aksmf1442 commented on PR #10360:
URL: https://github.com/apache/seatunnel/pull/10360#issuecomment-3799851205
> > Hi @aksmf1442 , Good job. I found there is one issue you can do it
better ^_^
> > ### Performance Bottleneck
> > **Description**: The code calls
`DateTimeFormatter.ofPattern(dateTimeFormat.getPattern())` inside the
`parsedatetime` method. Since this UDF executes **per row**, compiling the
pattern every time is extremely inefficient for high-throughput scenarios.
**Location**:
`seatunnel-transforms-v2/src/main/java/org/apache/seatunnel/transform/sql/zeta/functions/DateTimeFunction.java`
(Line ~646) **Recommendation**: Cache the `DateTimeFormatter` instance within
the `ZetaDateTimeFormat` enum. `DateTimeFormatter` is thread-safe and should be
created only once.
> > **Suggested Change**: Refactor `ZetaDateTimeFormat` to initialize the
formatter in its constructor:
> > ```java
> > public enum ZetaDateTimeFormat {
> > // ...
> > private final DateTimeFormatter formatter;
> >
> > ZetaDateTimeFormat(String pattern, FormatType type) {
> > this.pattern = pattern;
> > this.type = type;
> > this.formatter = DateTimeFormatter.ofPattern(pattern); //
Initialize once
> > }
> >
> > public DateTimeFormatter getFormatter() { return formatter; }
> > }
> > ```
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > Then use `dateTimeFormat.getFormatter()` in the function.
> > By the way, there is a smaill Issue: Whitelist Too Strict
**Description**: The current whitelist misses common formats like compact dates
(`yyyyMMdd`) or slash separators (`yyyy/MM/dd`). **Recommendation**: It's
better if you can expand the whitelist to include these common variations to
reduce migration friction for users.
>
> @davidzollo Thank you for the valuable suggestions. I’ll reflect them in t
@davidzollo
I've addressed both issues.
1. **Performance optimization**: Cached `DateTimeFormatter` in the enum
constructor to avoid repeated pattern compilation
2. **Expanded whitelist**: Added common formats including:
- Compact: `yyyyMMdd`, `yyyyMMddHHmmss`, `HHmmss`
- Slash separator: `yyyy/MM/dd`, `yyyy/MM/dd HH:mm:ss`, `yyyy/MM/dd
HH:mm:ss.SSS`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]