Hi,

Some time ago I've prepared a PR: https://github.com/apache/nifi/pull/4773 that changes formatter used for formatting/printing date from/to string from SimpleDateFormat to DateTimeFormatter. I did it because I've made some benchmarks and figure out that it is quite important bottle neck in my flows. It is because SimpleDateFormat is not thread safe and is created for each format/parse. In the other hand DateTimeFormatter can be created once for expression and used many times.

The problem with that approach is that, DateTimeFormatter has not exactly the same format. It provides almost backward compatible version of formatter thanks to:
DateTimeFormatterBuilder()
                .parseLenient()
                .parseCaseInsensitive()

, but for some age cases the behaviour of both formatters are different and can cause some modifications in flows. Some examples: - yyyy-MM-dd'T'HH:mm:ss.SSSX with input 2021-01-28T15:00:14.270+01:00 -> yyyy-MM-dd'T'HH:mm:ss.SSSXXX - dd/MMM/yyyy:HH:mm:ss with input: 28/Jan/2021:14:58:00 +0100 -> dd/MMM/yyyy:HH:mm:ss X

Due to this differences and discussions with @exceptionfactory , @turcsanyip and @turcsanyip we see two options of handling this issue: 1. To modify implementation of current format / toDate functions - won't mess up api, but will need some date format modifications on users side (which will be described in tests and migration guide). Additional question is in which version should be introduced? 2. To add new formatDateTime, toDateTime functions (using DateTimeFormatter) next to existing format, toDate (using SimpleDateFormat, will be deprecated) - won't break flows using current functions, but will add some complexity in api which will vanish after removal of deprecated functions.

What do you guys think about both options?

Cheers,
Arek

Reply via email to