Hi,
Some time ago I've prepared a PR:
https://github.com/apache/nifi/pull/4773 that changes formatter used for
formatting/printing date from/to string from SimpleDateFormat to
DateTimeFormatter.
I did it because I've made some benchmarks and figure out that it is
quite important bottle neck in my flows.
It is because SimpleDateFormat is not thread safe and is created for
each format/parse. In the other hand DateTimeFormatter can be created
once for expression and used many times.
The problem with that approach is that, DateTimeFormatter has not
exactly the same format. It provides almost backward compatible version
of formatter thanks to:
DateTimeFormatterBuilder()
.parseLenient()
.parseCaseInsensitive()
, but for some age cases the behaviour of both formatters are different
and can cause some modifications in flows. Some examples:
- yyyy-MM-dd'T'HH:mm:ss.SSSX with input 2021-01-28T15:00:14.270+01:00 ->
yyyy-MM-dd'T'HH:mm:ss.SSSXXX
- dd/MMM/yyyy:HH:mm:ss with input: 28/Jan/2021:14:58:00 +0100 ->
dd/MMM/yyyy:HH:mm:ss X
Due to this differences and discussions with @exceptionfactory ,
@turcsanyip and @turcsanyip we see two options of handling this issue:
1. To modify implementation of current format / toDate functions - won't
mess up api, but will need some date format modifications on users side
(which will be described in tests and migration guide). Additional
question is in which version should be introduced?
2. To add new formatDateTime, toDateTime functions (using
DateTimeFormatter) next to existing format, toDate (using
SimpleDateFormat, will be deprecated) - won't break flows using current
functions, but will add some complexity in api which will vanish after
removal of deprecated functions.
What do you guys think about both options?
Cheers,
Arek