Thanks for the contribution, Joshua. Overall I think this is a really good
improvement to this connector, for all the reasons cited in the motivation
section. I do have some requests, though.

The biggest issue I have is that the change is not backward compatible.
We've worked really hard to make sure that all Connect APIs (including
configuration properties) are always backward compatible because this makes
it trivial for users to upgrade from one Connect release to a newer one.
After all, any users that are already using this SMT in their connectors
should be able to upgrade Connect and still have the SMT behave exactly as
it did in earlier releases -- unless they opt in to changing their
configuration to use the new feature. So I would request that the KIP and
proposed implementation be changed to not *change* existing configuration
properties. We can of course introduce new config properties, but they
should have a default that results in exactly the same old behavior as in
earlier releases. A third option is to create a new SMT, though we should
try really hard to just improve the existing SMT since that's generally a
much better UX. Assuming we can make the existing transformation
work, the option to create a new SMT should be documented in the Rejected
Alternatives section, too.

For example, there are (at least?) two options with regard to specifying
the field names. 1) just keep `field` as the name even though a
comma-separated list is allowed, or 2) add `fields` but keep (and
deprecate?) the existing `field` configuration, using `fields` if defined
or if `field` is not defined, and maybe even supporting comma separated
lists of names in both fields. The same goes for `format` and
`format.output`. In general, I think 2 *might* provide a better experience
at the cost of higher implementation complexity. Regardless of which you
choose, the KIP should explicitly propose one approach, and document the
other(s) in the rejected alternatives section. After all, the KIP should be
very clear about what will be proposed.

In terms of deprecating the old configs in option 2, it'd be fine to do
this in the documentation (and code, if applicable) as part of this
feature. That means they could be removed at the earliest in the next major
release after the feature is merged, which would be AK 4.0. And it would be
good to mention this explicitly in the KIP so that we don't need another
KIP just to remove the deprecation; so something like "This configuration
property will be deprecated and documented as such, and eventually removed
in a subsequent major release."

Supporting nested fields would also be a nice touch here, too. However, as
you mention, we'd likely want to make the same change to other SMTs, and as
such we might consider adding support for nested names on all SMTs with a
separate KIP. The two KIPs could be written such that they are independent.

Finally, a KIP need not go into much detail about the implementation, and
should instead focus on the API and behavioral changes. For example, it's
probably not necessary for the KIP to mention that the implementation would
use DateTimeFormatter. However, it would be worth mentioning explicitly
that the `format.input` configuration property will take a regular
expression that is compatible with the JDK's DateTimeFormatter.ofPattern method
-- this very clearly states the constraint on the input values.

Again, very nice job on the KIP so far!

Best regards,

Randall

On Sat, May 15, 2021 at 6:50 AM Joshua Grisham <grishamj...@gmail.com>
wrote:

> Hello again Kafka Developers community!
>
> I thought it has been some time and I would try to dust KIP-682 off and see
> if anyone wanted to take a discussion on them, if it still makes sense or
> if this should go in a different direction.
>
> Anyone have thoughts on this?  That Connect's TimestampConverter could be
> updated to support converting multiple fields in one pass of the
> transformation (instead of chaining multiple instances together if you have
> multiple timestamps to convert) and that it could support a regex-like
> pattern for matching string input instead of only allowing one string
> format and expect that all producers are sending in 100% the same format?
>
> Since some time has gone there are a few conflicts in my proposed PR but
> they seem quite trivial (some error message and formatting of a test result
> from a quick glance) so it should not be hard to resolve them.
>
> Thanks!
>
> Best,
> Joshua
>
>
> On Sat, Nov 7, 2020 at 9:24 PM Joshua Grisham <grishamj...@gmail.com>
> wrote:
>
> > Hi Brandon,
> >
> > I have added what i called "recursive" support to find child fields on
> any
> > level to the Cast and ReplaceField transforms as well, the PR is here
> > https://github.com/apache/kafka/pull/9493
> >
> > I plan to try and write up a KIP for that one as well maybe in the next
> > day or two. It is not exactly the same as specifically targeting one
> nested
> > field based on its full path, but instead will look for the field name at
> > any level of the structure and cast/replace all of them if they match.
> >
> > The same could be added in theory to almost all of the transforms i guess
> > but maybe that is for another discussion 😊
> >
> > But for now with this TimestampConverter the only thing which I have
> > proposed are the two mentioned before - multiple fields and support for
> > pattern like matching of string timestamps.
> >
> > Joshua
> >
> > Den lör 7 nov. 2020 18:42Brandon Brown <bran...@bbrownsound.com> skrev:
> >
> >> I love this idea! I have a current KIP for a hash transform and am
> >> working adding multi field/nested support to that one. I can think of a
> few
> >> times we’ve needed functionality like that. I’d add that adding support
> for
> >> transforming nested fields would be a great feature for this.
> >>
> >> -Brandon
> >>
> >> Brandon Brown
> >> > On Nov 7, 2020, at 11:11 AM, Joshua Grisham <grishamj...@gmail.com>
> >> wrote:
> >> >
> >> > Hello everyone!
> >> > (Thanks to Tom Bentley for directing me in this direction!)
> >> >
> >> > I have made a series of changes to some of the standard Connect
> >> transforms
> >> > to meet some of the challenges at my company to consume messages using
> >> > Connect, and have been running them for a few weeks now as custom
> SMTs.
> >> >
> >> > I realized that several of these changes might actually be really good
> >> > features to be included in the standard transforms so I will open some
> >> KIPs
> >> > to go along with PRs which I already submitted (apologize for going a
> >> bit
> >> > backwards in the process as this was my first time working with the
> >> Kafka
> >> > project).
> >> >
> >> > The first one I have created a KIP for is KIP-682 on the
> >> TimestampConverter
> >> > transform.
> >> >
> >> > Basically the 2 limitations that I am proposing to remove are the
> >> ability
> >> > to only convert one field at a time, but instead convert multiple
> fields
> >> > using a common config, and that if any producer sends a slightly
> >> different
> >> > date/timestamp string format on any message then the transformation
> >> > fails... so to instead add the ability to support variations in the
> >> string
> >> > input format.
> >> >
> >> > More info here:
> >> >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-682%3A+Connect+TimestampConverter+support+for+multiple+fields+and+multiple+input+formats
> >> >
> >> > I would appreciate any kind of discussion to make improvements and it
> >> would
> >> > be great to see changes like this being pushed up.
> >> >
> >> > Thanks for now!
> >> >
> >> > Best regards,
> >> > Joshua Grisham
> >>
> >
>

Reply via email to