My 2c on this.. This is a super valuable source for DeltaStreamer and in
the interest of making forward progress, it is totally fine to make things
like "arbitrary queries" a follow on item..
Just supporting incremental columns, time stamp columns and may be compound
columns, would already bring immense value..

My only additional suggestion is to go over the existing PR, and see if
there are some implementation aspects that need more upfront design e.g
deriving the checkpoints from extracted data..
Also if any change to existing Source, DeltaStreamer framework is required,
it might be best to call this out briefly in the RFC. (I think we should be
ok; just being thorough).

Overall, a big +1 from me

On Mon, Jan 6, 2020 at 6:01 AM 蒋晓峰 <[email protected]> wrote:

> Yeah,I got it. Thanks for Rushiraj to explain. Good idea.
>
>
>
>
> | |
> 蒋晓峰
> |
> |
> 邮箱:[email protected]
> |
>
> 签名由 网易邮箱大师 定制
>
> On 01/06/2020 19:58, rushiraj chavan wrote:
> Hi Nicholas,
>
> We discussed about incremental mode for arbitrary queries at our end
> and we came up with an idea. We can put a filtering criteria using
> where clause and provide placeholder for incrementing conditions.
>
> Thanks,
>
> Rushiraj
>
> On Mon, Jan 6, 2020 at 4:50 PM rushiraj chavan <[email protected]>
> wrote:
> >
> > Hi Nicholas,
> >
> > I guess it can be made pluggable. Sorry I didn't understand what you
> > meant by ... SPI....
> >
> > Purushotham([email protected]) and I are working on JDBC Delta
> > Streamer. It is WIP.
> >
> > Thanks,
> > Rushiraj
> >
> > On Mon, Jan 6, 2020 at 4:20 PM 蒋晓峰 <[email protected]> wrote:
> > >
> > > Hi Rushiraj,
> > > As you said, Is the idea that provide way to pass custom parameters to
> the arbitrary queries pluggable? I think that user could scale the
> implement by SPI. And Has the work of JDBC Delta Streamer already done?
> > >
> > >
> > > Bests,
> > > Nicholas
> > >
> > > At 2020-01-06 18:05:28, "rushiraj chavan" <[email protected]>
> wrote:
> > > >Hi Nicholas,
> > > >
> > > >We haven't given enough thought to it. At a high level, it looks bit
> > > >hard to generalise as we won't have any control
> > > > over the arbitrary queries. In that case, the burden would be on the
> > > >user to configure complex queries to have
> > > > incremental nature. We would also need to provide a way to pass
> > > >custom parameters to the arbitrary queries which
> > > >would be defined by the user. It is feasible but needs significant
> > > >design work. We will make note of this in our future work.
> > > >
> > > >I hope that makes sense.
> > > >
> > > >Thanks,
> > > >Rushiraj
> > > >
> > > >On Mon, Jan 6, 2020 at 2:40 PM 蒋晓峰 <[email protected]> wrote:
> > > >>
> > > >> Hi Purushotham,
> > > >>     About arbitrary queries (multi table complex queries), why
> support this only in Bulk Mode? What concern about this?
> > > >> Thanks,
> > > >> Nicholas
> > > >>
> > > >>
> > > >> At 2020-01-06 15:41:31, "Purushotham Pushpavanthar" <
> [email protected]> wrote:
> > > >> >Hi everyone,
> > > >> >
> > > >> >We are working on introducing JDBC Delta Streamer as
> > > >> >one of the sources for HUDI. We've drafted initial version of
> design in
> > > >> >RFC-14.
> > > >> >Kindly review and let us know your thoughts.
> > > >> >
> > > >> >I'm initiating this thread to discuss few comments raised by
> Vinoth.
> > > >> >
> > > >> >   1. As discussed on the RFC page, we are going to support
> compound
> > > >> >   incremental columns.
> > > >> >   2. About arbitrary queries (multi table complex queries), we are
> > > >> >   planning to support this only in Bulk Mode.
> > > >> >
> > > >> >
> > > >> >[1] https://issues.apache.org/jira/browse/HUDI-251
> > > >> >[2]
> > > >> >
> https://cwiki.apache.org/confluence/display/HUDI/RFC+-+14+%3A+JDBC+incremental+puller
> > > >> >
> > > >> >Regards,
> > > >> >Purushotham Pushpavanth
>

Reply via email to