+1 to incremental scan.

In the case where newly added data is also ingested, are we querying
multiple times with the same query? Or is the resultset of the first query
updated continuously with the newer records? In the latter case, the
Resultset can effectively be an infinitely iterable set.


~Bhupesh

On Mon, May 9, 2016 at 10:34 AM, Priyanka Gugale <[email protected]>
wrote:

> Incremental scan was not available with jdbc operator till now. +1 for
> adding that.
>
> -Priyanka
>
> On Mon, May 9, 2016 at 8:56 AM, Mohit Jotwani <[email protected]>
> wrote:
>
> > +1 for incremental data.
> >
> > Regards,
> > Mohit
> > On 9 May 2016 19:59, "Yogi Devendra" <[email protected]>
> wrote:
> >
> > > +1 for incremental data fetching.
> > > for fetchDirection variable; it is better to get inputs from original
> > > author (if possible).
> > >
> > > ~ Yogi
> > >
> > > On 9 May 2016 at 19:04, Akshay Gore <[email protected]> wrote:
> > >
> > > > +1 for incremental data fetching. This is a must-have feature.
> > > >
> > > > -Akshay
> > > > On 09-May-2016 3:39 pm, "Sandeep Deshmukh" <[email protected]>
> > > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > I am using JdbcPOJOInputOperator to ingest data from mysql to
> HDFS. I
> > > > > observed that  once the existing data is ingested, newly added data
> > in
> > > > > mysql is not ingested. At the same time, if I add some data to
> mysql
> > > when
> > > > > the ingestion is still going on, the newly added data is also
> > ingested
> > > on
> > > > > HDFS.
> > > > >
> > > > > In the code, fetching data in batches in achieved using fetchSize
> > > > parameter
> > > > > that limits the number of tuples to fetch per result set and
> > pageNumber
> > > > is
> > > > > used internally to manage the offset calculation as ( fetchSize *
> > > > > pageNumber). The pageNumber is incremented per window.
> > > > >
> > > > > When the existing tuples are ingested, there is no further data
> > ingest
> > > > but
> > > > > the pageNumber variable is still incremented. This results is
> trying
> > to
> > > > > fetch data that is beyond the number of tuples in the
> > > table/queryresult.
> > > > >
> > > > > Changing offset calculations to tuples read so far will fix this
> > issue
> > > > and
> > > > > the operator can then be used to poll for newer data in the table.
> > > > >
> > > > > If you need to have a quick look at the code:
> > > https://github.com/apache/
> > > > > incubator-apex-malhar/blob/master/library/src/main/java/
> > > > > com/datatorrent/lib/db/jdbc/JdbcPOJOInputOperator.java
> > > > >
> > > > > Side observation: fetchDirection variable is unused in the code.
> Will
> > > > > remove it from the class.
> > > > >
> > > > > Would like get your thoughts on my observations. I will create a
> JIRA
> > > and
> > > > > open a PR based on inputs received on this thread.
> > > > >
> > > > > Regards,
> > > > > Sandeep
> > > > >
> > > >
> > >
> >
>

Reply via email to