Fwiw, even if the rows are in order, they can end up not-rolled-up if they
come into two separate Kafka tasks (or even the same task across a segment
boundary based on max row count or intermediate handoff).

On Fri, Aug 9, 2019 at 10:50 AM Prashant Deva <prashant.d...@gmail.com>
wrote:

> In this case there is only a single Kafka partition and rows are coming in
> the order of timestamp and at the timestamp itself (no out of order rows
> are being ingested).
>
> On Fri, Aug 9, 2019 at 10:21 AM Gian Merlino <g...@apache.org> wrote:
>
> > Hey Prashant,
> >
> > Thanks for the report, we'll look into this.
> >
> > Btw- I would generally not expect datasources that use Kafka ingestion to
> > be _fully_ aggregated, since it doesn't guarantee perfect rollup (it'll
> > roll up rows that come in to the same Kafka ingestion task, which is
> based
> > on when they come in and what partitions they come in from). But
> > nevertheless we'll check out this issue to see if it looks like something
> > is wrong.
> >
> > On Fri, Aug 9, 2019 at 9:57 AM Prashant Deva <prashant.d...@gmail.com>
> > wrote:
> >
> > > by ' Data sources that dont aggregate' i mean data sources that have
> same
> > > query granularity as the data that is generated.
> > >
> > > Prashant
> > >
> > >
> > > On Fri, Aug 9, 2019 at 9:55 AM Prashant Deva <prashant.d...@gmail.com>
> > > wrote:
> > >
> > > > Only posting this because I think this is a major issue.
> > > > See bug #8276 <https://github.com/apache/incubator-druid/issues/8276
> >
> > > for
> > > > details.
> > > > Seems to be happening on every data source that is aggregating during
> > > > ingestion.
> > > >
> > > > Data sources that dont aggregate dont have this issue.
> > > >
> > > > Prashant
> > > >
> > >
> >
> --
> Prashant
>

Reply via email to