Fwiw, even if the rows are in order, they can end up not-rolled-up if they come into two separate Kafka tasks (or even the same task across a segment boundary based on max row count or intermediate handoff).
On Fri, Aug 9, 2019 at 10:50 AM Prashant Deva <[email protected]> wrote: > In this case there is only a single Kafka partition and rows are coming in > the order of timestamp and at the timestamp itself (no out of order rows > are being ingested). > > On Fri, Aug 9, 2019 at 10:21 AM Gian Merlino <[email protected]> wrote: > > > Hey Prashant, > > > > Thanks for the report, we'll look into this. > > > > Btw- I would generally not expect datasources that use Kafka ingestion to > > be _fully_ aggregated, since it doesn't guarantee perfect rollup (it'll > > roll up rows that come in to the same Kafka ingestion task, which is > based > > on when they come in and what partitions they come in from). But > > nevertheless we'll check out this issue to see if it looks like something > > is wrong. > > > > On Fri, Aug 9, 2019 at 9:57 AM Prashant Deva <[email protected]> > > wrote: > > > > > by ' Data sources that dont aggregate' i mean data sources that have > same > > > query granularity as the data that is generated. > > > > > > Prashant > > > > > > > > > On Fri, Aug 9, 2019 at 9:55 AM Prashant Deva <[email protected]> > > > wrote: > > > > > > > Only posting this because I think this is a major issue. > > > > See bug #8276 <https://github.com/apache/incubator-druid/issues/8276 > > > > > for > > > > details. > > > > Seems to be happening on every data source that is aggregating during > > > > ingestion. > > > > > > > > Data sources that dont aggregate dont have this issue. > > > > > > > > Prashant > > > > > > > > > > -- > Prashant >
