I agree with Parth's earlier comment. Getting a framework in place sounds like the right long term solution. But given that we have more immediate needs for something sooner rather than later (warnings + skip records), can we implement something simpler now until we have the more general framework in place?
-- Zelaine On Wed, Dec 2, 2015 at 10:43 AM, Hsuan Yi Chu <hyi...@maprtech.com> wrote: > +1 on having a framework. > > But how about the #skipped records use case, and warning one ( > > https://github.com/abhipol/drill/commit/137059cd44ec28e8ba3bf2aa73d2c1cbcd55d604 > ) > > Implementing the framework at this moment sounds a good timing because it > can benefit those two use cases in one shot. > > > > On Tue, Dec 1, 2015 at 3:52 PM, Parth Chandra <par...@apache.org> wrote: > > > +1 on having a framework. > > OTOH, as with the warnings implementation, we might want to go ahead > with a > > simpler implementation while we get a more generic framework design in > > place. > > > > Jacques, do you have any preliminary thoughts on the framework? > > > > On Tue, Dec 1, 2015 at 2:08 PM, Julian Hyde <jh...@apache.org> wrote: > > > > > +1 for a sideband mechanism. > > > > > > Sideband can also allow correlated restart of sub-queries. > > > > > > In sideband use cases you described, the messages ran in the opposite > > > direction to the data. Would the sideband also run in the same > direction > > as > > > the data? If so it could carry warnings, rejected rows, progress > > > indications, and (for online aggregation[1]) notifications that a > better > > > approximate query result is available. > > > > > > Julian > > > > > > [1] https://en.wikipedia.org/wiki/Online_aggregation > > > > > > > > > > > > > On Dec 1, 2015, at 1:51 PM, Jacques Nadeau <jacq...@dremio.com> > wrote: > > > > > > > > This seems like a form of sideband communication. I think we should > > have > > > a > > > > framework for this type of thing in general rather than a one-off for > > > this > > > > particular need. Other forms of sideband might be small table > > bloomfilter > > > > generation and pushdown into hbase, separate file > > assignment/partitioning > > > > providers balancing/generating scanner workloads, statistics > generation > > > for > > > > adaptive execution, etc. > > > > > > > > -- > > > > Jacques Nadeau > > > > CTO and Co-Founder, Dremio > > > > > > > > On Tue, Dec 1, 2015 at 11:35 AM, Hsuan Yi Chu <hyi...@maprtech.com> > > > wrote: > > > > > > > >> I am trying to deal with the following scenario: > > > >> > > > >> A bunch of minor fragments are doing things in parallel. Each of > them > > > could > > > >> skip some records. Since the downstream minor fragment needs to know > > the > > > >> sum of skipped-record-counts (in order to just display or see if the > > > number > > > >> exceeds the threshold) in the upstreams, each upstream minor > fragment > > > needs > > > >> to pass this scalar with RecordBatch. > > > >> > > > >> Since this seems impacting the protocol of RecordBatch, I am looking > > for > > > >> some advice here. > > > >> > > > >> Thanks. > > > >> > > > > > > > > >