Re: [DISCUSS] 1.14.0 release

Vlad Rozov Fri, 13 Jul 2018 13:04:23 -0700

My 2 cents:

From Apache point of view it is OK to do a release even if unit testsdo not pass at all or there is a large number of regression introduced.Apache release is a source release and as long as it compiles and doesnot have license issues, it is up to community (PMC) to decide on anyother criteria for a release.

The issue in DRILL-6453 is not limited to a large number of hash joins.It should be possible to reproduce it even with a single hash join aslong as left and right sides are getting batches from one(many) to manyexchanges (broadcast or hash partitioner senders).


Thank you,

Vlad

On 7/13/18 08:41, Aman Sinha wrote:

I would say we have to take a measured approach to this and decide on a
case-by-case which issue is a show stopper.
While of course we have to make every effort to avoid regression, we cannot
claim that a particular release will not cause any regression.
I believe there are 10000+ passing tests,  so that should provide a level
of confidence.   The TPC-DS 72 is a 10 table join which in the hadoop world
of
denormalized schemas is not relatively common.  The main question is does
the issue reproduce with fewer joins having the same type of distribution
plan ?


Aman

On Fri, Jul 13, 2018 at 7:36 AM Arina Yelchiyeva <[email protected]>
wrote:

We cannot release with existing regressions, especially taking into account
the there are not minor issues.
As far as I understand reverting is not an option since hash join spill
feature are extended into several commits + subsequent fixes.
I guess we need to consider postponing the release until issues are
resolved.

Kind regards,
Arina

On Fri, Jul 13, 2018 at 5:14 PM Boaz Ben-Zvi <[email protected]> wrote:

(Guessing ...) It is possible that the root cause for DRILL-6606 is
similar to that in  DRILL-6453 -- that is the new "early sniffing" in the
Hash-Join, which repeatedly invokes next() on the two "children" of the
join *during schema discovery* until non-empty data is returned (or NONE,
STOP, etc).  Last night Salim, Vlad and I briefly discussed alternatives,
like postponing the "sniffing" to a later time (beginning of the build

for

the right child, and beginning of the probe for the left child).

However this would require some work time. So what should we do about

1.14

?

   Thanks,

           Boaz

On Fri, Jul 13, 2018 at 3:46 AM, Arina Yelchiyeva <
[email protected]> wrote:

During implementing late limit 0 optimization, Bohdan has found one more
regression after Hash Join spill to disk.
https://issues.apache.org/jira/browse/DRILL-6606
<

https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_DRILL-2D6606&d=DwMFaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=7lXQnf0aC8VQ0iMXwVgNHw&m=OHnyHeZpNk3hcwkG-JoQG6E90tKdoS47J1rv5x-hJzw&s=wm5zpJf9K2zYzrqRB1LqLpKcvmBK5y6XC0ZUqVmSjko&e=

Boaz please take a look.

Kind regards,
Arina

Re: [DISCUSS] 1.14.0 release

Reply via email to