Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/22144
I think we should add the support back. It sounded like some people didn't
like this PR for a fix so we would need to investigate something else,
@cloud-fan did you have more specifics about
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/22144
So, 2.4 is out. Where are we? Rereading the comments above, looks we should:
1. Find the root cause
2. Officially drop it if the workaround is not easy
3. Fix it if the
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/22144
Maybe we should discuss it in another thread, I think we should officially
write down the procedure about how to evaluate a bug report and label it as a
blocker or not. Currently
Github user markhamstra commented on the issue:
https://github.com/apache/spark/pull/22144
Thanks @tgravescs for your latest posts -- they've saved me from posting
something similar in many respects but more strongly worded.
What is bothering me (not just in the discussion of
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/22144
@cloud-fan
I agree with you that most issues do not have to block a release but I
don't agree with most of your criteria for making that decision. I've already
talked about the other
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/22144
@tgravescs please quote my full comment instead of part of it.
> After all, this is a bug and a regression from previous releases, like
other 1000 we've fixed before.
The point I
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/22144
@tgravescs yes, I didn't say it was normal or good, but that it's not
forbidden. Ex: no more Java 7 support in Spark 2.3. The point was: if this
change were on purpose, then no it's not a
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/22144
while I'm ok with not blocking 2.4 for this as well, not for many of the
reasons stated though. Note the jira was filed a Major not a blocker. Based on
the information we have, the impact on the
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/22144
Adding it as a known issue sounds reasonable to me as well.
---
-
To unsubscribe, e-mail:
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/22144
I think this is different from the blocker tickets we opened before. We
should try our best to avoid accidentally dropping the existing support. Please
encourage more people in the community to
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/22144
Unfortunately, we didn't drop it mistakenly. It's a mistake and we should
fix it. What I try to avoid is adding back the `supportsPartial` flag. We
should look into the root cause and see how to
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/22144
> According to the policy, we don't have to block the current release
because of i
@cloud-fan, BTW, would you mind if I ask to share what you read? I want to
be aware of the policy as
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/22144
After all, this is a bug and a regression from previous releases, like
other 1000 we've fixed before. According to the policy, we don't have to block
the current release because of it, and this
Github user markhamstra commented on the issue:
https://github.com/apache/spark/pull/22144
Yes, @rxin I know that I was a little unfair to you in order to make my
point sharper. Apologies. My concern is real, though.
---
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/22144
@markhamstra how did you arrive at that conclusion? I said "itâs not a
new regression and also
somewhat esoteric"
---
-
Github user markhamstra commented on the issue:
https://github.com/apache/spark/pull/22144
@srowen I understand and agree. What bothers me is that the block-no block
decision now often seems to be "not a regression; automatic no block" -- and
that doesn't seem right to me.
---
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/22144
@markhamstra I do think there's an unspoken but legitimate consideration
here, and that's that there is also a cost to not shipping the N thousand other
things users are waiting on in this release.
Github user markhamstra commented on the issue:
https://github.com/apache/spark/pull/22144
> Itâs certainly not a blocker since itâs not a new regression
I really hate this line of argument. Somehow we seem to have slipped from
"if it is a regression, then we must block"
Github user AlexanderSaydakov commented on the issue:
https://github.com/apache/spark/pull/22144
I believe there is a misunderstanding. It is not just about HLL Sketch
UDFs. It seems to be a wrong way of executing Hive UDFs in general. It does not
always manifest itself as a failure.
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/22144
So, some particular functionality worked in Spark 2.1, but didn't work
starting in 2.2. If anything, that was the breaking change. (I wonder if it was
on purpose, or documented, but, what's done is
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/22144
The fact that it isn't a new regression shouldn't determine if its a
blocker, there are lots of things you don't hit right away.
The impact does affect if we decide if this is a
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/22144
Wait wait .. do we only care about regressions as blockers for the last
release (2.3)? I'm asking this because I really don't know. If so, I'm okay.
---
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/22144
Itâs certainly not a blocker since itâs not a new regression and also
somewhat esoteric. Would be good to fix though.
On Tue, Oct 23, 2018 at 8:20 AM Wenchen Fan
wrote:
>
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/22144
This is not a PR that is ready to merge. We are likely talking about
delaying 2.4.0 for multiple weeks because of this issue. Is it really worth?
I'm not sure what's the exact policy,
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/22144
For instance,
https://groups.google.com/forum/?utm_medium=email_source=footer#!msg/sketches-user/GmH4-OlHP9g/MW-J7Hg4BwAJ
this discussion thread was started almost one year ago.
---
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/22144
It does try to fix a backward compatibility issue. It is found later now
but still it is true we found a breaking issue to fix. Broken backward
compatibility that potentially affects a set of
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/22144
Note that the `supportsPartial` flag was dropped at Spark 2.2, not 2.4.
I'm not very familiar with Hive code so I don't clearly know how it is
broken. The worst case is, Hive has some
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/22144
@cloud-fan I disagree, it seems like you broke functionality that was
there and you can't do that in a minor release. 3.0 would be fine to drop I
think but we should fix it for 2.4, this PR
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/22144
If we were going for 3.0, then I would definitely leave +1 and I agree that
we should rather focus on Spark itself as a higher priority - we should do that
when we go 3.0 and rather drop such
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/22144
My feeling is that, hive compatibility is not that important to Spark at
this point. *ALL* aggregate functions in Spark (including Spark UDAF) support
partial aggregate, but now we need to
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/22144
hmm .. this is actually a regression and compatibility issue we should take
a look into ..
---
-
To unsubscribe, e-mail:
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/22144
@cloud-fan would you have time to look at this to at least see if this is
something important?
---
-
To unsubscribe, e-mail:
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/22144
hey, this looks important, could someone review this?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22144
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95763/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22144
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22144
**[Test build #95763 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95763/testReport)**
for PR 22144 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22144
**[Test build #95763 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95763/testReport)**
for PR 22144 at commit
Github user tgravescs commented on the issue:
https://github.com/apache/spark/pull/22144
ok to test
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user pgandhi999 commented on the issue:
https://github.com/apache/spark/pull/22144
@cloud-fan I have updated the PR and added the configurable flags for
Window UDAF and Hive UDAF for now. Would appreciate to hear your thoughts.
Thank you.
---
Github user pgandhi999 commented on the issue:
https://github.com/apache/spark/pull/22144
@cloud-fan I see a flag supportsPartial which was removed in PR
https://github.com/apache/spark/pull/16461. Is this the flag you were talking
about? If so, I was thinking that we make this flag
Github user pgandhi999 commented on the issue:
https://github.com/apache/spark/pull/22144
@cloud-fan your comment makes sense, will work on it.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/22144
This should not be a global config but per-UDAF flag. IIRC we do have such
a flag before, but get removed later. Maybe we should bring it back.
---
Github user pgandhi999 commented on the issue:
https://github.com/apache/spark/pull/22144
@dilipbiswal This property is by default set to true so it does not effect
anything currently in the way UDAF's run. This property has been added purely
for the purpose of maintaining backward
Github user dilipbiswal commented on the issue:
https://github.com/apache/spark/pull/22144
@pgandhi999 I have a basic question. Setting this property will have a
global effect on all the aggregations ?
---
-
To
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22144
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22144
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22144
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
47 matches
Mail list logo