Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
@steveloughran Agree. We always can make an exception if most need it.
- For any external behavior change (even in experimental APIs), we need to
document it and also mention it in the
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/21060
* from the ASF process-police perspective, something like
versioning/backport policy is something which should be done on the ASF dev
list...consider asking in user@ to see what people's
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
I am fine to accept different opinions for this specific PR. Reverting this
PR is not my goal here. This is a public community. It sounds like the commit
message clearly delivers what this PR
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/21060
I do not see a problem with the commit message here. Is that really the
issue? it accurately describes _what_ changes. The _why_ has always been
documented in discussion, and it is here already.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
I might not explain it well. Sorry for the misunderstanding. Thank you
@rxin for helping me clarify my points. It sounds like many of you think this
backport is fine. I am not against this
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/21060
Adding a flag just in 2.3 is, at least, an unusual thing to do. By this
logic lots of backports should be flag protected but we don't. Why is this
special?
I still don't see much argument
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
I am okay if there's a specific reason. I think this is the point - if
there's a specific reason, that should be mentioned and explained ahead.
Actually, I (and @srowen did as well IIUC) asked
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/21060
It looks to me this is a bug fix that can merit backporting, as
QueryExecutionListener is also marked as experimental,
In this case, I think @gatorsmile is worried one might have written a
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21060
I agree with what @srowen said:
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/21060
This was a bug fix from my perspective and looked to be low risk. I don't
think this changes any behavior for the user, except if you do a `collect` from
pyspark and have a
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/21060
This is one of those great problems in software engineering: no good
answer. I think case-by-case is generally the best tactic, with a bias against
feature backport, though my track record is
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
cc @rdblue and @steveloughran too who I guess should be interested in
setting up a backporting policy.
---
-
To
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
I am not saying we shouldn't be careful. I am trying to be careful when I
backport. So, your reasons are:
- any behaviour changes shouldn't be backported and it's the basic backport
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
Like what I said above, we need to be very careful when backporting the PR
with the behavior changes, especially when this is **neither a critical issue
nor a regression**. Even if this is a bug
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
This is not just about just inconsistency but a bug. The previous behaivour
doesn't make sense.
Sure, no need to rush.
---
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
Fixing API inconsistency should not be treated as a bug fix.
Please give me a few days. I need to summarize the Spark 2.3 release and
list all the PRs that were backported to the
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
This is not an new feature addition .. this fixes an exiting functionality
to work as expected and consistently ..
Sure, that'd be great. Will join in the discussion.
---
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
> This case specifically collect in PySpark doesn't work alone whereas all
other actions like foreach, show and other cases in other languages works in
all other APIs. Also, that's what a query
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
> The behavior consistency among Python/Scala/R/JAVA does not mean a bug,
right?
This case specifically `collect` in PySpark doesn't work alone whereas all
other actions like
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
> The callback works for collect in R and Scala but Python doesn't. I think
we should at least match the behaviour. I wonder why it's hard to say a bug
when collect is detected in some APIs but
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
> We need to be very careful when backporting the PR with the behavior
changes, especially when this is neither a critical issue nor a regression.
Thus, I do not think we should backport this
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
> withCallback was added in Spark 1.6 release
https://issues.apache.org/jira/browse/SPARK-11068 Since then, my understanding
is we never clearly define which should be part of withCallback.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
`withCallback` was added in Spark 1.6 release
https://issues.apache.org/jira/browse/SPARK-11068 Since then, my understanding
is we never clearly define which should be part of `withCallback`.
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/21060
This certainly looks like a bug fix. I don't know this area well, but I
don't see an argument here that the current behavior is correct. Right?
When we say we don't back-port behavior
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
Yup, that should reduce some overhead like this. I would like to listen
what you guys think cc @srowenn, @vanzin, @felixcheung, @holdenk too.
---
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
I do think we should clearly document the rule what we can backport.
I do not think we should make an exception for this PR. cc @rxin @marmbrus
@yhuai @cloud-fan @ueshin
---
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
How about we formally document that in the guide?
I have been always putting more importance on practice and I personally
think we are fine to make a backport if it's a bug and the fix
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
If this can be treated as a bug to backport, we have many behavior change
PRs that can be backported. We are building the system software. We have to be
more principled.
---
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
I agree that It's better to avoid a behaviour change but this one is a
clearly a bug and the fix is straightforward. I am puzzled why this
specifically prompted you. I wouldn't revert if
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
This is just the basic backport rule we follow for each PR. We should not
make an exception for this PR.
---
-
To
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
I am a bit puzzled because `QueryExecutionListener` should call the
callback for actions and `collect` triggers it in Scala and R but it doesn't in
PySpark specifically. It sounds a bug and
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
Users apps should not be blamed in this case. If they want this change,
they should upgrade to the newer release. Basically, we should not introduce
any external behavior change in the
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
I guess the behaviour changes here is that a custom query execution
listener now can recognise the action `collect` in PySpark which other APIs
have detected. Mind explaining how it breaks
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
This will introduce the behavior change and it is not a regression. The
changes we made in this PR could break the external app. We should not do it in
the maintenance release.
---
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
hm I would say it's a bug since the action is not detected which is
supposed to call the callback. The test is a bit complicated but the fix is
relatively straightforward.
---
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/21060
Since this is not a bug fix, I plan to revert this PR. WDYT? @HyukjinKwon
@BryanCutler
---
-
To unsubscribe, e-mail:
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
Merged to branch-2.3.
Thanks for reviewing this @BryanCutler.
---
-
To unsubscribe, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21060
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89363/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21060
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21060
**[Test build #89363 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89363/testReport)**
for PR 21060 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21060
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21060
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2328/
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21060
**[Test build #89363 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89363/testReport)**
for PR 21060 at commit
Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/21060
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21060
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89352/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21060
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21060
**[Test build #89352 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89352/testReport)**
for PR 21060 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21060
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2323/
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21060
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21060
**[Test build #89352 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89352/testReport)**
for PR 21060 at commit
Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/21060
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21060
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21060
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89327/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21060
**[Test build #89327 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89327/testReport)**
for PR 21060 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21060
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2306/
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21060
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21060
**[Test build #89327 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89327/testReport)**
for PR 21060 at commit
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21060
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89312/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21060
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21060
**[Test build #89312 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89312/testReport)**
for PR 21060 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21060
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21060
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2294/
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/21060
**[Test build #89312 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89312/testReport)**
for PR 21060 at commit
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/21060
cc @BryanCutler
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
65 matches
Mail list logo