Github user viirya commented on the issue:
https://github.com/apache/spark/pull/14847
@rxin Thanks for recommendation! Let me close it now and work on it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/14847
@viirya This is a pretty interesting. Based on the suggestion from @rxin
let us discuss it? Also cc @ioana-delaney @nsyca
---
If your project is set up for it, you can reply to this email and
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/14847
So I'd recommend closing this for now. @viirya you are welcome to work on
it still, but it should:
1. Be obvious in explain (e.g. through an operator) that this behavior
exists.
2. Works
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/14847
FYI this is a relevant paper
http://www.comp.nus.edu.sg/~tankl/cs5208/readings/enough.pdf
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14847
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66730/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14847
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14847
**[Test build #66730 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66730/consoleFull)**
for PR 14847 at commit
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/14847
@viirya can you try to create a new operator for this optimization and make
it work with whole-stage-codegen? thanks!
---
If your project is set up for it, you can reply to this email and have
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/14847
This would really only be interesting if it works with whole stage code
gen; otherwise it is not really interesting.
In addition, it'd make sense to have an explicit operator for this, e.g.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14847
**[Test build #66730 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66730/consoleFull)**
for PR 14847 at commit
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/14847
Re-open it and see if we can have some consensus about this direction.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/14847
If the filter condition is deterministic, I think this is a good
optimization, especially for bucketed(with SORT BY) table, cc @yhuai
@gatorsmile
---
If your project is set up for it, you can
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/14847
@cloud-fan If this direction is okay for you, I can investigate it further.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/14847
Is it possible to speed up the whole-stage-on case? We can create a new
physical operator `FilterOnSortedData` to isolate this special logic.
---
If your project is set up for it, you can reply
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/14847
/cc @cloud-fan @rxin @davies for reviewing this. Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14847
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64520/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14847
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14847
**[Test build #64520 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64520/consoleFull)**
for PR 14847 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14847
**[Test build #64520 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64520/consoleFull)**
for PR 14847 at commit
19 matches
Mail list logo