Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-146088680
@marmbrus Thanks for clarifying that.
I have scanned quickly the predicates in expressions. Actually, seems that
I can't find any predicate that only takes a
Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-146103912
The point I'm trying to make is that we should generalize this, so that we
don't have to special case every possible `udf(attr) literal`
but can instead push down any
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-146215917
This implementation only considers the use case to evaluate a single
attribute with an UDF and compare the result with a literal value. We only
consider this because in
Github user viirya closed the pull request at:
https://github.com/apache/spark/pull/8922
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-145926850
Also, I understand you can't share your internal data/code, but you can
create a similar benchmark on synthetic data.
---
If your project is set up for it, you can
Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-145926723
I'm not suggesting we specialize for UDFs that are Int => Boolean. I'm
suggesting that we generate a function for any predicate that only takes a
single attribute as
Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-145632836
"daily sql query" is not sufficiently descriptive. Please post actual
benchmark results with code when making pull requests that claim to improve
performance. It
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-145717868
Hmm, because the sql query and data schema is sensitive for company
business, I may not be able to post publicly here. The data size is hundreds GB
to 1TB, and the sql
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-145721815
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-145721619
[Test build #43263 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43263/console)
for PR 8922 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-145721818
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-145699987
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-14565
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-145700872
[Test build #43263 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43263/consoleFull)
for PR 8922 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-145254675
[Test build #43214 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43214/consoleFull)
for PR 8922 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-145254408
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-145254404
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-145264450
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-145264449
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-145264411
[Test build #43214 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43214/console)
for PR 8922 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144935295
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144935281
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144935914
[Test build #43175 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43175/consoleFull)
for PR 8922 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144957948
[Test build #43175 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43175/console)
for PR 8922 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144958187
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144958184
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144991461
I've used daily sql query to test the performance difference. Roughly, it
shows about 20% or more improvement.
---
If your project is set up for it, you can reply to
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144645064
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144645045
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144646240
[Test build #43147 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43147/consoleFull)
for PR 8922 at commit
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144697959
ping @liancheng @marmbrus
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144776282
I will post performance comparison later.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144772135
I'm a little skeptical that this is worth the complexity. Do you have real
works loads that this speeds up significantly?
---
If your project is set up for it, you
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144638860
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144638836
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144640115
[Test build #43146 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43146/consoleFull)
for PR 8922 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144682666
[Test build #43147 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43147/console)
for PR 8922 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144682791
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144682795
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144663366
[Test build #43146 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43146/console)
for PR 8922 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144663629
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-144663631
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143785651
[Test build #43061 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43061/console)
for PR 8922 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143785810
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143785812
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143749165
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143749198
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143750658
[Test build #43061 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43061/consoleFull)
for PR 8922 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143408936
[Test build #43048 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43048/consoleFull)
for PR 8922 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143416167
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143414466
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143414469
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143416156
[Test build #43049 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43049/console)
for PR 8922 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143416166
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
GitHub user viirya opened a pull request:
https://github.com/apache/spark/pull/8922
[SPARK-10841][SQL] Add pushdown support of UDF for Parquet
JIRA: https://issues.apache.org/jira/browse/SPARK-10841
Currently we can't push down filters involving UDFs to Parquet. In
Github user viirya commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143416242
retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143409014
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143409012
[Test build #43048 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43048/console)
for PR 8922 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143409013
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143416391
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143416396
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143426933
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143426691
[Test build #43050 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43050/console)
for PR 8922 at commit
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143426937
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143408652
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143408654
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143416591
[Test build #43050 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43050/consoleFull)
for PR 8922 at commit
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/8922#issuecomment-143414672
[Test build #43049 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43049/consoleFull)
for PR 8922 at commit
68 matches
Mail list logo