Github user markhamstra commented on the pull request:
https://github.com/apache/spark/pull/13280
Oh, wait... sorry, I just realized that @liancheng said he also merged to
branch-2.0. +1 on reverting that.
---
If your project is set up for it, you can reply to this email and have yo
Github user markhamstra commented on the pull request:
https://github.com/apache/spark/pull/13280
@rxin Huh? The merge was to master, not branch-2.0. Doesn't that put it
on the 2.1 track and not into 2.0.0? I think that is all that Yin was saying,
that @rdblue was mistaken about th
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/13280
+1 on reverting for the reasons Cheng mentioned. It's very risky to do dep
updates at this point for 2.0, and I was surprised this got merged without at
least verifying the prior performance reg
Github user yhuai commented on the pull request:
https://github.com/apache/spark/pull/13280
Hello @rdblue, we are pretty late in this release cycle. I am afraid that
we cannot actually upgrade Parquet to 1.8.1 because of the following two
reasons:
1. Since this change was merged p
Github user rdblue commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-77478
Thanks @liancheng! It will be great to have predicate push-down for strings
in 2.0!
---
If your project is set up for it, you can reply to this email and have your
repl
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/13280
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is ena
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-77051
Great, I'm merging this to master and branch-2.0. Thanks for working on
this!
---
If your project is set up for it, you can reply to this email and have your
reply a
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-44477
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-44475
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your projec
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-44278
**[Test build #59508 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59508/consoleFull)**
for PR 13280 at commit
[`33e07ee`](https://g
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-39131
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-39129
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your projec
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-38965
**[Test build #59506 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59506/consoleFull)**
for PR 13280 at commit
[`37b4978`](https://g
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-34310
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your projec
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-34311
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-34126
**[Test build #59505 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59505/consoleFull)**
for PR 13280 at commit
[`40241dc`](https://g
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-21166
**[Test build #59508 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59508/consoleFull)**
for PR 13280 at commit
[`33e07ee`](https://gi
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/13280#discussion_r64948705
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala
---
@@ -1415,6 +1425,18 @@ class ParquetSchemaS
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-18873
**[Test build #59506 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59506/consoleFull)**
for PR 13280 at commit
[`37b4978`](https://gi
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/13280#discussion_r64947459
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala
---
@@ -1415,6 +1425,18 @@ class ParquetSchemaS
Github user liancheng commented on a diff in the pull request:
https://github.com/apache/spark/pull/13280#discussion_r64947036
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala
---
@@ -1415,6 +1425,18 @@ class ParquetSche
Github user liancheng commented on a diff in the pull request:
https://github.com/apache/spark/pull/13280#discussion_r64946259
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java
---
@@ -194,7 +196,24 @@ protecte
Github user liancheng commented on a diff in the pull request:
https://github.com/apache/spark/pull/13280#discussion_r64945499
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala
---
@@ -1415,6 +1425,18 @@ class ParquetSche
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-14171
**[Test build #59505 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59505/consoleFull)**
for PR 13280 at commit
[`40241dc`](https://gi
Github user rdblue commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-13718
@liancheng, fixed. Yeah, IntelliJ has a few annoyances like that with
scala. Imports are a mess.
---
If your project is set up for it, you can reply to this email and h
Github user liancheng commented on a diff in the pull request:
https://github.com/apache/spark/pull/13280#discussion_r64943553
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala
---
@@ -1066,19 +1066,29 @@ class ParquetSch
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-10870
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-10868
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your projec
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-10688
**[Test build #59500 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59500/consoleFull)**
for PR 13280 at commit
[`85e03f9`](https://g
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-222189589
**[Test build #59500 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59500/consoleFull)**
for PR 13280 at commit
[`85e03f9`](https://gi
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-222046470
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your projec
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-222046471
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-222046346
**[Test build #59441 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59441/consoleFull)**
for PR 13280 at commit
[`af3957f`](https://g
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-222039291
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-222039290
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your projec
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-222039177
**[Test build #59430 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59430/consoleFull)**
for PR 13280 at commit
[`d1c79c7`](https://g
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-222034843
**[Test build #59441 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59441/consoleFull)**
for PR 13280 at commit
[`af3957f`](https://gi
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-222032994
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-222032991
Build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-222032895
**[Test build #59422 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59422/consoleFull)**
for PR 13280 at commit
[`30769bd`](https://g
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-222027264
**[Test build #59430 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59430/consoleFull)**
for PR 13280 at commit
[`d1c79c7`](https://gi
Github user rdblue commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-222027030
@liancheng: rebased. Sorry I missed that earlier.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-222025448
@rdblue LGTM pending rebasing and Jenkins. Thanks for fixing this!
---
If your project is set up for it, you can reply to this email and have your
reply appear on Git
Github user rdblue commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-222014796
@liancheng, thanks for pointing out that fix, I've added it. I thought that
was already committed since it has been a while since we fixed the Parquet side.
I've
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-222014105
**[Test build #59422 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59422/consoleFull)**
for PR 13280 at commit
[`30769bd`](https://gi
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-221982103
I think we should upgrade Parquet to 1.8.1 in Spark 2.0 due to the
following reasons:
1. Get PARQUET-251 fixed so that we no longer write corrupted statistics
Github user yhuai commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-221935404
@rdblue Is there any perf evaluation of this new version that we can refer
to ?
---
If your project is set up for it, you can reply to this email and have your
reply app
Github user liancheng commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-221457588
I had once tried to upgrade Parquet to 1.8.1, and one more change needs to
be done for the upgrade:
https://github.com/apache/spark/pull/9225/files#diff-b4108187503e0
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-221428658
The dev/test-dependencies script can auto update the deps files for this
purpose.
One thing we ask people to investigate are changes between old and new
versio
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-221417951
cc @liancheng who might have idea about past parquet perf regressions.
---
If your project is set up for it, you can reply to this email and have your
reply appear on Git
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-221417895
Yea you would need to explicitly update the dependency list. We added that
as a safe-guard to accidentally changing dependencies.
---
If your project is set up for it, y
Github user rdblue commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-221417279
I'm not sure what should be done to fix the dependency test failure. Looks
like there's a list of dependencies that needs to be updated. Is that something
I should inclu
Github user rdblue commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-221375636
@rxin, I agree that we shouldn't upgrade if there are perf regressions. I
would like to know what they are so we can fix them in Parquet upstream though.
This should be
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-221366492
**[Test build #3015 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3015/consoleFull)**
for PR 13280 at commit
[`022dd6b`](https://
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-221366165
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-221366112
in the past parquet upgrades brought perf regressions. Any idea about this
release?
---
If your project is set up for it, you can reply to this email and have your
reply
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-221366162
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your projec
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-221366139
**[Test build #59214 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59214/consoleFull)**
for PR 13280 at commit
[`022dd6b`](https://g
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-221365621
**[Test build #3015 has
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3015/consoleFull)**
for PR 13280 at commit
[`022dd6b`](https://g
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/13280#issuecomment-221365216
**[Test build #59214 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59214/consoleFull)**
for PR 13280 at commit
[`022dd6b`](https://gi
GitHub user rdblue opened a pull request:
https://github.com/apache/spark/pull/13280
[SPARK-9876][SQL]: Update Parquet to 1.8.1.
## What changes were proposed in this pull request?
This includes minimal changes to get Spark using the current release of
Parquet, 1.8.1.
61 matches
Mail list logo