[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-16 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-157027289 @marmbrus Is this one OK for branch-1.6? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-16 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-157027571 @HyukjinKwon Thanks! I've merged this one to master. And yes, please feel free to add the decimal test case(s). --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-16 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/9060 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-16 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-157138358 Merging to branch-1.6. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-16 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-157108064 Sure --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156879272 I saw accidently `TODO Adds test case for reading dictionary encoded decimals written as 'FIXED_LEN_BYTE_ARRAY'`. I will also add this test in the following

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156891545 **[Test build #45964 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45964/consoleFull)** for PR 9060 at commit

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156891628 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156891627 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-15 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156879507 **[Test build #45964 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45964/consoleFull)** for PR 9060 at commit

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-13 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/9060#discussion_r44765188 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala --- @@ -513,6 +515,41 @@ class ParquetIOSuite

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-13 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/9060#discussion_r44764961 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala --- @@ -513,6 +515,41 @@ class ParquetIOSuite

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-13 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/9060#discussion_r44764956 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala --- @@ -513,6 +515,41 @@ class ParquetIOSuite

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-13 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156379942 LGTM except for a few minor styling issue. I can merge it right after you fix them. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156072061 I think we can check for column encoding information, which is accessible from Parquet footers. For example, `PARQUET_2_0` uses `RLE_DICTIONARY` while `PARQUET_1_0`

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156076494 Thank toy very much. I will try in that way. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156077334 You may construct a Parquet file consists of a single column with dictionary encoding using: ```scala val path = "file:///tmp/parquet/dict"

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156306727 Fortunately, I worked around parquet tools once and looked through Parquet codes several times :). Thank you very much for your help. This could be dome

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156306860 [Test build #45810 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45810/consoleFull) for PR 9060 at commit

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156322308 Build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156322309 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156322233 [Test build #45810 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45810/console) for PR 9060 at commit

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156354310 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156327284 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156327273 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156354309 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156354224 **[Test build #45831 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45831/consoleFull)** for PR 9060 at commit

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156099372 Thanks! I will follow the way. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156309719 **[Test build #45811 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45811/consoleFull)** for PR 9060 at commit

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156325563 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156325565 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156325499 **[Test build #45811 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45811/consoleFull)** for PR 9060 at commit

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156327584 **[Test build #45831 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45831/consoleFull)** for PR 9060 at commit

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156306712 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156306692 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156309106 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-156309116 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-11 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-155718158 @HyukjinKwon Oh yeah, sorry. Finally got sometime to clean my review queue :) I wonder is there an easy way to add a test case for this? At first I thought

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-11 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-155718167 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-155718924 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-155718954 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-155720490 I will try to find and test them first tommorow before adding a commit! --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-155719264 **[Test build #45626 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45626/consoleFull)** for PR 9060 at commit

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-155752417 **[Test build #45626 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/45626/consoleFull)** for PR 9060 at commit

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-155753066 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-155753068 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-155973698 @liancheng I give some tries to figure out the version but.. as you said, it is pretty tricky to check the writer version as it only changes the version of data

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-11-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-154597634 @liancheng I assume you missed this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-10-18 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-148994769 /cc @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-10-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/9060#discussion_r41705069 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystWriteSupport.scala --- @@ -431,6 +431,7 @@ private[parquet]

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-10-10 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/9060#discussion_r41695242 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystWriteSupport.scala --- @@ -431,6 +431,7 @@ private[parquet]

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-10-10 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/9060 [SPARK-11044][SQL] Parquet writer version fixed as version1 https://issues.apache.org/jira/browse/SPARK-11044 Spark only writes the parquet file with writer version1 ignoring the given

[GitHub] spark pull request: [SPARK-11044][SQL] Parquet writer version fixe...

2015-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9060#issuecomment-147047845 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your