[GitHub] spark issue #16819: [SPARK-16441][YARN] Set maxNumExecutor depends on yarn c...

2017-02-06 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16819 I don't think this is a necessary change. Already, you can't ask for more resources than the cluster has; the cluster won't grant them. Capping it here means the app can't use more resources if the

[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...

2017-02-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16787 **[Test build #72433 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72433/testReport)** for PR 16787 at commit

[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...

2017-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16787 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72433/ Test FAILed. ---

[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...

2017-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16787 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16803: [SPARK-19458][SQL]load hive jars from local repo which h...

2017-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16803 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16803: [SPARK-19458][SQL]load hive jars from local repo which h...

2017-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16803 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72428/ Test PASSed. ---

[GitHub] spark issue #16803: [SPARK-19458][SQL]load hive jars from local repo which h...

2017-02-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16803 **[Test build #72428 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72428/testReport)** for PR 16803 at commit

[GitHub] spark pull request #16810: [SPARK-19464][CORE][YARN][test-hadoop2.6] Remove ...

2017-02-06 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/16810#discussion_r99561230 --- Diff: docs/building-spark.md --- @@ -63,57 +63,30 @@ with Maven profile settings and so on like the direct Maven build. Example: This will

[GitHub] spark issue #16815: [SPARK-19407][SS] defaultFS is used FileSystem.get inste...

2017-02-06 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/16815 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16817: [SPARK-17213][SQL][FOLLOWUP] Re-enable Parquet filter te...

2017-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16817 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16817: [SPARK-17213][SQL][FOLLOWUP] Re-enable Parquet filter te...

2017-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16817 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72430/ Test PASSed. ---

[GitHub] spark issue #16677: [SPARK-19355][SQL] Use map output statistices to improve...

2017-02-06 Thread sujith71955
Github user sujith71955 commented on the issue: https://github.com/apache/spark/pull/16677 @viirya i tested with the above mentioned approach with sample data, it has improved the performance almost into 3X Please find the test report Total No of Executers = 3 Total

[GitHub] spark issue #16817: [SPARK-17213][SQL][FOLLOWUP] Re-enable Parquet filter te...

2017-02-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16817 **[Test build #72430 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72430/testReport)** for PR 16817 at commit

[GitHub] spark issue #16816: Code style improvement

2017-02-06 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16816 @zhoucen please close this PR and read http://spark.apache.org/contributing.html --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #16751: [SPARK-19409][BUILD] Bump parquet version to 1.8.2

2017-02-06 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16751 Pardon me, but is there anywhere else keeping track of the build break with SBT? It's been failing for a while in master:

[GitHub] spark pull request #16815: [SPARK-19407][SS] defaultFS is used FileSystem.ge...

2017-02-06 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/16815#discussion_r99556423 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamMetadata.scala --- @@ -47,7 +47,7 @@ object StreamMetadata extends Logging

[GitHub] spark issue #16751: [SPARK-19409][BUILD] Bump parquet version to 1.8.2

2017-02-06 Thread robbinspg
Github user robbinspg commented on the issue: https://github.com/apache/spark/pull/16751 Sorry, I've been away for the w/end. Yes we use maven for our test runs. Looks like you have it under control. Thanks --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request #16819: [SPARK-16441][YARN] Set maxNumExecutor depends on...

2017-02-06 Thread wangyum
GitHub user wangyum opened a pull request: https://github.com/apache/spark/pull/16819 [SPARK-16441][YARN] Set maxNumExecutor depends on yarn cluster resources. ## What changes were proposed in this pull request? Dynamic set `spark.dynamicAllocation.maxExecutors` by cluster

[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...

2017-02-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16787 **[Test build #72433 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72433/testReport)** for PR 16787 at commit

[GitHub] spark issue #16818: [SPARK-19451][SQL][Core] Underlying integer overflow in ...

2017-02-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16818 **[Test build #72432 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72432/testReport)** for PR 16818 at commit

[GitHub] spark pull request #16818: [SPARK-19451][SQL][Core] Underlying integer overf...

2017-02-06 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/16818 [SPARK-19451][SQL][Core] Underlying integer overflow in Window function ## What changes were proposed in this pull request? reproduce code: ``` val tw =

[GitHub] spark pull request #16625: [SPARK-17874][core] Add SSL port configuration.

2017-02-06 Thread sarutak
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/16625#discussion_r99538768 --- Diff: docs/configuration.md --- @@ -1797,6 +1797,20 @@ Apart from these, the following properties are also available, and may be useful

[GitHub] spark pull request #16625: [SPARK-17874][core] Add SSL port configuration.

2017-02-06 Thread sarutak
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/16625#discussion_r99540244 --- Diff: docs/security.md --- @@ -49,10 +49,6 @@ component-specific configuration namespaces used to override the default setting Component

[GitHub] spark pull request #16625: [SPARK-17874][core] Add SSL port configuration.

2017-02-06 Thread sarutak
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/16625#discussion_r99540452 --- Diff: core/src/main/scala/org/apache/spark/ui/JettyUtils.scala --- @@ -394,8 +410,7 @@ private[spark] object JettyUtils extends Logging {

[GitHub] spark issue #16817: [SPARK-17213][SQL][FOLLOWUP] Re-enable Parquet filter te...

2017-02-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16817 Thank you, @cloud-fan. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...

2017-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16787 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...

2017-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16787 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72429/ Test FAILed. ---

[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...

2017-02-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16787 **[Test build #72429 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72429/testReport)** for PR 16787 at commit

[GitHub] spark issue #16817: [SPARK-17213][SQL][FOLLOWUP] Re-enable Parquet filter te...

2017-02-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16817 LGTM if tests pass --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16386 can we focus on supporting multiline json in this PR? We can leave the improvements in new PRs, or this PR is kind of hard to review. --- If your project is set up for it, you can reply to this

[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field

2017-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16476 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72431/ Test FAILed. ---

[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field

2017-02-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16476 **[Test build #72431 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72431/testReport)** for PR 16476 at commit

[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field

2017-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16476 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field

2017-02-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16476 **[Test build #72431 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72431/testReport)** for PR 16476 at commit

[GitHub] spark pull request #16476: [SPARK-19084][SQL] Implement expression field

2017-02-06 Thread gczsjdy
Github user gczsjdy commented on a diff in the pull request: https://github.com/apache/spark/pull/16476#discussion_r99535960 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala --- @@ -340,3 +344,99 @@ object CaseKeyWhen {

[GitHub] spark pull request #16386: [SPARK-18352][SQL] Support parsing multiline json...

2017-02-06 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16386#discussion_r99535380 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JSONOptions.scala --- @@ -31,10 +31,17 @@ import

[GitHub] spark issue #16817: [SPARK-17213][SQL][FOLLOWUP] Re-enable Parquet filter te...

2017-02-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16817 **[Test build #72430 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72430/testReport)** for PR 16817 at commit

[GitHub] spark pull request #16817: [SPARK-17213][SQL][FOLLOWUP] Re-enable Parquet fi...

2017-02-06 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/16817 [SPARK-17213][SQL][FOLLOWUP] Re-enable Parquet filter tests for binary and string ## What changes were proposed in this pull request? This PR proposes to enable the tests for Parquet

[GitHub] spark issue #16817: [SPARK-17213][SQL][FOLLOWUP] Re-enable Parquet filter te...

2017-02-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16817 cc @liancheng, could you see if it makes sense? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...

2017-02-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16787 **[Test build #72429 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72429/testReport)** for PR 16787 at commit

[GitHub] spark pull request #16386: [SPARK-18352][SQL] Support parsing multiline json...

2017-02-06 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16386#discussion_r99534380 --- Diff: core/src/main/scala/org/apache/spark/input/PortableDataStream.scala --- @@ -194,5 +195,8 @@ class PortableDataStream( } def

[GitHub] spark issue #16803: [SPARK-19458][SQL]load hive jars from local repo which h...

2017-02-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16803 **[Test build #72428 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72428/testReport)** for PR 16803 at commit

[GitHub] spark pull request #16386: [SPARK-18352][SQL] Support parsing multiline json...

2017-02-06 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16386#discussion_r99534181 --- Diff: core/src/main/scala/org/apache/spark/input/PortableDataStream.scala --- @@ -194,5 +195,8 @@ class PortableDataStream( } def

[GitHub] spark issue #16803: [SPARK-19458][SQL]load hive jars from local repo which h...

2017-02-06 Thread windpiger
Github user windpiger commented on the issue: https://github.com/apache/spark/pull/16803 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field

2017-02-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16476 **[Test build #72427 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72427/testReport)** for PR 16476 at commit

[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field

2017-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16476 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72427/ Test FAILed. ---

[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field

2017-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16476 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...

2017-02-06 Thread windpiger
Github user windpiger commented on the issue: https://github.com/apache/spark/pull/16787 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #13379: [SPARK-12431][GraphX] Add local checkpointing to GraphX.

2017-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13379 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #16476: [SPARK-19084][SQL] Implement expression field

2017-02-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16476 **[Test build #72427 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72427/testReport)** for PR 16476 at commit

[GitHub] spark pull request #16791: [SPARK-19409][SPARK-17213] Cleanup Parquet workar...

2017-02-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16791 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #16791: [SPARK-19409][SPARK-17213] Cleanup Parquet workarounds/h...

2017-02-06 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16791 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...

2017-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16787 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72426/ Test FAILed. ---

[GitHub] spark issue #16803: [SPARK-19458][SQL]load hive jars from local repo which h...

2017-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16803 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72425/ Test FAILed. ---

[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...

2017-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16787 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16803: [SPARK-19458][SQL]load hive jars from local repo which h...

2017-02-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16803 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #16750: [SPARK-18937][SQL] Timezone support in CSV/JSON p...

2017-02-06 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16750#discussion_r99531422 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -298,6 +299,8 @@ class DataFrameReader private[sql](sparkSession:

<    1   2   3   4   5   6