[GitHub] spark issue #16620: [SPARK-19263] DAGScheduler should handle stage's pending...

2017-01-19 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/16620 @squito `SchedulerIntegrationSuite` is very helpful. I like it very much, I can reproduce this issue in `SchedulerIntegrationSuite` now. To fix this issue, it is more complicated than I

[GitHub] spark issue #16552: [WIP][SPARK-19152][SQL]DataFrameWriter.saveAsTable suppo...

2017-01-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16552 **[Test build #71654 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71654/testReport)** for PR 16552 at commit

[GitHub] spark issue #16552: [WIP][SPARK-19152][SQL]DataFrameWriter.saveAsTable suppo...

2017-01-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16552 **[Test build #71653 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71653/testReport)** for PR 16552 at commit

[GitHub] spark issue #16552: [WIP][SPARK-19152][SQL]DataFrameWriter.saveAsTable suppo...

2017-01-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16552 **[Test build #71652 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71652/testReport)** for PR 16552 at commit

[GitHub] spark issue #16633: [SPARK-19274][SQL] Make GlobalLimit without shuffling da...

2017-01-19 Thread scwf
Github user scwf commented on the issue: https://github.com/apache/spark/pull/16633 Not get you, but let me explain more, If we use map output statistics to decide each global limit should take how many element. 1. local limit shuffle with the maillist partitioner and return

[GitHub] spark issue #16638: spark-19115

2017-01-19 Thread ouyangxiaochen
Github user ouyangxiaochen commented on the issue: https://github.com/apache/spark/pull/16638 Here is the differences between Hive and Spark2.x as follow: 1.Hive create table test(id int); --> MANAGED_TABLE create table test(id int) location '/warehouse/test'; -->

[GitHub] spark pull request #16639: [SPARK-19276][CORE] Fetch Failure handling robust...

2017-01-19 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/16639#discussion_r96816680 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -405,6 +415,13 @@ private[spark] class Executor(

[GitHub] spark pull request #16639: [SPARK-19276][CORE] Fetch Failure handling robust...

2017-01-19 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/16639#discussion_r96814693 --- Diff: core/src/main/scala/org/apache/spark/shuffle/FetchFailedException.scala --- @@ -45,6 +45,12 @@ private[spark] class FetchFailedException(

[GitHub] spark pull request #16639: [SPARK-19276][CORE] Fetch Failure handling robust...

2017-01-19 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/16639#discussion_r96815319 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -405,6 +415,13 @@ private[spark] class Executor(

[GitHub] spark pull request #16639: [SPARK-19276][CORE] Fetch Failure handling robust...

2017-01-19 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/16639#discussion_r96814948 --- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala --- @@ -56,6 +57,8 @@ private[spark] class TaskContextImpl( // Whether the task

[GitHub] spark issue #9168: [SPARK-11182] HDFS Delegation Token will be expired when ...

2017-01-19 Thread leocook
Github user leocook commented on the issue: https://github.com/apache/spark/pull/9168 I just add principal conf when submit my job like blow: ``` --principal=@yy \ --keytab=a.keytab \ ``` And spark.hadoop.fs.hdfs.impl.disable.cache=true is not

[GitHub] spark issue #12004: [SPARK-7481] [build] Add spark-cloud module to pull in o...

2017-01-19 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/12004 I have the impression that you can't really use Spark with S3 and only S3, not as an intermediate store, because it's too eventually-consistent. Does the presence of additional integration libraries

[GitHub] spark issue #16633: [SPARK-19274][SQL] Make GlobalLimit without shuffling da...

2017-01-19 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16633 @scwf Even it works finally, I don't think it is better in performance. Simply calculate it. Assume the limit number is `n`, partition number is `N`, and each partition has `n / r` rows in

[GitHub] spark issue #16632: [SPARK-19273]shuffle stage should retry when fetch shuff...

2017-01-19 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16632 @viper-kun OK but in any event, we would not merge this PR, so it can be closed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #16631: [SPARK-19271] [SQL] Change non-cbo estimation of aggrega...

2017-01-19 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16631 I will try to review it carefully tomorrow, if needed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #16592: [SPARK-19235] [SQL] [TESTS] Enable Test Cases in DDLSuit...

2017-01-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16592 **[Test build #71651 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71651/testReport)** for PR 16592 at commit

[GitHub] spark issue #16611: [SPARK-17967][SPARK-17878][SQL][PYTHON] Support for arra...

2017-01-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16611 **[Test build #71650 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71650/testReport)** for PR 16611 at commit

[GitHub] spark pull request #16643: [SPARK-17724][Streaming][WebUI] Unevaluated new l...

2017-01-19 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/16643#discussion_r96813248 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala --- @@ -19,7 +19,7 @@ package org.apache.spark.ui.jobs import

[GitHub] spark issue #16592: [SPARK-19235] [SQL] [TESTS] Enable Test Cases in DDLSuit...

2017-01-19 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16592 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16611: [SPARK-17967][SPARK-17878][SQL][PYTHON] Support for arra...

2017-01-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16611 **[Test build #71649 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71649/testReport)** for PR 16611 at commit

[GitHub] spark issue #15192: [SPARK-14536] [SQL] fix to handle null value in array ty...

2017-01-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15192 **[Test build #71648 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71648/testReport)** for PR 15192 at commit

[GitHub] spark issue #16643: [SPARK-17724][Streaming][WebUI] Unevaluated new lines in...

2017-01-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16643 **[Test build #71647 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71647/testReport)** for PR 16643 at commit

[GitHub] spark pull request #16643: [SPARK-17724][Streaming][WebUI] Unevaluated new l...

2017-01-19 Thread keypointt
GitHub user keypointt opened a pull request: https://github.com/apache/spark/pull/16643 [SPARK-17724][Streaming][WebUI] Unevaluated new lines in tooltip in DAG Visualization of a job https://issues.apache.org/jira/browse/SPARK-17724 ## What changes were proposed in this

[GitHub] spark issue #16638: spark-19115

2017-01-19 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16638 In the PR, you might need to consider more scenarios. For example, let me ask a question. How does Hive behave when the specified location is not empty? --- If your project is set up for it,

[GitHub] spark issue #16638: spark-19115

2017-01-19 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16638 We have a few test cases you can follow. Please create test cases. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #12064: [SPARK-14272][ML] Add Loglikelihood in GaussianMixtureSu...

2017-01-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12064 **[Test build #71646 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71646/testReport)** for PR 12064 at commit

[GitHub] spark pull request #16638: spark-19115

2017-01-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16638#discussion_r96809883 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -58,6 +58,7 @@ import org.apache.spark.util.Utils case

[GitHub] spark issue #16642: [SPARK-19284][SQL]append to partitioned datasource table...

2017-01-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16642 **[Test build #71645 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71645/testReport)** for PR 16642 at commit

[GitHub] spark issue #12064: [SPARK-14272][ML] Add Loglikelihood in GaussianMixtureSu...

2017-01-19 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/12064 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #16638: spark-19115

2017-01-19 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16638 Could you follow the title requirement in http://spark.apache.org/contributing.html? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request #16642: [SPARK-19284][SQL]append to datasource partitione...

2017-01-19 Thread windpiger
GitHub user windpiger opened a pull request: https://github.com/apache/spark/pull/16642 [SPARK-19284][SQL]append to datasource partitioned table without custom partition location ## What changes were proposed in this pull request? when we append data to a existed

[GitHub] spark issue #16635: [SPARK-19059] [SQL] Unable to retrieve data from parquet...

2017-01-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16635 **[Test build #71644 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71644/testReport)** for PR 16635 at commit

[GitHub] spark issue #16441: [SPARK-14975][ML] Fixed GBTClassifier to predict probabi...

2017-01-19 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/16441 @imatiach-msft thanks for this, really great to have GBT in the classification trait hierarchy, and now usable with binary evaluator metrics! --- If your project is set up for it, you can reply to

[GitHub] spark issue #16635: [SPARK-19059] [SQL] Unable to retrieve data from parquet...

2017-01-19 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/16635 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16635: [SPARK-19059] [SQL] Unable to retrieve data from parquet...

2017-01-19 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/16635 process was terminated by signal 9 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #16621: [SPARK-19265][SQL] make table relation cache gene...

2017-01-19 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16621 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #16621: [SPARK-19265][SQL] make table relation cache general and...

2017-01-19 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16621 Thanks! Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16635: [SPARK-19059] [SQL] Unable to retrieve data from parquet...

2017-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16635 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71635/ Test FAILed. ---

[GitHub] spark issue #16635: [SPARK-19059] [SQL] Unable to retrieve data from parquet...

2017-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16635 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16552: [SPARK-19152][SQL]DataFrameWriter.saveAsTable support hi...

2017-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16552 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #12064: [SPARK-14272][ML] Add Loglikelihood in GaussianMixtureSu...

2017-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12064 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16552: [SPARK-19152][SQL]DataFrameWriter.saveAsTable support hi...

2017-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16552 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71639/ Test FAILed. ---

[GitHub] spark issue #12064: [SPARK-14272][ML] Add Loglikelihood in GaussianMixtureSu...

2017-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12064 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71641/ Test FAILed. ---

[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71642/ Test FAILed. ---

[GitHub] spark issue #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2017-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16344 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71643/ Test FAILed. ---

[GitHub] spark issue #16621: [SPARK-19265][SQL] make table relation cache general and...

2017-01-19 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16621 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark pull request #16621: [SPARK-19265][SQL] make table relation cache gene...

2017-01-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16621#discussion_r96807499 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala --- @@ -586,12 +594,12 @@ class SessionCatalog(

<    1   2   3   4   5