[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110083521 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -327,3 +345,104 @@ object JoinReorderDP

[GitHub] spark issue #16906: [SPARK-19570][PYSPARK] Allow to disable hive in pyspark ...

2017-04-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16906 **[Test build #75562 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75562/testReport)** for PR 16906 at commit

[GitHub] spark issue #16906: [SPARK-19570][PYSPARK] Allow to disable hive in pyspark ...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16906 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75562/ Test PASSed. ---

[GitHub] spark issue #16906: [SPARK-19570][PYSPARK] Allow to disable hive in pyspark ...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16906 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #17541: [SPARK-20229][SQL] add semanticHash to QueryPlan

2017-04-05 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/17541#discussion_r110080643 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -423,8 +423,17 @@ case class CatalogRelation(

[GitHub] spark issue #16906: [SPARK-19570][PYSPARK] Allow to disable hive in pyspark ...

2017-04-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16906 **[Test build #75562 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75562/testReport)** for PR 16906 at commit

[GitHub] spark issue #17222: [SPARK-19439][PYSPARK][SQL] PySpark's registerJavaFuncti...

2017-04-05 Thread zjffdu
Github user zjffdu commented on the issue: https://github.com/apache/spark/pull/17222 @holdenk Mind to review it ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110078548 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -327,3 +345,104 @@ object JoinReorderDP

[GitHub] spark pull request #17050: [SPARK-19722] [SQL] [MINOR] Clean up the usage of...

2017-04-05 Thread gatorsmile
Github user gatorsmile closed the pull request at: https://github.com/apache/spark/pull/17050 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #17546: [SPARK-20233] [SQL] Apply star-join filter heuristics to...

2017-04-05 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/17546 @wzhfy @gatorsmile @cloud-fan I've integrated star-join with join enumeration. Would you please take a look? Thanks. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-05 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17546#discussion_r110076651 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -150,12 +148,15 @@ object

[GitHub] spark issue #17546: [SPARK-20233] [SQL] Apply star-join filter heuristics to...

2017-04-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17546 **[Test build #75561 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75561/testReport)** for PR 17546 at commit

[GitHub] spark pull request #17546: [SPARK-20233] [SQL] Apply star-join filter heuris...

2017-04-05 Thread ioana-delaney
GitHub user ioana-delaney opened a pull request: https://github.com/apache/spark/pull/17546 [SPARK-20233] [SQL] Apply star-join filter heuristics to dynamic programming join enumeration ## What changes were proposed in this pull request? Implements star-join filter to

[GitHub] spark issue #17527: [SPARK-20156][CORE][SQL][STREAMING][MLLIB] Java String t...

2017-04-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17527 Yea, that's the concern. The downside is when these are exposed to users. However, it might be an advantage as well. The behavior doesn't depend on default JVM locale and is consistent. I think

[GitHub] spark issue #17527: [SPARK-20156][CORE][SQL][STREAMING][MLLIB] Java String t...

2017-04-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17527 @HyukjinKwon yap, so for such cases exposed to users, I think it is better to leave it out for the default locale? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #17527: [SPARK-20156][CORE][SQL][STREAMING][MLLIB] Java String t...

2017-04-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17527 @viirya, I think it is possible. In case of `Lower`, `Upper` and `InitCap` as an example maybe. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request #17527: [SPARK-20156][CORE][SQL][STREAMING][MLLIB] Java S...

2017-04-05 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17527#discussion_r110064613 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java --- @@ -407,7 +408,7 @@ public UTF8String toLowerCase() { }

[GitHub] spark issue #17527: [SPARK-20156][CORE][SQL][STREAMING][MLLIB] Java String t...

2017-04-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17527 Out of curious, is there any situation we do really need the locale setting, instead of `Locale.ROOT`? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #16793: [SPARK-19454][PYTHON][SQL] DataFrame.replace improvement...

2017-04-05 Thread zero323
Github user zero323 commented on the issue: https://github.com/apache/spark/pull/16793 Thanks @holdenk --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark pull request #17531: [SPARK-20217][core] Executor should not fail stag...

2017-04-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17531 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17535: [SPARK-20222][SQL] Bring back the Spark SQL UI when exec...

2017-04-05 Thread carsonwang
Github user carsonwang commented on the issue: https://github.com/apache/spark/pull/17535 Yes, it is closely related but two scenarios of adding `SQLExecution.withNewExecutionId`. Now some tests fail because `withNewExecutionId` is called twice. ```

[GitHub] spark issue #17537: [SPARK-20204][SQL][Followup] SQLConf should react to cha...

2017-04-05 Thread dilipbiswal
Github user dilipbiswal commented on the issue: https://github.com/apache/spark/pull/17537 @viirya @cloud-fan @gatorsmile Thanks a lot. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #17544: [SPARK-20231] [SQL] Refactor star schema code for...

2017-04-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17544 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17544: [SPARK-20231] [SQL] Refactor star schema code for the su...

2017-04-05 Thread ioana-delaney
Github user ioana-delaney commented on the issue: https://github.com/apache/spark/pull/17544 @gatorsmile Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #17538: [SPARK-20223][SQL] Fix typo in tpcds q77.sql

2017-04-05 Thread wzhfy
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/17538 @srowen It's not run automatically. And even if it is run, no error would be caught, the predicate will just become `null` because casting `'2000-08-03]'` to Date is failed. --- If your project is

[GitHub] spark issue #17544: [SPARK-20231] [SQL] Refactor star schema code for the su...

2017-04-05 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17544 Thanks! Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #17544: [SPARK-20231] [SQL] Refactor star schema code for the su...

2017-04-05 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17544 LGTM. Look forward to your next PR for merging start join detection with CBO. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request #17544: [SPARK-20231] [SQL] Refactor star schema code for...

2017-04-05 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/17544#discussion_r110060504 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/StarSchemaDetection.scala --- @@ -0,0 +1,351 @@ +/* + * Licensed

[GitHub] spark pull request #17149: [SPARK-19257][SQL]location for table/partition/da...

2017-04-05 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17149#discussion_r110060244 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -285,7 +285,7 @@ private[spark] class

[GitHub] spark issue #17506: [SPARK-20189][DStream] Fix spark kinesis testcases to re...

2017-04-05 Thread yssharma
Github user yssharma commented on the issue: https://github.com/apache/spark/pull/17506 @srowen - does the Jenkins re-test trigger automatically? else, could I request a retest on this patch please ? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request #17532: [SPARK-20214][ML] Make sure converted csc matrix ...

2017-04-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17532 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17532: [SPARK-20214][ML] Make sure converted csc matrix has sor...

2017-04-05 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/17532 LGTM Merging with master and branch-2.1, branch-2.0 Thanks a lot! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #17149: [SPARK-19257][SQL]location for table/partition/da...

2017-04-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17149#discussion_r110058436 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -285,7 +285,7 @@ private[spark] class

[GitHub] spark issue #17149: [SPARK-19257][SQL]location for table/partition/database ...

2017-04-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17149 yea, please go ahead --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #17537: [SPARK-20204][SQL][Followup] SQLConf should react...

2017-04-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17537 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17537: [SPARK-20204][SQL][Followup] SQLConf should react to cha...

2017-04-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17537 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #17532: [SPARK-20214][ML] Make sure converted csc matrix ...

2017-04-05 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/17532#discussion_r110057924 --- Diff: python/pyspark/mllib/tests.py --- @@ -853,6 +853,17 @@ def serialize(l): self.assertEqual(sv, serialize(lil.tocsr()))

[GitHub] spark issue #17531: [SPARK-20217][core] Executor should not fail stage if ki...

2017-04-05 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17531 Thanks. Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #17531: [SPARK-20217][core] Executor should not fail stage if ki...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17531 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17531: [SPARK-20217][core] Executor should not fail stage if ki...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17531 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75559/ Test PASSed. ---

[GitHub] spark issue #17531: [SPARK-20217][core] Executor should not fail stage if ki...

2017-04-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17531 **[Test build #75559 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75559/testReport)** for PR 17531 at commit

[GitHub] spark issue #17537: [SPARK-20204][SQL][Followup] SQLConf should react to cha...

2017-04-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17537 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #17296: [SPARK-19953][ML] Random Forest Models use parent UID wh...

2017-04-05 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/17296 @MLnick this should be good to go. I made https://issues.apache.org/jira/browse/SPARK-20234 to address some better consistency in these basic checks. --- If your project is set up for it,

[GitHub] spark issue #17532: [SPARK-20214][ML] Make sure converted csc matrix has sor...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17532 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17532: [SPARK-20214][ML] Make sure converted csc matrix has sor...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17532 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75560/ Test PASSed. ---

[GitHub] spark issue #17532: [SPARK-20214][ML] Make sure converted csc matrix has sor...

2017-04-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17532 **[Test build #75560 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75560/testReport)** for PR 17532 at commit

[GitHub] spark issue #17537: [SPARK-20204][SQL][Followup] SQLConf should react to cha...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17537 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17537: [SPARK-20204][SQL][Followup] SQLConf should react to cha...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17537 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75558/ Test PASSed. ---

[GitHub] spark issue #17537: [SPARK-20204][SQL][Followup] SQLConf should react to cha...

2017-04-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17537 **[Test build #75558 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75558/testReport)** for PR 17537 at commit

[GitHub] spark pull request #17532: [SPARK-20214][ML] Make sure converted csc matrix ...

2017-04-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/17532#discussion_r110053944 --- Diff: python/pyspark/mllib/tests.py --- @@ -853,6 +853,17 @@ def serialize(l): self.assertEqual(sv, serialize(lil.tocsr()))

[GitHub] spark issue #17527: [SPARK-20156][CORE][SQL][STREAMING][MLLIB] Java String t...

2017-04-05 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17527 I support this idea in general. I can at least identify several references, for example,, https://hibernate.atlassian.net/plugins/servlet/mobile#issue/HHH-9722,

[GitHub] spark issue #17494: [SPARK-20076][ML][PySpark] Add Python interface for ml.s...

2017-04-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17494 @jkbradley @MLnick @holdenk If there is no more questions about this change, maybe we can make it into 2.2? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #17532: [SPARK-20214][ML] Make sure converted csc matrix has sor...

2017-04-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17532 **[Test build #75560 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75560/testReport)** for PR 17532 at commit

[GitHub] spark issue #17532: [SPARK-20214][ML] Make sure converted csc matrix has sor...

2017-04-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17532 @jkbradley An unit test is added. Please check this again. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #17532: [SPARK-20214][ML] Make sure converted csc matrix ...

2017-04-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/17532#discussion_r110052585 --- Diff: python/pyspark/ml/linalg/__init__.py --- @@ -72,7 +72,9 @@ def _convert_to_vector(l): return DenseVector(l) elif _have_scipy

[GitHub] spark issue #17532: [SPARK-20214][ML] Make sure converted csc matrix has sor...

2017-04-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17532 @jkbradley Thanks for comment. I will add the unit test now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #17544: [SPARK-20231] [SQL] Refactor star schema code for the su...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17544 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17544: [SPARK-20231] [SQL] Refactor star schema code for the su...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17544 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75557/ Test PASSed. ---

[GitHub] spark issue #17544: [SPARK-20231] [SQL] Refactor star schema code for the su...

2017-04-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17544 **[Test build #75557 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75557/testReport)** for PR 17544 at commit

[GitHub] spark pull request #17539: [SPARK-20224][SS] Updated docs for streaming drop...

2017-04-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17539 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17530: [SPARK-5158] Access kerberized HDFS from Spark standalon...

2017-04-05 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/17530 > That would work for cluster mode but in client mode the driver on the submitting nodes still needs the keytab unfortunately. You're setting up a special cluster for a single user. I'm

[GitHub] spark issue #17530: [SPARK-5158] Access kerberized HDFS from Spark standalon...

2017-04-05 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/17530 I'm just not sold on the idea that this is necessary in the first place. Personally I don't use standalone nor do I play with it at all, so my concerns are purely from a security standpoint. As in,

[GitHub] spark issue #17530: [SPARK-5158] Access kerberized HDFS from Spark standalon...

2017-04-05 Thread themodernlife
Github user themodernlife commented on the issue: https://github.com/apache/spark/pull/17530 BTW not trying to give you the hard sell and appreciate the help rounding out the requirements from the core committers' POV. --- If your project is set up for it, you can reply to this

[GitHub] spark issue #17530: [SPARK-5158] Access kerberized HDFS from Spark standalon...

2017-04-05 Thread themodernlife
Github user themodernlife commented on the issue: https://github.com/apache/spark/pull/17530 That would work for cluster mode but in client mode the driver on the submitting nodes still needs the keytab unfortunately. Standalone clusters are best viewed as distributed

[GitHub] spark issue #17530: [SPARK-5158] Access kerberized HDFS from Spark standalon...

2017-04-05 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/17530 And BTW, if you really want to pursue this, please write a detailed spec explaining everything that is being done, and describe all the security issues people need to be aware of. It might even be

[GitHub] spark issue #17543: [SPARK-20230] FetchFailedExceptions should invalidate fi...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17543 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75554/ Test PASSed. ---

[GitHub] spark issue #17543: [SPARK-20230] FetchFailedExceptions should invalidate fi...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17543 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17543: [SPARK-20230] FetchFailedExceptions should invalidate fi...

2017-04-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17543 **[Test build #75554 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75554/testReport)** for PR 17543 at commit

[GitHub] spark issue #17539: [SPARK-20224][SS] Updated docs for streaming dropDuplica...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17539 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17539: [SPARK-20224][SS] Updated docs for streaming dropDuplica...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17539 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75556/ Test PASSed. ---

[GitHub] spark issue #17539: [SPARK-20224][SS] Updated docs for streaming dropDuplica...

2017-04-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17539 **[Test build #75556 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75556/testReport)** for PR 17539 at commit

[GitHub] spark issue #17512: [SPARK-20196][PYTHON][SQL] update doc for catalog functi...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17512 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17512: [SPARK-20196][PYTHON][SQL] update doc for catalog functi...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17512 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/7/ Test PASSed. ---

[GitHub] spark issue #17512: [SPARK-20196][PYTHON][SQL] update doc for catalog functi...

2017-04-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17512 **[Test build #7 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/7/testReport)** for PR 17512 at commit

[GitHub] spark issue #17531: [SPARK-20217][core] Executor should not fail stage if ki...

2017-04-05 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/17531 LGTM pending Jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #17537: [SPARK-20204][SQL][Followup] SQLConf should react to cha...

2017-04-05 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17537 LGTM pending Jenkins --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #17531: [SPARK-20217][core] Executor should not fail stage if ki...

2017-04-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17531 **[Test build #75559 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75559/testReport)** for PR 17531 at commit

[GitHub] spark pull request #17544: [SPARK-20231] [SQL] Refactor star schema code for...

2017-04-05 Thread ioana-delaney
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/17544#discussion_r11001 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,339 +20,13 @@ package

[GitHub] spark issue #17531: [SPARK-20217][core] Executor should not fail stage if ki...

2017-04-05 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/17531 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark pull request #17544: [SPARK-20231] [SQL] Refactor star schema code for...

2017-04-05 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/17544#discussion_r110031650 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala --- @@ -20,339 +20,13 @@ package

[GitHub] spark issue #14617: [SPARK-17019][Core] Expose on-heap and off-heap memory u...

2017-04-05 Thread jsoltren
Github user jsoltren commented on the issue: https://github.com/apache/spark/pull/14617 This looks good to me. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17531: [SPARK-20217][core] Executor should not fail stage if ki...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17531 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17532: [SPARK-20214][ML] Make sure converted csc matrix has sor...

2017-04-05 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/17532 Btw, I'd really like to get this into 2.2, which will be cut soon. Let me know if you'd like me to take it over. Thanks! --- If your project is set up for it, you can reply to this email and

[GitHub] spark issue #17531: [SPARK-20217][core] Executor should not fail stage if ki...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17531 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75553/ Test FAILed. ---

[GitHub] spark issue #17531: [SPARK-20217][core] Executor should not fail stage if ki...

2017-04-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17531 **[Test build #75553 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75553/testReport)** for PR 17531 at commit

[GitHub] spark issue #17537: [SPARK-20204][SQL][Followup] SQLConf should react to cha...

2017-04-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17537 **[Test build #75558 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75558/testReport)** for PR 17537 at commit

[GitHub] spark pull request #17544: [SPARK-20231] [SQL] Refactor star schema code for...

2017-04-05 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/17544#discussion_r110029591 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/StarSchemaDetection.scala --- @@ -0,0 +1,351 @@ +/* + * Licensed

[GitHub] spark issue #17531: [SPARK-20217][core] Executor should not fail stage if ki...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17531 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75552/ Test FAILed. ---

[GitHub] spark issue #17531: [SPARK-20217][core] Executor should not fail stage if ki...

2017-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17531 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17531: [SPARK-20217][core] Executor should not fail stage if ki...

2017-04-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17531 **[Test build #75552 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75552/testReport)** for PR 17531 at commit

[GitHub] spark issue #17531: [SPARK-20217][core] Executor should not fail stage if ki...

2017-04-05 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/17531 I will leave it around a bit in case @JoshRosen has any further comments. Feel free to merge btw if you dont ! --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #17544: [SPARK-20231] [SQL] Refactor star schema code for the su...

2017-04-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17544 **[Test build #75557 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75557/testReport)** for PR 17544 at commit

[GitHub] spark pull request #17531: [SPARK-20217][core] Executor should not fail stag...

2017-04-05 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/17531#discussion_r110028203 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -432,7 +432,7 @@ private[spark] class Executor(

[GitHub] spark issue #17544: [SPARK-20231] [SQL] Refactor star schema code for the su...

2017-04-05 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17544 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #17544: [SPARK-20231] [SQL] Refactor star schema code for the su...

2017-04-05 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17544 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #17543: [SPARK-20230] FetchFailedExceptions should invalidate fi...

2017-04-05 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/17543 That JIRA is great. I'll close this PR for now and link my JIRA in there. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #17543: [SPARK-20230] FetchFailedExceptions should invali...

2017-04-05 Thread brkyvz
Github user brkyvz closed the pull request at: https://github.com/apache/spark/pull/17543 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #17543: [SPARK-20230] FetchFailedExceptions should invalidate fi...

2017-04-05 Thread kayousterhout
Github user kayousterhout commented on the issue: https://github.com/apache/spark/pull/17543 In theory (as you may know), the way this is supposed to work is that, since each reduce task reads the map outputs in random order, we delay re-scheduling the earlier stage, to try to

[GitHub] spark issue #17543: [SPARK-20230] FetchFailedExceptions should invalidate fi...

2017-04-05 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/17543 Let me try to draw a graph to better explain this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #17543: [SPARK-20230] FetchFailedExceptions should invalidate fi...

2017-04-05 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/17543 Yes, your explanation is on point. If I have 4+ executors that died, then all retries of Stage B will also eventually fail. If we didn't ignore these failures, we could have re-computed the outputs

  1   2   3   4   >