[GitHub] spark pull request #19459: [SPARK-20791][PYSPARK] Use Arrow to create Spark ...

2017-10-09 Thread BryanCutler
GitHub user BryanCutler opened a pull request: https://github.com/apache/spark/pull/19459 [SPARK-20791][PYSPARK] Use Arrow to create Spark DataFrame from Pandas ## What changes were proposed in this pull request? This change uses Arrow to optimize the creation of a Spark

[GitHub] spark issue #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-10-09 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19269 Several things to discuss: 1. Since Spark can't disable speculation during runtime, currently there is not much benefit to provide an interface for data source to disable speculation,

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19394 **[Test build #82558 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82558/testReport)** for PR 19394 at commit

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19394 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82555/ Test PASSed. ---

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19394 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19394 **[Test build #82555 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82555/testReport)** for PR 19394 at commit

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18853 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82554/ Test PASSed. ---

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18853 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18853 **[Test build #82554 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82554/testReport)** for PR 18853 at commit

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82553/ Test PASSed. ---

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82553 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82553/testReport)** for PR 18732 at commit

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19433 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82557/ Test FAILed. ---

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19433 **[Test build #82557 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82557/testReport)** for PR 19433 at commit

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19433 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19394 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19394 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82556/ Test FAILed. ---

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19394 **[Test build #82556 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82556/testReport)** for PR 19394 at commit

[GitHub] spark pull request #19447: [SPARK-22215][SQL] Add configuration to set the t...

2017-10-09 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19447#discussion_r143540779 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -279,11 +279,11 @@ class

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19433 **[Test build #82557 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82557/testReport)** for PR 19433 at commit

[GitHub] spark issue #19433: [SPARK-3162] [MLlib] Add local tree training for decisio...

2017-10-09 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/19433 add to whitelist --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-09 Thread ArtRand
Github user ArtRand commented on the issue: https://github.com/apache/spark/pull/19437 @susanxhuynh @skonto The secret-containing protos will be valid in Mesos 1.3 onwards, thus why the scheduler has that requirement. DC/OS with file-based secrets has Mesos 1.4 thus why we test it

[GitHub] spark pull request #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-10-09 Thread steveloughran
Github user steveloughran commented on a diff in the pull request: https://github.com/apache/spark/pull/19269#discussion_r143530841 --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/sources/v2/JavaSimpleWritableDataSource.java --- @@ -0,0 +1,297 @@ +/* + *

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143529694 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/Word2VecSuite.scala --- @@ -245,5 +508,28 @@ class Word2VecSuite extends SparkFunSuite

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143529286 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/Word2VecSuite.scala --- @@ -189,6 +305,136 @@ class Word2VecSuite extends SparkFunSuite

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143528339 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -171,20 +210,46 @@ final class Word2Vec @Since("1.4.0") (

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143528173 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -171,20 +210,46 @@ final class Word2Vec @Since("1.4.0") (

[GitHub] spark pull request #19447: [SPARK-22215][SQL] Add configuration to set the t...

2017-10-09 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19447#discussion_r143527821 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -279,11 +279,11 @@ class CodegenContext {

[GitHub] spark pull request #19458: [SPARK-22227][CORE] DiskBlockManager.getAllBlocks...

2017-10-09 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19458#discussion_r143527010 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala --- @@ -100,7 +102,9 @@ private[spark] class DiskBlockManager(conf: SparkConf,

[GitHub] spark issue #19433: [SPARK-3162] [MLlib][WIP] Add local tree training for de...

2017-10-09 Thread smurching
Github user smurching commented on the issue: https://github.com/apache/spark/pull/19433 Thanks! I'll remove the WIP. To clear things up for the future, I'd thought [WIP] was the appropriate tag for a PR that's ready for review but not ready to be merged (based on

[GitHub] spark issue #19458: [SPARK-22227][CORE] DiskBlockManager.getAllBlocks now to...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19458 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19458: [SPARK-22227][CORE] DiskBlockManager.getAllBlocks...

2017-10-09 Thread superbobry
GitHub user superbobry opened a pull request: https://github.com/apache/spark/pull/19458 [SPARK-7][CORE] DiskBlockManager.getAllBlocks now tolerates temp files ## What changes were proposed in this pull request? Prior to this commit getAllBlocks implicitly assumed that

[GitHub] spark pull request #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-10-09 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/19269#discussion_r143524057 --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/sources/v2/JavaSimpleWritableDataSource.java --- @@ -0,0 +1,297 @@ +/* + * Licensed to

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-09 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 @HyukjinKwon and @ueshin so with Arrow, the Pandas DataFrame from `toPandas()` timestamp columns will not have a timezone - are we going to do the same thing for `pandas_udf` Series? I was

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19394 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19266: [SPARK-22033][CORE] BufferHolder, other size checks shou...

2017-10-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19266 @srowen Thanks! @liufengdb Could you submit a separate PR to fix the issues and also please include the test cases? ---

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-09 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19061 Hi, @vanzin and @jerryshao . Could you review this `ConsoleProgressBar` issue when you have some time? --- - To

[GitHub] spark issue #18460: [SPARK-21247][SQL] Type comparison should respect case-s...

2017-10-09 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/18460 Gentle ping~, @gatorsmile . :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19394 **[Test build #82556 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82556/testReport)** for PR 19394 at commit

[GitHub] spark issue #19439: [SPARK-21866][ML][PySpark] Adding spark image reader

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19439 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82552/ Test PASSed. ---

[GitHub] spark issue #19439: [SPARK-21866][ML][PySpark] Adding spark image reader

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19439 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19439: [SPARK-21866][ML][PySpark] Adding spark image reader

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19439 **[Test build #82552 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82552/testReport)** for PR 19439 at commit

[GitHub] spark pull request #19394: [SPARK-22170][SQL] Reduce memory consumption in b...

2017-10-09 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/19394#discussion_r143517522 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala --- @@ -274,19 +274,26 @@ abstract class SparkPlan extends

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19394 **[Test build #82555 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82555/testReport)** for PR 19394 at commit

[GitHub] spark pull request #19394: [SPARK-22170][SQL] Reduce memory consumption in b...

2017-10-09 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/19394#discussion_r143517490 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala --- @@ -274,19 +274,26 @@ abstract class SparkPlan extends

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143516772 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -106,6 +106,45 @@ private[feature] trait Word2VecBase extends Params

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143516595 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -106,6 +106,45 @@ private[feature] trait Word2VecBase extends Params

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143516496 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -106,6 +106,45 @@ private[feature] trait Word2VecBase extends Params

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-09 Thread shubhamchopra
Github user shubhamchopra commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143516384 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala --- @@ -106,6 +106,45 @@ private[feature] trait Word2VecBase extends Params

[GitHub] spark pull request #19269: [SPARK-22026][SQL][WIP] data source v2 write path

2017-10-09 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19269#discussion_r143514294 --- Diff: sql/core/src/test/java/test/org/apache/spark/sql/sources/v2/JavaSimpleWritableDataSource.java --- @@ -0,0 +1,297 @@ +/* + * Licensed

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18853 **[Test build #82554 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82554/testReport)** for PR 18853 at commit

[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on the issue: https://github.com/apache/spark/pull/19374 @skonto One more question: in your screen shot of the History Server, I noticed the "Completed" time is 1969-12-31 for all the drivers (the original one, retry-1, and retry-2). Is that to be

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-09 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143507748 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -519,3 +519,18 @@ case class CoGroup(

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82553 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82553/testReport)** for PR 18732 at commit

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-09 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143506845 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,69 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col)

[GitHub] spark issue #19106: [SPARK-21770][ML] ProbabilisticClassificationModel fix c...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19106 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82551/ Test PASSed. ---

[GitHub] spark issue #19106: [SPARK-21770][ML] ProbabilisticClassificationModel fix c...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19106 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19106: [SPARK-21770][ML] ProbabilisticClassificationModel fix c...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19106 **[Test build #82551 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82551/testReport)** for PR 19106 at commit

[GitHub] spark issue #19457: [SPARK] Misleading error message for missing --proxy-use...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19457 **[Test build #3946 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3946/testReport)** for PR 19457 at commit

[GitHub] spark issue #19439: [SPARK-21866][ML][PySpark] Adding spark image reader

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19439 **[Test build #82552 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82552/testReport)** for PR 19439 at commit

[GitHub] spark issue #19363: [SPARK-22224][Minor]Override toString of KeyValue/Relati...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19363 **[Test build #82549 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82549/testReport)** for PR 19363 at commit

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855][SQL] Added flatten functions ...

2017-10-09 Thread sohum2002
Github user sohum2002 commented on the issue: https://github.com/apache/spark/pull/19454 @HyukjinKwon - Thank you for your comments and analysis of this PR. I will also try to improve the `flatMap(identity)` as mentioned by @srowen. Also, will add a python implementation. ---

[GitHub] spark issue #19363: [SPARK-22224][Minor]Override toString of KeyValue/Relati...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19363 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19363: [SPARK-22224][Minor]Override toString of KeyValue/Relati...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19363 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82549/ Test PASSed. ---

[GitHub] spark pull request #19438: [SPARK-22208] [SQL] Improve percentile_approx by ...

2017-10-09 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19438#discussion_r143492975 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/QuantileSummariesSuite.scala --- @@ -58,7 +58,7 @@ class

[GitHub] spark pull request #19374: [SPARK-22145][MESOS] fix supervise with checkpoin...

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request: https://github.com/apache/spark/pull/19374#discussion_r143344688 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala --- @@ -374,6 +375,15 @@

[GitHub] spark pull request #19374: [SPARK-22145][MESOS] fix supervise with checkpoin...

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request: https://github.com/apache/spark/pull/19374#discussion_r143484031 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala --- @@ -804,45 +814,52 @@

[GitHub] spark pull request #19374: [SPARK-22145][MESOS] fix supervise with checkpoin...

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request: https://github.com/apache/spark/pull/19374#discussion_r143487275 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala --- @@ -276,8 +276,8 @@

[GitHub] spark pull request #19374: [SPARK-22145][MESOS] fix supervise with checkpoin...

2017-10-09 Thread susanxhuynh
Github user susanxhuynh commented on a diff in the pull request: https://github.com/apache/spark/pull/19374#discussion_r143361887 --- Diff: resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosClusterScheduler.scala --- @@ -276,8 +276,8 @@

[GitHub] spark issue #16648: [SPARK-18016][SQL][CATALYST] Code Generation: Constant P...

2017-10-09 Thread bdrillard
Github user bdrillard commented on the issue: https://github.com/apache/spark/pull/16648 I'm blocking out time to prepare the part 2 PR for this issue starting today over this week, regarding compaction of excess primitive state. cc: @kiszk ---

[GitHub] spark issue #19106: [SPARK-21770][ML] ProbabilisticClassificationModel fix c...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19106 **[Test build #82551 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82551/testReport)** for PR 19106 at commit

[GitHub] spark pull request #19438: [SPARK-22208] [SQL] Improve percentile_approx by ...

2017-10-09 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/19438#discussion_r143481784 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameStatSuite.scala --- @@ -157,21 +157,21 @@ class DataFrameStatSuite extends QueryTest with

[GitHub] spark pull request #19438: [SPARK-22208] [SQL] Improve percentile_approx by ...

2017-10-09 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/19438#discussion_r143481416 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/QuantileSummariesSuite.scala --- @@ -58,7 +58,7 @@ class QuantileSummariesSuite

[GitHub] spark pull request #19438: [SPARK-22208] [SQL] Improve percentile_approx by ...

2017-10-09 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/19438#discussion_r143480931 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/QuantileSummariesSuite.scala --- @@ -58,7 +58,7 @@ class QuantileSummariesSuite

[GitHub] spark issue #17968: [SPARK-9792] Make DenseMatrix equality semantical

2017-10-09 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17968 @gglanzani And you the `ml.linalg.DenseMatrix` looks have the same bug. Can you also update it ? --- - To unsubscribe,

[GitHub] spark issue #17968: [SPARK-9792] Make DenseMatrix equality semantical

2017-10-09 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17968 ping @gglanzani This bug need fixed ASAP. Can you update code when you're free ? Thanks. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #17968: [SPARK-9792] Make DenseMatrix equality semantical

2017-10-09 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/17968#discussion_r143465176 --- Diff: python/pyspark/mllib/linalg/__init__.py --- @@ -1131,14 +1131,21 @@ def __getitem__(self, indices): return self.values[i + j

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-10-09 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19222 ping for review @hvanhovell @tejasapatil --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19250: [SPARK-12297] Table timezone correction for Times...

2017-10-09 Thread zivanfi
Github user zivanfi commented on a diff in the pull request: https://github.com/apache/spark/pull/19250#discussion_r143462649 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -266,6 +267,10 @@ final class DataFrameWriter[T] private[sql](ds:

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855 ][SQL] Added flatten functions...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19454 BTW, for the answer to https://github.com/apache/spark/pull/19454#issuecomment-335138642, I think you should take a look at, for example, `flatMap` as a reference in `rdd.py` and related

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855 ][SQL] Added flatten functions...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19454 I think @srowen requested to fix it in a more performant way as well, for example, referring https://github.com/apache/spark/pull/16276, if I understood correctly and otherwise closing it.

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-10-09 Thread DaimonPl
Github user DaimonPl commented on the issue: https://github.com/apache/spark/pull/16578 @mallman @viirya from my understanding current workaround is for case when reading columns which are not in file schema > Parquet-mr will throw an exception if we try to read a superset of

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855 ][SQL] Added flatten functions...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19454 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855 ][SQL] Added flatten functions...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19454 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82550/ Test FAILed. ---

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855 ][SQL] Added flatten functions...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19454 **[Test build #82550 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82550/testReport)** for PR 19454 at commit

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855 ][SQL] Added flatten functions...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19454 **[Test build #82550 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82550/testReport)** for PR 19454 at commit

[GitHub] spark issue #19363: [SPARK-22224][Minor]Override toString of KeyValue/Relati...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19363 **[Test build #82549 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82549/testReport)** for PR 19363 at commit

[GitHub] spark issue #19457: [SPARK] Misleading error message for missing --proxy-use...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19457 **[Test build #3946 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3946/testReport)** for PR 19457 at commit

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855 ][SQL] Added flatten functions...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19454 Let's fix up the PR title from `[SPARK-18855 ][SQL]` to `[SPARK-18855][SQL]` BTW. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19454: [SPARK-22152][SPARK-18855 ][SQL] Added flatten functions...

2017-10-09 Thread sohum2002
Github user sohum2002 commented on the issue: https://github.com/apache/spark/pull/19454 Would appreciate some help in the Python implementation of the `flatten` function as I have never used pyspark. Could someone help me out? ---

[GitHub] spark issue #19457: [SPARK] Misleading error message for missing --proxy-use...

2017-10-09 Thread pavel-sakun
Github user pavel-sakun commented on the issue: https://github.com/apache/spark/pull/19457 Not aware ATM, this one handles missing value for args expecting one. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19106: [SPARK-21770][ML] ProbabilisticClassificationMode...

2017-10-09 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19106#discussion_r143445232 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/ProbabilisticClassifier.scala --- @@ -230,21 +230,22 @@ private[ml] object

[GitHub] spark pull request #19106: [SPARK-21770][ML] ProbabilisticClassificationMode...

2017-10-09 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19106#discussion_r143445323 --- Diff: mllib/src/main/scala/org/apache/spark/ml/classification/ProbabilisticClassifier.scala --- @@ -230,21 +230,22 @@ private[ml] object

[GitHub] spark issue #19457: [SPARK] Misleading error message

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19457 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19457: [SPARK] Misleading error message

2017-10-09 Thread pavel-sakun
GitHub user pavel-sakun opened a pull request: https://github.com/apache/spark/pull/19457 [SPARK] Misleading error message Fix misleading error message when argument is expected. ## What changes were proposed in this pull request? Change message to be accurate.

[GitHub] spark issue #19363: [SPARK-22224][Minor]Override toString of KeyValue/Relati...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19363 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19363: [SPARK-22224][Minor]Override toString of KeyValue/Relati...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19363 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82548/ Test PASSed. ---

[GitHub] spark issue #19363: [SPARK-22224][Minor]Override toString of KeyValue/Relati...

2017-10-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19363 **[Test build #82548 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82548/testReport)** for PR 19363 at commit

[GitHub] spark pull request #19077: [SPARK-21860][core]Improve memory reuse for heap ...

2017-10-09 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/19077#discussion_r143439369 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala --- @@ -116,9 +116,10 @@ private [sql] object

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19424 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82547/ Test PASSed. ---

<    1   2   3   4   >