[GitHub] spark pull request #19106: [SPARK-21770][ML] ProbabilisticClassificationMode...

2017-10-10 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19106 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17359: [SPARK-20028][SQL] Add aggreagate expression nGrams

2017-10-10 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/17359 Sorry, but I think this is inactive. Thanks for your attention. @wzhfy @viirya @gatorsmile --- - To unsubscribe, e-mail: review

[GitHub] spark issue #19106: [SPARK-21770][ML] ProbabilisticClassificationModel fix c...

2017-10-10 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19106 Merged to master. I wasn't clear whether this was a pressing problem that needed to be backported. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #17359: [SPARK-20028][SQL] Add aggreagate expression nGra...

2017-10-10 Thread gczsjdy
Github user gczsjdy closed the pull request at: https://github.com/apache/spark/pull/17359 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19463: Cleanup comment in RDDSuite test

2017-10-10 Thread sohum2002
Github user sohum2002 closed the pull request at: https://github.com/apache/spark/pull/19463 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19463: Cleanup comment in RDDSuite test

2017-10-10 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19463 This comment seems valid. It's stating the question the test is trying to answer. I'd close this please, as it would be trivial even if valid --- ---

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143646526 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/pythonLogicalOperators.scala --- @@ -0,0 +1,43 @@ +/* + * Licensed

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143646922 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/pythonLogicalOperators.scala --- @@ -0,0 +1,43 @@ +/* + * Licensed

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18732 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18664 > Write Arrow data with SESSION_LOCAL timestamp (as is currently in this PR) BTW, could we just use `DateTimeUtils.defaultTimeZone()` instead of `SQLConf.SESSION_LOCAL_TIMEZONE` if you r

[GitHub] spark issue #18711: [SPARK-21506][DOC]The description of "spark.executor.cor...

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18711 **[Test build #82578 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82578/testReport)** for PR 18711 at commit [`440f936`](https://github.com/apache/spark/commit/44

[GitHub] spark issue #17968: [SPARK-9792] Make DenseMatrix equality semantical

2017-10-10 Thread gglanzani
Github user gglanzani commented on the issue: https://github.com/apache/spark/pull/17968 @WeichenXu123 Done. Let me know. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19082 Let me summarize recent interesting PRs for code generation regarding JVM bytecode limit for JIT compilation. These PRs encourages to apply JIT compilation to more methods since most of JIT compilers

[GitHub] spark pull request #17968: [SPARK-9792] Make DenseMatrix equality semantical

2017-10-10 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/17968#discussion_r143656820 --- Diff: python/pyspark/mllib/linalg/__init__.py --- @@ -1131,14 +1131,20 @@ def __getitem__(self, indices): return self.values[i + j

[GitHub] spark pull request #17968: [SPARK-9792] Make DenseMatrix equality semantical

2017-10-10 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/17968#discussion_r143656736 --- Diff: python/pyspark/ml/linalg/__init__.py --- @@ -976,14 +976,20 @@ def __getitem__(self, indices): return self.values[i + j * sel

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18664 I'd say I prefer 1, too. I'm just wondering what if we use timestamp in nested types. Currently we don't support nested types but in the future? ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 @kiszk Thanks for summarizing the PRs. I just have a question about inlining method by JIT compiler. So you mean JIT compiler will inline methods into larger unit and then do JIT compilation

[GitHub] spark issue #17968: [SPARK-9792] Make DenseMatrix equality semantical

2017-10-10 Thread gglanzani
Github user gglanzani commented on the issue: https://github.com/apache/spark/pull/17968 @WeichenXu123 Done again! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: r

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19218 **[Test build #82579 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82579/testReport)** for PR 19218 at commit [`90cbcb3`](https://github.com/apache/spark/commit/90

[GitHub] spark issue #19439: [SPARK-21866][ML][PySpark] Adding spark image reader

2017-10-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19439 I saw there are few images, just want to make sure, are those images are safe of license issue to be included in Spark? --- - To

[GitHub] spark pull request #19438: [SPARK-22208] [SQL] Improve percentile_approx by ...

2017-10-10 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19438#discussion_r143684083 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/QuantileSummariesSuite.scala --- @@ -58,7 +58,7 @@ class QuantileSummariesSuite ext

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r143684708 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/HadoopUtils.scala --- @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Software Foundation (

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r143685432 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/HadoopUtils.scala --- @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Software Foundation (

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r143686017 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/HadoopUtils.scala --- @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Software Foundation (

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r143686181 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/HadoopUtils.scala --- @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Software Foundation (

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r143686577 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/HadoopUtils.scala --- @@ -0,0 +1,107 @@ +/* + * Licensed to the Apache Software Foundation (

[GitHub] spark pull request #19464: Spark 22233

2017-10-10 Thread liutang123
GitHub user liutang123 opened a pull request: https://github.com/apache/spark/pull/19464 Spark 22233 ## What changes were proposed in this pull request? add spark.hadoop.filterOutEmptySplit confituration to allow user to filter out empty split in HadoopRDD. You can merge this p

[GitHub] spark issue #19464: Spark 22233

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19464 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r143687378 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala --- @@ -0,0 +1,217 @@ +/* + * Licensed to the Apache Software Foundation (

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r143688286 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala --- @@ -0,0 +1,217 @@ +/* + * Licensed to the Apache Software Foundation (

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r143688666 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala --- @@ -0,0 +1,217 @@ +/* + * Licensed to the Apache Software Foundation (

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r143689246 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala --- @@ -0,0 +1,217 @@ +/* + * Licensed to the Apache Software Foundation (

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r143689831 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala --- @@ -0,0 +1,217 @@ +/* + * Licensed to the Apache Software Foundation (

[GitHub] spark pull request #19439: [SPARK-21866][ML][PySpark] Adding spark image rea...

2017-10-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19439#discussion_r143689721 --- Diff: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala --- @@ -0,0 +1,217 @@ +/* + * Licensed to the Apache Software Foundation (

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19218 **[Test build #82580 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82580/testReport)** for PR 19218 at commit [`dd6d635`](https://github.com/apache/spark/commit/dd

[GitHub] spark issue #19464: Spark 22233

2017-10-10 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19464 Could you please update the title of this PR appropriately? e.g. `[SPARK-22233][core] ...` --- - To unsubscribe, e-mail: reviews-u

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19218 **[Test build #82579 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82579/testReport)** for PR 19218 at commit [`90cbcb3`](https://github.com/apache/spark/commit/9

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19218 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82579/ Test FAILed. ---

[GitHub] spark issue #16648: [SPARK-18016][SQL][CATALYST] Code Generation: Constant P...

2017-10-10 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/16648 @bdrillard Thank you very much --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19218 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #18711: [SPARK-21506][DOC]The description of "spark.executor.cor...

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18711 **[Test build #82578 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82578/testReport)** for PR 18711 at commit [`440f936`](https://github.com/apache/spark/commit/4

[GitHub] spark issue #18711: [SPARK-21506][DOC]The description of "spark.executor.cor...

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18711 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82578/ Test PASSed. ---

[GitHub] spark issue #18711: [SPARK-21506][DOC]The description of "spark.executor.cor...

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18711 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #6751: [SPARK-8300] DataFrame hint for broadcast join.

2017-10-10 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/6751 @rxin @marmbrus Is there another way to broadcast table with the spark-sql now, except by `spark.sql.autoBroadcastJoinThreshold`? And if no, is it a good way to broadcast table by user con

[GitHub] spark issue #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19337 **[Test build #82581 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82581/testReport)** for PR 19337 at commit [`7814968`](https://github.com/apache/spark/commit/78

[GitHub] spark issue #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-10 Thread mpjlu
Github user mpjlu commented on the issue: https://github.com/apache/spark/pull/19337 Thanks, @hhbyyh. I will create a JIRA for python API --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #17968: [SPARK-9792] Make DenseMatrix equality semantical

2017-10-10 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/17968#discussion_r143702719 --- Diff: python/pyspark/mllib/linalg/__init__.py --- @@ -1131,14 +1131,17 @@ def __getitem__(self, indices): return self.values[i + j

[GitHub] spark pull request #17968: [SPARK-9792] Make DenseMatrix equality semantical

2017-10-10 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/17968#discussion_r143702830 --- Diff: python/pyspark/ml/linalg/__init__.py --- @@ -976,14 +976,18 @@ def __getitem__(self, indices): return self.values[i + j * sel

[GitHub] spark issue #17968: [SPARK-9792] Make DenseMatrix equality semantical

2017-10-10 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17968 @gatorsmile Add this to white list! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comman

[GitHub] spark pull request #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-10 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19337#discussion_r143704472 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/LDASuite.scala --- @@ -119,6 +121,8 @@ class LDASuite extends SparkFunSuite with MLlibTestSp

[GitHub] spark pull request #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-10 Thread mpjlu
Github user mpjlu commented on a diff in the pull request: https://github.com/apache/spark/pull/19337#discussion_r143705102 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/LDASuite.scala --- @@ -119,6 +121,8 @@ class LDASuite extends SparkFunSuite with MLlibTestSpark

[GitHub] spark pull request #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-10 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19337#discussion_r143705705 --- Diff: mllib/src/test/scala/org/apache/spark/ml/clustering/LDASuite.scala --- @@ -119,6 +121,8 @@ class LDASuite extends SparkFunSuite with MLlibTestSp

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-10-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17819#discussion_r143706308 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -684,6 +684,34 @@ class DataFrameSuite extends QueryTest with SharedSQLCo

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-10-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/17819#discussion_r143707170 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -684,6 +684,34 @@ class DataFrameSuite extends QueryTest with SharedSQLCo

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-10 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/17819 Yes, fair enough On Tue, 10 Oct 2017 at 14:09 Liang-Chi Hsieh wrote: > *@viirya* commented on this pull request. > -- > > In sql/core/src/

[GitHub] spark issue #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19337 **[Test build #82582 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82582/testReport)** for PR 19337 at commit [`6a3c6a6`](https://github.com/apache/spark/commit/6a

[GitHub] spark issue #18711: [SPARK-21506][DOC]The description of "spark.executor.cor...

2017-10-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18711 LGTM, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: re

[GitHub] spark pull request #18711: [SPARK-21506][DOC]The description of "spark.execu...

2017-10-10 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18711 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19337 **[Test build #82581 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82581/testReport)** for PR 19337 at commit [`7814968`](https://github.com/apache/spark/commit/7

[GitHub] spark issue #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19337 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19337 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82581/ Test PASSed. ---

[GitHub] spark issue #17357: [SPARK-20025][CORE] Ignore SPARK_LOCAL* env, while deplo...

2017-10-10 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17357 LGTM, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: re

[GitHub] spark pull request #17357: [SPARK-20025][CORE] Ignore SPARK_LOCAL* env, whil...

2017-10-10 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17357 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19363: [SPARK-22224][Minor]Override toString of KeyValue...

2017-10-10 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19363#discussion_r143717543 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala --- @@ -564,4 +565,30 @@ class KeyValueGroupedDataset[K, V] private[sq

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19218 **[Test build #82580 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82580/testReport)** for PR 19218 at commit [`dd6d635`](https://github.com/apache/spark/commit/d

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19218 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82580/ Test PASSed. ---

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19218 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #82583 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82583/testReport)** for PR 17819 at commit [`1889995`](https://github.com/apache/spark/commit/18

[GitHub] spark pull request #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-10 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19337#discussion_r143720272 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -573,7 +584,8 @@ private[clustering] object OnlineLDAOptimizer {

[GitHub] spark pull request #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-10 Thread mpjlu
Github user mpjlu commented on a diff in the pull request: https://github.com/apache/spark/pull/19337#discussion_r143723408 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -573,7 +584,8 @@ private[clustering] object OnlineLDAOptimizer {

[GitHub] spark issue #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19337 **[Test build #82582 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82582/testReport)** for PR 19337 at commit [`6a3c6a6`](https://github.com/apache/spark/commit/6

[GitHub] spark issue #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19337 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19337 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82582/ Test PASSed. ---

[GitHub] spark pull request #19020: [SPARK-3181] [ML] Implement huber loss for Linear...

2017-10-10 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19020#discussion_r143726258 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala --- @@ -744,11 +754,20 @@ object LinearRegressionModel extends ML

[GitHub] spark pull request #19020: [SPARK-3181] [ML] Implement huber loss for Linear...

2017-10-10 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19020#discussion_r143727489 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/aggregator/HuberAggregator.scala --- @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #19020: [SPARK-3181] [ML] Implement huber loss for Linear...

2017-10-10 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/19020#discussion_r143727850 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala --- @@ -208,6 +292,26 @@ class LinearRegression @Since("1.3.0") (

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-10-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17819#discussion_r143712667 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala --- @@ -108,26 +173,53 @@ final class Bucketizer @Since("1.4.0") (@Since("1.4.0"

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-10-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17819#discussion_r143713315 --- Diff: examples/src/main/java/org/apache/spark/examples/ml/JavaBucketizerExample.java --- @@ -33,6 +33,13 @@ import org.apache.spark.sql.types.Struc

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-10-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17819#discussion_r143723205 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala --- @@ -96,9 +99,71 @@ final class Bucketizer @Since("1.4.0") (@Since("1.4.0") o

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-10-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17819#discussion_r143730685 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala --- @@ -24,20 +24,23 @@ import org.apache.spark.annotation.Since import org.

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-10-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17819#discussion_r143722947 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/BucketizerSuite.scala --- @@ -187,6 +188,196 @@ class BucketizerSuite extends SparkFunSuite with

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-10-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17819#discussion_r143708258 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala --- @@ -24,20 +24,23 @@ import org.apache.spark.annotation.Since import org.

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-10-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17819#discussion_r143730302 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/BucketizerSuite.scala --- @@ -187,6 +188,196 @@ class BucketizerSuite extends SparkFunSuite with

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-10-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17819#discussion_r143710289 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala --- @@ -96,9 +99,71 @@ final class Bucketizer @Since("1.4.0") (@Since("1.4.0") o

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-10-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17819#discussion_r143724681 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/BucketizerSuite.scala --- @@ -187,6 +188,196 @@ class BucketizerSuite extends SparkFunSuite with

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-10-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17819#discussion_r143728145 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala --- @@ -96,9 +99,71 @@ final class Bucketizer @Since("1.4.0") (@Since("1.4.0") o

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-10-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17819#discussion_r143724079 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/BucketizerSuite.scala --- @@ -187,6 +188,196 @@ class BucketizerSuite extends SparkFunSuite with

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-10-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17819#discussion_r143728324 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/BucketizerSuite.scala --- @@ -187,6 +188,196 @@ class BucketizerSuite extends SparkFunSuite with

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-10-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17819#discussion_r143730103 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/BucketizerSuite.scala --- @@ -187,6 +188,196 @@ class BucketizerSuite extends SparkFunSuite with

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-10-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17819#discussion_r143728663 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/BucketizerSuite.scala --- @@ -187,6 +188,196 @@ class BucketizerSuite extends SparkFunSuite with

[GitHub] spark pull request #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer t...

2017-10-10 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17819#discussion_r143733664 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala --- @@ -96,9 +99,71 @@ final class Bucketizer @Since("1.4.0") (@Since("1.4.0") o

[GitHub] spark issue #19270: [SPARK-21809] : Change Stage Page to use datatables to s...

2017-10-10 Thread pgandhi999
Github user pgandhi999 commented on the issue: https://github.com/apache/spark/pull/19270 @ajbozarth I do not quite understand what you are saying. Everything seems to be working fine on my test setup. Can you please let me know how do I replicate the issue? Thank you. ---

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-10 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143740078 --- Diff: python/pyspark/sql/tests.py --- @@ -3376,6 +3376,151 @@ def test_vectorized_udf_empty_partition(self): res = df.select(f(col('id'))

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-10 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143740129 --- Diff: python/pyspark/sql/tests.py --- @@ -3376,6 +3376,151 @@ def test_vectorized_udf_empty_partition(self): res = df.select(f(col('id'))

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-10 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143740157 --- Diff: python/pyspark/sql/tests.py --- @@ -3376,6 +3376,151 @@ def test_vectorized_udf_empty_partition(self): res = df.select(f(col('id'))

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-10 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143740636 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -519,3 +519,4 @@ case class CoGroup( outp

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-10 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143740773 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/pythonLogicalOperators.scala --- @@ -0,0 +1,43 @@ +/* + * Lice

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-10 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143740882 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/pythonLogicalOperators.scala --- @@ -0,0 +1,43 @@ +/* + * Lice

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-10 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143741944 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala --- @@ -44,14 +73,18 @@ case class ArrowEvalPythonExec

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-10 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143744197 --- Diff: python/pyspark/sql/functions.py --- @@ -2181,30 +2187,66 @@ def udf(f=None, returnType=StringType()): @since(2.3) def pandas_udf(f=Non

  1   2   3   4   5   6   >