[GitHub] spark issue #15516: [SPARK-17961][SparkR][SQL] Add storageLevel to DataFrame...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15516 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67078/ Test FAILed. ---

[GitHub] spark issue #11105: [SPARK-12469][CORE] Data Property accumulators for Spark

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/11105 **[Test build #67068 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67068/consoleFull)** for PR 11105 at commit [`e027d53`](https://github.com/apache/spark/commit/e

[GitHub] spark issue #15516: [SPARK-17961][SparkR][SQL] Add storageLevel to DataFrame...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15516 **[Test build #67078 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67078/consoleFull)** for PR 15516 at commit [`4be3e5f`](https://github.com/apache/spark/commit/

[GitHub] spark issue #14847: [SPARK-17254][SQL] Add StopAfter physical plan for the f...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14847 **[Test build #67072 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67072/consoleFull)** for PR 14847 at commit [`141dc51`](https://github.com/apache/spark/commit/1

[GitHub] spark pull request #11105: [SPARK-12469][CORE] Data Property accumulators fo...

2016-10-17 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/11105#discussion_r83690354 --- Diff: core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala --- @@ -136,15 +179,76 @@ abstract class AccumulatorV2[IN, OUT] extends Serializable

[GitHub] spark issue #15497: [Test][SPARK-16002][Follow-up] Fix flaky test in Streami...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15497 **[Test build #67074 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67074/consoleFull)** for PR 15497 at commit [`7ae7782`](https://github.com/apache/spark/commit/7

[GitHub] spark issue #15481: [SPARK-17929] [CORE] Fix deadlock when CoarseGrainedSche...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15481 **[Test build #67067 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67067/consoleFull)** for PR 15481 at commit [`2997ccb`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #15417: [SPARK-17851][SQL][TESTS] Make sure all test sqls in cat...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15417 **[Test build #67077 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67077/consoleFull)** for PR 15417 at commit [`005ff36`](https://github.com/apache/spark/commit/0

[GitHub] spark issue #14650: [SPARK-17062][MESOS] add conf option to mesos dispatcher

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14650 **[Test build #67066 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67066/consoleFull)** for PR 14650 at commit [`c322c27`](https://github.com/apache/spark/commit/c

[GitHub] spark issue #15410: [SPARK-17843][Web UI] Indicate event logs pending for pr...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15410 **[Test build #67070 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67070/consoleFull)** for PR 15410 at commit [`b43e241`](https://github.com/apache/spark/commit/b

[GitHub] spark issue #15515: [SPARK-17970][SQL][WIP] store partition spec in metastor...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15515 **[Test build #67076 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67076/consoleFull)** for PR 15515 at commit [`ceac57b`](https://github.com/apache/spark/commit/c

[GitHub] spark issue #15516: [SPARK-17961][SparkR][SQL] Add storageLevel to DataFrame...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15516 **[Test build #67078 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67078/consoleFull)** for PR 15516 at commit [`4be3e5f`](https://github.com/apache/spark/commit/4

[GitHub] spark issue #15514: [SPARK-17960][PySpark] [Upgrade to Py4J 0.10.4]

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15514 **[Test build #67075 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67075/consoleFull)** for PR 15514 at commit [`70fa455`](https://github.com/apache/spark/commit/7

[GitHub] spark issue #15436: [SPARK-17875] [BUILD] Remove unneeded direct dependence ...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15436 **[Test build #3355 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3355/consoleFull)** for PR 15436 at commit [`f49f6a6`](https://github.com/apache/spark/commit/

[GitHub] spark pull request #15516: [SPARK-17961][SparkR][SQL] Add storageLevel to Da...

2016-10-17 Thread WeichenXu123
GitHub user WeichenXu123 opened a pull request: https://github.com/apache/spark/pull/15516 [SPARK-17961][SparkR][SQL] Add storageLevel to Dataset for SparkR ## What changes were proposed in this pull request? Add storageLevel to Dataset for SparkR. This is similar to thi

[GitHub] spark issue #15512: [SPARK-17930][CORE]The SerializerInstance instance used ...

2016-10-17 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/15512 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wish

[GitHub] spark pull request #11105: [SPARK-12469][CORE] Data Property accumulators fo...

2016-10-17 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/11105#discussion_r83689480 --- Diff: core/src/main/scala/org/apache/spark/rdd/ShuffledRDD.scala --- @@ -104,10 +105,26 @@ class ShuffledRDD[K: ClassTag, V: ClassTag, C: ClassTag](

[GitHub] spark issue #15512: [SPARK-17930][CORE]The SerializerInstance instance used ...

2016-10-17 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/15512 serializing will create buffers, but since these are only used for deserializing, I don't think there should even be any buffers created. I guess the time saved is all the registration which can be

[GitHub] spark pull request #11105: [SPARK-12469][CORE] Data Property accumulators fo...

2016-10-17 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/11105#discussion_r83687095 --- Diff: core/src/main/scala/org/apache/spark/rdd/MapPartitionsRDD.scala --- @@ -34,8 +35,29 @@ private[spark] class MapPartitionsRDD[U: ClassTag, T: Class

[GitHub] spark issue #15515: [SPARK-17970][SQL][WIP] store partition spec in metastor...

2016-10-17 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15515 Also cc @ericl --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #15274: [SPARK-17699] Support for parsing JSON string columns

2016-10-17 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15274 @DanielMe The best options for 1.6 are `get_json_object ` and `json_tuple` (their docs can be found at https://spark.apache.org/docs/1.6.0/api/scala/index.html#org.apache.spark.sql.functions$). ---

[GitHub] spark issue #15417: [SPARK-17851][SQL][TESTS] Make sure all test sqls in cat...

2016-10-17 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/15417 @gatorsmile I've updated all testcases that failed checkAnalysis in Optimizer related testsuites, for those testcases for Parser/Analyzer we don't require them to pass checkAnalysis. Please have

[GitHub] spark pull request #15515: [SPARK-17970][SQL][WIP] store partition spec in m...

2016-10-17 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/15515 [SPARK-17970][SQL][WIP] store partition spec in metastore for data source table ## What changes were proposed in this pull request? We should follow hive table and also store partition s

[GitHub] spark issue #15515: [SPARK-17970][SQL][WIP] store partition spec in metastor...

2016-10-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15515 cc @yhuai --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if th

[GitHub] spark pull request #15394: [SPARK-17748][ML] One pass solver for Weighted Le...

2016-10-17 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/15394#discussion_r83671808 --- Diff: mllib/src/main/scala/org/apache/spark/ml/optim/NormalEquationSolver.scala --- @@ -0,0 +1,165 @@ +/* + * Licensed to the Apache Software Foun

[GitHub] spark issue #15497: [Test][SPARK-16002][Follow-up] Fix flaky test in Streami...

2016-10-17 Thread lw-lin
Github user lw-lin commented on the issue: https://github.com/apache/spark/pull/15497 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishe

[GitHub] spark pull request #15394: [SPARK-17748][ML] One pass solver for Weighted Le...

2016-10-17 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/15394#discussion_r83666140 --- Diff: mllib-local/src/test/scala/org/apache/spark/ml/linalg/BLASSuite.scala --- @@ -422,4 +422,49 @@ class BLASSuite extends SparkMLFunSuite { as

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r83662587 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -17,6 +17,8 @@ package org.apache.spark.ml.clustering +

[GitHub] spark issue #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/9 One minor comment, otherwise LGTM. Thanks a lot @yinxusen and reviewers. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your projec

[GitHub] spark issue #15513: [WIP][SPARK-17963][SQL][Documentation] Add examples (ext...

2016-10-17 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15513 I suppose I don't know the conventions here well, but, the format looks better in your change, and more params documentation seems helpful. --- If your project is set up for it, you can reply to thi

[GitHub] spark issue #14650: [SPARK-17062][MESOS] add conf option to mesos dispatcher

2016-10-17 Thread skonto
Github user skonto commented on the issue: https://github.com/apache/spark/pull/14650 jenkins please test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, o

[GitHub] spark pull request #15514: [SPARK-17960][PySpark] [Upgrade to Py4J 0.10.4]

2016-10-17 Thread jagadeesanas2
GitHub user jagadeesanas2 opened a pull request: https://github.com/apache/spark/pull/15514 [SPARK-17960][PySpark] [Upgrade to Py4J 0.10.4] ## What changes were proposed in this pull request? 1) Upgrade the Py4J version on the Java side 2) Update the py4j src zip file we

[GitHub] spark issue #12004: [SPARK-7481] [build] Add spark-cloud module to pull in o...

2016-10-17 Thread steveloughran
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/12004 that's it warning that the manifest has changed. Which it has: there's now hadoop-azure, hadoop-openstack and hadoop-aws JARs on the CP, along with dependencies (amazon-aws SDK, microsoft-azur

[GitHub] spark issue #15513: [WIP][SPARK-17963][SQL][Documentation] Add examples (ext...

2016-10-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15513 @srowen I will definitely double check the changes here before proceeding further. Thank you for your review before getting this too big. I will proceed this in 1-2 days. Please let me know if t

[GitHub] spark pull request #15513: [WIP][SPARK-17963][SQL][Documentation] Add exampl...

2016-10-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15513#discussion_r83648848 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/collect.scala --- @@ -86,7 +86,13 @@ abstract class Collect ex

[GitHub] spark pull request #15513: [WIP][SPARK-17963][SQL][Documentation] Add exampl...

2016-10-17 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15513#discussion_r83644588 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/collect.scala --- @@ -109,7 +115,13 @@ case class CollectList(

[GitHub] spark pull request #15513: [WIP][SPARK-17963][SQL][Documentation] Add exampl...

2016-10-17 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15513#discussion_r83645262 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/collect.scala --- @@ -86,7 +86,13 @@ abstract class Collect extends

[GitHub] spark pull request #15513: [WIP][SPARK-17963][SQL][Documentation] Add exampl...

2016-10-17 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15513#discussion_r83644777 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CentralMomentAgg.scala --- @@ -194,8 +222,15 @@ case class Variance

[GitHub] spark pull request #15513: [WIP][SPARK-17963][SQL][Documentation] Add exampl...

2016-10-17 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15513#discussion_r83644370 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xml/xpath.scala --- @@ -121,8 +176,19 @@ case class XPathFloat(xml: Expressio

[GitHub] spark pull request #15513: [WIP][SPARK-17963][SQL][Documentation] Add exampl...

2016-10-17 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15513#discussion_r83644833 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/CentralMomentAgg.scala --- @@ -209,7 +244,14 @@ case class Skewness

[GitHub] spark issue #14847: [SPARK-17254][SQL] Add StopAfter physical plan for the f...

2016-10-17 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14847 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, o

[GitHub] spark issue #15513: [WIP][SPARK-17963][SQL][Documentation] Add examples (ext...

2016-10-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15513 I am sorry for asking repeatedly but this one is slightly different with the JIRA one. cc @rxin @srowen Could you please check if this one is preferable? I am worried to go further with

[GitHub] spark pull request #15513: [WIP][SPARK-17963][SQL][Documentation] Add exampl...

2016-10-17 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/15513 [WIP][SPARK-17963][SQL][Documentation] Add examples (extend) in each function and improve documentation with arguments ## What changes were proposed in this pull request? This PR propo

[GitHub] spark pull request #14847: [SPARK-17254][SQL] Add StopAfter physical plan fo...

2016-10-17 Thread viirya
GitHub user viirya reopened a pull request: https://github.com/apache/spark/pull/14847 [SPARK-17254][SQL] Add StopAfter physical plan for the filtering that can be stopped early ## What changes were proposed in this pull request? This is motivated by: From https:/

[GitHub] spark issue #12004: [SPARK-7481] [build] Add spark-cloud module to pull in o...

2016-10-17 Thread nchammas
Github user nchammas commented on the issue: https://github.com/apache/spark/pull/12004 @steveloughran - Is this message in the most recent build log critical? ``` Spark's published dependencies DO NOT MATCH the manifest file (dev/spark-deps). To update the manifest fi

[GitHub] spark issue #15410: [SPARK-17843][Web UI] Indicate event logs pending for pr...

2016-10-17 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/15410 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wis

[GitHub] spark issue #15410: [SPARK-17843][Web UI] Indicate event logs pending for pr...

2016-10-17 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/15410 can you update the description with latest implementation and screen shot. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your p

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-17 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/15377 Yup, that's what I mean. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wish

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-17 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15377 Yeah, you're saying all the checks can be done upfront? I agree. It will need a tiny bit of code duplication but is more robust. I think we could also potentially get rid of the return value from `se

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-17 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/15377 @srowen I'm not against this change, personally because the usage of flag is wired to me and frankly saying I haven't seen such pattern in the Spark code. Since we want to avoid re-executi

[GitHub] spark pull request #11105: [SPARK-12469][CORE] Data Property accumulators fo...

2016-10-17 Thread holdenk
Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/11105#discussion_r83623396 --- Diff: core/src/main/scala/org/apache/spark/TaskContextImpl.scala --- @@ -126,4 +126,14 @@ private[spark] class TaskContextImpl( taskMetrics.regis

[GitHub] spark pull request #15376: [SPARK-17796][SQL] Support wildcard character in ...

2016-10-17 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15376#discussion_r83621049 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -246,7 +247,28 @@ case class LoadDataCommand( val loadPa

[GitHub] spark pull request #15376: [SPARK-17796][SQL] Support wildcard character in ...

2016-10-17 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15376#discussion_r83622009 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -1886,6 +1887,37 @@ class SQLQuerySuite extends QueryTest wi

[GitHub] spark pull request #15376: [SPARK-17796][SQL] Support wildcard character in ...

2016-10-17 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15376#discussion_r83621898 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -246,7 +247,28 @@ case class LoadDataCommand( val loadPa

[GitHub] spark issue #15512: [SPARK-17930][CORE]The SerializerInstance instance used ...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15512 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15512: [SPARK-17930][CORE]The SerializerInstance instance used ...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15512 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67063/ Test FAILed. ---

[GitHub] spark issue #15512: [SPARK-17930][CORE]The SerializerInstance instance used ...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15512 **[Test build #67063 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67063/consoleFull)** for PR 15512 at commit [`037871d`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15376: [SPARK-17796][SQL] Support wildcard character in filenam...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15376 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67064/ Test PASSed. ---

[GitHub] spark issue #15376: [SPARK-17796][SQL] Support wildcard character in filenam...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15376 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15492: [DO NOT MERGE][TEST] Testing flakiness of StreamingQuery...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15492 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15492: [DO NOT MERGE][TEST] Testing flakiness of StreamingQuery...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15492 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67057/ Test FAILed. ---

[GitHub] spark issue #15376: [SPARK-17796][SQL] Support wildcard character in filenam...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15376 **[Test build #67064 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67064/consoleFull)** for PR 15376 at commit [`401c4ee`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15492: [DO NOT MERGE][TEST] Testing flakiness of StreamingQuery...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15492 **[Test build #67057 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67057/consoleFull)** for PR 15492 at commit [`5bf08e6`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15481: [SPARK-17929] [CORE] Fix deadlock when CoarseGrainedSche...

2016-10-17 Thread scwf
Github user scwf commented on the issue: https://github.com/apache/spark/pull/15481 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14136 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14136 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67061/ Test PASSed. ---

[GitHub] spark issue #14136: [SPARK-16282][SQL] Implement percentile SQL function.

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14136 **[Test build #67061 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67061/consoleFull)** for PR 14136 at commit [`eb264d9`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15408: [SPARK-17839][CORE] Use Nio's directbuffer instead of Bu...

2016-10-17 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15408 Going once, going twice, any more comments? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.

2016-10-17 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15377 @jerryshao can I take your temperature about how against this change you are? maybe @lins05 can elaborate again on what this is preventing, and what happens in case of a race between two threads. Tha

[GitHub] spark issue #15450: [SPARK-3261] [MLLIB] KMeans clusterer can return duplica...

2016-10-17 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15450 @sethah I agree that when there are lots of unique points (>> k) then this is almost certain to not happen, and that's most real-world use cases, but the question indeed is what should happen when th

[GitHub] spark issue #15487: [SPARK-17940][SQL] Fixed a typo in LAST function and imp...

2016-10-17 Thread lins05
Github user lins05 commented on the issue: https://github.com/apache/spark/pull/15487 @HyukjinKwon thanks, I'll update the PR accordingly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #15505: [SPARK-17931][CORE] taskScheduler has some unneeded seri...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15505 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15505: [SPARK-17931][CORE] taskScheduler has some unneeded seri...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15505 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67056/ Test FAILed. ---

[GitHub] spark issue #15505: [SPARK-17931][CORE] taskScheduler has some unneeded seri...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15505 **[Test build #67056 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67056/consoleFull)** for PR 15505 at commit [`8a6062d`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15423: [SPARK-17860][SQL] SHOW COLUMN's database conflict check...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15423 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67060/ Test PASSed. ---

[GitHub] spark issue #15423: [SPARK-17860][SQL] SHOW COLUMN's database conflict check...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15423 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15423: [SPARK-17860][SQL] SHOW COLUMN's database conflict check...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15423 **[Test build #67060 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67060/consoleFull)** for PR 15423 at commit [`cb0691c`](https://github.com/apache/spark/commit/

[GitHub] spark pull request #15511: [SPARK-17969]I think it's user unfriendly to proc...

2016-10-17 Thread codlife
Github user codlife commented on a diff in the pull request: https://github.com/apache/spark/pull/15511#discussion_r83616031 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -240,16 +240,35 @@ class DataFrameReader private[sql](sparkSession: Spar

[GitHub] spark issue #15511: [SPARK-17969]I think it's user unfriendly to process sta...

2016-10-17 Thread codlife
Github user codlife commented on the issue: https://github.com/apache/spark/pull/15511 @srowen , you are right! I propose this method just to make it more user friendly, With this method, user can load a standard json file directly. --- If your project is set up for it, you can repl

[GitHub] spark issue #14650: [SPARK-17062][MESOS] add conf option to mesos dispatcher

2016-10-17 Thread skonto
Github user skonto commented on the issue: https://github.com/apache/spark/pull/14650 @vanzin & @srowen pls review. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wi

[GitHub] spark issue #15274: [SPARK-17699] Support for parsing JSON string columns

2016-10-17 Thread DanielMe
Github user DanielMe commented on the issue: https://github.com/apache/spark/pull/15274 Is there any workaround I can use to achieve a similar effect in 1.6? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #15511: [SPARK-17969]I think it's user unfriendly to process sta...

2016-10-17 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15511 OK, I think in both cases "standard" JSON is read, and in both cases, each record is a JSON document. These aren't different cases. If you mean to read small JSON files as records, you just use whole

[GitHub] spark issue #15376: [SPARK-17796][SQL] Support wildcard character in filenam...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15376 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67052/ Test FAILed. ---

[GitHub] spark issue #15376: [SPARK-17796][SQL] Support wildcard character in filenam...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15376 **[Test build #67052 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67052/consoleFull)** for PR 15376 at commit [`401c4ee`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15376: [SPARK-17796][SQL] Support wildcard character in filenam...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15376 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15512: [SPARK-17930][CORE]The SerializerInstance instance used ...

2016-10-17 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15512 Hm, if the benchmark you give generalizes much that is certainly compelling. I guess I'm surprised that instantiating the object can be so expensive relative to deserialization since it just happens

[GitHub] spark pull request #15503: Fix example of tf_idf with minDocFreq

2016-10-17 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15503 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #15503: Fix example of tf_idf with minDocFreq

2016-10-17 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15503 Merged to master/2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #15302: [SPARK-17732][SQL] ALTER TABLE DROP PARTITION should sup...

2016-10-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/15302 Hi, @hvanhovell . When using `Expression`, I faced two situations. - `checkAnalysis` raises exceptions because the column is unresolved, e.g., `country` is unresolved. - As a w

[GitHub] spark issue #15511: [SPARK-17969]I think it's user unfriendly to process sta...

2016-10-17 Thread codlife
Github user codlife commented on the issue: https://github.com/apache/spark/pull/15511 Compile is ok, but when we call show(), we will get a _corrupt_record, besides when we call select on this df, we will get an exception. --- If your project is set up for it, you can reply to this

[GitHub] spark issue #15316: [SPARK-17751] [SQL] Remove spark.sql.eagerAnalysis and O...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15316 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15316: [SPARK-17751] [SQL] Remove spark.sql.eagerAnalysis and O...

2016-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15316 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67058/ Test PASSed. ---

[GitHub] spark issue #15316: [SPARK-17751] [SQL] Remove spark.sql.eagerAnalysis and O...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15316 **[Test build #67058 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67058/consoleFull)** for PR 15316 at commit [`f082643`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15512: [SPARK-17930][CORE]The SerializerInstance instance used ...

2016-10-17 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15512 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #15511: [SPARK-17969]I think it's user unfriendly to process sta...

2016-10-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15511 BTW, I guess per-line JSON also complies a standard - https://tools.ietf.org/html/rfc7159#section-4. We should add a test, fix the title to summarise what the PR proposes and fill the PR descrip

[GitHub] spark issue #15511: [SPARK-17969]I think it's user unfriendly to process sta...

2016-10-17 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15511 I guess it'd be nicer if this PR resembles https://github.com/apache/spark/pull/14151 The suggested change is to read each JSON object per file which I guess we can share some codes in the P

[GitHub] spark issue #15512: The SerializerInstance instance used when deserializing ...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15512 **[Test build #67063 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67063/consoleFull)** for PR 15512 at commit [`037871d`](https://github.com/apache/spark/commit/0

[GitHub] spark issue #15376: [SPARK-17796][SQL] Support wildcard character in filenam...

2016-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15376 **[Test build #67064 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67064/consoleFull)** for PR 15376 at commit [`401c4ee`](https://github.com/apache/spark/commit/4

[GitHub] spark pull request #15512: The SerializerInstance instance used when deseria...

2016-10-17 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/15512 The SerializerInstance instance used when deserializing a TaskResult is not reused ## What changes were proposed in this pull request? The following code is called when the DirectTaskResult inst

[GitHub] spark pull request #11119: [SPARK-10780][ML] Add an initial model to kmeans

2016-10-17 Thread dbtsai
Github user dbtsai commented on a diff in the pull request: https://github.com/apache/spark/pull/9#discussion_r83600176 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -303,6 +312,20 @@ class KMeans @Since("1.5.0") ( @Since("1.5.0")

<    1   2   3   4   5   6   >