[GitHub] spark issue #16997: Updated the SQL programming guide to explain about the E...

2017-02-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16997 Could you fix the PR title too while you are online maybe? It might be nice to have a good title for both a commit log and those who like to track down the history. --- If your project is set

[GitHub] spark pull request #15125: [SPARK-5484][GraphX] Periodically do checkpoint i...

2017-02-21 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/15125#discussion_r102168380 --- Diff: docs/graphx-programming-guide.md --- @@ -708,7 +708,9 @@ messages remaining. > messaging function. These constraints allow additional

[GitHub] spark issue #16971: [SPARK-19573][SQL] Make NaN/null handling consistent in ...

2017-02-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16971 **[Test build #73211 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73211/testReport)** for PR 16971 at commit

[GitHub] spark issue #16997: Updated the SQL programming guide to explain about the E...

2017-02-21 Thread HarshSharma8
Github user HarshSharma8 commented on the issue: https://github.com/apache/spark/pull/16997 I updated the content with a demo object. I would appreciate if anyone can have a look at this. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request #16971: [SPARK-19573][SQL] Make NaN/null handling consist...

2017-02-21 Thread zhengruifeng
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/16971#discussion_r102167055 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameStatSuite.scala --- @@ -214,20 +214,29 @@ class DataFrameStatSuite extends QueryTest

[GitHub] spark issue #16976: [SPARK-19610][SQL] Support parsing multiline CSV files

2017-02-21 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16976 I see, fair enough. cc @falaki --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17014: [SPARK-18608][ML][WIP] Fix double-caching in ML algorith...

2017-02-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17014 **[Test build #73209 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73209/testReport)** for PR 17014 at commit

[GitHub] spark issue #17014: [SPARK-18608][ML][WIP] Fix double-caching in ML algorith...

2017-02-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17014 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73209/ Test FAILed. ---

[GitHub] spark issue #17014: [SPARK-18608][ML][WIP] Fix double-caching in ML algorith...

2017-02-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17014 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #17013: [SPARK-19666][SQL] Improve error message for JavaBean wi...

2017-02-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17013 cc @cloud-fan who I saw in the related PR while tracking down the history. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #17013: [SPARK-19666][SQL] Improve error message for JavaBean wi...

2017-02-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17013 It seems ideally we should filter all out assuming from > // TODO: we should only collect properties that have getter and setter. However, some tests > // pass in scala case class

[GitHub] spark pull request #17013: [SPARK-19666][SQL] Improve error message for Java...

2017-02-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/17013#discussion_r102164904 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/JavaTypeInference.scala --- @@ -123,7 +123,11 @@ object JavaTypeInference {

[GitHub] spark issue #17013: [SPARK-19666][SQL] Improve error message for JavaBean wi...

2017-02-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17013 **[Test build #73210 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73210/testReport)** for PR 17013 at commit

[GitHub] spark issue #17014: [SPARK-18608][ML][WIP] Fix double-caching in ML algorith...

2017-02-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17014 **[Test build #73209 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73209/testReport)** for PR 17014 at commit

[GitHub] spark pull request #17014: [SPARK-18608][ML][WIP] Fix double-caching in ML a...

2017-02-21 Thread zhengruifeng
GitHub user zhengruifeng opened a pull request: https://github.com/apache/spark/pull/17014 [SPARK-18608][ML][WIP] Fix double-caching in ML algorithms ## What changes were proposed in this pull request? 1, For Predictors, use `train(dataset: Dataset[_], handlePersistence:

[GitHub] spark pull request #17013: [SPARK-19666][SQL] Improve error message for Java...

2017-02-21 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/17013 [SPARK-19666][SQL] Improve error message for JavaBean without getter ## What changes were proposed in this pull request? Currently, if we use a JavaBean without the getter as below:

[GitHub] spark issue #16977: [SPARK-19651][CORE] ParallelCollectionRDD.collect should...

2017-02-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16977 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16977: [SPARK-19651][CORE] ParallelCollectionRDD.collect should...

2017-02-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16977 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73208/ Test FAILed. ---

[GitHub] spark issue #16977: [SPARK-19651][CORE] ParallelCollectionRDD.collect should...

2017-02-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16977 **[Test build #73208 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73208/testReport)** for PR 16977 at commit

[GitHub] spark issue #16977: [SPARK-19651][CORE] ParallelCollectionRDD.collect should...

2017-02-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16977 **[Test build #73208 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73208/testReport)** for PR 16977 at commit

[GitHub] spark issue #17000: [SPARK-18946][ML] sliceAggregate which is a new aggregat...

2017-02-21 Thread ZunwenYou
Github user ZunwenYou commented on the issue: https://github.com/apache/spark/pull/17000 Hi, @MLnick Firstly, `sliceAggregate `is a common aggregate for array-like data. Besides `MultivariateOnlineSummarizer ` case, it can be used in many large machine learning cases. I chose

[GitHub] spark issue #17012: [SPARK-19677][SS] Renaming a file atop an existing one s...

2017-02-21 Thread vitillo
Github user vitillo commented on the issue: https://github.com/apache/spark/pull/17012 @tdas --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #16977: [SPARK-19651][CORE] ParallelCollectionRDD.collect should...

2017-02-21 Thread uncleGen
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/16977 build successfully in local, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #16990: [SPARK-19660][CORE][SQL] Replace the configuration prope...

2017-02-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16990 **[Test build #73207 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73207/testReport)** for PR 16990 at commit

[GitHub] spark issue #17012: [SPARK-19677][SS] Renaming a file atop an existing one s...

2017-02-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17012 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #17012: [SPARK-19677][SS] Renaming a file atop an existin...

2017-02-21 Thread vitillo
GitHub user vitillo opened a pull request: https://github.com/apache/spark/pull/17012 [SPARK-19677][SS] Renaming a file atop an existing one should not fail on HDFS ## What changes were proposed in this pull request? HDFSBackedStateStoreProvider fails to rename files on

[GitHub] spark issue #17011: [SPARK-19676][CORE] Flaky test: FsHistoryProviderSuite.S...

2017-02-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17011 **[Test build #73206 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73206/testReport)** for PR 17011 at commit

[GitHub] spark pull request #17011: [SPARK-19676][CORE] Flaky test: FsHistoryProvider...

2017-02-21 Thread uncleGen
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/17011 [SPARK-19676][CORE] Flaky test: FsHistoryProviderSuite.SPARK-3697: ignore directories that cannot be read. ## What changes were proposed in this pull request? Flaky test:

[GitHub] spark issue #16990: [SPARK-19660][CORE][SQL] Replace the configuration prope...

2017-02-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16990 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73205/ Test FAILed. ---

[GitHub] spark issue #16990: [SPARK-19660][CORE][SQL] Replace the configuration prope...

2017-02-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16990 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16995: [SPARK-19340][SQL] CSV file will result in an exception ...

2017-02-21 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16995 Actually I have a simpler fix like this: --- a/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala +++

<    1   2   3   4   5   6