[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-18 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r79330219 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala --- @@ -0,0 +1,159 @@ +/* + * Licensed to the

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r79329962 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala --- @@ -0,0 +1,209 @@ +/* + * Licensed to

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r79329866 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala --- @@ -0,0 +1,159 @@ +/* + * Licensed to

[GitHub] spark issue #15146: [SPARK-17590][SQL] Analyze CTE definitions at once and a...

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15146 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15146: [SPARK-17590][SQL] Analyze CTE definitions at once and a...

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15146 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65588/ Test PASSed. ---

[GitHub] spark issue #15146: [SPARK-17590][SQL] Analyze CTE definitions at once and a...

2016-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15146 **[Test build #65588 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65588/consoleFull)** for PR 15146 at commit

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-18 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r79329639 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/StatisticsSuite.scala --- @@ -101,4 +101,47 @@ class StatisticsSuite extends QueryTest with

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-18 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r79329457 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala --- @@ -32,19 +34,70 @@ package

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-18 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r79329382 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala --- @@ -0,0 +1,159 @@ +/* + * Licensed to the

[GitHub] spark issue #15053: [Doc] improve python API docstrings

2016-09-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15053 @holdenk I am also cautious but leaving everything but adding `df.show()` in the package docstring with cleaning up duplicated defining dataframes in each docstring will be minimal change and

[GitHub] spark issue #15054: [SPARK-17502] [SQL] Fix Multiple Bugs in DDL Statements ...

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15054 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65587/ Test PASSed. ---

[GitHub] spark issue #15054: [SPARK-17502] [SQL] Fix Multiple Bugs in DDL Statements ...

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15054 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15054: [SPARK-17502] [SQL] Fix Multiple Bugs in DDL Statements ...

2016-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15054 **[Test build #65587 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65587/consoleFull)** for PR 15054 at commit

[GitHub] spark issue #15024: [SPARK-17470][SQL] unify path for data source table and ...

2016-09-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15024 For `FileFormat `, [`allPaths` is changed to `paths ++ new

[GitHub] spark issue #15024: [SPARK-17470][SQL] unify path for data source table and ...

2016-09-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15024 It sounds like we also need to call `optionsToStorageFormat` for `visitCreateTempViewUsing`. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request #15024: [SPARK-17470][SQL] unify path for data source tab...

2016-09-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15024#discussion_r79328309 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -507,3 +400,117 @@ case class DataSource(

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2016-09-18 Thread zjffdu
Github user zjffdu commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r79328285 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala --- @@ -69,6 +84,67 @@ private[spark] class

[GitHub] spark pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pys...

2016-09-18 Thread zjffdu
Github user zjffdu commented on a diff in the pull request: https://github.com/apache/spark/pull/13599#discussion_r79328171 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala --- @@ -69,6 +84,67 @@ private[spark] class

[GitHub] spark issue #15053: [Doc] improve python API docstrings

2016-09-18 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/15053 I was thinking that the user would probably read the package string documentation before looking at the individual functions (or if they went looking for the definition of the dataframe). I'm a

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r79327748 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/StatisticsSuite.scala --- @@ -101,4 +101,47 @@ class StatisticsSuite extends QueryTest with

[GitHub] spark issue #15131: [SPARK-17577][SparkR][Core] SparkR support add files to ...

2016-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15131 **[Test build #65590 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65590/consoleFull)** for PR 15131 at commit

[GitHub] spark issue #15131: [SPARK-17577][SparkR][Core] SparkR support add files to ...

2016-09-18 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/15131 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15102: [SPARK-17346][SQL] Add Kafka source for Structured Strea...

2016-09-18 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/15102 > We do need to handle it comparing completely different topicpartitions, because it's entirely possible to have a job with a single topicpartition A, which is deleted or unsubscribed, and then

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r79327304 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala --- @@ -32,19 +34,70 @@ package

[GitHub] spark issue #14784: [SPARK-17210][SPARKR] sparkr.zip is not distributed to e...

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14784 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14784: [SPARK-17210][SPARKR] sparkr.zip is not distributed to e...

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14784 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65589/ Test PASSed. ---

[GitHub] spark issue #14784: [SPARK-17210][SPARKR] sparkr.zip is not distributed to e...

2016-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14784 **[Test build #65589 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65589/consoleFull)** for PR 14784 at commit

[GitHub] spark issue #11105: [SPARK-12469][CORE] Data Property accumulators for Spark

2016-09-18 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/11105 If @rxin or @squito has the bandwith to continue reviewing I'd really appreciate it (especially on the mergeImpl / addImpl wrapping or if should go about it in another way). --- If your project

[GitHub] spark issue #15146: [SPARK-17590][SQL] Analyze CTE definitions at once and a...

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15146 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15146: [SPARK-17590][SQL] Analyze CTE definitions at once and a...

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15146 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65586/ Test PASSed. ---

[GitHub] spark issue #15146: [SPARK-17590][SQL] Analyze CTE definitions at once and a...

2016-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15146 **[Test build #65586 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65586/consoleFull)** for PR 15146 at commit

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-18 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/15090#discussion_r79326407 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala --- @@ -0,0 +1,159 @@ +/* + * Licensed to the

[GitHub] spark issue #14784: [SPARK-17210][SPARKR] sparkr.zip is not distributed to e...

2016-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14784 **[Test build #65589 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65589/consoleFull)** for PR 14784 at commit

[GitHub] spark issue #14784: [SPARK-17210][SPARKR] sparkr.zip is not distributed to e...

2016-09-18 Thread zjffdu
Github user zjffdu commented on the issue: https://github.com/apache/spark/pull/14784 @shivaram @felixcheung Sorry for late response, I just rebase the PR and also take spark.master over master. Please help review. --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #15053: [Doc] improve python API docstrings

2016-09-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15053 Oh, I meant just don't touch the `globs` in `_test()` but just print the global dataframes (which should be rather `show()` to show the contents) so that users can understand the input and

[GitHub] spark issue #14597: [SPARK-17017][MLLIB][ML] add a chiSquare Selector based ...

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14597 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65585/ Test PASSed. ---

[GitHub] spark issue #14597: [SPARK-17017][MLLIB][ML] add a chiSquare Selector based ...

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14597 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14597: [SPARK-17017][MLLIB][ML] add a chiSquare Selector based ...

2016-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14597 **[Test build #65585 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65585/consoleFull)** for PR 14597 at commit

[GitHub] spark issue #15146: [SPARK-17590][SQL] Analyze CTE definitions at once and a...

2016-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15146 cc @hvanhovell @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15146: [SPARK-17590][SQL] Analyze CTE definitions at once and a...

2016-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15146 **[Test build #65588 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65588/consoleFull)** for PR 15146 at commit

[GitHub] spark issue #15053: [Doc] improve python API docstrings

2016-09-18 Thread mortada
Github user mortada commented on the issue: https://github.com/apache/spark/pull/15053 @HyukjinKwon I understand we can have `py.test` and `doctest`, but I don't quite see how we could define the input DataFrame globally while at the same time have a clear, self-contained docstring

[GitHub] spark issue #14995: [Test Only][SPARK-6235][CORE]Address various 2G limits

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14995 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14995: [Test Only][SPARK-6235][CORE]Address various 2G limits

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14995 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65584/ Test PASSed. ---

[GitHub] spark issue #14995: [Test Only][SPARK-6235][CORE]Address various 2G limits

2016-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14995 **[Test build #65584 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65584/consoleFull)** for PR 14995 at commit

[GitHub] spark issue #15041: [SPARK-17488][CORE] TakeAndOrder will OOM when the data ...

2016-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15041 cc @cloud-fan @hvanhovell --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15054: [SPARK-17502] [SQL] Fix Multiple Bugs in DDL Statements ...

2016-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15054 **[Test build #65587 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65587/consoleFull)** for PR 15054 at commit

[GitHub] spark issue #15147: [SPARK-17545] [SQL] Handle additional time offset format...

2016-09-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15147 FYI, we backported this to branch 2.0 too. So this will be fixed from 2.0.1 https://github.com/apache/spark/pull/14799. --- If your project is set up for it, you can reply to this email and

[GitHub] spark issue #15147: [SPARK-17545] [SQL] Handle additional time offset format...

2016-09-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15147 cc @srowen who was in the JIRA too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15147: [SPARK-17545] [SQL] Handle additional time offset format...

2016-09-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15147 To continue the discussion of JIRA, I think the issue you faced is to read those in CSV? Whether it is intended or not in `FastDateFormat`, the default pattern

[GitHub] spark issue #15147: [SPARK-17545] [SQL] Handle additional time offset format...

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15147 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #15147: [SPARK-17545] [SQL] Handle additional time offset...

2016-09-18 Thread nbeyer
GitHub user nbeyer opened a pull request: https://github.com/apache/spark/pull/15147 [SPARK-17545] [SQL] Handle additional time offset formats of ISO 8601 ## What changes were proposed in this pull request? Allows flexibility in handling additional ISO 8601 time offset variants.

[GitHub] spark issue #15146: [SPARK-17590][SQL] Analyze CTE definitions at once

2016-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15146 **[Test build #65586 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65586/consoleFull)** for PR 15146 at commit

[GitHub] spark pull request #15146: [SPARK-17590][SQL] Analyze CTE definitions at onc...

2016-09-18 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/15146 [SPARK-17590][SQL] Analyze CTE definitions at once ## What changes were proposed in this pull request? We substitute logical plan with CTE definitions in the analyzer rule CTESubstitution.

[GitHub] spark issue #15145: [SPARK-17589] [TEST] [2.0] Fix test case `create externa...

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15145 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65583/ Test PASSed. ---

[GitHub] spark issue #15145: [SPARK-17589] [TEST] [2.0] Fix test case `create externa...

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15145 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15145: [SPARK-17589] [TEST] [2.0] Fix test case `create externa...

2016-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15145 **[Test build #65583 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65583/consoleFull)** for PR 15145 at commit

[GitHub] spark issue #14803: [SPARK-17153][SQL] Should read partition data when readi...

2016-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14803 ping @marmbrus @zsxwing Would you mind to take a look this and provide your feedback? If this is not going to be fixed, please let me know too. This is a small change and I don't think it should be

[GitHub] spark pull request #15145: [SPARK-17589] [TEST] [2.0] Fix test case `create ...

2016-09-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15145#discussion_r79321615 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/MetastoreDataSourcesSuite.scala --- @@ -509,7 +509,7 @@ class MetastoreDataSourcesSuite

[GitHub] spark issue #14452: [SPARK-16849][SQL][WIP] Improve subquery execution by de...

2016-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14452 @hvanhovell @davies I rethink this PR in recent days. The changes includes some hacky change and are too big to review. I would like to separate it to individual small PRs which can be reviewed

[GitHub] spark issue #14597: [SPARK-17017][MLLIB][ML] add a chiSquare Selector based ...

2016-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14597 **[Test build #65585 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65585/consoleFull)** for PR 14597 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-09-18 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13680 @cloud-fan would it be possible to review this? I think that I implemented your suggestions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #14995: [Test Only][SPARK-6235][CORE]Address various 2G limits

2016-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14995 **[Test build #65584 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65584/consoleFull)** for PR 14995 at commit

[GitHub] spark pull request #15145: [SPARK-17589] [TEST] [2.0] Fix test case `create ...

2016-09-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15145#discussion_r79320510 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/MetastoreDataSourcesSuite.scala --- @@ -509,7 +509,7 @@ class MetastoreDataSourcesSuite

[GitHub] spark pull request #15145: [SPARK-17589] [TEST] [2.0] Fix test case `create ...

2016-09-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15145#discussion_r79320458 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/MetastoreDataSourcesSuite.scala --- @@ -509,7 +509,7 @@ class MetastoreDataSourcesSuite

[GitHub] spark issue #15053: [Doc] improve python API docstrings

2016-09-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15053 Hi @mortada , I am sorry that I am a bit being noisy here but I just took a look for myself. I resembled the PySpark structure and made a draft for myself. ```python """

[GitHub] spark pull request #15054: [SPARK-17502] [SQL] Fix Multiple Bugs in DDL Stat...

2016-09-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15054#discussion_r79320015 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -65,7 +64,11 @@ case class CreateTableLikeCommand(

[GitHub] spark issue #15136: [SPARK-17581] [SQL] Invalidate Statistics After Some ALT...

2016-09-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15136 Will do more investigation on this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15145: [SPARK-17589] [TEST] [2.0] Fix test case `create externa...

2016-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15145 **[Test build #65583 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65583/consoleFull)** for PR 15145 at commit

[GitHub] spark pull request #15145: [SPARK-17589] [TEST] [2.0] Fix test case `create ...

2016-09-18 Thread gatorsmile
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/15145 [SPARK-17589] [TEST] [2.0] Fix test case `create external table` in MetastoreDataSourcesSuite ### What changes were proposed in this pull request? This PR is to fix a test failure on the

[GitHub] spark issue #15145: [SPARK-17589] [TEST] [2.0] Fix test case `create externa...

2016-09-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15145 cc @cloud-fan @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15122: [SPARK-17569] Make StructuredStreaming FileStreamSource ...

2016-09-18 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15122 @petermaxlee I believe you will get a runtime exception saying that the file does not exist. Also, regarding your options 2, are you suggesting that users of structured streaming to use

[GitHub] spark issue #15142: [SPARK-17297] [DOCS] Clarify window/slide duration as ab...

2016-09-18 Thread peteb4ker
Github user peteb4ker commented on the issue: https://github.com/apache/spark/pull/15142 Looks great, thx Sean. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15138: [SPARK-17583][SQL] Remove uesless rowSeparator variable ...

2016-09-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15138 Yes, you are right and also yes, the purpose of the setting is to prevent OOM

[GitHub] spark issue #15099: [SPARK-17541][SQL] fix some DDL bugs about table managem...

2016-09-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15099 Let me do a quick fix. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15114: [SPARK-17473][SQL] fixing docker integration tests error...

2016-09-18 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/15114 I tried running this in SBT and ran into a bunch of spurious exceptions from logging code: ``` SLF4J: Failed toString() invocation on an object of type

[GitHub] spark issue #15053: [Doc] improve python API docstrings

2016-09-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15053 I haven't checked if package level docstring can define global level variables to be accessed from other docstrings. So, I would like to defer this to @holdenk (if you are not sure too, then we

[GitHub] spark issue #15053: [Doc] improve python API docstrings

2016-09-18 Thread mortada
Github user mortada commented on the issue: https://github.com/apache/spark/pull/15053 @HyukjinKwon thanks for your help! I'm happy to complete this PR and follow what you suggest for testing. How would the package level docstring work? The goal (which I think we all agree

[GitHub] spark issue #15053: [Doc] improve python API docstrings

2016-09-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15053 Hi @mortada, do you mind if I ask to address mine or @holdenk' comment? If you find any problem with testing, I am willing to take over this which I will ask comitters to credit this to you.

[GitHub] spark issue #15143: [SPARK-17584][Test] - Add unit test coverage for TaskSta...

2016-09-18 Thread erenavsarogullari
Github user erenavsarogullari commented on the issue: https://github.com/apache/spark/pull/15143 Hi @rxin, Firstly, thanks for quick reply. I was thinking for unit-test coverage perspective and a starter point to contribute project but it is ok for me if PR is

[GitHub] spark pull request #15127: [SPARK-17571][SQL] AssertOnQuery.condition should...

2016-09-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15127 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15127: [SPARK-17571][SQL] AssertOnQuery.condition should always...

2016-09-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15127 Merging in master/2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #15143: [SPARK-17584][Test] - Add unit test coverage for TaskSta...

2016-09-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15143 hm I agree having good unit test coverage is important -- this seems too trivial to test? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #15142: [SPARK-17297] [DOCS] Clarify window/slide duration as ab...

2016-09-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15142 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #3062: [SPARK-1406] Mllib pmml model export

2016-09-18 Thread manugarri
Github user manugarri commented on the issue: https://github.com/apache/spark/pull/3062 im not sure if this is the right place to ask, but is there any plan to implement PMML export from pyspark? Cant find anything on the pyspark docs. --- If your project is set up for it, you can

[GitHub] spark issue #15144: [SPARK-17587][PYTHON][MLLIB] SparseVector __getitem__ sh...

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15144 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15144: [SPARK-17587][PYTHON][MLLIB] SparseVector __getitem__ sh...

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15144 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65582/ Test PASSed. ---

[GitHub] spark issue #15144: [SPARK-17587][PYTHON][MLLIB] SparseVector __getitem__ sh...

2016-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15144 **[Test build #65582 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65582/consoleFull)** for PR 15144 at commit

[GitHub] spark issue #15134: [SPARK-17580][CORE]Add random UUID as app name while app...

2016-09-18 Thread sadikovi
Github user sadikovi commented on the issue: https://github.com/apache/spark/pull/15134 @phalodi Does this solve (intend to solve) situation when spark-submit is launched with empty app name? Currently, as of 1.6, it will use empty application name. --- If your project is set up

[GitHub] spark issue #15142: [SPARK-17297] [DOCS] Clarify window/slide duration as ab...

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15142 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65581/ Test PASSed. ---

[GitHub] spark issue #15142: [SPARK-17297] [DOCS] Clarify window/slide duration as ab...

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15142 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15142: [SPARK-17297] [DOCS] Clarify window/slide duration as ab...

2016-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15142 **[Test build #65581 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65581/consoleFull)** for PR 15142 at commit

[GitHub] spark issue #15144: [SPARK-17587][PYTHON][MLLIB] SparseVector __getitem__ sh...

2016-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15144 **[Test build #65582 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65582/consoleFull)** for PR 15144 at commit

[GitHub] spark pull request #15144: [SPARK-17587][PYTHON][MLLIB] SparseVector __getit...

2016-09-18 Thread zero323
GitHub user zero323 opened a pull request: https://github.com/apache/spark/pull/15144 [SPARK-17587][PYTHON][MLLIB] SparseVector __getitem__ should follow __getitem__ contract ## What changes were proposed in this pull request? Replaces ValueError with IndexError when index

[GitHub] spark issue #15143: [SPARK-17584][Test] - Add unit test coverage for TaskSta...

2016-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15143 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #14444: [SPARK-16839] [SQL] redundant aliases after cleanupAlias...

2016-09-18 Thread eyalfa
Github user eyalfa commented on the issue: https://github.com/apache/spark/pull/1 @cloud-fan, please see this [https://github.com/apache/spark/blob/1dbb725dbef30bf7633584ce8efdb573f2d92bca/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L1104-L1115](url), it seems

[GitHub] spark pull request #15143: [SPARK-17584][Test] - Add unit test coverage for ...

2016-09-18 Thread erenavsarogullari
GitHub user erenavsarogullari opened a pull request: https://github.com/apache/spark/pull/15143 [SPARK-17584][Test] - Add unit test coverage for TaskState and ExecutorState ## What changes were proposed in this pull request? - TaskState and ExecutorState expose isFailed and

[GitHub] spark issue #15114: [SPARK-17473][SQL] fixing docker integration tests error...

2016-09-18 Thread lresende
Github user lresende commented on the issue: https://github.com/apache/spark/pull/15114 I verified this works on native docker in linux with : build/mvn -Pdocker-integration-tests -Pscala-2.11 -pl :spark-docker-integration-tests_2.11 clean compile test LGTM. ---

[GitHub] spark issue #14762: [SPARK-16962][CORE][SQL] Fix misaligned record accesses ...

2016-09-18 Thread sumansomasundar
Github user sumansomasundar commented on the issue: https://github.com/apache/spark/pull/14762 @srowen I ran dev/lint-java, removed few additional white spaces, and shortened few lines longer than 100 characters, then rebased it. --- If your project is set up for it, you can reply

[GitHub] spark issue #14444: [SPARK-16839] [SQL] redundant aliases after cleanupAlias...

2016-09-18 Thread eyalfa
Github user eyalfa commented on the issue: https://github.com/apache/spark/pull/1 @hvanhovell , I'm currently trying your approach of testing `ne.resolved` prior to accessing `ne.name`. tests are running as I write here, but a quick dive into the `NamedExpression` hierarchy

[GitHub] spark issue #14981: [SPARK-17418] Remove Kinesis artifacts from Spark releas...

2016-09-18 Thread lresende
Github user lresende commented on the issue: https://github.com/apache/spark/pull/14981 @srowen Please don't get me wrong, I don't have any interest on this extension either, but just want to make sure we start doing the right thing for Apache Spark. I will try to ping some of the

  1   2   3   4   >