[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16579 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16579 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71824/ Test PASSed. ---

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16579 **[Test build #71824 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71824/testReport)** for PR 16579 at commit

[GitHub] spark issue #16638: [SPARK-19115] [SQL] Supporting Create External Table Lik...

2017-01-22 Thread ouyangxiaochen
Github user ouyangxiaochen commented on the issue: https://github.com/apache/spark/pull/16638 I am sorry that I did't grasp the key points of your question. In Hive, if there are data files under the specified path while creating an external table, then Hive will identify the files

[GitHub] spark issue #16642: [SPARK-19284][SQL]append to partitioned datasource table...

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16642 **[Test build #71829 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71829/testReport)** for PR 16642 at commit

[GitHub] spark issue #16566: [SPARK-18821][SparkR]: Bisecting k-means wrapper in Spar...

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16566 **[Test build #71828 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71828/testReport)** for PR 16566 at commit

[GitHub] spark pull request #16642: [SPARK-19284][SQL]append to partitioned datasourc...

2017-01-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16642#discussion_r97262909 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/PartitionedWriteSuite.scala --- @@ -92,6 +111,16 @@ class PartitionedWriteSuite extends

[GitHub] spark pull request #16642: [SPARK-19284][SQL]append to partitioned datasourc...

2017-01-22 Thread windpiger
Github user windpiger commented on a diff in the pull request: https://github.com/apache/spark/pull/16642#discussion_r97262157 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/PartitionedWriteSuite.scala --- @@ -92,6 +96,47 @@ class PartitionedWriteSuite extends

[GitHub] spark pull request #16642: [SPARK-19284][SQL]append to partitioned datasourc...

2017-01-22 Thread windpiger
Github user windpiger commented on a diff in the pull request: https://github.com/apache/spark/pull/16642#discussion_r97262179 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/PartitionedWriteSuite.scala --- @@ -92,6 +96,47 @@ class PartitionedWriteSuite extends

[GitHub] spark issue #16642: [SPARK-19284][SQL]append to partitioned datasource table...

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16642 **[Test build #71827 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71827/testReport)** for PR 16642 at commit

[GitHub] spark issue #16521: [SPARK-19139][core] New auth mechanism for transport lib...

2017-01-22 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/16521 Made one pass. Looks good overall. Just some nits. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #16666: [SPARK-19319][SparkR]:SparkR Kmeans summary returns erro...

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/1 **[Test build #71826 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71826/testReport)** for PR 1 at commit

[GitHub] spark pull request #16666: [SPARK-19319][SparkR]:SparkR Kmeans summary retur...

2017-01-22 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/1#discussion_r97260863 --- Diff: R/pkg/R/mllib_clustering.R --- @@ -225,10 +225,12 @@ setMethod("spark.kmeans", signature(data = "SparkDataFrame", formula = "formula"

[GitHub] spark issue #16652: [SPARK-19234][MLLib] AFTSurvivalRegression should fail f...

2017-01-22 Thread admackin
Github user admackin commented on the issue: https://github.com/apache/spark/pull/16652 I've addressed all the problems I think – code style now fixed, MLTestingUtils patched (and verified all MLLib test cases still pass), and added a test case for zero-valued labels --- If your

[GitHub] spark issue #16638: spark-19115

2017-01-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16638 Please keep updating your PR description. For example, this PR is not relying on `manual tests`. In addition, you also need to summarize what this PR did. List more details to help reviewers

[GitHub] spark issue #16638: spark-19115

2017-01-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16638 Let me rephrase it. If the directory specified in the `LOCATION` spec contains the other files, what does Hive behave? --- If your project is set up for it, you can reply to this email and

[GitHub] spark issue #16638: spark-19115

2017-01-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16638 First, please change the PR title to `[SPARK-19115] [SQL] Supporting Create External Table Like Location` --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #16645: [SPARK-19290][SQL] add a new extending interface in Anal...

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16645 **[Test build #71825 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71825/testReport)** for PR 16645 at commit

[GitHub] spark issue #16671: [SPARK-19327][SparkSQL] a better balance partition metho...

2017-01-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16671 The connectors by some DBMS vendors are using the UNLOAD utility, which performs much better, and build the RDD in the connectors. Normally, JDBC is not a good option for large table

[GitHub] spark issue #16654: [SPARK-19303][ML][WIP] Add evaluate method in clustering...

2017-01-22 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16654 Metrics evaluate the clustering though; the details of the algorithm are irrelevant. This still clusters points in a continuous space so you can measure WSSSE. --- If your project is set up for

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16579 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71821/ Test PASSed. ---

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16579 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16594 :- ) No perfect solution, but we should use the [metric prefix](https://en.wikipedia.org/wiki/Metric_prefix) when the number is huge. --- If your project is set up for it, you can reply to

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16594 SQLServer has three ways to show the plan: graphical plans, text plans, and XML plans. Actually, it is pretty advanced. When using the text plans, users can set the output formats: 1.

[GitHub] spark issue #16675: [SPARK-19155][ML] Make family case insensitive in GLM

2017-01-22 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/16675 @yanboliang Thanks. Seems to have passed tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #16659: [SPARK-19309][SQL] disable common subexpression e...

2017-01-22 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16659 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16579 **[Test build #71824 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71824/testReport)** for PR 16579 at commit

[GitHub] spark issue #16659: [SPARK-19309][SQL] disable common subexpression eliminat...

2017-01-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16659 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16579 The only failure is irrelevant to this PR. ``` [info] - set spark.sql.warehouse.dir *** FAILED *** (5 minutes, 0 seconds) [info] Timeout of './bin/spark-submit' '--class'

[GitHub] spark pull request #16668: [SPARK-18788][SPARKR] Add API for getNumPartition...

2017-01-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16668#discussion_r97254989 --- Diff: R/pkg/R/DataFrame.R --- @@ -3406,3 +3406,28 @@ setMethod("randomSplit", } sapply(sdfs, dataFrame)

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16579 Retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16594 As of MySQL 5.7.3, the EXPLAIN statement is changed so that the effect of the EXTENDED keyword is always enabled. ``` mysql> EXPLAIN EXTENDED -> SELECT t1.a, t1.a IN (SELECT t2.a

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16579 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71822/ Test FAILed. ---

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16579 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16579 **[Test build #71822 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71822/testReport)** for PR 16579 at commit

[GitHub] spark issue #16659: [SPARK-19309][SQL] disable common subexpression eliminat...

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16659 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71818/ Test PASSed. ---

[GitHub] spark issue #16659: [SPARK-19309][SQL] disable common subexpression eliminat...

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16659 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16659: [SPARK-19309][SQL] disable common subexpression eliminat...

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16659 **[Test build #71818 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71818/testReport)** for PR 16659 at commit

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16594 PostgreSQL has [a few different options in the EXPLAIN command](https://www.postgresql.org/docs/9.3/static/sql-explain.html): ``` EXPLAIN SELECT * FROM foo WHERE i = 4;

[GitHub] spark pull request #16552: [SPARK-19152][SQL]DataFrameWriter.saveAsTable sup...

2017-01-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16552#discussion_r97253775 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -1353,6 +1353,15 @@ class HiveDDLSuite

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16594 DB2 has a tool to format the contents of the EXPLAIN tables. Below is an example of the output with explanation: ![screenshot 2017-01-22 21 05

[GitHub] spark issue #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16344 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16344 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71823/ Test PASSed. ---

[GitHub] spark issue #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16344 **[Test build #71823 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71823/testReport)** for PR 16344 at commit

[GitHub] spark issue #16672: [SPARK-19329][SQL]insert data to a not exist location da...

2017-01-22 Thread windpiger
Github user windpiger commented on the issue: https://github.com/apache/spark/pull/16672 In hive: 1. read a table with non-existing path, no exception and return 0 rows 2. read a table with non-permission path, throw runtime exception ``` FAILED: SemanticException

[GitHub] spark issue #16659: [SPARK-19309][SQL] disable common subexpression eliminat...

2017-01-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16659 LGTM pending test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16587: [SPARK-19229] [SQL] Disallow Creating Hive Source Tables...

2017-01-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16587 Thanks! Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16579 LGTM pending test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16669: [SPARK-16101][SQL] Refactoring CSV read path to be consi...

2017-01-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16669 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #16675: [SPARK-19155][ML] Make family case insensitive in GLM

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16675 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16675: [SPARK-19155][ML] Make family case insensitive in GLM

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16675 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71820/ Test PASSed. ---

[GitHub] spark issue #16675: [SPARK-19155][ML] Make family case insensitive in GLM

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16675 **[Test build #71820 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71820/testReport)** for PR 16675 at commit

[GitHub] spark pull request #16669: [SPARK-16101][SQL] Refactoring CSV read path to b...

2017-01-22 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16669 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #16587: [SPARK-19229] [SQL] Disallow Creating Hive Source...

2017-01-22 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16587 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16594 Let us do some research how the other RDBMSs are doing it? For example, Oracle ``` SQL> explain plan for select * from product; Explained. SQL> select * from

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-22 Thread wzhfy
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/16594 @rxin Can we add a flag to enable or disable it? Currently there's no other way to see size and row count except debugging. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #16671: [SPARK-19327][SparkSQL] a better balance partition metho...

2017-01-22 Thread djvulee
Github user djvulee commented on the issue: https://github.com/apache/spark/pull/16671 @HyukjinKwon One assumption behind this design is that the specified column has index in most real scenario, so the table scan cost is not much high. What I observed is that most large

[GitHub] spark issue #16587: [SPARK-19229] [SQL] Disallow Creating Hive Source Tables...

2017-01-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16587 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/16579 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-22 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16594 sorry this explain plan makes no sense -- it is impossible to read. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request #16579: [SPARK-19218][SQL] Fix SET command to show a resu...

2017-01-22 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/16579#discussion_r97250719 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -982,6 +982,33 @@ class SQLQuerySuite extends QueryTest with

[GitHub] spark issue #16675: [SPARK-19155][ML] Make family case insensitive in GLM

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16675 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71819/ Test PASSed. ---

[GitHub] spark issue #16675: [SPARK-19155][ML] Make family case insensitive in GLM

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16675 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16675: [SPARK-19155][ML] Make family case insensitive in GLM

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16675 **[Test build #71819 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71819/testReport)** for PR 16675 at commit

[GitHub] spark pull request #16579: [SPARK-19218][SQL] Fix SET command to show a resu...

2017-01-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16579#discussion_r97250587 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -982,6 +982,33 @@ class SQLQuerySuite extends QueryTest with

[GitHub] spark issue #16636: [SPARK-19279] [SQL] Block Creating a Hive Table With an ...

2017-01-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16636 Ideally the table schema must be specified or inferred before saving to metastore, however, for hive serde tables, we have to save it to metastore first, and let the hive metastore to infer the

[GitHub] spark pull request #16579: [SPARK-19218][SQL] Fix SET command to show a resu...

2017-01-22 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/16579#discussion_r97250343 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -982,6 +982,33 @@ class SQLQuerySuite extends QueryTest with

[GitHub] spark pull request #16579: [SPARK-19218][SQL] Fix SET command to show a resu...

2017-01-22 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/16579#discussion_r97250196 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -982,6 +982,33 @@ class SQLQuerySuite extends QueryTest with

[GitHub] spark pull request #16579: [SPARK-19218][SQL] Fix SET command to show a resu...

2017-01-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16579#discussion_r97250113 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -982,6 +982,33 @@ class SQLQuerySuite extends QueryTest with

[GitHub] spark pull request #16579: [SPARK-19218][SQL] Fix SET command to show a resu...

2017-01-22 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/16579#discussion_r97249959 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -982,6 +982,33 @@ class SQLQuerySuite extends QueryTest with

[GitHub] spark pull request #16579: [SPARK-19218][SQL] Fix SET command to show a resu...

2017-01-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16579#discussion_r97249854 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -982,6 +982,33 @@ class SQLQuerySuite extends QueryTest with

[GitHub] spark issue #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2017-01-22 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/16344 @yanboliang Thanks so much for your detailed review. Your suggestions make lots of sense and I have included all of them in the new commit. Let me know if there is any other change needed.

[GitHub] spark issue #16671: [SPARK-19327][SparkSQL] a better balance partition metho...

2017-01-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16671 FWIW, I am negative of this approach too. It does not look a good solution to require full table scans to resolve skew between partitions. As said, it is not good for a large table.

[GitHub] spark issue #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16344 **[Test build #71823 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71823/testReport)** for PR 16344 at commit

[GitHub] spark pull request #16579: [SPARK-19218][SQL] Fix SET command to show a resu...

2017-01-22 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/16579#discussion_r97249644 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -982,6 +982,33 @@ class SQLQuerySuite extends QueryTest with

[GitHub] spark pull request #16579: [SPARK-19218][SQL] Fix SET command to show a resu...

2017-01-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16579#discussion_r97249538 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -982,6 +982,33 @@ class SQLQuerySuite extends QueryTest with

[GitHub] spark pull request #16579: [SPARK-19218][SQL] Fix SET command to show a resu...

2017-01-22 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/16579#discussion_r97249317 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -982,6 +982,33 @@ class SQLQuerySuite extends QueryTest with

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16579 **[Test build #71822 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71822/testReport)** for PR 16579 at commit

[GitHub] spark issue #16675: [SPARK-19155][ML] Make family case insensitive in GLM

2017-01-22 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/16675 Looks good, I'll merge if it passes test. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #16579: [SPARK-19218][SQL] Fix SET command to show a resu...

2017-01-22 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/16579#discussion_r97249218 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -982,6 +982,33 @@ class SQLQuerySuite extends QueryTest with

[GitHub] spark pull request #16579: [SPARK-19218][SQL] Fix SET command to show a resu...

2017-01-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16579#discussion_r97249076 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -982,6 +982,33 @@ class SQLQuerySuite extends QueryTest with

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16579 **[Test build #71821 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71821/testReport)** for PR 16579 at commit

[GitHub] spark issue #16594: [SPARK-17078] [SQL] Show stats when explain

2017-01-22 Thread wzhfy
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/16594 @hvanhovell I've updated the description which shows a simple example. The explained plan will become hard to read when joining many tables and sizeInBytes is computed by the simple way

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16579 Thank you, @viirya . I noticed that `spark.sessionState.conf.clear()` is useless. I removed that. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #16675: [SPARK-19155][ML] Make family case insensitive in GLM

2017-01-22 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/16675 @yanboliang Thanks for the quick response. How about the new commit, where I just change the value from `getFamily` to lower case when necessary, i.e., in the calculation of p-value and

[GitHub] spark issue #16675: [SPARK-19155][ML] Make family case insensitive in GLM

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16675 **[Test build #71820 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71820/testReport)** for PR 16675 at commit

[GitHub] spark pull request #16579: [SPARK-19218][SQL] Fix SET command to show a resu...

2017-01-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/16579#discussion_r97248522 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -982,6 +982,33 @@ class SQLQuerySuite extends QueryTest with

[GitHub] spark issue #16675: [SPARK-19155][ML] Make family case insensitive in GLM

2017-01-22 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/16675 @actuaryzhang I think the change is not appropriate, the function ```getFamily``` should return the raw value that users specified, this is the cause that I didn't change them in #16516 .

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16579 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71816/ Test PASSed. ---

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16579 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16579 **[Test build #71816 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71816/testReport)** for PR 16579 at commit

[GitHub] spark issue #16675: [SPARK-19155][ML] Make family case insensitive in GLM

2017-01-22 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/16675 I would prefer that `getFamily` returns lower case values directly, because using `getFamily.toLowerCase` can get very cumbersome and I use this a lot in another PR #16344. If we want to keep

[GitHub] spark issue #16675: [SPARK-19155][ML] Make family case insensitive in GLM

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16675 **[Test build #71819 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71819/testReport)** for PR 16675 at commit

[GitHub] spark pull request #16636: [SPARK-19279] [SQL] Block Creating a Hive Table W...

2017-01-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16636#discussion_r97247351 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala --- @@ -1527,6 +1527,21 @@ class DDLSuite extends QueryTest with

[GitHub] spark issue #16479: [SPARK-19085][SQL] cleanup OutputWriterFactory and Outpu...

2017-01-22 Thread koertkuipers
Github user koertkuipers commented on the issue: https://github.com/apache/spark/pull/16479 i will just copy the conversion code over for now thx --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request #16675: [SPARK-19155][ML] make getFamily case insensitive

2017-01-22 Thread actuaryzhang
GitHub user actuaryzhang opened a pull request: https://github.com/apache/spark/pull/16675 [SPARK-19155][ML] make getFamily case insensitive ## What changes were proposed in this pull request? This is a supplement to PR #16516 which did not make the value from `getFamily` case

[GitHub] spark issue #16659: [SPARK-19309][SQL] disable common subexpression eliminat...

2017-01-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16659 **[Test build #71818 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71818/testReport)** for PR 16659 at commit

[GitHub] spark issue #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16344 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16344 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71817/ Test PASSed. ---

[GitHub] spark issue #16579: [SPARK-19218][SQL] Fix SET command to show a result corr...

2017-01-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16579 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

  1   2   3   >