[GitHub] spark issue #16308: [SPARK-18936][SQL] Infrastructure for session local time...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16308 **[Test build #70396 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70396/testReport)** for PR 16308 at commit [`4b6900c`](https://github.com/apache/spark/commit/4

[GitHub] spark issue #16347: [SPARK-18934][SQL] Writing to dynamic partitions does no...

2016-12-20 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16347 Thanks for submitting the ticket. In general I don't think the sortWithinPartitions property can carry over to writing out data, because one partition actually corresponds to more than one file.

[GitHub] spark issue #16308: [SPARK-18936][SQL] Infrastructure for session local time...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16308 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16308: [SPARK-18936][SQL] Infrastructure for session local time...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16308 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70396/ Test PASSed. ---

[GitHub] spark pull request #16349: [Doc] bucketing is applicable to all file-based d...

2016-12-20 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/16349 [Doc] bucketing is applicable to all file-based data sources ## What changes were proposed in this pull request? Starting Spark 2.1.0, bucketing feature is available for all file-based data source

[GitHub] spark issue #16296: [SPARK-18885][SQL] unify CREATE TABLE syntax for data so...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16296 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70398/ Test FAILed. ---

[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12775 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70397/ Test FAILed. ---

[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12775 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16296: [SPARK-18885][SQL] unify CREATE TABLE syntax for data so...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16296 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16349: [Doc] bucketing is applicable to all file-based data sou...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16349 **[Test build #70399 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70399/testReport)** for PR 16349 at commit [`c8f1b42`](https://github.com/apache/spark/commit/c8

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2016-12-20 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15018 For the zero-weight values, can we do similar to scikit-learn to remove zero-weight values, like https://github.com/amueller/scikit-learn/commit/2415100f79293bbbf52c12c36d63a6cf602cf3c4 --- If your

[GitHub] spark issue #16232: [SPARK-18800][SQL] Fix UnsafeKVExternalSorter by correct...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16232 **[Test build #70400 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70400/testReport)** for PR 16232 at commit [`e70692d`](https://github.com/apache/spark/commit/e7

[GitHub] spark issue #16347: [SPARK-18934][SQL] Writing to dynamic partitions does no...

2016-12-20 Thread junegunn
Github user junegunn commented on the issue: https://github.com/apache/spark/pull/16347 Thanks for the comment. I was trying to implement the following Hive QL in Spark SQL/API: ```sql set hive.exec.dynamic.partition.mode=nonstrict; set hive.mapred.mode = nonstrict;

[GitHub] spark issue #16232: [SPARK-18800][SQL] Correct the assert in UnsafeKVExterna...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16232 **[Test build #70401 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70401/testReport)** for PR 16232 at commit [`5a31e37`](https://github.com/apache/spark/commit/5a

[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...

2016-12-20 Thread lirui-intel
Github user lirui-intel commented on the issue: https://github.com/apache/spark/pull/12775 The new test passed locally and I can't find any failures in the Jenkins test report. Not sure what failed exactly. --- If your project is set up for it, you can reply to this email and have yo

[GitHub] spark issue #16336: [SPARK-18923][DOC][BUILD] Support skipping R/Python API ...

2016-12-20 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16336 I think it's fine to make this change for consistency and convenience. It's minor. It'd be nice to document them in the README.md, briefly. --- If your project is set up for it, you can reply to thi

[GitHub] spark issue #16349: [Doc] bucketing is applicable to all file-based data sou...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16349 **[Test build #70399 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70399/testReport)** for PR 16349 at commit [`c8f1b42`](https://github.com/apache/spark/commit/c

[GitHub] spark issue #16232: [SPARK-18800][SQL] Correct the assert in UnsafeKVExterna...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16232 **[Test build #70400 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70400/testReport)** for PR 16232 at commit [`e70692d`](https://github.com/apache/spark/commit/e

[GitHub] spark issue #16232: [SPARK-18800][SQL] Correct the assert in UnsafeKVExterna...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16232 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70400/ Test PASSed. ---

[GitHub] spark issue #16232: [SPARK-18800][SQL] Correct the assert in UnsafeKVExterna...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16232 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16232: [SPARK-18800][SQL] Correct the assert in UnsafeKVExterna...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16232 **[Test build #70401 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70401/testReport)** for PR 16232 at commit [`5a31e37`](https://github.com/apache/spark/commit/5

[GitHub] spark pull request #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock ...

2016-12-20 Thread xuanyuanking
GitHub user xuanyuanking opened a pull request: https://github.com/apache/spark/pull/16350 [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for each table's relation in cache ## What changes were proposed in this pull request? Backport of #16135 to branch-2.0 ## Ho

[GitHub] spark issue #16232: [SPARK-18800][SQL] Correct the assert in UnsafeKVExterna...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16232 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16232: [SPARK-18800][SQL] Correct the assert in UnsafeKVExterna...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16232 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70401/ Test PASSed. ---

[GitHub] spark issue #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for eac...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16350 **[Test build #70402 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70402/consoleFull)** for PR 16350 at commit [`132d12e`](https://github.com/apache/spark/commit/1

[GitHub] spark issue #16135: [SPARK-18700][SQL] Add StripedLock for each table's rela...

2016-12-20 Thread xuanyuanking
Github user xuanyuanking commented on the issue: https://github.com/apache/spark/pull/16135 @hvanhovell Sure, I open a new BACKPORT-2.0. There's a little diff in branch-2.0, the ut test of this patch based on the `HiveCatalogMetrics` which not added in 2.0, so I added the patch nee

[GitHub] spark pull request #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2016-12-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/16344#discussion_r93215941 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -592,6 +629,59 @@ object GeneralizedLinearRegression e

[GitHub] spark pull request #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2016-12-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/16344#discussion_r93216003 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -592,6 +629,59 @@ object GeneralizedLinearRegression e

[GitHub] spark pull request #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2016-12-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/16344#discussion_r93215641 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -592,6 +629,59 @@ object GeneralizedLinearRegression e

[GitHub] spark pull request #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2016-12-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/16344#discussion_r93215688 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -592,6 +629,59 @@ object GeneralizedLinearRegression e

[GitHub] spark issue #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for eac...

2016-12-20 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/16350 Maybe we should just drop the UT (so we don't have to add the metrics). cc @ericl WDYT? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request #16351: [SPARK-18943][SQL] Avoid per-record type dispatch...

2016-12-20 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/16351 [SPARK-18943][SQL] Avoid per-record type dispatch in CSV when reading ## What changes were proposed in this pull request? `CSVRelation.csvParser` does type dispatch for each value in ea

[GitHub] spark issue #16351: [SPARK-18943][SQL] Avoid per-record type dispatch in CSV...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16351 **[Test build #70403 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70403/testReport)** for PR 16351 at commit [`e72d1bc`](https://github.com/apache/spark/commit/e7

[GitHub] spark pull request #15018: [SPARK-17455][MLlib] Improve PAVA implementation ...

2016-12-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15018#discussion_r93229282 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala --- @@ -328,74 +336,69 @@ class IsotonicRegression private (private

[GitHub] spark pull request #16351: [SPARK-18943][SQL] Avoid per-record type dispatch...

2016-12-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/16351#discussion_r93230247 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVInferSchema.scala --- @@ -215,84 +215,133 @@ private[csv] object CSV

[GitHub] spark issue #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for eac...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16350 **[Test build #70402 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70402/consoleFull)** for PR 16350 at commit [`132d12e`](https://github.com/apache/spark/commit/

[GitHub] spark issue #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for eac...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16350 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for eac...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16350 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70402/ Test FAILed. ---

[GitHub] spark issue #16233: [SPARK-18801][SQL] Add `View` operator to help resolve a...

2016-12-20 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16233 We need a way to isolate the analysis of view text with a different context. Using wrapper is one solution, and my proposal doesn't introduce a wrapper, instead it applies the context in place, i.

[GitHub] spark issue #16233: [SPARK-18801][SQL] Add `View` operator to help resolve a...

2016-12-20 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16233 hmm, it seems hard to apply the view context in place, considering things like CTE. I think it's better to introduce analysis context, which can limit the max depth of stacked view easily. --- I

[GitHub] spark issue #16351: [SPARK-18943][SQL] Avoid per-record type dispatch in CSV...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16351 **[Test build #70404 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70404/testReport)** for PR 16351 at commit [`22c9a8a`](https://github.com/apache/spark/commit/22

[GitHub] spark issue #16329: [SPARK-16046][DOCS] Aggregations in the Spark SQL progra...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16329 **[Test build #70405 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70405/testReport)** for PR 16329 at commit [`2c1f182`](https://github.com/apache/spark/commit/2c

[GitHub] spark pull request #16323: [SPARK-18911] [SQL] Define CatalogStatistics to i...

2016-12-20 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/16323#discussion_r93240303 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -198,6 +200,10 @@ case class CatalogTable( locat

[GitHub] spark issue #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for eac...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16350 **[Test build #70406 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70406/consoleFull)** for PR 16350 at commit [`80b8664`](https://github.com/apache/spark/commit/8

[GitHub] spark issue #16329: [SPARK-16046][DOCS] Aggregations in the Spark SQL progra...

2016-12-20 Thread aokolnychyi
Github user aokolnychyi commented on the issue: https://github.com/apache/spark/pull/16329 @marmbrus I have updated the pull request. The compiled docs can be found [here](https://aokolnychyi.github.io/spark-docs/sql-programming-guide.html). I did not manage to build the Java

[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15996 **[Test build #70407 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70407/testReport)** for PR 15996 at commit [`28f88ef`](https://github.com/apache/spark/commit/28

[GitHub] spark issue #16329: [SPARK-16046][DOCS] Aggregations in the Spark SQL progra...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16329 **[Test build #70405 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70405/testReport)** for PR 16329 at commit [`2c1f182`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #16329: [SPARK-16046][DOCS] Aggregations in the Spark SQL progra...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16329 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16329: [SPARK-16046][DOCS] Aggregations in the Spark SQL progra...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16329 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70405/ Test PASSed. ---

[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...

2016-12-20 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/12775 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, o

[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15996 **[Test build #70408 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70408/testReport)** for PR 15996 at commit [`97dc307`](https://github.com/apache/spark/commit/97

[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12775 **[Test build #70409 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70409/testReport)** for PR 12775 at commit [`9778cef`](https://github.com/apache/spark/commit/97

[GitHub] spark issue #16296: [SPARK-18885][SQL] unify CREATE TABLE syntax for data so...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16296 **[Test build #70410 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70410/testReport)** for PR 16296 at commit [`a553366`](https://github.com/apache/spark/commit/a5

[GitHub] spark pull request #16352: [SPARK-18947][SQL] SQLContext.tableNames should n...

2016-12-20 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/16352 [SPARK-18947][SQL] SQLContext.tableNames should not call Catalog.listTables ## What changes were proposed in this pull request? It's a huge waste to call `Catalog.listTables` in `SQLConte

[GitHub] spark issue #16352: [SPARK-18947][SQL] SQLContext.tableNames should not call...

2016-12-20 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16352 cc @yhuai @gatorsmile --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16352: [SPARK-18947][SQL] SQLContext.tableNames should not call...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16352 **[Test build #70411 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70411/testReport)** for PR 16352 at commit [`f12dc79`](https://github.com/apache/spark/commit/f1

[GitHub] spark issue #16351: [SPARK-18943][SQL] Avoid per-record type dispatch in CSV...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16351 **[Test build #70403 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70403/testReport)** for PR 16351 at commit [`e72d1bc`](https://github.com/apache/spark/commit/e

[GitHub] spark issue #16351: [SPARK-18943][SQL] Avoid per-record type dispatch in CSV...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16351 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70403/ Test PASSed. ---

[GitHub] spark issue #16351: [SPARK-18943][SQL] Avoid per-record type dispatch in CSV...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16351 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16351: [SPARK-18943][SQL] Avoid per-record type dispatch in CSV...

2016-12-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/16351 cc @cloud-fan, could I please ask to take a look? I remember a similar PR was reviewed by you before. --- If your project is set up for it, you can reply to this email and have your reply appea

[GitHub] spark pull request #16329: [SPARK-16046][DOCS] Aggregations in the Spark SQL...

2016-12-20 Thread jnh5y
Github user jnh5y commented on a diff in the pull request: https://github.com/apache/spark/pull/16329#discussion_r93261268 --- Diff: docs/sql-programming-guide.md --- @@ -382,6 +382,52 @@ For example: +## Aggregations + +The [built-in DataFrames functi

[GitHub] spark issue #16352: [SPARK-18947][SQL] SQLContext.tableNames should not call...

2016-12-20 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16352 The same issue also exists in [getTableNames](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala#L278). Could we also fix there? --- I

[GitHub] spark pull request #16329: [SPARK-16046][DOCS] Aggregations in the Spark SQL...

2016-12-20 Thread jnh5y
Github user jnh5y commented on a diff in the pull request: https://github.com/apache/spark/pull/16329#discussion_r93262242 --- Diff: docs/sql-programming-guide.md --- @@ -382,6 +382,52 @@ For example: +## Aggregations + +The [built-in DataFrames functi

[GitHub] spark issue #16352: [SPARK-18947][SQL] SQLContext.tableNames should not call...

2016-12-20 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16352 LGTM except the above comment --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled an

[GitHub] spark issue #16232: [SPARK-18800][SQL] Correct the assert in UnsafeKVExterna...

2016-12-20 Thread davies
Github user davies commented on the issue: https://github.com/apache/spark/pull/16232 lgtm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the featur

[GitHub] spark issue #16351: [SPARK-18943][SQL] Avoid per-record type dispatch in CSV...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16351 **[Test build #70404 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70404/testReport)** for PR 16351 at commit [`22c9a8a`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #16351: [SPARK-18943][SQL] Avoid per-record type dispatch in CSV...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16351 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70404/ Test PASSed. ---

[GitHub] spark issue #16296: [SPARK-18885][SQL] unify CREATE TABLE syntax for data so...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16296 **[Test build #70410 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70410/testReport)** for PR 16296 at commit [`a553366`](https://github.com/apache/spark/commit/a

[GitHub] spark issue #16296: [SPARK-18885][SQL] unify CREATE TABLE syntax for data so...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16296 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70410/ Test FAILed. ---

[GitHub] spark issue #16296: [SPARK-18885][SQL] unify CREATE TABLE syntax for data so...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16296 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16351: [SPARK-18943][SQL] Avoid per-record type dispatch in CSV...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16351 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark pull request #15018: [SPARK-17455][MLlib] Improve PAVA implementation ...

2016-12-20 Thread neggert
Github user neggert commented on a diff in the pull request: https://github.com/apache/spark/pull/15018#discussion_r9328 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala --- @@ -328,74 +336,69 @@ class IsotonicRegression private (privat

[GitHub] spark issue #16240: [SPARK-16792][SQL] Dataset containing a Case Class with ...

2016-12-20 Thread michalsenkyr
Github user michalsenkyr commented on the issue: https://github.com/apache/spark/pull/16240 None of them. The compilation will fail. That is why I had to provide those additional implicits. ``` scala> class Test[T] defined class Test scala> implicit def test1[

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2016-12-20 Thread neggert
Github user neggert commented on the issue: https://github.com/apache/spark/pull/15018 @viirya Better to remove them, or throw an error? Personally, I'd rather be alerted that I'm passing invalid input, rather than have it "fixed" for me. --- If your project is set up for it, you can

[GitHub] spark issue #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for eac...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16350 **[Test build #70406 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70406/consoleFull)** for PR 16350 at commit [`80b8664`](https://github.com/apache/spark/commit/

[GitHub] spark issue #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for eac...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16350 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70406/ Test PASSed. ---

[GitHub] spark issue #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for eac...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16350 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15996 **[Test build #70407 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70407/testReport)** for PR 15996 at commit [`28f88ef`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15996 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70407/ Test PASSed. ---

[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15996 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12775 **[Test build #70409 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70409/testReport)** for PR 12775 at commit [`9778cef`](https://github.com/apache/spark/commit/9

[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12775 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12775 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70409/ Test FAILed. ---

[GitHub] spark pull request #16353: [SPARK-18948][MLlib] Add Mean Percentile Rank met...

2016-12-20 Thread daniloascione
GitHub user daniloascione opened a pull request: https://github.com/apache/spark/pull/16353 [SPARK-18948][MLlib] Add Mean Percentile Rank metric for ranking algorithms ## What changes were proposed in this pull request? This PR adds the implementation of Mean Percentile

[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15996 **[Test build #70408 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70408/testReport)** for PR 15996 at commit [`97dc307`](https://github.com/apache/spark/commit/9

[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15996 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70408/ Test PASSed. ---

[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15996 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16353: [SPARK-18948][MLlib] Add Mean Percentile Rank metric for...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16353 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark issue #16352: [SPARK-18947][SQL] SQLContext.tableNames should not call...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16352 **[Test build #70411 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70411/testReport)** for PR 16352 at commit [`f12dc79`](https://github.com/apache/spark/commit/f

[GitHub] spark issue #16352: [SPARK-18947][SQL] SQLContext.tableNames should not call...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16352 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16352: [SPARK-18947][SQL] SQLContext.tableNames should not call...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16352 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70411/ Test PASSed. ---

[GitHub] spark issue #16353: [SPARK-18948][MLlib] Add Mean Percentile Rank metric for...

2016-12-20 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16353 This is pretty specific to ALS, and relies on the r_ui strength value in the paper. I'm not sure it is that general. Without this weight, it's somewhat related to simple existing metrics like mean re

[GitHub] spark pull request #16354: [SPARK-18886][Scheduler][WIP] Adjust Delay schedu...

2016-12-20 Thread squito
GitHub user squito opened a pull request: https://github.com/apache/spark/pull/16354 [SPARK-18886][Scheduler][WIP] Adjust Delay scheduling to prevent under-utilization of cluster ## What changes were proposed in this pull request? This is a significant change to delay sched

[GitHub] spark issue #16337: [SPARK-18871][SQL] New test cases for IN/NOT IN subquery

2016-12-20 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/16337 I have tested a few runs on `SQLQueryTestSuite` to confirm it allows to have sub-directories under `sql/core/src/test/resources/sql-tests/[inputs|results]` to group test files further. By reading th

[GitHub] spark pull request #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2016-12-20 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/16344#discussion_r93289668 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -592,6 +629,59 @@ object GeneralizedLinearRegres

[GitHub] spark issue #16354: [SPARK-18886][Scheduler][WIP] Adjust Delay scheduling to...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16354 **[Test build #70412 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70412/testReport)** for PR 16354 at commit [`449ba20`](https://github.com/apache/spark/commit/44

[GitHub] spark issue #16354: [SPARK-18886][Scheduler][WIP] Adjust Delay scheduling to...

2016-12-20 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/16354 @mridulm @markhamstra @kayousterhout This is *not* ready to merge -- it needs some cleanup and more tests -- but I thought that seeing an implementation might help think through the design. I t

[GitHub] spark pull request #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2016-12-20 Thread actuaryzhang
Github user actuaryzhang commented on a diff in the pull request: https://github.com/apache/spark/pull/16344#discussion_r93290858 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -592,6 +629,59 @@ object GeneralizedLinearRegres

[GitHub] spark pull request #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2016-12-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/16344#discussion_r93291042 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -592,6 +629,59 @@ object GeneralizedLinearRegression e

[GitHub] spark pull request #16344: [SPARK-18929][ML] Add Tweedie distribution in GLM

2016-12-20 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/16344#discussion_r93290854 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/GeneralizedLinearRegression.scala --- @@ -242,7 +275,7 @@ class GeneralizedLinearRegression @Si

  1   2   3   4   5   >