[GitHub] spark issue #18025: [WIP][SparkR] Update doc and examples for sql functions

2017-05-18 Thread actuaryzhang
Github user actuaryzhang commented on the issue: https://github.com/apache/spark/pull/18025 This is what the `'column_aggregate_functions.Rd'` doc looks like: ![image](https://cloud.githubusercontent.com/assets/11082368/26190195/fd353224-3b5c-11e7-9a78-2607cc665f49.png) !

[GitHub] spark issue #18000: [SPARK-20364][SQL] Disable Parquet predicate pushdown fo...

2017-05-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18000 LGTM, is parquet going to fix it in the future? or is there any official way to support filter push down for column names with dot? --- If your project is set up for it, you can reply to this ema

[GitHub] spark pull request #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated S...

2017-05-18 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/14971#discussion_r117169168 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala --- @@ -232,7 +446,8 @@ class StatisticsSuite extends StatisticsCollectionT

[GitHub] spark pull request #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated S...

2017-05-18 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/14971#discussion_r117168812 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala --- @@ -215,6 +218,217 @@ class StatisticsSuite extends StatisticsCollectio

[GitHub] spark pull request #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated S...

2017-05-18 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/14971#discussion_r117156770 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -414,6 +415,50 @@ private[hive] class HiveClientImpl(

[GitHub] spark pull request #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated S...

2017-05-18 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/14971#discussion_r117169136 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveComparisonTest.scala --- @@ -192,13 +192,7 @@ abstract class HiveComparisonTest

[GitHub] spark pull request #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated S...

2017-05-18 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/14971#discussion_r117172781 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala --- @@ -215,6 +218,217 @@ class StatisticsSuite extends StatisticsCollectio

[GitHub] spark pull request #18002: [SPARK-20770][SQL] Improve ColumnStats

2017-05-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18002#discussion_r117174531 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/ColumnStats.scala --- @@ -53,219 +53,299 @@ private[columnar] sealed trait Colu

[GitHub] spark pull request #17996: [SPARK-20506][DOCS] 2.2 migration guide

2017-05-18 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/17996#discussion_r117174667 --- Diff: docs/ml-guide.md --- @@ -72,35 +72,26 @@ MLlib is under active development. The APIs marked `Experimental`/`DeveloperApi` may change in future

[GitHub] spark issue #18000: [SPARK-20364][SQL] Disable Parquet predicate pushdown fo...

2017-05-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18000 Based on the discussion in https://github.com/apache/parquet-mr/pull/361, it does not sound Parquet will support it in the short term. We might need to live with it for a long time. --- If you

[GitHub] spark pull request #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated S...

2017-05-18 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/14971#discussion_r117174966 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -414,6 +415,50 @@ private[hive] class HiveClientImpl(

[GitHub] spark issue #18000: [SPARK-20364][SQL] Disable Parquet predicate pushdown fo...

2017-05-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18000 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-18 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r117170904 --- Diff: core/src/main/scala/org/apache/spark/memory/MemoryManager.scala --- @@ -20,7 +20,7 @@ package org.apache.spark.memory import javax.annotati

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-18 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r117171397 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -278,4 +278,21 @@ package object config { "spark.io.compr

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-18 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r117172780 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -175,33 +197,54 @@ final class ShuffleBlockFetcherIterato

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-18 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r117170816 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -278,4 +278,21 @@ package object config { "spark.io.compr

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-18 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r117170463 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java --- @@ -126,4 +150,50 @@ private void failRema

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-18 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r117169752 --- Diff: common/network-common/src/main/java/org/apache/spark/network/server/OneForOneStreamManager.java --- @@ -95,6 +97,25 @@ public ManagedBuffer get

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-18 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r117174623 --- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala --- @@ -193,8 +217,19 @@ private[spark] object HighlyCompressedMapStatus {

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-18 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r117171649 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -278,4 +278,21 @@ package object config { "spark.io.compr

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-18 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r117170538 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java --- @@ -126,4 +150,50 @@ private void failRema

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-18 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r117172062 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -395,7 +438,6 @@ final class ShuffleBlockFetcherIterator(

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-18 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r117172461 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -129,6 +137,12 @@ final class ShuffleBlockFetcherIterator

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-18 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r117175176 --- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala --- @@ -128,41 +133,60 @@ private[spark] class CompressedMapStatus( * @param

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-18 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r117174283 --- Diff: docs/configuration.md --- @@ -954,12 +971,12 @@ Apart from these, the following properties are also available, and may be useful spark.me

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-18 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r117174976 --- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala --- @@ -193,8 +217,19 @@ private[spark] object HighlyCompressedMapStatus {

[GitHub] spark pull request #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated S...

2017-05-18 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/14971#discussion_r117176108 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala --- @@ -175,7 +178,7 @@ class StatisticsSuite extends StatisticsCollectionT

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-18 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r117173550 --- Diff: core/src/test/scala/org/apache/spark/MapOutputTrackerSuite.scala --- @@ -29,7 +29,11 @@ import org.apache.spark.shuffle.FetchFailedException

[GitHub] spark pull request #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated S...

2017-05-18 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/14971#discussion_r117175951 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala --- @@ -175,7 +178,7 @@ class StatisticsSuite extends StatisticsCollectionT

[GitHub] spark issue #18000: [SPARK-20364][SQL] Disable Parquet predicate pushdown fo...

2017-05-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18000 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #18000: [SPARK-20364][SQL] Disable Parquet predicate pushdown fo...

2017-05-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18000 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #18012: [SPARK-20779][Examples]The ASF header placed in an incor...

2017-05-18 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18012 Jenkins test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16989: [SPARK-19659] Fetch big blocks to disk when shuffle-read...

2017-05-18 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/16989 A few more high-level thoughts about this PR: - It seems like the benefits here come from three interrelated changes: - Improving the accuracy of map output size reporting for large s

[GitHub] spark pull request #18002: [SPARK-20770][SQL] Improve ColumnStats

2017-05-18 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/18002#discussion_r117177275 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/ColumnStats.scala --- @@ -53,219 +53,299 @@ private[columnar] sealed trait ColumnSt

[GitHub] spark pull request #17997: [SPARK-20763][SQL]The function of `month` and `da...

2017-05-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/17997#discussion_r117178061 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -603,7 +603,13 @@ object DateTimeUtils { */

[GitHub] spark pull request #17997: [SPARK-20763][SQL]The function of `month` and `da...

2017-05-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/17997#discussion_r117178267 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -603,7 +603,13 @@ object DateTimeUtils { */

[GitHub] spark issue #10405: [SPARK-12339] [WebUI] Added a null check that was remove...

2017-05-18 Thread VishnuGowthemT
Github user VishnuGowthemT commented on the issue: https://github.com/apache/spark/pull/10405 Can this fix be added in 1.6 as well ? https://github.com/apache/spark/blob/branch-1.6/sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala --- If your project

[GitHub] spark issue #18024: [SPARK-20792][SS] Support same timeout operations in map...

2017-05-18 Thread tdas
Github user tdas commented on the issue: https://github.com/apache/spark/pull/18024 jenkins test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #16989: [SPARK-19659] Fetch big blocks to disk when shuffle-read...

2017-05-18 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/16989 Update: I realize that I overlooked the change to set a default for `spark.memory.offHeap.size`. Thus I'll retract my original objections regarding `MemoryMode.OFF_HEAP` but I'd still like to addr

[GitHub] spark issue #16989: [SPARK-19659] Fetch big blocks to disk when shuffle-read...

2017-05-18 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/16989 Also, I noticed that the PR description doesn't quite align with implementation AFAIK: > Track average size and also the outliers(which are larger than 2*avgSize) in MapStatus; d

[GitHub] spark pull request #17997: [SPARK-20763][SQL]The function of `month` and `da...

2017-05-18 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/17997#discussion_r117180644 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -603,7 +603,13 @@ object DateTimeUtils { */

[GitHub] spark issue #18000: [SPARK-20364][SQL] Disable Parquet predicate pushdown fo...

2017-05-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/18000 Seems jenkins doesn't work for now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #17997: [SPARK-20763][SQL]The function of `month` and `da...

2017-05-18 Thread 10110346
Github user 10110346 commented on a diff in the pull request: https://github.com/apache/spark/pull/17997#discussion_r117181219 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeUtils.scala --- @@ -603,7 +603,13 @@ object DateTimeUtils { */

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r117181192 --- Diff: core/src/main/scala/org/apache/spark/scheduler/MapStatus.scala --- @@ -193,8 +217,19 @@ private[spark] object HighlyCompressedMapStatus {

[GitHub] spark issue #16989: [SPARK-19659] Fetch big blocks to disk when shuffle-read...

2017-05-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16989 +1 on @JoshRosen 's suggestion, we can integrate it with memory manager later. cc @JoshRosen shall we put this patch to branch 2.2? --- If your project is set up for it, you can reply to

[GitHub] spark pull request #12162: [SPARK-14289][WIP] Support multiple eviction stra...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12162 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #12085: [SPARK-14293] Improve shuffle load balancing and ...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12085 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #12419: [SPARK-14661] [MLlib] trim PCAModel by required e...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12419 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #12420: [SPARK-14585][ML][WIP] Provide accessor methods f...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12420 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #14481: [WIP][SPARK-16844][SQL] Generate code for sort ba...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14481 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #17872: [SPARK-20608] allow standby namenodes in spark.ya...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17872 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #15594: [SPARK-18061][SQL][Security] Spark Thriftserver n...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15594 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #14091: [SPARK-16412][SQL][WIP] Generate Java code that g...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14091 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #14557: [SPARK-16709][CORE] Kill the running task if stag...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14557 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #17001: [SPARK-19667][SQL]create table with hiveenabled i...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17001 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #17971: Branch 0.5

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17971 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #16652: [SPARK-19234][MLLib] AFTSurvivalRegression should...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16652 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #15918: [SPARK-18122][SQL][WIP]Fallback to Kryo for unsup...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15918 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #18017: [INFRA] Close stale PRs

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/18017 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #17303: [SPARK-19112][CORE] add codec for ZStandard

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17303 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #15850: [SPARK-18411] [SQL] Add Argument Types and Test C...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15850 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #13959: [SPARK-14351] [MLlib] [ML] Optimize findBestSplit...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13959 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #17272: [SPARK-19724][SQL]create a managed table with an ...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17272 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #13762: [SPARK-14926] [ML] OneVsRest labelMetadata uses i...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13762 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #13851: [SPARK-9478] [ml] Add class weights to Random For...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13851 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #16975: [SPARK-19522] Fix executor memory in local-cluste...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16975 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #12491: [SPARK-14712][ML]spark.ml.LogisticRegressionModel...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12491 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #11129: [SPARK-13232][YARN] Fix executor node label

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/11129 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #14547: [SPARK-16718][MLlib] gbm-style treeboost

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14547 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #16743: [SPARK-19379][CORE] SparkAppHandle.getState not r...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16743 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #15652: [SPARK-16987] [None] Add spark-default.conf prope...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15652 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #16893: [SPARK-19555][SQL] Improve the performance of Str...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16893 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #17119: [SPARK-19784][SQL][WIP]refresh table after alter ...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17119 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #13881: [SPARK-3723] [MLlib] Adding instrumentation to ra...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13881 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #15914: [SPARK-14974][SQL]delete temporary folder after i...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15914 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #14686: [SPARK-16253][SQL] make spark sql compatible with...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14686 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #13837: [SPARK-16126] [SQL] Better Error Message When usi...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13837 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #13891: [SPARK-6685][MLLIB]Use DSYRK to compute AtA in AL...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/13891 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #16285: [SPARK-18867] [SQL] Throw cause if IsolatedClient...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16285 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #18026: [SPARK-16202][SQL][DOC] Follow-up to Correct The ...

2017-05-18 Thread jaceklaskowski
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/18026 [SPARK-16202][SQL][DOC] Follow-up to Correct The Description of CreatableRelationProvider's createRelation ## What changes were proposed in this pull request? Follow-up to SPARK-162

[GitHub] spark pull request #16389: [SPARK-18981][Core]The job hang problem when spec...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16389 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #17778: [SPARK-20494] Implement UDF array_unique in Spark...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17778 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark pull request #17088: [SPARK-19753][CORE] Un-register all shuffle outpu...

2017-05-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17088 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is ena

[GitHub] spark issue #17996: [SPARK-20506][DOCS] 2.2 migration guide

2017-05-18 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/17996 @felixcheung @yanboliang by the way I haven't added any SparkR stuff here as I'm not sure anything breaking, deprecated etc goes here or in the SparkR guide. --- If your project is set up for it,

[GitHub] spark issue #17723: [SPARK-20434][YARN][CORE] Move kerberos delegation token...

2017-05-18 Thread mgummelt
Github user mgummelt commented on the issue: https://github.com/apache/spark/pull/17723 I'm working on this now, and am definitely willing to execute the plan we've agreed on, but the more I think about it, the more I think it makes sense to make `ServiceCredentialProvider` private an

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add a Bucketizer that can bin mul...

2017-05-18 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/17819 I will try to take a look soon. My main concern is whether we should really have a new class - it starts to make things really messy if we introduce `Multi` versions of everything (e.g. we may want t

[GitHub] spark issue #17723: [SPARK-20434][YARN][CORE] Move kerberos delegation token...

2017-05-18 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/17723 @mgummelt We have in house delegation provider for HiveServer2, multi HBase cluster. I think this is useful in Hadoop world. So better to keep it. --- If your project is set up for it, you can re

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add a Bucketizer that can bin mul...

2017-05-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @MLnick That's right. I also have concern about this. However, to keep the original single-column Bucketizer and multiple-column Bucketizer in one class seems also producing a messy code. I'

[GitHub] spark issue #17999: [SPARK-20751][SQL] Add built-in SQL Function - COT

2017-05-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17999 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark pull request #18022: [SPARK-20790] [MLlib] Correctly handle negative v...

2017-05-18 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/18022#discussion_r117191129 --- Diff: mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala --- @@ -78,7 +79,7 @@ class ALSSuite val k = 2 val ne0 = n

[GitHub] spark pull request #18022: [SPARK-20790] [MLlib] Correctly handle negative v...

2017-05-18 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/18022#discussion_r117190338 --- Diff: mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala --- @@ -348,6 +349,37 @@ class ALSSuite } /** + * T

[GitHub] spark pull request #18022: [SPARK-20790] [MLlib] Correctly handle negative v...

2017-05-18 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/18022#discussion_r117192420 --- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala --- @@ -1624,15 +1628,15 @@ object ALS extends DefaultParamsReadable[ALS] with L

[GitHub] spark pull request #18022: [SPARK-20790] [MLlib] Correctly handle negative v...

2017-05-18 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/18022#discussion_r117191375 --- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala --- @@ -795,8 +799,8 @@ object ALS extends DefaultParamsReadable[ALS] with Loggi

[GitHub] spark pull request #18022: [SPARK-20790] [MLlib] Correctly handle negative v...

2017-05-18 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/18022#discussion_r117192700 --- Diff: mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala --- @@ -763,11 +763,15 @@ object ALS extends DefaultParamsReadable[ALS] with Log

[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)

2017-05-18 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/16478 Is there no SQL committer support for this? Seems like a critical feature for Spark users with no response from any SQL folks. Making UDT public in some way is pretty important no? --- If

[GitHub] spark issue #17094: [SPARK-19762][ML] Hierarchy for consolidating ML aggrega...

2017-05-18 Thread MLnick
Github user MLnick commented on the issue: https://github.com/apache/spark/pull/17094 In terms of the high level intention of this, agree we definitely need it and it should clean things up substantially. I will start taking a look through ASAP. Thanks! --- If your project is set up

[GitHub] spark issue #18011: [SPARK-19089][SQL] Add support for nested sequences

2017-05-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18011 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #18011: [SPARK-19089][SQL] Add support for nested sequences

2017-05-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18011 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77040/ Test FAILed. ---

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-18 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r117203261 --- Diff: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java --- @@ -126,4 +150,50 @@ private void failRemain

[GitHub] spark pull request #16989: [SPARK-19659] Fetch big blocks to disk when shuff...

2017-05-18 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/16989#discussion_r117203833 --- Diff: common/network-common/src/main/java/org/apache/spark/network/server/OneForOneStreamManager.java --- @@ -95,6 +97,25 @@ public ManagedBuffer getCh

  1   2   3   4   5   >