[GitHub] spark pull request: Turn UpdateBlockInfo into case class.

2014-08-10 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1872#issuecomment-51707226 I'm going to merge this one since the test failure is independent of this. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: Turn UpdateBlockInfo into case class.

2014-08-10 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1872 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: Updated Spark SQL README to include the hive-t...

2014-08-10 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1867 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2953] Allow using short names for io co...

2014-08-10 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/1873 [SPARK-2953] Allow using short names for io compression codecs Instead of requiring org.apache.spark.io.LZ4CompressionCodec, it is easier for users if Spark just accepts lz4, lzf, snappy. You can

[GitHub] spark pull request: [SPARK-2953] Allow using short names for io co...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1873#issuecomment-51707622 QA tests have started for PR 1873. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18273/consoleFull ---

[GitHub] spark pull request: [SPARK-2460] Optimize SparkContext.hadoopFile ...

2014-08-10 Thread scwf
Github user scwf closed the pull request at: https://github.com/apache/spark/pull/1385 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2907] [MLlib] Use mutable.HashMap to re...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1871#issuecomment-51708105 QA results for PR 1871:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2953] Allow using short names for io co...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1873#issuecomment-51708328 QA results for PR 1873:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [PySpark] [SPARK-2954] [SPARK-2948] [SPARK-291...

2014-08-10 Thread JoshRosen
GitHub user JoshRosen opened a pull request: https://github.com/apache/spark/pull/1874 [PySpark] [SPARK-2954] [SPARK-2948] [SPARK-2910] [SPARK-2101] Python 2.6 Fixes - Modify dev/run-tests to test with Python 2.6 - Use unittest2 when running on Python 2.6. - Fix issue with

[GitHub] spark pull request: [PySpark] [SPARK-2954] [SPARK-2948] [SPARK-291...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1874#issuecomment-51708413 QA tests have started for PR 1874. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18274/consoleFull ---

[GitHub] spark pull request: [PySpark] [SPARK-2954] [SPARK-2948] [SPARK-291...

2014-08-10 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1874#issuecomment-51708455 Jenkins, test this please. I installed `unittest2` on Jenkins, so hopefully these tests should now pass with `python2.6`. --- If your project is set up for

[GitHub] spark pull request: [PySpark] [SPARK-2954] [SPARK-2948] [SPARK-291...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1874#issuecomment-51708493 QA tests have started for PR 1874. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18275/consoleFull ---

[GitHub] spark pull request: [SPARK-2952] Enable logging actor messages at ...

2014-08-10 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/1870#issuecomment-51708496 Yeah that and the `DriverSuite`. Not sure what the reason is yet, but I noticed that it started happening after #1777 went in... --- If your project is set up for

[GitHub] spark pull request: Support executing Spark from symlinks

2014-08-10 Thread roji
GitHub user roji opened a pull request: https://github.com/apache/spark/pull/1875 Support executing Spark from symlinks The current scripts (e.g. pyspark) fail to run when they are executed via symlinks. A common Linux scenario would be to have Spark installed somewhere (e.g.

[GitHub] spark pull request: Support executing Spark from symlinks

2014-08-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1875#issuecomment-51708965 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2677] BasicBlockFetchIterator#next can ...

2014-08-10 Thread sarutak
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/1632#issuecomment-51709077 In #1758 @JoshRosen fixed ConnectionManager to handle the case remote executor return error message. But, the case remote executor hangs up is not handled so if

[GitHub] spark pull request: [PySpark] [SPARK-2954] [SPARK-2948] [SPARK-291...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1874#issuecomment-51709195 QA results for PR 1874:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: Remove extra semicolon in Task.scala

2014-08-10 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/1876 Remove extra semicolon in Task.scala You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark remove_semicolon_in_Task_scala Alternatively

[GitHub] spark pull request: Remove extra semicolon in Task.scala

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1876#issuecomment-51710376 QA tests have started for PR 1876. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18276/consoleFull ---

[GitHub] spark pull request: [SPARK-2950] Add gc time and shuffle write tim...

2014-08-10 Thread kayousterhout
Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/1869#issuecomment-51710728 Looks great!! +1 on this being useful. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: Turn UpdateBlockInfo into case class.

2014-08-10 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/1872#issuecomment-51710872 If case class then does it still need to be Externalizable ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: Remove extra semicolon in Task.scala

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1876#issuecomment-51711239 QA results for PR 1876:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2590][SQL] Added option to handle incre...

2014-08-10 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/1853#discussion_r16030631 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala --- @@ -30,6 +30,7 @@ private[spark] object SQLConf { val SHUFFLE_PARTITIONS =

[GitHub] spark pull request: [SPARK-1853] Show Streaming application code c...

2014-08-10 Thread mubarak
Github user mubarak commented on the pull request: https://github.com/apache/spark/pull/1723#issuecomment-51712229 @tdas I have removed 'name' from DStream and addressed your review comments. Can you please review? Thanks. --- If your project is set up for it, you can reply to

[GitHub] spark pull request: [SPARK-2907] [MLlib] Use mutable.HashMap to re...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1871#issuecomment-51712712 QA tests have started for PR 1871. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18277/consoleFull ---

[GitHub] spark pull request: [SPARK-2907] [MLlib] Use mutable.HashMap to re...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1871#issuecomment-51712724 QA results for PR 1871:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2590][SQL] Added option to handle incre...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1853#issuecomment-51712859 QA tests have started for PR 1853. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18278/consoleFull ---

[GitHub] spark pull request: [SPARK-2907] [MLlib] Use mutable.HashMap to re...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1871#issuecomment-51713377 QA tests have started for PR 1871. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18279/consoleFull ---

[GitHub] spark pull request: [SPARK-2929][SQL] Refactored Thrift server and...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1856#issuecomment-51714336 QA tests have started for PR 1856. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18280/consoleFull ---

[GitHub] spark pull request: [SPARK-2929][SQL] Refactored Thrift server and...

2014-08-10 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/1856#issuecomment-51714349 The reason of the timeout occurred in build failure is unknown due to lacking of necessary logs (maybe something's wrong in the test suites, or maybe it's just running

[GitHub] spark pull request: [SPARK-2907] [MLlib] Use mutable.HashMap to re...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1871#issuecomment-51714618 QA results for PR 1871:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2929][SQL] Refactored Thrift server and...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1856#issuecomment-51715928 QA results for PR 1856:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brclass CliSuite

[GitHub] spark pull request: [WIP][SPARK-2947] DAGScheduler resubmit the st...

2014-08-10 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/1877 [WIP][SPARK-2947] DAGScheduler resubmit the stage into an infinite loop You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-2947

[GitHub] spark pull request: [WIP][SPARK-2947] DAGScheduler resubmit the st...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1877#issuecomment-51717429 QA tests have started for PR 1877. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18281/consoleFull ---

[GitHub] spark pull request: replace println to log4j

2014-08-10 Thread critikaled
Github user critikaled commented on the pull request: https://github.com/apache/spark/pull/1372#issuecomment-51718206 hey this change has not been included in 1.0.2 release. any heads up on the version in which this will be reflected ? --- If your project is set up for it, you can

[GitHub] spark pull request: SPARK-2083 Add support for spark.local.maxFail...

2014-08-10 Thread roji
Github user roji commented on the pull request: https://github.com/apache/spark/pull/1465#issuecomment-51718343 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request: [WIP][SPARK-2947] DAGScheduler resubmit the st...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1877#issuecomment-51718834 QA results for PR 1877:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2083 Add support for spark.local.maxFail...

2014-08-10 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1465#issuecomment-51719921 I think there's already a mechanism to set this by using `local[N, maxFailures]` to create your SparkContext: ```scala // Regular expression for local[N,

[GitHub] spark pull request: [SPARK-2937] Separate out samplyByKeyExact as ...

2014-08-10 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1866#discussion_r16032164 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala --- @@ -133,68 +133,64 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)]) *

[GitHub] spark pull request: [SPARK-2937] Separate out samplyByKeyExact as ...

2014-08-10 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1866#discussion_r16032165 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala --- @@ -133,68 +133,64 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)]) *

[GitHub] spark pull request: [SPARK-2937] Separate out samplyByKeyExact as ...

2014-08-10 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1866#discussion_r16032166 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -197,33 +197,57 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)]) *

[GitHub] spark pull request: [SPARK-2937] Separate out samplyByKeyExact as ...

2014-08-10 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1866#discussion_r16032170 --- Diff: core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala --- @@ -556,6 +519,97 @@ class PairRDDFunctionsSuite extends FunSuite with

[GitHub] spark pull request: [SPARK-2937] Separate out samplyByKeyExact as ...

2014-08-10 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1866#discussion_r16032172 --- Diff: core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala --- @@ -556,6 +519,97 @@ class PairRDDFunctionsSuite extends FunSuite with

[GitHub] spark pull request: [SPARK-2315] Implement drop, dropRight and dro...

2014-08-10 Thread erikerlandson
Github user erikerlandson commented on the pull request: https://github.com/apache/spark/pull/1839#issuecomment-51720727 Jenkins still not getting the memo. How strict is Jenkins with commands? Is 'okay' same as 'ok'? --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [SPARK-2937] Separate out samplyByKeyExact as ...

2014-08-10 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1866#issuecomment-51720797 LGTM except inline comments. Thanks for keeping APIs consistent across languages!! --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-2934][MLlib] Adding LogisticRegressionW...

2014-08-10 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1862#issuecomment-51720829 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2907] [MLlib] Use mutable.HashMap to re...

2014-08-10 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1871#issuecomment-51720842 @Ishiihara Did you compare the speed? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-2923][MLLIB] Implement some basic BLAS ...

2014-08-10 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/1849#issuecomment-51722636 @mengxr By the way, I actually realized that copying sparse to dense vectors would be useful for me (in an example I wrote for the stats API check). I wanted it for

[GitHub] spark pull request: [SPARK-2850] [mllib] MLlib stats examples + sm...

2014-08-10 Thread jkbradley
GitHub user jkbradley opened a pull request: https://github.com/apache/spark/pull/1878 [SPARK-2850] [mllib] MLlib stats examples + small fixes Added examples for statistical summarization: * Scala: StatisticalSummary.scala ** Tests: correlation, MultivariateOnlineSummarizer

[GitHub] spark pull request: [SPARK-2850] [mllib] MLlib stats examples + sm...

2014-08-10 Thread jkbradley
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/1878#issuecomment-51723114 Q: Is the Python SparseVector.toDense() function too big an API update? --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-2850] [mllib] MLlib stats examples + sm...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1878#issuecomment-51723167 QA tests have started for PR 1878. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18283/consoleFull ---

[GitHub] spark pull request: Turn UpdateBlockInfo into case class.

2014-08-10 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1872#issuecomment-51724118 It is using some custom serialization to reduce serialization overhead. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: Remove extra semicolon in Task.scala

2014-08-10 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1876#issuecomment-51724140 Thanks. I've merged this in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2907] [MLlib] Use mutable.HashMap to re...

2014-08-10 Thread Ishiihara
Github user Ishiihara commented on the pull request: https://github.com/apache/spark/pull/1871#issuecomment-51724432 @mengxr Some benchmark result Environment: OSX 10.9, 8G memory, 2.5G i5 CPU, 4 threads startingAlpha = 0.0025 vecterSize = 100 Driver memory 2g

[GitHub] spark pull request: [SPARK-2950] Add gc time and shuffle write tim...

2014-08-10 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1869 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2898] [PySpark] fix bugs in deamon.py

2014-08-10 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1842#issuecomment-51725452 I've merged this into `master` and `branch-1.1`. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-2898] [PySpark] fix bugs in deamon.py

2014-08-10 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1842 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-2955 [BUILD] Test code fails to compile ...

2014-08-10 Thread srowen
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/1879 SPARK-2955 [BUILD] Test code fails to compile with mvn compile without install (This is the corrected follow-up to https://issues.apache.org/jira/browse/SPARK-2903) Right now, `mvn compile

[GitHub] spark pull request: SPARK-2955 [BUILD] Test code fails to compile ...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1879#issuecomment-51725678 QA tests have started for PR 1879. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18284/consoleFull ---

[GitHub] spark pull request: [SPARK-2936] Migrate Netty network module from...

2014-08-10 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/1865#discussion_r16033181 --- Diff: core/src/main/scala/org/apache/spark/network/netty/FileServerHandler.scala --- @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-2936] Migrate Netty network module from...

2014-08-10 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/1865#discussion_r16033193 --- Diff: core/src/main/scala/org/apache/spark/network/netty/FileServer.scala --- @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-2936] Migrate Netty network module from...

2014-08-10 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/1865#discussion_r16033197 --- Diff: core/src/main/scala/org/apache/spark/network/netty/FileServer.scala --- @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-2936] Migrate Netty network module from...

2014-08-10 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/1865#discussion_r16033207 --- Diff: core/src/main/scala/org/apache/spark/network/netty/FileClient.scala --- @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-2936] Migrate Netty network module from...

2014-08-10 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/1865#discussion_r16033208 --- Diff: core/src/main/scala/org/apache/spark/network/netty/FileClient.scala --- @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-2936] Migrate Netty network module from...

2014-08-10 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/1865#issuecomment-51726479 Just had a couple minor issues with the translation, LGTM functionality-wise. Did not do a thorough diff check, though. --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-2936] Migrate Netty network module from...

2014-08-10 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/1865#discussion_r16033275 --- Diff: core/src/main/scala/org/apache/spark/network/netty/FileClient.scala --- @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-2936] Migrate Netty network module from...

2014-08-10 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/1865#discussion_r16033305 --- Diff: core/src/main/scala/org/apache/spark/network/netty/FileClient.scala --- @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-2936] Migrate Netty network module from...

2014-08-10 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1865#discussion_r1601 --- Diff: core/src/main/scala/org/apache/spark/network/netty/FileClient.scala --- @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-2650] Build column buffers in smaller b...

2014-08-10 Thread marmbrus
GitHub user marmbrus opened a pull request: https://github.com/apache/spark/pull/1880 [SPARK-2650] Build column buffers in smaller batches You can merge this pull request into a Git repository by running: $ git pull https://github.com/marmbrus/spark columnBatches

[GitHub] spark pull request: SPARK-2955 [BUILD] Test code fails to compile ...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1879#issuecomment-51727806 QA results for PR 1879:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2650] Build column buffers in smaller b...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1880#issuecomment-51727834 QA tests have started for PR 1880. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18285/consoleFull ---

[GitHub] spark pull request: [SPARK-2650][SQL] Build column buffers in smal...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1880#issuecomment-51727923 QA results for PR 1880:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2787: Make sort-based shuffle write file...

2014-08-10 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1799#discussion_r16033699 --- Diff: core/src/main/scala/org/apache/spark/SparkEnv.scala --- @@ -246,8 +250,13 @@ object SparkEnv extends Logging { . } -

[GitHub] spark pull request: replace println to log4j

2014-08-10 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1372#issuecomment-51728995 It will be in 1.1. I guess we can also backport it to branch-1.0 -- how bad is the issue, does it cause some problems or is it just annoying? --- If your project is set

[GitHub] spark pull request: [sql]use SparkSQLEnv.stop() in ShutdownHook

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1852#issuecomment-51729257 QA tests have started for PR 1852. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18287/consoleFull ---

[GitHub] spark pull request: [SPARK-2650][SQL] Build column buffers in smal...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1880#issuecomment-5173 QA results for PR 1880:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2650][SQL] Build column buffers in smal...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1880#issuecomment-51730127 QA tests have started for PR 1880. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18288/consoleFull ---

[GitHub] spark pull request: [SPARK-2937] Separate out samplyByKeyExact as ...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1866#issuecomment-51730126 QA tests have started for PR 1866. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18289/consoleFull ---

[GitHub] spark pull request: [WIP][SPARK-2947] DAGScheduler resubmit the st...

2014-08-10 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1877#issuecomment-51730277 @witgo can you explain how this happens and why the fix works, and add a unit test for it? We can't really merge something like this without a test. --- If your project

[GitHub] spark pull request: Support executing Spark from symlinks

2014-08-10 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1875#issuecomment-51730319 @roji mind opening a JIRA issue for this on https://issues.apache.org/jira/browse/SPARK and adding it in the pull request's title? --- If your project is set up for it,

[GitHub] spark pull request: [SPARK-2953] Allow using short names for io co...

2014-08-10 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1873#discussion_r16034012 --- Diff: docs/configuration.md --- @@ -373,12 +373,12 @@ Apart from these, the following properties are also available, and may be useful /tr tr

[GitHub] spark pull request: Support executing Spark from symlinks

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1875#issuecomment-51730388 QA tests have started for PR 1875. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18290/consoleFull ---

[GitHub] spark pull request: [PySpark] [SPARK-2954] [SPARK-2948] [SPARK-291...

2014-08-10 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1874#discussion_r16034056 --- Diff: python/pyspark/tests.py --- @@ -905,8 +911,9 @@ def createFileInZip(self, name, content): pattern = re.compile(r'^ *\|', re.MULTILINE)

[GitHub] spark pull request: [SPARK-2937] Separate out samplyByKeyExact as ...

2014-08-10 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1866#issuecomment-51730672 Looks good, I also prefer separating this --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-2907] [MLlib] Use mutable.HashMap to re...

2014-08-10 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1871#issuecomment-51730712 Just FYI, mutable.HashMap can be pretty inefficient in space usage, compared e.g. to java.util.HashMap or to Spark's AppendOnlyMap. In this case it will depend on how

[GitHub] spark pull request: [SPARK-2907] [MLlib] Use mutable.HashMap to re...

2014-08-10 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1871#issuecomment-51730738 Even better might be Spark's PrimitiveKeyOpenHashMap here. Again, if there are lots of keys. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-2848] Shade Guava in uber-jars.

2014-08-10 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1813#discussion_r16034129 --- Diff: core/src/main/java/com/google/common/base/Optional.java --- @@ -0,0 +1,243 @@ +/* + * Copyright (C) 2011 The Guava Authors --- End diff

[GitHub] spark pull request: [WIP][SPARK-2816][SQL] Type-safe SQL Queries

2014-08-10 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1759#issuecomment-51730914 @marmbrus how do you intend this to work with things like Hive or JDBC? We won't know the types at compile time there, but we might still want a solution that checks the

[GitHub] spark pull request: [SPARK-2871] [PySpark] Add missing API

2014-08-10 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1791#issuecomment-51731037 BTW leaving TODOs in the Python code would also be okay, if you want to see this in the code. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-2871] [PySpark] Add missing API

2014-08-10 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1791#issuecomment-51731023 I also actually prefer leaving out the non-implemented ones instead of putting them in with NotImplementedError. Especially when working in an IDE or something similar,

[GitHub] spark pull request: [SPARK-2848] Shade Guava in uber-jars.

2014-08-10 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/1813#discussion_r16034200 --- Diff: core/src/main/java/com/google/common/base/Optional.java --- @@ -0,0 +1,243 @@ +/* + * Copyright (C) 2011 The Guava Authors --- End diff

[GitHub] spark pull request: [SPARK-2937] Separate out samplyByKeyExact as ...

2014-08-10 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1866#issuecomment-51731068 Merged into both master and branch-1.1. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2937] Separate out samplyByKeyExact as ...

2014-08-10 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1866 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [sql]use SparkSQLEnv.stop() in ShutdownHook

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1852#issuecomment-51731096 QA results for PR 1852:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2937] Separate out samplyByKeyExact as ...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1866#issuecomment-51731332 QA results for PR 1866:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2850] [mllib] MLlib stats examples + sm...

2014-08-10 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1878#issuecomment-51731530 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2850] [mllib] MLlib stats examples + sm...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1878#issuecomment-51731591 QA tests have started for PR 1878. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18291/consoleFull ---

[GitHub] spark pull request: Support executing Spark from symlinks

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1875#issuecomment-51731609 QA results for PR 1875:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2650][SQL] Build column buffers in smal...

2014-08-10 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1880#issuecomment-51731948 QA results for PR 1880:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2893: Do not swallow Exceptions when run...

2014-08-10 Thread GrahamDennis
Github user GrahamDennis commented on the pull request: https://github.com/apache/spark/pull/1827#issuecomment-51732795 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

  1   2   >