[GitHub] spark pull request: [mllib] Decision Tree API update and multiclas...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1582#issuecomment-50110901 QA results for PR 1582:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2670] FetchFailedException should be th...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1578#issuecomment-50110964 QA results for PR 1578:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2682] Javadoc generated from Scala sour...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1584#issuecomment-50111042 QA tests have started for PR 1584. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17169/consoleFull ---

[GitHub] spark pull request: SPARK-2657 Use more compact data structures th...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1555#issuecomment-50111308 QA tests have started for PR 1555. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17171/consoleFull ---

[GitHub] spark pull request: [mllib] Decision Tree API update and multiclas...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1582#issuecomment-50111307 QA tests have started for PR 1582. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17170/consoleFull ---

[GitHub] spark pull request: [SPARK-2260] Fix standalone-cluster mode, whic...

2014-07-25 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1538#discussion_r15387566 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala --- @@ -45,7 +45,7 @@ private[spark] class

[GitHub] spark pull request: [WIP] SPARK-2157 Ability to write tight firewa...

2014-07-25 Thread ash211
Github user ash211 commented on the pull request: https://github.com/apache/spark/pull/1107#issuecomment-50111683 Hi @pwendell I had a minor conflict with the fix for SPARK-2392 in #1335 but it's rebased now and merges cleanly. --- If your project is set up for it, you can reply to

[GitHub] spark pull request: [SPARK-2683] unidoc failed because org.apache....

2014-07-25 Thread yhuai
GitHub user yhuai opened a pull request: https://github.com/apache/spark/pull/1585 [SPARK-2683] unidoc failed because org.apache.spark.util.CallSite uses Java keywords as value names Renaming `short` to `shortForm` and `long` to `longForm`. JIRA:

[GitHub] spark pull request: SPARK-1416: PySpark support for SequenceFile a...

2014-07-25 Thread MLnick
Github user MLnick commented on the pull request: https://github.com/apache/spark/pull/455#issuecomment-50111706 You'd need to run loading code off master branch. It should be in 1.1 release in a few weeks— Sent from Mailbox On Fri, Jul 25, 2014 at 4:14 AM, Russell

[GitHub] spark pull request: [WIP][SPARK-2179][SQL] Public API for DataType...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1346#issuecomment-50111861 QA results for PR 1346:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2024] Add saveAsSequenceFile to PySpark

2014-07-25 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1338#issuecomment-50111889 @kanzhang it might mean that we're reusing the Bean object on the Java side when we read from the InputFormat. Hadoop's RecordReaders actually reuse the same object as

[GitHub] spark pull request: [SPARK-2024] Add saveAsSequenceFile to PySpark

2014-07-25 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1338#issuecomment-50111934 The same thing happens in normal Spark if you create a hadoopRDD or sequenceFile with Writables inside it, and then call cache(). There will be only one key element and

[GitHub] spark pull request: [mllib] Decision Tree API update and multiclas...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1582#issuecomment-50111962 QA results for PR 1582:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2024] Add saveAsSequenceFile to PySpark

2014-07-25 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/1338#discussion_r15387719 --- Diff: python/pyspark/rdd.py --- @@ -964,6 +964,106 @@ def first(self): return self.take(1)[0] +def

[GitHub] spark pull request: [SPARK-2529] Clean closures in foreach and for...

2014-07-25 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1583#issuecomment-50112144 Yeah weird, it must've been an oversight while editing. Unfortunately the apache/incubator-spark repo is gone so we can't see the old PRs and comments on them... --- If

[GitHub] spark pull request: [SPARK-2024] Add saveAsSequenceFile to PySpark

2014-07-25 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1338#issuecomment-50112244 BTW you can try WritableUtils.clone. At some point we tried cloning data by default in hadoopRDD, or having a flag for it, and we gave up because it didn't seem to work

[GitHub] spark pull request: [SPARK-2682] Javadoc generated from Scala sour...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1584#issuecomment-50112325 QA results for PR 1584:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2686 Add Length support to Spark SQL and...

2014-07-25 Thread javadba
GitHub user javadba opened a pull request: https://github.com/apache/spark/pull/1586 SPARK-2686 Add Length support to Spark SQL and HQL and Strlen support to SQL Syntactic, parsing, and operational support have been added for LEN(GTH) and STRLEN functions. Examples: SQL:

[GitHub] spark pull request: SPARK-2686 Add Length support to Spark SQL and...

2014-07-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1586#issuecomment-50112581 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2024] Add saveAsSequenceFile to PySpark

2014-07-25 Thread ash211
Github user ash211 commented on the pull request: https://github.com/apache/spark/pull/1338#issuecomment-50112894 Matei, what InputFormats did you have problems with when cloning by default? I'd love to figure out what it would take to solve the one element/one value problem.

[GitHub] spark pull request: [SPARK-2656] Python version of stratified samp...

2014-07-25 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1554 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2656] Python version of stratified samp...

2014-07-25 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1554#issuecomment-50113405 Merged. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: SPARK-2657 Use more compact data structures th...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1555#issuecomment-50114262 QA results for PR 1555:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [mllib] Decision Tree API update and multiclas...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1582#issuecomment-50114354 QA results for PR 1582:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2679] [MLLib] Ser/De for Double

2014-07-25 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1581#issuecomment-50114457 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2679] [MLLib] Ser/De for Double

2014-07-25 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1581#issuecomment-50114441 I agree it is safer to put the magic byte in front of every record. However, this is not a public API where users can throw in an arbitrary RDD and ask the serializer to

[GitHub] spark pull request: [SPARK-2679] [MLLib] Ser/De for Double

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1581#issuecomment-50114833 QA tests have started for PR 1581. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17175/consoleFull ---

[GitHub] spark pull request: [SPARK-2024] Add saveAsSequenceFile to PySpark

2014-07-25 Thread MLnick
Github user MLnick commented on a diff in the pull request: https://github.com/apache/spark/pull/1338#discussion_r15388452 --- Diff: python/pyspark/rdd.py --- @@ -964,6 +964,106 @@ def first(self): return self.take(1)[0] +def

[GitHub] spark pull request: SPARK-2269 Refactor mesos scheduler resourceOf...

2014-07-25 Thread tnachen
Github user tnachen commented on the pull request: https://github.com/apache/spark/pull/1487#issuecomment-50115251 @pwendell this seems to pass tests now, mind to take a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: (WIP) SPARK-2045 Sort-based shuffle

2014-07-25 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/1499#issuecomment-50115353 Running tests with export SPARK_JAVA_OPTS=-Dspark.shuffle.manager=org.apache.spark.shuffle.sort.SortShuffleManager causes : ''' - sorting using mutable

[GitHub] spark pull request: (WIP) SPARK-2045 Sort-based shuffle

2014-07-25 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/1499#issuecomment-50115453 BTW, this is one of 5 failures from core. I hope there are no merge issues though, --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-2521] Broadcast RDD object (instead of ...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1498#issuecomment-50115581 QA tests have started for PR 1498. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17176/consoleFull ---

[GitHub] spark pull request: (WIP) SPARK-2045 Sort-based shuffle

2014-07-25 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1499#issuecomment-50115862 @mridulm that should've been fixed recently in https://github.com/mateiz/spark/commit/9c299579f13f004f5fd1f4dd0b98b7d76cac2a55, which got rid of custom return types in

[GitHub] spark pull request: (WIP) SPARK-2045 Sort-based shuffle

2014-07-25 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1499#discussion_r15388684 --- Diff: core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleManager.scala --- @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-2024] Add saveAsSequenceFile to PySpark

2014-07-25 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1338#issuecomment-50116221 @ash211 this was the JIRA: https://issues.apache.org/jira/browse/SPARK-1018. I don't remember the problematic types exactly, but one might've been Avro records, so try

[GitHub] spark pull request: (WIP) SPARK-2045 Sort-based shuffle

2014-07-25 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/1499#issuecomment-50116492 ah, thanks ! rerunning with 9c29957. cant pull the pr - and manual merge is painful, hence delays in testing :-) --- If your project is set up for it, you can reply

[GitHub] spark pull request: [sql] fix DEFAULT_INITIAL_BUFFER_SIZE

2014-07-25 Thread scwf
GitHub user scwf opened a pull request: https://github.com/apache/spark/pull/1587 [sql] fix DEFAULT_INITIAL_BUFFER_SIZE You can merge this pull request into a Git repository by running: $ git pull https://github.com/scwf/spark fixColumnBuilder Alternatively you can review

[GitHub] spark pull request: [SPARK-2024] Add saveAsSequenceFile to PySpark

2014-07-25 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1338#discussion_r15388927 --- Diff: python/pyspark/rdd.py --- @@ -964,6 +964,106 @@ def first(self): return self.take(1)[0] +def

[GitHub] spark pull request: [sql] fix DEFAULT_INITIAL_BUFFER_SIZE

2014-07-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1587#issuecomment-50116636 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2010] [PySpark] support nested structur...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1559#issuecomment-50116742 QA tests have started for PR 1559. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17178/consoleFull ---

[GitHub] spark pull request: [SPARK-2010] [PySpark] support nested structur...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1559#issuecomment-50116749 QA results for PR 1559:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brclass

[GitHub] spark pull request: SPARK-2657 Use more compact data structures th...

2014-07-25 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1555#issuecomment-50117105 Merged this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: SPARK-2657 Use more compact data structures th...

2014-07-25 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1555 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-25 Thread javadba
Github user javadba commented on a diff in the pull request: https://github.com/apache/spark/pull/1542#discussion_r15389491 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -130,7 +130,7 @@ private[spark] abstract class MapOutputTracker(conf: SparkConf)

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-25 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1520#discussion_r15389599 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/random/DistributionGenerator.scala --- @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-25 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1520#discussion_r15389612 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/random/RandomRDDGenerators.scala --- @@ -0,0 +1,422 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-25 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1520#discussion_r15389618 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/random/RandomRDDGenerators.scala --- @@ -0,0 +1,422 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-25 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1520#discussion_r15389602 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/random/DistributionGenerator.scala --- @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-25 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1520#discussion_r15389629 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/rdd/RandomRDD.scala --- @@ -0,0 +1,118 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-25 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1520#discussion_r15389640 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/rdd/RandomRDD.scala --- @@ -0,0 +1,118 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-25 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1520#discussion_r15389665 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/rdd/RandomRDD.scala --- @@ -0,0 +1,118 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-25 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1520#discussion_r15389671 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/random/DistributionGeneratorSuite.scala --- @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-25 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1520#discussion_r15389681 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/random/RandomRDDGeneratorsSuite.scala --- @@ -0,0 +1,171 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-25 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1520#discussion_r15389677 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/random/RandomRDDGeneratorsSuite.scala --- @@ -0,0 +1,171 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-25 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1520#discussion_r15389645 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/rdd/RandomRDD.scala --- @@ -0,0 +1,118 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-25 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1520#discussion_r15389694 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/random/RandomRDDGeneratorsSuite.scala --- @@ -0,0 +1,171 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2521] Broadcast RDD object (instead of ...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1498#issuecomment-50118861 QA results for PR 1498:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brabstract class

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-25 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1520#discussion_r15389761 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/random/RandomRDDGeneratorsSuite.scala --- @@ -0,0 +1,171 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-25 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1520#discussion_r15389784 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/random/RandomRDDGeneratorsSuite.scala --- @@ -0,0 +1,171 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-25 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1520#discussion_r15389923 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/random/DistributionGeneratorSuite.scala --- @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2538] [PySpark] Hash based disk spillin...

2014-07-25 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/1460#issuecomment-50119786 Awesome! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-25 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1520#discussion_r15390040 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/rdd/RandomRDD.scala --- @@ -0,0 +1,118 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-2010] [PySpark] support nested structur...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1559#issuecomment-50119958 QA results for PR 1559:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brclass

[GitHub] spark pull request: [SPARK-2010] [PySpark] support nested structur...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1559#issuecomment-50119951 QA tests have started for PR 1559. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17179/consoleFull ---

[GitHub] spark pull request: [SPARK-2529] Clean closures in foreach and for...

2014-07-25 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1583#issuecomment-50120111 Merged in master branch-1.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2529] Clean closures in foreach and for...

2014-07-25 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1583 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [sql] fix DEFAULT_INITIAL_BUFFER_SIZE

2014-07-25 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1587#discussion_r15390285 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnBuilder.scala --- @@ -118,7 +118,7 @@ private[sql] class BinaryColumnBuilder extends

[GitHub] spark pull request: [SPARK-2514] [mllib] Random RDD generator

2014-07-25 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1520#issuecomment-50120556 @dorx Besides comments, could you mark distribution generators and methods that requires distribution generators `@Experimental`? Part of the reason is that we don't have

[GitHub] spark pull request: SPARK-2686 Add Length support to Spark SQL and...

2014-07-25 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/1586#discussion_r15390507 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -208,6 +208,69 @@ case class EndsWith(left:

[GitHub] spark pull request: SPARK-2686 Add Length support to Spark SQL and...

2014-07-25 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/1586#discussion_r15390509 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringOperations.scala --- @@ -208,6 +208,69 @@ case class EndsWith(left:

[GitHub] spark pull request: SPARK-2686 Add Length support to Spark SQL and...

2014-07-25 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/1586#discussion_r15390523 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvaluationSuite.scala --- @@ -543,4 +546,46 @@ class

[GitHub] spark pull request: Tests meant to demonstrate the bug in SPARK-26...

2014-07-25 Thread ash211
GitHub user ash211 opened a pull request: https://github.com/apache/spark/pull/1588 Tests meant to demonstrate the bug in SPARK-2620 They pass though, which is not what I was expecting given the reporters observations in the ticket You can merge this pull request into a Git

[GitHub] spark pull request: SPARK-2686 Add Length support to Spark SQL and...

2014-07-25 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/1586#discussion_r15390570 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/QueryTest.scala --- @@ -39,12 +41,15 @@ class QueryTest extends PlanTest { def

[GitHub] spark pull request: SPARK-2686 Add Length support to Spark SQL and...

2014-07-25 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/1586#discussion_r15390556 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionEvaluationSuite.scala --- @@ -543,4 +546,46 @@ class

[GitHub] spark pull request: SPARK-2686 Add Length support to Spark SQL and...

2014-07-25 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1586#issuecomment-50121152 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: Tests meant to demonstrate the bug in SPARK-26...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1588#issuecomment-50121149 QA tests have started for PR 1588. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17180/consoleFull ---

[GitHub] spark pull request: SPARK-2686 Add Length support to Spark SQL and...

2014-07-25 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/1586#discussion_r15390612 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala --- @@ -988,8 +990,15 @@ private[hive] object HiveQl { case

[GitHub] spark pull request: SPARK-2686 Add Length support to Spark SQL and...

2014-07-25 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1586#issuecomment-50121226 Thanks for doing this! A few minor comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [sql] fix DEFAULT_INITIAL_BUFFER_SIZE

2014-07-25 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/1587#discussion_r15390674 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnBuilder.scala --- @@ -118,7 +118,7 @@ private[sql] class BinaryColumnBuilder extends

[GitHub] spark pull request: SPARK-2686 Add Length support to Spark SQL and...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1586#issuecomment-50121561 QA tests have started for PR 1586. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17181/consoleFull ---

[GitHub] spark pull request: SPARK-2686 Add Length support to Spark SQL and...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1586#issuecomment-50121672 QA results for PR 1586:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brtrait

[GitHub] spark pull request: [SPARK-2665] [SQL] Add EqualNS Unit Tests

2014-07-25 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1570 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2665] [SQL] Add EqualNS Unit Tests

2014-07-25 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1570#issuecomment-50121775 Thanks! I've merge this into master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SQL]Update HiveMetastoreCatalog.scala

2014-07-25 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1569#issuecomment-50122704 I guess it is a `def` since it is not serializable... You could make it a `@transient lazy val`. BTW you can run the tests locally: `sbt/sbt

[GitHub] spark pull request: [sql] fix DEFAULT_INITIAL_BUFFER_SIZE

2014-07-25 Thread scwf
Github user scwf commented on a diff in the pull request: https://github.com/apache/spark/pull/1587#discussion_r15391086 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnBuilder.scala --- @@ -118,7 +118,7 @@ private[sql] class BinaryColumnBuilder extends

[GitHub] spark pull request: [SPARK-2648] through shuffling blocksByAddress...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1549#issuecomment-50124523 QA tests have started for PR 1549. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17182/consoleFull ---

[GitHub] spark pull request: [SPARK-2666] when task failed with FetchFailed...

2014-07-25 Thread lianhuiwang
Github user lianhuiwang commented on the pull request: https://github.com/apache/spark/pull/1572#issuecomment-50126087 ok, i understand your idea.the current implementation is let the remaning tasks run.but that has a problem if one of remaning tasks is writing hdfs and other new

[GitHub] spark pull request: add ability to submit multiple jars for Driver

2014-07-25 Thread lianhuiwang
Github user lianhuiwang commented on the pull request: https://github.com/apache/spark/pull/1113#issuecomment-50126195 @andrewor14 can you take a look at this? thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [sql] fix DEFAULT_INITIAL_BUFFER_SIZE

2014-07-25 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/1587#issuecomment-50126452 @marmbrus has already filed [SPARK-2650](https://issues.apache.org/jira/browse/SPARK-2650) to track this issue and assigned to me. The `104` here is definitely a

[GitHub] spark pull request: SPARK-2638 MapOutputTracker concurrency improv...

2014-07-25 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/1542#discussion_r15392537 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -162,9 +164,9 @@ private[spark] abstract class MapOutputTracker(conf: SparkConf)

[GitHub] spark pull request: [sql] fix DEFAULT_INITIAL_BUFFER_SIZE

2014-07-25 Thread scwf
Github user scwf closed the pull request at: https://github.com/apache/spark/pull/1587 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [sql] fix DEFAULT_INITIAL_BUFFER_SIZE

2014-07-25 Thread scwf
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/1587#issuecomment-50127420 ok.thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-2687] [yarn]amClient should remove Cont...

2014-07-25 Thread lianhuiwang
GitHub user lianhuiwang opened a pull request: https://github.com/apache/spark/pull/1589 [SPARK-2687] [yarn]amClient should remove ContainerRequest in https://issues.apache.org/jira/browse/YARN-1902, after receving allocated containers,if amClient donot remove ContainerRequest,RM

[GitHub] spark pull request: [SPARK-2687] [yarn]amClient should remove Cont...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1589#issuecomment-50128259 QA tests have started for PR 1589. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17183/consoleFull ---

[GitHub] spark pull request: [SPARK-2648] through shuffling blocksByAddress...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1549#issuecomment-50128774 QA results for PR 1549:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2687] [yarn]amClient should remove Cont...

2014-07-25 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1589#issuecomment-50132308 QA results for PR 1589:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2547]:The clustering documentaion examp...

2014-07-25 Thread yu-iskw
GitHub user yu-iskw opened a pull request: https://github.com/apache/spark/pull/1590 [SPARK-2547]:The clustering documentaion example provided for spark 0.9 I modified a trivial mistake in the MLlib documentation. I checked that the python sample code for a k-means

[GitHub] spark pull request: [SPARK-2547]:The clustering documentaion examp...

2014-07-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1590#issuecomment-50134863 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: Remove console logging in ActorReceiver.scala

2014-07-25 Thread isasmani
GitHub user isasmani opened a pull request: https://github.com/apache/spark/pull/1591 Remove console logging in ActorReceiver.scala For large scale stream processing the console logging is hindering throughput. You can merge this pull request into a Git repository by running:

[GitHub] spark pull request: Remove console logging in ActorReceiver.scala

2014-07-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1591#issuecomment-50142171 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

  1   2   3   >