[GitHub] spark pull request: [SPARK-1087] Move python traceback utilities i...

2014-09-14 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2385#issuecomment-55516585 Jenkins, this is ok to test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-3398] [EC2] Have spark-ec2 intelligentl...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2339#issuecomment-55516618 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20298/consoleFull) for PR 2339 at commit

[GitHub] spark pull request: [SPARK-1087] Move python traceback utilities i...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2385#issuecomment-55516625 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20300/consoleFull) for PR 2385 at commit

[GitHub] spark pull request: [SPARK-3463] [PySpark] aggregate and show spil...

2014-09-14 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2336 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-1087] Move python traceback utilities i...

2014-09-14 Thread jyotiska
Github user jyotiska commented on the pull request: https://github.com/apache/spark/pull/2385#issuecomment-55517066 LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-3485][SQL] should check parameter type ...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2355#issuecomment-55517166 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20299/consoleFull) for PR 2355 at commit

[GitHub] spark pull request: [WIP][SPARK-2816][SQL] Type-safe SQL Queries

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1759#issuecomment-55517240 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20301/consoleFull) for PR 1759 at commit

[GitHub] spark pull request: [WIP][SPARK-2816][SQL] Type-safe SQL Queries

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1759#issuecomment-55517245 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20301/consoleFull) for PR 1759 at commit

[GitHub] spark pull request: [SPARK-3491] [WIP] [MLlib] [PySpark] use pickl...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2378#issuecomment-55518181 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20304/consoleFull) for PR 2378 at commit

[GitHub] spark pull request: [SPARK-3491] [WIP] [MLlib] [PySpark] use pickl...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2378#issuecomment-55518966 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20302/consoleFull) for PR 2378 at commit

[GitHub] spark pull request: [SPARK-3074] [PySpark] support groupByKey() wi...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1977#issuecomment-55519113 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20305/consoleFull) for PR 1977 at commit

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-09-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r17518799 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -141,6 +193,93 @@ private class MemoryStore(blockManager: BlockManager,

[GitHub] spark pull request: [SPARK-927] detect numpy at time of use

2014-09-14 Thread mattf
Github user mattf commented on the pull request: https://github.com/apache/spark/pull/2313#issuecomment-55523684 @erikerlandson i know you've been doing some serious work w/ sampling, what's your take on this? --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-3425] do not set MaxPermSize for OpenJD...

2014-09-14 Thread mattf
Github user mattf commented on the pull request: https://github.com/apache/spark/pull/2301#issuecomment-55523986 @mateiz @pwendell pls take another look --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: allow symlinking to shell scripts

2014-09-14 Thread hsn10
GitHub user hsn10 opened a pull request: https://github.com/apache/spark/pull/2386 allow symlinking to shell scripts patch for SPARK-3482 You can merge this pull request into a Git repository by running: $ git pull https://github.com/hsn10/spark spark-3482 Alternatively you

[GitHub] spark pull request: allow symlinking to shell scripts

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2386#issuecomment-55524878 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: allow symlinking to shell scripts

2014-09-14 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2386#issuecomment-55526164 Again, still looks like a duplicate of https://github.com/apache/spark/pull/1875 --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-1087] Move python traceback utilities i...

2014-09-14 Thread staple
Github user staple commented on the pull request: https://github.com/apache/spark/pull/2385#issuecomment-55526943 Hi, the above failure in NetworkReceiverSuite.scala seems like it may be unrelated to this patch. That test also passed when I ran locally. --- If your project is set up

[GitHub] spark pull request: make spark-class to work with openjdk

2014-09-14 Thread hsn10
GitHub user hsn10 opened a pull request: https://github.com/apache/spark/pull/2387 make spark-class to work with openjdk fix for SPARK-3520 - java version check in spark-class fails with openjdk You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: make spark-class to work with openjdk

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2387#issuecomment-55528609 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-927] detect numpy at time of use

2014-09-14 Thread erikerlandson
Github user erikerlandson commented on the pull request: https://github.com/apache/spark/pull/2313#issuecomment-55529273 @mattf, one useful question would be: do the results generate equivalent output distributions. The basic methodology would be to collect output in both

[GitHub] spark pull request: [SPARK-3491] [WIP] [MLlib] [PySpark] use pickl...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2378#issuecomment-55531127 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20307/consoleFull) for PR 2378 at commit

[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB]LDA based on Graphx

2014-09-14 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/2388 [WIP][SPARK-1405][MLLIB]LDA based on Graphx cc @mengxr You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark graphx_lda Alternatively you

[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB]LDA based on Graphx

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2388#issuecomment-55531748 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20308/consoleFull) for PR 2388 at commit

[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB]LDA based on Graphx

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2388#issuecomment-55532086 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20309/consoleFull) for PR 2388 at commit

[GitHub] spark pull request: [SPARK-3491] [WIP] [MLlib] [PySpark] use pickl...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2378#issuecomment-55532844 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20307/consoleFull) for PR 2378 at commit

[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB]LDA based on Graphx

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2388#issuecomment-55533316 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20308/consoleFull) for PR 2388 at commit

[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB]LDA based on Graphx

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2388#issuecomment-55534091 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20309/consoleFull) for PR 2388 at commit

[GitHub] spark pull request: [WIP][SPARK-3468] WebUI Timeline-View feature

2014-09-14 Thread ash211
Github user ash211 commented on a diff in the pull request: https://github.com/apache/spark/pull/2342#discussion_r17521459 --- Diff: core/src/main/scala/org/apache/spark/ui/env/EnvironmentPage.scala --- @@ -26,6 +29,23 @@ import org.apache.spark.ui.{UIUtils, WebUIPage}

[GitHub] spark pull request: [SPARK-3488][MLLIB] Cache python RDDs after de...

2014-09-14 Thread staple
Github user staple commented on the pull request: https://github.com/apache/spark/pull/2362#issuecomment-55539838 I ran a simple logistic regression performance test on my local machine (ubuntu desktop w/ 8gb ram). I used two data sizes: 2m records, which was not memory constrained,

[GitHub] spark pull request: [SPARK-2951] [PySpark] support unpickle array....

2014-09-14 Thread mattf
Github user mattf commented on the pull request: https://github.com/apache/spark/pull/2365#issuecomment-55540972 where's the message that was sent to the pyrolite folks? it looks like SPARK-2378 s targetted for 1.2, so it has a bit of time --- If your project is set up for

[GitHub] spark pull request: [SPARK-2951] [PySpark] support unpickle array....

2014-09-14 Thread mattf
Github user mattf commented on the pull request: https://github.com/apache/spark/pull/2365#issuecomment-55540988 if you do end up merging, what do you think about logging an issue for fixing up the workaround once pyrolite is update? --- If your project is set up for it, you can

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-14 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17523204 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -390,6 +393,79 @@ class RowMatrix( new

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-14 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17523197 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/linalg/distributed/RowMatrixSuite.scala --- @@ -95,6 +95,33 @@ class RowMatrixSuite extends FunSuite

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-14 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17523206 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -390,6 +393,79 @@ class RowMatrix( new

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-14 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17523212 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -390,6 +393,79 @@ class RowMatrix( new

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-14 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17523214 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -390,6 +393,79 @@ class RowMatrix( new

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-14 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17523232 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -390,6 +393,79 @@ class RowMatrix( new

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-14 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17523233 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -390,6 +393,79 @@ class RowMatrix( new

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-14 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17523235 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -27,10 +27,13 @@ import

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-14 Thread rezazadeh
Github user rezazadeh commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-55542396 @mengxr All requested changes made. All tests are passing locally. However, I expect Jenkins to complain because of the new normL1 and normL2 methods added to

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-55542450 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20311/consoleFull) for PR 1778 at commit

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-55542478 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20311/consoleFull) for PR 1778 at commit

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-14 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17523330 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -390,6 +393,79 @@ class RowMatrix( new

[GitHub] spark pull request: [SPARK-1087] Move python traceback utilities i...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2385#issuecomment-55542730 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/104/consoleFull) for PR 2385 at commit

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-55542822 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20312/consoleFull) for PR 1778 at commit

[GitHub] spark pull request: [SPARK-3000][CORE] drop old blocks to disk in ...

2014-09-14 Thread liyezhang556520
Github user liyezhang556520 commented on the pull request: https://github.com/apache/spark/pull/2134#issuecomment-55543007 @andrewor14 any comment on my explanation? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [MLlib] Update SVD documentation in IndexedRow...

2014-09-14 Thread rezazadeh
GitHub user rezazadeh opened a pull request: https://github.com/apache/spark/pull/2389 [MLlib] Update SVD documentation in IndexedRowMatrix Updating this to reflect the newest SVD via ARPACK You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: [MLlib] Update SVD documentation in IndexedRow...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2389#issuecomment-55543271 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20313/consoleFull) for PR 2389 at commit

[GitHub] spark pull request: [SPARK-2594][SQL] Add CACHE TABLE name AS SE...

2014-09-14 Thread ravipesala
GitHub user ravipesala opened a pull request: https://github.com/apache/spark/pull/2390 [SPARK-2594][SQL] Add CACHE TABLE name AS SELECT ... (Updated as per review comments) Updated as per the review comments of Admin. Previous pull request is

[GitHub] spark pull request: [SPARK-2594][SQL] Add CACHE TABLE name AS SE...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2390#issuecomment-55543465 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-55544053 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20312/consoleFull) for PR 1778 at commit

[GitHub] spark pull request: [SPARK-1087] Move python traceback utilities i...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2385#issuecomment-55544330 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/104/consoleFull) for PR 2385 at commit

[GitHub] spark pull request: [MLlib] Update SVD documentation in IndexedRow...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2389#issuecomment-55544944 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20313/consoleFull) for PR 2389 at commit

[GitHub] spark pull request: SPARK-3177 (on Master Branch)

2014-09-14 Thread chesterxgchen
Github user chesterxgchen commented on the pull request: https://github.com/apache/spark/pull/2204#issuecomment-55545571 Reformatted based on Andrew's comment #2204 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-3177 (on Master Branch)

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2204#issuecomment-55545608 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20314/consoleFull) for PR 2204 at commit

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-09-14 Thread li-zhihui
Github user li-zhihui commented on a diff in the pull request: https://github.com/apache/spark/pull/1616#discussion_r17524360 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -313,15 +313,84 @@ private[spark] object Utils extends Logging { }

[GitHub] spark pull request: [MLLIB] [spark-2352] Implementation of an 1-hi...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1290#issuecomment-55546270 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20315/consoleFull) for PR 1290 at commit

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-09-14 Thread li-zhihui
Github user li-zhihui commented on a diff in the pull request: https://github.com/apache/spark/pull/1616#discussion_r17524413 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -805,11 +805,12 @@ class SparkContext(config: SparkConf) extends Logging {

[GitHub] spark pull request: [SPARK-3393] [SQL] Align the log4j configurati...

2014-09-14 Thread chenghao-intel
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/2263#issuecomment-55546610 The Hive `LogUtils` just load the specified .properties file for `Log4J`, see

[GitHub] spark pull request: [SPARK-3481] [SQL] Eliminate the error log in ...

2014-09-14 Thread chenghao-intel
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/2352#issuecomment-55546663 Thank you @liancheng for so detailed explanation. Actually I didn't know those while submitting this PR. :) --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-14 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/2368#discussion_r17524564 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -32,12 +32,12 @@ object Optimizer extends

[GitHub] spark pull request: [SPARK-3398] [EC2] Have spark-ec2 intelligentl...

2014-09-14 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/2339#issuecomment-55546927 One more thing we could do is to get instance status check reports using boto. This shows up in the web ui as `2/2 checks passed` etc. and we should be able to get this

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1616#issuecomment-55547002 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20316/consoleFull) for PR 1616 at commit

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-09-14 Thread li-zhihui
Github user li-zhihui commented on a diff in the pull request: https://github.com/apache/spark/pull/1616#discussion_r17524670 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -313,15 +313,84 @@ private[spark] object Utils extends Logging { }

[GitHub] spark pull request: [SPARK-2872] Fix conflict between code and doc...

2014-09-14 Thread li-zhihui
Github user li-zhihui closed the pull request at: https://github.com/apache/spark/pull/1684 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-14 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/2368#discussion_r17524775 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -32,12 +32,12 @@ object Optimizer extends

[GitHub] spark pull request: [SPARK-3124] Fix the jar version conflict in u...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2035#issuecomment-55547567 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/105/consoleFull) for PR 2035 at commit

[GitHub] spark pull request: SPARK-3177 (on Master Branch)

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2204#issuecomment-55547575 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20314/consoleFull) for PR 2204 at commit

[GitHub] spark pull request: [SPARK-2714] DAGScheduler logs jobid when runJ...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1617#issuecomment-55547685 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20317/consoleFull) for PR 1617 at commit

[GitHub] spark pull request: [MLLIB] [spark-2352] Implementation of an 1-hi...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1290#issuecomment-55548415 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20315/consoleFull) for PR 1290 at commit

[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-55548796 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20318/consoleFull) for PR 2368 at commit

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1616#issuecomment-55549134 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20316/consoleFull) for PR 1616 at commit

[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB]Collapsed Gibbs sampli...

2014-09-14 Thread allwefantasy
Github user allwefantasy commented on the pull request: https://github.com/apache/spark/pull/1983#issuecomment-55549348 @witgo i have saw you new performance test configuration。 I will try your new code and test in my data today --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-14 Thread chenghao-intel
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/2368#discussion_r17525387 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -32,12 +32,12 @@ object Optimizer extends

[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2368#issuecomment-55549403 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20319/consoleFull) for PR 2368 at commit

[GitHub] spark pull request: [SPARK-2714] DAGScheduler logs jobid when runJ...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1617#issuecomment-55549734 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20317/consoleFull) for PR 1617 at commit

[GitHub] spark pull request: [SPARK-3040] pick up a more proper local ip ad...

2014-09-14 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1946#issuecomment-0076 In that case I'd propose merging this tentatively and if it causes issues in the 1.2 dev/QA cycle we can revert it. I dug around a bunch, it looks like there

[GitHub] spark pull request: [SPARK-3124] Fix the jar version conflict in u...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2035#issuecomment-0221 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/105/consoleFull) for PR 2035 at commit

[GitHub] spark pull request: SPARK-1767: Prefer HDFS-cached replicas when s...

2014-09-14 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1486#issuecomment-0386 Jenkins, test this please. @cmccabe - do you mean `PROCESS_LOCAL`? I'm pretty sure we want to have them be higher priority than `NODE_LOCAL`, which is the

[GitHub] spark pull request: [SPARK-3491] [WIP] [MLlib] [PySpark] use pickl...

2014-09-14 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2378#issuecomment-0409 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20320/consoleFull) for PR 2378 at commit

[GitHub] spark pull request: SPARK-1767: Prefer HDFS-cached replicas when s...

2014-09-14 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1486#discussion_r17525749 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -248,10 +250,22 @@ class HadoopRDD[K, V]( new

[GitHub] spark pull request: SPARK-1767: Prefer HDFS-cached replicas when s...

2014-09-14 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1486#discussion_r17525822 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskLocation.scala --- @@ -23,12 +23,33 @@ package org.apache.spark.scheduler * of preference

[GitHub] spark pull request: SPARK-1767: Prefer HDFS-cached replicas when s...

2014-09-14 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1486#discussion_r17525861 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskLocation.scala --- @@ -23,12 +23,33 @@ package org.apache.spark.scheduler * of preference

[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB]Collapsed Gibbs sampli...

2014-09-14 Thread allwefantasy
Github user allwefantasy commented on the pull request: https://github.com/apache/spark/pull/1983#issuecomment-1073 @witgo i have try ur latest code in my corpus 。 it will not Stuck in broadcasting . However ,some exception are throw。

[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-14 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/2368#discussion_r17525906 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -32,12 +32,12 @@ object Optimizer extends

[GitHub] spark pull request: SPARK-1767: Prefer HDFS-cached replicas when s...

2014-09-14 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1486#discussion_r17525907 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -208,8 +208,10 @@ abstract class RDD[T: ClassTag]( } /** - *

[GitHub] spark pull request: [SPARK-3519] add distinct(n) to PySpark

2014-09-14 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/2383#discussion_r17525914 --- Diff: python/pyspark/tests.py --- @@ -586,6 +586,17 @@ def test_repartitionAndSortWithinPartitions(self): self.assertEquals(partitions[0],

[GitHub] spark pull request: [SPARK-3519] add distinct(n) to PySpark

2014-09-14 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/2383#discussion_r17525918 --- Diff: python/pyspark/rdd.py --- @@ -353,7 +353,7 @@ def func(iterator): return ifilter(f, iterator) return

[GitHub] spark pull request: SPARK-1767: Prefer HDFS-cached replicas when s...

2014-09-14 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1486#discussion_r17525928 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -309,4 +323,42 @@ private[spark] object HadoopRDD { f(inputSplit,

[GitHub] spark pull request: SPARK-1767: Prefer HDFS-cached replicas when s...

2014-09-14 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1486#discussion_r17525938 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -309,4 +323,42 @@ private[spark] object HadoopRDD { f(inputSplit,

[GitHub] spark pull request: SPARK-1767: Prefer HDFS-cached replicas when s...

2014-09-14 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1486#discussion_r17525946 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -309,4 +323,42 @@ private[spark] object HadoopRDD { f(inputSplit,

[GitHub] spark pull request: SPARK-1767: Prefer HDFS-cached replicas when s...

2014-09-14 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1486#issuecomment-1288 Added some more comments, mostly cosmetic. Right now the tests are failing because this makes an API breaking change. --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-2951] [PySpark] support unpickle array....

2014-09-14 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/2365#issuecomment-1408 I had created https://issues.apache.org/jira/browse/SPARK-3524 to track this. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: SPARK-3039: Allow spark to be built using avro...

2014-09-14 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1945 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-1087] Move python traceback utilities i...

2014-09-14 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/2385#discussion_r17526144 --- Diff: python/pyspark/context.py --- @@ -99,8 +100,8 @@ def __init__(self, master=None, appName=None, sparkHome=None, pyFiles=None, ...

[GitHub] spark pull request: [SPARK-3452] Maven build should skip publishin...

2014-09-14 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2329 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: Add a Community Projects page

2014-09-14 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2219#issuecomment-1813 @velvia so is this subsumed by the wiki page then? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-1087] Move python traceback utilities i...

2014-09-14 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/2385#issuecomment-1894 LGTM, just one minor comment, it's not must to have. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-3501] [SQL] Fix the bug of Hive SimpleU...

2014-09-14 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request: https://github.com/apache/spark/pull/2368#discussion_r17526186 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -32,12 +32,12 @@ object Optimizer extends

  1   2   >