[GitHub] spark pull request: [SPARK-3536][SQL] SELECT on empty parquet tabl...

2014-09-20 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2456#issuecomment-56258118 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-3578] Fix upper bound in GraphGenerator...

2014-09-20 Thread ankurdave
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/2439#issuecomment-56258349 @rnowling Hmm, maybe you're right about that -- I'm not familiar enough with the algorithm to know whether it specifies rounding behavior in the first place.

[GitHub] spark pull request: [SPARK-1987] EdgePartitionBuilder: More memory...

2014-09-20 Thread ankurdave
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/2446#issuecomment-56258447 Is graphx/build.sbt necessary? I thought modifying graphx/pom.xml would be sufficient. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-3578] Fix upper bound in GraphGenerator...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2439#issuecomment-56258780 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20607/consoleFull) for PR 2439 at commit

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-56260508 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20608/consoleFull) for PR 1778 at commit

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-20 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17818640 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/linalg/distributed/RowMatrixSuite.scala --- @@ -95,6 +95,33 @@ class RowMatrixSuite extends FunSuite

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-20 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17818648 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -27,10 +28,12 @@ import

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-20 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17818650 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -390,6 +393,113 @@ class RowMatrix( new

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-20 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17818645 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -18,6 +18,7 @@ package

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-20 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17818651 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -390,6 +393,113 @@ class RowMatrix( new

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-20 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17818655 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -390,6 +393,113 @@ class RowMatrix( new

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-20 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17818657 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -390,6 +393,113 @@ class RowMatrix( new

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-20 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17818661 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -390,6 +393,113 @@ class RowMatrix( new

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-20 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17818659 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -390,6 +393,113 @@ class RowMatrix( new

[GitHub] spark pull request: [SPARK-3613] Record only average block size in...

2014-09-20 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/2470 [SPARK-3613] Record only average block size in MapStatus for large stages This changes the way we send MapStatus from executors back to driver for large stages (2000 tasks). For large stages, we no

[GitHub] spark pull request: SPARK-3608 Break if the instance tag naming su...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2466#issuecomment-56260777 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/136/consoleFull) for PR 2466 at commit

[GitHub] spark pull request: SPARK-3608 Break if the instance tag naming su...

2014-09-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2466#issuecomment-56260810 Merging this in master branch-1.1. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-20 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17818718 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/MultivariateStatisticalSummary.scala --- @@ -53,4 +53,14 @@ trait

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-20 Thread rezazadeh
Github user rezazadeh commented on a diff in the pull request: https://github.com/apache/spark/pull/1778#discussion_r17818724 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/linalg/distributed/RowMatrixSuite.scala --- @@ -95,6 +95,40 @@ class RowMatrixSuite extends FunSuite

[GitHub] spark pull request: Periodic cleanup event logs

2014-09-20 Thread viper-kun
GitHub user viper-kun opened a pull request: https://github.com/apache/spark/pull/2471 Periodic cleanup event logs You can merge this pull request into a Git repository by running: $ git pull https://github.com/viper-kun/spark deletelog2 Alternatively you can review and

[GitHub] spark pull request: Periodic cleanup event logs

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2471#issuecomment-56261050 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs

2014-09-20 Thread viper-kun
Github user viper-kun commented on a diff in the pull request: https://github.com/apache/spark/pull/2391#discussion_r17818744 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -214,6 +245,27 @@ private[history] class

[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs

2014-09-20 Thread viper-kun
Github user viper-kun commented on the pull request: https://github.com/apache/spark/pull/2391#issuecomment-56261135 @vanzin , @andrewor14 .Thanks for your opinions. Because the source branch had been deleted by me, i can change it in this commit. i submit another commit[#2471] and

[GitHub] spark pull request: [SPARK-3304] [YARN] ApplicationMaster's Finish...

2014-09-20 Thread sarutak
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/2198#discussion_r17818868 --- Diff: docs/running-on-yarn.md --- @@ -50,6 +50,13 @@ Most of the configs are the same for Spark on YARN as for other deployment modes /td /tr

[GitHub] spark pull request: [SPARK-3304] [YARN] ApplicationMaster's Finish...

2014-09-20 Thread sarutak
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/2198#issuecomment-56261930 I tested in the 2 way. 1st one is to hard code to raise Exception in the ApplicationMaster. 2nd one is using Debugger and inject Exception. --- If your

[GitHub] spark pull request: [MLlib] [SPARK-2885] DIMSUM: All-pairs similar...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1778#issuecomment-56262006 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20608/consoleFull) for PR 1778 at commit

[GitHub] spark pull request: Fix Java example in Streaming Programming Guid...

2014-09-20 Thread smola
GitHub user smola opened a pull request: https://github.com/apache/spark/pull/2472 Fix Java example in Streaming Programming Guide val conf was used instead of SparkConf conf in Java snippet. You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: Fix Java example in Streaming Programming Guid...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2472#issuecomment-56264606 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-1987] EdgePartitionBuilder: More memory...

2014-09-20 Thread larryxiao
Github user larryxiao commented on the pull request: https://github.com/apache/spark/pull/2446#issuecomment-56265541 Sorry I don't know about the build system much. I thought pom.xml is for maven, and build.sbt is for sbt. But I can only sbt assembly with build.sbt. --- If your

[GitHub] spark pull request: [SPARK-3578] Fix upper bound in GraphGenerator...

2014-09-20 Thread rnowling
Github user rnowling commented on the pull request: https://github.com/apache/spark/pull/2439#issuecomment-56269533 I looke at the Pregel paper but it doesn't specify and doesn't cite other papers. I know it's a common method, though. After some thought, I think your

[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB] topic modeling on Gra...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2388#issuecomment-56270363 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20610/consoleFull) for PR 2388 at commit

[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB] topic modeling on Gra...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2388#issuecomment-56270391 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20610/consoleFull) for PR 2388 at commit

[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB] topic modeling on Gra...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2388#issuecomment-56271458 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20611/consoleFull) for PR 2388 at commit

[GitHub] spark pull request: [WIP][SPARK-1405][MLLIB] topic modeling on Gra...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2388#issuecomment-56272823 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20611/consoleFull) for PR 2388 at commit

[GitHub] spark pull request: [SPARK-3599]Avoid loading properties file freq...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2454#issuecomment-56276117 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20612/consoleFull) for PR 2454 at commit

[GitHub] spark pull request: [SPARK-3599]Avoid loading properties file freq...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2454#issuecomment-56276384 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20613/consoleFull) for PR 2454 at commit

[GitHub] spark pull request: [SPARK-3599]Avoid loading properties file freq...

2014-09-20 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/2454#issuecomment-56276648 I think it is better using lazy val for readability(putting all elements of defaultSparkProperties into value properties is more comfortable than conversely) and

[GitHub] spark pull request: [SPARK-3613] Record only average block size in...

2014-09-20 Thread Ishiihara
Github user Ishiihara commented on the pull request: https://github.com/apache/spark/pull/2470#issuecomment-56277206 @rxin my understanding is that MapStatus is used to check whether a map output file contain data for a certain reducer. Why do we use actual size instead of a boolean

[GitHub] spark pull request: [SPARK-3613] Record only average block size in...

2014-09-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2470#issuecomment-56277279 It's more than that. We use estimated sizes to track the total size of outstanding fetches, and try to bound that to a certain size in case an executor sends too many

[GitHub] spark pull request: [SPARK-3613] Record only average block size in...

2014-09-20 Thread Ishiihara
Github user Ishiihara commented on the pull request: https://github.com/apache/spark/pull/2470#issuecomment-56277955 Thanks for the reply. Another questions, In hash shuffle write, the data may be screwed for different map output file. For some cases, the reducer may try to fetch

[GitHub] spark pull request: [SPARK-3599]Avoid loading properties file freq...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2454#issuecomment-56278035 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20612/consoleFull) for PR 2454 at commit

[GitHub] spark pull request: [SPARK-3599]Avoid loading properties file freq...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2454#issuecomment-56278247 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20613/consoleFull) for PR 2454 at commit

[GitHub] spark pull request: [SPARK-3446] Expose underlying job ids in Futu...

2014-09-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2337#issuecomment-56278413 The API is slightly awkward as you suggested. Is this intended to get job progress? If yes, maybe we can do that through the job group to get the list of job ids? --- If

[GitHub] spark pull request: [SPARK-3446] Expose underlying job ids in Futu...

2014-09-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2337#discussion_r17821378 --- Diff: core/src/test/scala/org/apache/spark/FutureActionSuite.scala --- @@ -0,0 +1,49 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

[GitHub] spark pull request: [SPARK-3446] Expose underlying job ids in Futu...

2014-09-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2337#discussion_r17821379 --- Diff: core/src/main/scala/org/apache/spark/FutureAction.scala --- @@ -171,6 +179,8 @@ class ComplexFutureAction[T] extends FutureAction[T] { // is

[GitHub] spark pull request: [PySpark] remove unnecessary use of numSlices ...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2467#issuecomment-56280081 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/137/consoleFull) for PR 2467 at commit

[GitHub] spark pull request: stop, start and destroy require the EC2_REGION

2014-09-20 Thread jeffsteinmetz
GitHub user jeffsteinmetz opened a pull request: https://github.com/apache/spark/pull/2473 stop, start and destroy require the EC2_REGION i.e ./spark-ec2 --region=us-west-1 stop yourclustername You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: stop, start and destroy require the EC2_REGION

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2473#issuecomment-56280512 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-3446] Expose underlying job ids in Futu...

2014-09-20 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2337#issuecomment-56280786 My initial thought was that a job group-based approach might be a bit cleaner, but there are a few subtleties with that proposal that we need to consider.

[GitHub] spark pull request: [SPARK-3446] Expose underlying job ids in Futu...

2014-09-20 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2337#issuecomment-56281339 @rxin @pwendell Since we have job groups and the ability to cancel all jobs running in a job group (`sc.cancelJobGroup()`), then why do we need FutureAction? It looks

[GitHub] spark pull request: [SPARK-3377] [Metrics] Metrics can be accident...

2014-09-20 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2250#issuecomment-56281680 I feel strongly that we should use the same application ID to refer to the application in every context, since creating a different id based off of

[GitHub] spark pull request: [SPARK-3377] [Metrics] Metrics can be accident...

2014-09-20 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2432#issuecomment-56281694 Can you add closes #1067 to the description here, too? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-3377] [Metrics] Metrics can be accident...

2014-09-20 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2432#issuecomment-56281779 Quoting @sarutak from #2250, regarding this PR: And for problem 2, when launching ExecutorBackends, launcher pass application id to ExecutorBackends. It

[GitHub] spark pull request: [PySpark] remove unnecessary use of numSlices ...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2467#issuecomment-56281778 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/137/consoleFull) for PR 2467 at commit

[GitHub] spark pull request: Fix Java example in Streaming Programming Guid...

2014-09-20 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2472#issuecomment-56281856 Ah, good catch! Since this is a doc-only markdown change, I'm going to merge it without waiting for Jenkins. --- If your project is set up for it, you can reply to

[GitHub] spark pull request: Fix Java example in Streaming Programming Guid...

2014-09-20 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2472 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [PySpark] remove unnecessary use of numSlices ...

2014-09-20 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2467#issuecomment-56281993 LGTM. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [PySpark] remove unnecessary use of numSlices ...

2014-09-20 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2467 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: stop, start and destroy require the EC2_REGION

2014-09-20 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2473#discussion_r17822066 --- Diff: docs/ec2-scripts.md --- @@ -137,11 +137,11 @@ cost you any EC2 cycles, but ***will*** continue to cost money for EBS storage. - To

[GitHub] spark pull request: stop, start and destroy require the EC2_REGION

2014-09-20 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2473#issuecomment-56282092 Thanks. This is great to have. I left a tiny comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: SPARK-3574. Shuffle finish time always reporte...

2014-09-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2440#issuecomment-56283069 Thanks Sandy! Pulling this in. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-3574. Shuffle finish time always reporte...

2014-09-20 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2440 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-3609][SQL] Adds sizeInBytes statistics ...

2014-09-20 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2468 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-3446] Expose underlying job ids in Futu...

2014-09-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2337#issuecomment-56283750 @vanzin it would be helpful to hear what the needs are for Hive on Spark. Other applications I've seen have been using the job group for this purpose. And this will

[GitHub] spark pull request: [SPARK-3414][SQL] Replace LowerCaseSchema with...

2014-09-20 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2382 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: Adding known issue for MESOS-1688

2014-09-20 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1860#issuecomment-56283933 Great! I'll create a JIRA to update Spark to it when that comes out. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-3616] Add basic Selenium tests to WebUI...

2014-09-20 Thread JoshRosen
GitHub user JoshRosen opened a pull request: https://github.com/apache/spark/pull/2474 [SPARK-3616] Add basic Selenium tests to WebUISuite This patch adds Selenium tests for Spark's web UI. To avoid adding extra dependencies to the test environment, the tests use Selenium's

[GitHub] spark pull request: [SPARK-3616] Add basic Selenium tests to WebUI...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2474#issuecomment-56284177 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20614/consoleFull) for PR 2474 at commit

[GitHub] spark pull request: stop, start and destroy require the EC2_REGION

2014-09-20 Thread jeffsteinmetz
Github user jeffsteinmetz commented on a diff in the pull request: https://github.com/apache/spark/pull/2473#discussion_r17822444 --- Diff: docs/ec2-scripts.md --- @@ -137,11 +137,11 @@ cost you any EC2 cycles, but ***will*** continue to cost money for EBS storage.

[GitHub] spark pull request: SPARK-3604. Replace the map call in UnionRDD#g...

2014-09-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2463#issuecomment-56284671 @ericdf is your original issue fixed by using the union utility function? I misread it to be a bug report, but I think the issue is just that you were chaining together

[GitHub] spark pull request: [SPARK-3562]Periodic cleanup event logs

2014-09-20 Thread mattf
Github user mattf commented on the pull request: https://github.com/apache/spark/pull/2471#issuecomment-56284715 i strongly suggest against duplicating functionality that is already provided by the system where these logs are written. however, if you proceed with this, the

[GitHub] spark pull request: [SPARK-3595] Respect configured OutputCommitte...

2014-09-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2450#issuecomment-56284797 Jenkins, this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-3595] Respect configured OutputCommitte...

2014-09-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2450#issuecomment-56284812 LGTM pending tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: SPARK-3604. Replace the map call in UnionRDD#g...

2014-09-20 Thread harishreedharan
Github user harishreedharan commented on the pull request: https://github.com/apache/spark/pull/2463#issuecomment-56285227 Agreed. This patch simply make it more difficult to overflow - so it is not really a fix. Will close this.  Thanks, Hari On Sat,

[GitHub] spark pull request: [SPARK-3616] Add basic Selenium tests to WebUI...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2474#issuecomment-56285340 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20614/consoleFull) for PR 2474 at commit

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17822722 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,238 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17822725 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,238 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17822726 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,238 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17822738 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,238 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17822735 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,238 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark pull request: [SPARK-3218, SPARK-3219, SPARK-3261, SPARK-342...

2014-09-20 Thread derrickburns
Github user derrickburns commented on the pull request: https://github.com/apache/spark/pull/2419#issuecomment-56285643 I deleted that file in my original pull request. — Sent from Mailbox On Fri, Sep 19, 2014 at 4:32 PM, Nicholas Chammas notificati...@github.com

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17822775 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -45,7 +45,8 @@ import org.apache.spark.util.Utils private[spark] abstract

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17822777 --- Diff: core/src/test/scala/org/apache/spark/CacheManagerSuite.scala --- @@ -94,7 +94,7 @@ class CacheManagerSuite extends FunSuite with BeforeAndAfter

[GitHub] spark pull request: [WIP][SPARK-3247][SQL] An API for adding forei...

2014-09-20 Thread marmbrus
GitHub user marmbrus opened a pull request: https://github.com/apache/spark/pull/2475 [WIP][SPARK-3247][SQL] An API for adding foreign data sources to Spark SQL **Work in progress - APIs may change** This PR introduces a new set of APIs to Spark SQL that allow other

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17822804 --- Diff: core/src/main/java/org/apache/spark/TaskContext.java --- @@ -0,0 +1,238 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2425#discussion_r17822812 --- Diff: core/src/main/scala/org/apache/spark/scheduler/Task.scala --- @@ -45,7 +45,8 @@ import org.apache.spark.util.Utils private[spark] abstract

[GitHub] spark pull request: [WIP][SPARK-3247][SQL] An API for adding forei...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2475#issuecomment-56285912 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20615/consoleFull) for PR 2475 at commit

[GitHub] spark pull request: [WIP][SPARK-3247][SQL] An API for adding forei...

2014-09-20 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2475#issuecomment-56285964 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20615/consoleFull) for PR 2475 at commit

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2425#issuecomment-56286065 @ScrapCodes made another pass with some comments. Overall this is looking good --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-3599]Avoid loading properties file freq...

2014-09-20 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/2454#discussion_r17822883 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala --- @@ -107,7 +107,8 @@ private[spark] class SparkSubmitArguments(args:

[GitHub] spark pull request: [SPARK-3599]Avoid loading properties file freq...

2014-09-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2454#issuecomment-56286238 I don't think the overhead of reading the file is significant here, but agree it's nice to avoid extra print statements. `lazy val`'s like this are a little brittle

[GitHub] spark pull request: [SPARK-3616] Add basic Selenium tests to WebUI...

2014-09-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2474#issuecomment-56286298 @JoshRosen how much time does this add to the test harness? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-3599]Avoid loading properties file freq...

2014-09-20 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2454 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-3616] Add basic Selenium tests to WebUI...

2014-09-20 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2474#issuecomment-56286662 @pwendell [According to Jenkins](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20614/testReport/org.apache.spark.ui/UISuite/), UISuite took ~11

[GitHub] spark pull request: SPARK-3604. Replace the map call in UnionRDD#g...

2014-09-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2463#issuecomment-56287554 Gotcha - sounds good! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: Add ValueIncrementableHashMapAccumulator

2014-09-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2314#issuecomment-56287649 Let's close this issue, in that case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: SPARK-2058: Overriding config from SPARK_HOME ...

2014-09-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/997#issuecomment-56289070 Let's close this issue then pending follow up from @EugenCepoi. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: SPARK-2582. Make Block Manager Master pluggabl...

2014-09-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1506#issuecomment-56289168 Hey @harishreedharan thanks for submitting, but like to close this PR for now pending a more complete design proposal about how external implementations of the block

[GitHub] spark pull request: SPARK-2387: remove stage barrier

2014-09-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1328#issuecomment-56289215 I'd like to close this issue for now pending more of a design discussion on the JIRA. These Proof of Concept patches are useful to have, but I'd rather not have them

[GitHub] spark pull request: SPARK-1597: Add a version of reduceByKey that ...

2014-09-20 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/550#issuecomment-56289232 It sounds like the conclusion here is to close this issue then. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

  1   2   >