[GitHub] spark pull request: [SPARK-3491] [WIP] [MLlib] [PySpark] use pickl...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2378#issuecomment-55699312 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20371/consoleFull) for PR 2378 at commit

[GitHub] spark pull request: [SPARK-3491] [WIP] [MLlib] [PySpark] use pickl...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2378#issuecomment-55699370 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20371/consoleFull) for PR 2378 at commit

[GitHub] spark pull request: [SPARK-2182] Scalastyle rule blocking non asci...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2358#issuecomment-55699350 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20369/consoleFull) for PR 2358 at commit

[GitHub] spark pull request: [SPARK-3485][SQL] should check parameter type ...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2355#issuecomment-55699441 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20368/consoleFull) for PR 2355 at commit

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-09-16 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r17584855 --- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala --- @@ -118,21 +118,29 @@ private[spark] class CacheManager(blockManager: BlockManager)

[GitHub] spark pull request: [SPARK-3485][SQL] should check parameter type ...

2014-09-16 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/2355#discussion_r17585678 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUdfs.scala --- @@ -113,30 +113,32 @@ private[hive] case class

[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-16 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/2347#issuecomment-55701824 this is ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-3485][SQL] should check parameter type ...

2014-09-16 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/2355#discussion_r17585758 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUdfs.scala --- @@ -113,30 +113,32 @@ private[hive] case class

[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2347#issuecomment-55702131 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20372/consoleFull) for PR 2347 at commit

[GitHub] spark pull request: [Minor]ignore all config files in conf

2014-09-16 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2395#issuecomment-55705179 I was commenting on a comment, suggesting to also ignore conf/slaves. It is not in the PR so LGTM. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-3491] [WIP] [MLlib] [PySpark] use pickl...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2378#issuecomment-55706058 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20375/consoleFull) for PR 2378 at commit

[GitHub] spark pull request: [SPARK-3491] [WIP] [MLlib] [PySpark] use pickl...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2378#issuecomment-55707384 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20376/consoleFull) for PR 2378 at commit

[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2347#issuecomment-55708415 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20372/consoleFull) for PR 2347 at commit

[GitHub] spark pull request: [SPARK-3491] [WIP] [MLlib] [PySpark] use pickl...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2378#issuecomment-55709985 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20377/consoleFull) for PR 2378 at commit

[GitHub] spark pull request: [SPARK-3491] [WIP] [MLlib] [PySpark] use pickl...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2378#issuecomment-55711136 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20375/consoleFull) for PR 2378 at commit

[GitHub] spark pull request: [SPARK-3491] [WIP] [MLlib] [PySpark] use pickl...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2378#issuecomment-55712664 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20376/consoleFull) for PR 2378 at commit

[GitHub] spark pull request: [SPARK-2759][CORE] Generic Binary File Support...

2014-09-16 Thread jrabary
Github user jrabary commented on the pull request: https://github.com/apache/spark/pull/1658#issuecomment-55714177 Hi all, I'm trying to use this patch to load a set of jpeg images but the path (key) of the output is empty val image = sc.binaryFiles(data/*.jpg)

[GitHub] spark pull request: [SPARK-2062][GraphX] VertexRDD.apply does not ...

2014-09-16 Thread larryxiao
Github user larryxiao commented on the pull request: https://github.com/apache/spark/pull/1903#issuecomment-55714900 As described in commit message: a copy of vertices with defaultVal is created before, and it's b in (a, b) = b see in

[GitHub] spark pull request: Update configuration.md

2014-09-16 Thread viper-kun
GitHub user viper-kun opened a pull request: https://github.com/apache/spark/pull/2406 Update configuration.md change the value of spark.files.fetchTimeout You can merge this pull request into a Git repository by running: $ git pull https://github.com/viper-kun/spark master

[GitHub] spark pull request: Update configuration.md

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2406#issuecomment-55715528 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-3491] [WIP] [MLlib] [PySpark] use pickl...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2378#issuecomment-55716038 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20377/consoleFull) for PR 2378 at commit

[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

2014-09-16 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2014#issuecomment-55717325 @pwendell no I believe that the user still has to install the gem. I did at least. Yes this is GTG from my end. --- If your project is set up for it, you can reply to

[GitHub] spark pull request: [SPARK-2062][GraphX] VertexRDD.apply does not ...

2014-09-16 Thread ankurdave
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/1903#issuecomment-55718184 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-2062][GraphX] VertexRDD.apply does not ...

2014-09-16 Thread ankurdave
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/1903#issuecomment-55718245 Yeah, a note about that default would be great. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2062][GraphX] VertexRDD.apply does not ...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1903#issuecomment-55718541 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20378/consoleFull) for PR 1903 at commit

[GitHub] spark pull request: [SPARK-3485][SQL] another way to pass to hive ...

2014-09-16 Thread adrian-wang
GitHub user adrian-wang opened a pull request: https://github.com/apache/spark/pull/2407 [SPARK-3485][SQL] another way to pass to hive simple udf This is just another solution to SPARK-3485, in addition to PR #2355 In this patch, we will use ConventionHelper and FunctionRegistry

[GitHub] spark pull request: [SPARK-3485][SQL] should check parameter type ...

2014-09-16 Thread adrian-wang
Github user adrian-wang commented on the pull request: https://github.com/apache/spark/pull/2355#issuecomment-55722541 Hi @liancheng I have created another PR(PR #2407 ) on this issue, offering another solution. Please let me know which solution do you guys prefer here. --- If your

[GitHub] spark pull request: [SPARK-3485][SQL] should check parameter type ...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2355#issuecomment-55722631 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20380/consoleFull) for PR 2355 at commit

[GitHub] spark pull request: [SPARK-3485][SQL] should check parameter type ...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2355#issuecomment-55722729 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20380/consoleFull) for PR 2355 at commit

[GitHub] spark pull request: [SPARK-3485][SQL] should check parameter type ...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2355#issuecomment-55723585 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20381/consoleFull) for PR 2355 at commit

[GitHub] spark pull request: [SPARK-2062][GraphX] VertexRDD.apply does not ...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1903#issuecomment-55725153 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20378/consoleFull) for PR 1903 at commit

[GitHub] spark pull request: [SPARK-2594][SQL] Add CACHE TABLE name AS SE...

2014-09-16 Thread ravipesala
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/spark/pull/2397#discussion_r17593929 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala --- @@ -183,9 +183,17 @@ class SqlParser extends StandardTokenParsers

[GitHub] spark pull request: [SPARK-2594][SQL] Add CACHE TABLE name AS SE...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2397#issuecomment-55725414 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20382/consoleFull) for PR 2397 at commit

[GitHub] spark pull request: [SPARK-2062][GraphX] VertexRDD.apply does not ...

2014-09-16 Thread ankurdave
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/1903#issuecomment-55725506 This looks good! I'll merge it pending the doc update. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-2594][SQL] Add CACHE TABLE name AS SE...

2014-09-16 Thread ravipesala
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/spark/pull/2397#discussion_r17594424 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala --- @@ -229,7 +229,13 @@ private[hive] object HiveQl {

[GitHub] spark pull request: [SPARK-2594][SQL] Add CACHE TABLE name AS SE...

2014-09-16 Thread ravipesala
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/spark/pull/2397#discussion_r17594549 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala --- @@ -229,7 +229,13 @@ private[hive] object HiveQl {

[GitHub] spark pull request: [SPARK-3529] [SQL] Delete the temp files after...

2014-09-16 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2393#issuecomment-55728033 @chenghao-intel For example, in `FileServerSuite`: ``` override def beforeAll() { super.beforeAll() tmpDir = Files.createTempDir()

[GitHub] spark pull request: [SPARK-2594][SQL] Add CACHE TABLE name AS SE...

2014-09-16 Thread ravipesala
Github user ravipesala commented on the pull request: https://github.com/apache/spark/pull/2397#issuecomment-55729953 Making lazy cache seems to be a good idea.As ```SQLContext.cacheTable```, ```CACHE TABLE name``` are both lazy so It is better to make ```CACHE TABLE AS SELECT ..```

[GitHub] spark pull request: [SPARK-3485][SQL] should check parameter type ...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2355#issuecomment-55731253 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20381/consoleFull) for PR 2355 at commit

[GitHub] spark pull request: [SPARK-2594][SQL] Add CACHE TABLE name AS SE...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2397#issuecomment-55732973 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20382/consoleFull) for PR 2397 at commit

[GitHub] spark pull request: [SPARK-3546] InputStream of ManagedBuffer does...

2014-09-16 Thread sarutak
GitHub user sarutak opened a pull request: https://github.com/apache/spark/pull/2408 [SPARK-3546] InputStream of ManagedBuffer does not close and causes running out of file descriptor You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: [SPARK-3546] InputStream of ManagedBuffer does...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2408#issuecomment-55734763 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20383/consoleFull) for PR 2408 at commit

[GitHub] spark pull request: [SPARK-3546] InputStream of ManagedBuffer is n...

2014-09-16 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2408#discussion_r17598920 --- Diff: core/src/main/scala/org/apache/spark/network/ManagedBuffer.scala --- @@ -66,8 +67,13 @@ final class FileSegmentManagedBuffer(val file: File, val

[GitHub] spark pull request: [SPARK-3546] InputStream of ManagedBuffer is n...

2014-09-16 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2408#discussion_r17598955 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -111,13 +112,21 @@ final class ShuffleBlockFetcherIterator(

[GitHub] spark pull request: [SPARK-3485][SQL] another way to pass to hive ...

2014-09-16 Thread adrian-wang
Github user adrian-wang commented on the pull request: https://github.com/apache/spark/pull/2407#issuecomment-55740848 test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-3546] InputStream of ManagedBuffer is n...

2014-09-16 Thread sarutak
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/2408#discussion_r17600021 --- Diff: core/src/main/scala/org/apache/spark/network/ManagedBuffer.scala --- @@ -66,8 +67,13 @@ final class FileSegmentManagedBuffer(val file: File, val

[GitHub] spark pull request: [SPARK-3546] InputStream of ManagedBuffer is n...

2014-09-16 Thread sarutak
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/2408#discussion_r17600041 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -111,13 +112,21 @@ final class ShuffleBlockFetcherIterator(

[GitHub] spark pull request: [SPARK-3545] Put HadoopRDD.getPartitions forwa...

2014-09-16 Thread YanTangZhai
GitHub user YanTangZhai opened a pull request: https://github.com/apache/spark/pull/2409 [SPARK-3545] Put HadoopRDD.getPartitions forward and put TaskScheduler.start back to reduce DAGScheduler.JobSubmitted processing time and shorten cluster resources occupation period We have

[GitHub] spark pull request: [SPARK-3545] Put HadoopRDD.getPartitions forwa...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2409#issuecomment-55741743 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20385/consoleFull) for PR 2409 at commit

[GitHub] spark pull request: [SPARK-3485][SQL] another way to pass to hive ...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2407#issuecomment-55741740 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20386/consoleFull) for PR 2407 at commit

[GitHub] spark pull request: [SPARK-3546] InputStream of ManagedBuffer is n...

2014-09-16 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/2408#discussion_r17600258 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -111,13 +112,21 @@ final class ShuffleBlockFetcherIterator(

[GitHub] spark pull request: [SPARK-3546] InputStream of ManagedBuffer is n...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2408#issuecomment-55741767 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20384/consoleFull) for PR 2408 at commit

[GitHub] spark pull request: [SPARK-3546] InputStream of ManagedBuffer is n...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2408#issuecomment-55742800 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20383/consoleFull) for PR 2408 at commit

[GitHub] spark pull request: [SPARK-3546] InputStream of ManagedBuffer is n...

2014-09-16 Thread sarutak
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/2408#discussion_r17601336 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -111,13 +112,21 @@ final class ShuffleBlockFetcherIterator(

[GitHub] spark pull request: add ability to submit multiple jars for Driver

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1113#issuecomment-55744803 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20387/consoleFull) for PR 1113 at commit

[GitHub] spark pull request: add ability to submit multiple jars for Driver

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1113#issuecomment-55744971 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20387/consoleFull) for PR 1113 at commit

[GitHub] spark pull request: [SPARK-2301] add ability to submit multiple ja...

2014-09-16 Thread lianhuiwang
Github user lianhuiwang commented on the pull request: https://github.com/apache/spark/pull/1113#issuecomment-55746200 @andrewor14 i update it with your comments.thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-2301] add ability to submit multiple ja...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1113#issuecomment-55746406 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20388/consoleFull) for PR 1113 at commit

[GitHub] spark pull request: [SPARK-2301] add ability to submit multiple ja...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1113#issuecomment-55746563 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20388/consoleFull) for PR 1113 at commit

[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-16 Thread staple
Github user staple commented on the pull request: https://github.com/apache/spark/pull/2347#issuecomment-55747592 Hi, per the discussion in https://github.com/apache/spark/pull/2362 the plan is to continue caching before deserialization from python rather than after, in order to

[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2347#issuecomment-55747882 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20389/consoleFull) for PR 2347 at commit

[GitHub] spark pull request: [SPARK-3545] Put HadoopRDD.getPartitions forwa...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2409#issuecomment-55749156 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20385/consoleFull) for PR 2409 at commit

[GitHub] spark pull request: add spark.driver.memory to config docs

2014-09-16 Thread nartz
GitHub user nartz opened a pull request: https://github.com/apache/spark/pull/2410 add spark.driver.memory to config docs It took me a minute to track this down, so I thought it could be useful to have it in the docs. I'm unsure if 512mb is the default for

[GitHub] spark pull request: add spark.driver.memory to config docs

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2410#issuecomment-55749884 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-3546] InputStream of ManagedBuffer is n...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2408#issuecomment-55751983 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20384/consoleFull) for PR 2408 at commit

[GitHub] spark pull request: [SPARK-3485][SQL] another way to pass to hive ...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2407#issuecomment-55755721 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20386/consoleFull) for PR 2407 at commit

[GitHub] spark pull request: SPARK-1656: Fix potential resource leaks

2014-09-16 Thread zsxwing
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/577#issuecomment-55755661 @andrewor14 any further suggestion? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-3548] [WebUI] Display cache hit ratio o...

2014-09-16 Thread sarutak
GitHub user sarutak opened a pull request: https://github.com/apache/spark/pull/2411 [SPARK-3548] [WebUI] Display cache hit ratio on WebUI You can merge this pull request into a Git repository by running: $ git pull https://github.com/sarutak/spark cache-hit-ratio-feature

[GitHub] spark pull request: [SPARK-3540] Add reboot-slaves functionality t...

2014-09-16 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/2404#issuecomment-55757044 It happens because in the git diff the script compares the PR branch with master and if PR is not rebased to the tip of master. False reporting will happen.

[GitHub] spark pull request: [SPARK-3548] [WebUI] Display cache hit ratio o...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2411#issuecomment-55757731 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20390/consoleFull) for PR 2411 at commit

[GitHub] spark pull request: [SPARK-3546] InputStream of ManagedBuffer is n...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2408#issuecomment-55758516 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20391/consoleFull) for PR 2408 at commit

[GitHub] spark pull request: [SPARK-1484][MLLIB] Warn when running an itera...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2347#issuecomment-55758803 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20389/consoleFull) for PR 2347 at commit

[GitHub] spark pull request: [SPARK-3540] Add reboot-slaves functionality t...

2014-09-16 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/2404#issuecomment-55759519 Actually, I guess if a class was removed from master but not also removed from the PR fork, then that would show up as an added class. --- If your project is set up

[GitHub] spark pull request: [SPARK-3550][MLLIB] Disable automatic rdd cach...

2014-09-16 Thread staple
GitHub user staple opened a pull request: https://github.com/apache/spark/pull/2412 [SPARK-3550][MLLIB] Disable automatic rdd caching for relevant learners. The NaiveBayes, ALS, and DecisionTree learners do not require external caching to prevent repeated RDD re-evaluation during

[GitHub] spark pull request: [SPARK-3550][MLLIB] Disable automatic rdd cach...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2412#issuecomment-55762318 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-3488][MLLIB] Cache python RDDs after de...

2014-09-16 Thread staple
Github user staple commented on the pull request: https://github.com/apache/spark/pull/2362#issuecomment-55762293 @davies I created a separate PR for disabling automatic caching for some learners: https://github.com/apache/spark/pull/2412 --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-3548] [WebUI] Display cache hit ratio o...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2411#issuecomment-55766313 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20390/consoleFull) for PR 2411 at commit

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

2014-09-16 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/2379#issuecomment-55766467 I think I have an opposite view from Andrew in that I dislike using sys.props as an IPC mechanism, but other than that, looks good. --- If your project is set up for it,

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

2014-09-16 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/2379#discussion_r17610954 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala --- @@ -167,14 +167,19 @@ class HistoryServer( * This launches the

[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

2014-09-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2014#issuecomment-55768975 Okay I can merge this. One thing though, we've typically had less-than-smooth experiences with jekyll and its dependencies. So if this feature causes issues for users

[GitHub] spark pull request: [SPARK-3551] Remove redundant putting FetchRes...

2014-09-16 Thread sarutak
GitHub user sarutak opened a pull request: https://github.com/apache/spark/pull/2413 [SPARK-3551] Remove redundant putting FetchResult which means Fetch Fail when Remote fetching You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: [SPARK-2759][CORE] Generic Binary File Support...

2014-09-16 Thread kmader
Github user kmader commented on the pull request: https://github.com/apache/spark/pull/1658#issuecomment-55769371 Thanks @jrabary for this find, it had to do with the new method for handling PortableDataStreams which didn't calculate the name correctly. I think I have it fixed now

[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

2014-09-16 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2014 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2182] Scalastyle rule blocking non asci...

2014-09-16 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2358 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-3546] InputStream of ManagedBuffer is n...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2408#issuecomment-55769853 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20391/consoleFull) for PR 2408 at commit

[GitHub] spark pull request: [Minor]ignore all config files in conf

2014-09-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2395#issuecomment-55769881 Jenkins LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [Minor]ignore all config files in conf

2014-09-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2395#issuecomment-55769901 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

2014-09-16 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/2014#issuecomment-55770066 FYI: This page is 404-ing: http://spark.apache.org/docs/latest/building-spark.html Is that temporary? --- If your project is set up for it, you can reply to

[GitHub] spark pull request: [Docs] minor punctuation fix

2014-09-16 Thread nchammas
GitHub user nchammas opened a pull request: https://github.com/apache/spark/pull/2414 [Docs] minor punctuation fix You can merge this pull request into a Git repository by running: $ git pull https://github.com/nchammas/spark patch-1 Alternatively you can review and apply

[GitHub] spark pull request: SPARK-1767: Prefer HDFS-cached replicas when s...

2014-09-16 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1486#discussion_r17612851 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskLocation.scala --- @@ -23,12 +23,35 @@ package org.apache.spark.scheduler * of preference

[GitHub] spark pull request: SPARK-1767: Prefer HDFS-cached replicas when s...

2014-09-16 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1486#discussion_r17612896 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -181,8 +181,24 @@ private[spark] class TaskSetManager( }

[GitHub] spark pull request: SPARK-1767: Prefer HDFS-cached replicas when s...

2014-09-16 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1486#discussion_r17612925 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -181,8 +181,24 @@ private[spark] class TaskSetManager( }

Re: [GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

2014-09-16 Thread Sean Owen
I imagine the new site hasn't been pushed. Yeah, the README.md has the new links immediately though. It's a minor and temporary, since I believe the site was going to be updated to fix that 1.1.0-SNAPSHOT ref anyway. On Tue, Sep 16, 2014 at 5:24 PM, nchammas g...@git.apache.org wrote: Github

[GitHub] spark pull request: SPARK-1767: Prefer HDFS-cached replicas when s...

2014-09-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1486#issuecomment-55771271 On the visibility stuff, understood. I actually forgot the old API is still supported in newer versions of Hadoop. Otherwise, you could put this all in the new hadoop

[GitHub] spark pull request: [Docs] minor punctuation fix

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2414#issuecomment-55771497 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20394/consoleFull) for PR 2414 at commit

[GitHub] spark pull request: [SPARK-2301] add ability to submit multiple ja...

2014-09-16 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/1113#issuecomment-55772901 Yeah, that's true. Do the changes here achieve what spark-submit's `--jars` can't achieve? --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: SPARK-1767: Prefer HDFS-cached replicas when s...

2014-09-16 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1486#discussion_r17614113 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskLocation.scala --- @@ -23,12 +23,33 @@ package org.apache.spark.scheduler * of preference

[GitHub] spark pull request: [SPARK-3548] [WebUI] Display cache hit ratio o...

2014-09-16 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2411#issuecomment-55773715 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20395/consoleFull) for PR 2411 at commit

[GitHub] spark pull request: [SPARK-787] Add S3 configuration parameters to...

2014-09-16 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1120#issuecomment-55774088 @danosipov Yeah, do you want to just add the docs in this patch? Our docs are versioned in the spark repo under `docs/`. I'd just add one sentence that says you can

[GitHub] spark pull request: [SPARK-2098] All Spark processes should suppor...

2014-09-16 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/2379#discussion_r17614596 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1352,6 +1352,33 @@ private[spark] object Utils extends Logging { }

  1   2   3   4   >