[GitHub] spark pull request: SPARK-3337 Paranoid quoting in shell to allow ...

2014-09-05 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/2229#discussion_r17158127 --- Diff: sbt/sbt-launch-lib.bash --- @@ -180,7 +180,7 @@ run() { ${SBT_OPTS:-$default_sbt_opts} \ $(get_mem_opts $sbt_mem) \

[GitHub] spark pull request: [SPARK-3362][SQL] bug in casewhen resolve

2014-09-05 Thread adrian-wang
Github user adrian-wang commented on the pull request: https://github.com/apache/spark/pull/2245#issuecomment-54588876 Seems so unlucky to be trapped in different test suite in every run. @marmbrus @rxin Can you give the patch a retest? --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-3362][SQL] bug in casewhen resolve

2014-09-05 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2245#issuecomment-54588910 Jenkins is down right now ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Optimize the schedule procedure in Master

2014-09-05 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/1106#issuecomment-54588914 The PR is: https://issues.apache.org/jira/browse/SPARK-3411. Cause the filter will create copy of worker, so I change the way of filtering. The shuffle

[GitHub] spark pull request: [SPARK-3362][SQL] bug in casewhen resolve

2014-09-05 Thread adrian-wang
Github user adrian-wang commented on the pull request: https://github.com/apache/spark/pull/2245#issuecomment-54589163 All right... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-3410] The priority of shutdownhook for ...

2014-09-05 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2283#issuecomment-54589212 This change makes this shutdown hook lower than `FileSystem`'s, whereas it used to be higher. Also does this compile for `yarn-alpha` too? Given the time it went in, it

[GitHub] spark pull request: SPARK-3211 .take() is OOM-prone with empty par...

2014-09-05 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2117#issuecomment-54589602 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3410] The priority of shutdownhook for ...

2014-09-05 Thread sarutak
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/2283#issuecomment-54589820 @srowen It's confused but lower value is higher priority. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-3410] The priority of shutdownhook for ...

2014-09-05 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2283#issuecomment-54589954 Ah I see. That's fine, I just wasn't sure which the intent was since I think the original description is missing a word. --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-3391][EC2] Support attaching up to 8 EB...

2014-09-05 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2260#issuecomment-54590041 Ok merging this (and removed io1 for now). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-3391][EC2] Support attaching up to 8 EB...

2014-09-05 Thread pdeyhim
Github user pdeyhim commented on the pull request: https://github.com/apache/spark/pull/2260#issuecomment-54590139 And what happens when the additional EBS volumes get added? We probably want to configure spark-env.sh and spark_local_dir with the new volumes correct? the place this

[GitHub] spark pull request: [SPARK-3391][EC2] Support attaching up to 8 EB...

2014-09-05 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2260#issuecomment-54590209 the ebs volumes are not great for shuffle (bad small write performance). Let's hold that off for now. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-3391][EC2] Support attaching up to 8 EB...

2014-09-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2260 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [Docs] fix minor MLlib case typo

2014-09-05 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/2278#issuecomment-54590246 Merged into master and branch-1.1. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [Docs] fix minor MLlib case typo

2014-09-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2278 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-3410] The priority of shutdownhook for ...

2014-09-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2283#issuecomment-54590354 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19800/consoleFull) for PR 2283 at commit

[GitHub] spark pull request: [SPARK-3410] The priority of shutdownhook for ...

2014-09-05 Thread sarutak
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/2283#issuecomment-54590502 Ah, sorry it's my wrong. I confirm the logic of ShutdownHookManager, and higher value is higher priority. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [SPARK-3391][EC2] Support attaching up to 8 EB...

2014-09-05 Thread pdeyhim
Github user pdeyhim commented on the pull request: https://github.com/apache/spark/pull/2260#issuecomment-54590541 @rxin ok that's correct for smaller instance types. But FYI, EBS on larger instances (and ebs optimized instances) should perform well on shuffle read/write --- If

[GitHub] spark pull request: [SPARK-3086] [SPARK-3043] [SPARK-3156] [mllib]...

2014-09-05 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/2125#issuecomment-54590611 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-3409][SQL] Avoid pulling in Exchange op...

2014-09-05 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2282#issuecomment-54594495 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3412] [SQL] Add 3 missing types for Row...

2014-09-05 Thread chenghao-intel
GitHub user chenghao-intel opened a pull request: https://github.com/apache/spark/pull/2284 [SPARK-3412] [SQL] Add 3 missing types for Row API `BinaryType`, `DecimalType` and `TimestampType` are missing in the Row API. You can merge this pull request into a Git repository by

[GitHub] spark pull request: [SPARK-3410] The priority of shutdownhook for ...

2014-09-05 Thread sarutak
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/2283#issuecomment-54595905 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-3211 .take() is OOM-prone with empty par...

2014-09-05 Thread ash211
Github user ash211 commented on the pull request: https://github.com/apache/spark/pull/2117#issuecomment-54596140 @nchammas I'm guessing your OOM issue is unrelated to this one. ``` a = sc.parallelize([Nick, John, Bob]) a = a.repartition(24000) a.keyBy(lambda x:

[GitHub] spark pull request: SPARK-3211 .take() is OOM-prone with empty par...

2014-09-05 Thread ash211
Github user ash211 commented on the pull request: https://github.com/apache/spark/pull/2117#issuecomment-54596256 Regarding the merge, I'm guessing this is too late to land in the Spark 1.1 release. Is it a candidate for a backport to a 1.1.x? --- If your project is set up for it,

[GitHub] spark pull request: [SPARK-3408] Fixed Limit operator so it works ...

2014-09-05 Thread ash211
Github user ash211 commented on the pull request: https://github.com/apache/spark/pull/2281#issuecomment-54596774 What's the implication here for other client code of the Spark API? It looks like there are mutability concerns in whether you can save a reference to the object you get

[GitHub] spark pull request: [SPARK-3408] Fixed Limit operator so it works ...

2014-09-05 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2281#issuecomment-54596928 The correct assumption is to not reuse objects. However, in Spark SQL we exploited the implementation of the old shuffle behavior (which serializes each row object

[GitHub] spark pull request: Fix for false positives reported by mima on PR...

2014-09-05 Thread ScrapCodes
GitHub user ScrapCodes opened a pull request: https://github.com/apache/spark/pull/2285 Fix for false positives reported by mima on PR 2194. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ScrapCodes/spark-1 mima-fix

[GitHub] spark pull request: [SPARK-3408] Fixed Limit operator so it works ...

2014-09-05 Thread ash211
Github user ash211 commented on the pull request: https://github.com/apache/spark/pull/2281#issuecomment-54597321 I don't see that contract in the API documented in the Scaladoc for the method: ``` 588 /** 589* Return a new RDD by applying a function to each

[GitHub] spark pull request: SPARK-2895: Add mapPartitionsWithContext relat...

2014-09-05 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/2194#issuecomment-54598665 @rxin There is a reason and (workaround type of)fix for this on #2285. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: Fix for false positives reported by mima on PR...

2014-09-05 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/2285#discussion_r17161813 --- Diff: dev/mima --- @@ -25,12 +25,16 @@ FWDIR=$(cd `dirname $0`/..; pwd) cd $FWDIR echo -e q\n | sbt/sbt oldDeps/update +rm -f

[GitHub] spark pull request: Fix for false positives reported by mima on PR...

2014-09-05 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/2285#discussion_r17161831 --- Diff: dev/mima --- @@ -25,12 +25,16 @@ FWDIR=$(cd `dirname $0`/..; pwd) cd $FWDIR echo -e q\n | sbt/sbt oldDeps/update +rm -f

[GitHub] spark pull request: Tests meant to demonstrate the bug in SPARK-26...

2014-09-05 Thread ash211
Github user ash211 commented on the pull request: https://github.com/apache/spark/pull/1588#issuecomment-54598213 Yep good to close -- we can refer to the ticket in the future if it comes back up --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-3412] [SQL] Add 3 missing types for Row...

2014-09-05 Thread chenghao-intel
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/2284#issuecomment-54597712 test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Fix for false positives reported by mima on PR...

2014-09-05 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/2285#discussion_r17162225 --- Diff: dev/mima --- @@ -25,11 +25,15 @@ FWDIR=$(cd `dirname $0`/..; pwd) cd $FWDIR echo -e q\n | sbt/sbt oldDeps/update +rm -f

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-09-05 Thread li-zhihui
Github user li-zhihui commented on the pull request: https://github.com/apache/spark/pull/1616#issuecomment-54600822 @JoshRosen @andrewor14 I use codeurl.hashCode + timestamp/code as codecachedFileName/code, I believe it is impossible that existing codeurl.hashCode/code collision

[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-05 Thread cloud-fan
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/2230#issuecomment-54601247 @marmbrus Seems hive parser will pass something like a.b.c... to `LogicalPlan`, so I have to roll back(and I changed `dotExpressionHeader` to `ident . ident {.

[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-05 Thread cloud-fan
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/2230#issuecomment-54601682 I'm not sure how to modify `lazy val resolved` in `GetField` since it handles not only StructType now. Currently I just removed the type check. What do you think?

[GitHub] spark pull request: [SPARK-3410] The priority of shutdownhook for ...

2014-09-05 Thread sarutak
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/2283#issuecomment-54602306 test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: Don't include the empty string as a default...

2014-09-05 Thread ash211
GitHub user ash211 opened a pull request: https://github.com/apache/spark/pull/2286 Don't include the empty string as a defaultAclUser Changes logging from ``` 14/09/05 02:01:08 INFO SecurityManager: Changing view acls to: aash, 14/09/05 02:01:08 INFO

[GitHub] spark pull request: [BUILD] Fix for false positives reported by mi...

2014-09-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2285#issuecomment-54604364 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19803/consoleFull) for PR 2285 at commit

[GitHub] spark pull request: Don't include the empty string as a default...

2014-09-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2286#issuecomment-54604796 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19804/consoleFull) for PR 2286 at commit

[GitHub] spark pull request: [SPARK-1853] Show Streaming application code c...

2014-09-05 Thread mubarak
Github user mubarak commented on the pull request: https://github.com/apache/spark/pull/1723#issuecomment-54606563 @tdas Can you please review? Thanks ![screen shot 2014-09-05 at 1 42 28

[GitHub] spark pull request: [SPARK-1853] Show Streaming application code c...

2014-09-05 Thread mubarak
Github user mubarak commented on the pull request: https://github.com/apache/spark/pull/1723#issuecomment-54606650 Jenkins, this is ok to test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [BUILD] Fix for false positives reported by mi...

2014-09-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2285#issuecomment-54609772 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19803/consoleFull) for PR 2285 at commit

[GitHub] spark pull request: Don't include the empty string as a default...

2014-09-05 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2286#issuecomment-54610134 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19804/consoleFull) for PR 2286 at commit

[GitHub] spark pull request: [SPARK-3415] [PySpark] removes SerializingAdap...

2014-09-05 Thread wardviaene
GitHub user wardviaene opened a pull request: https://github.com/apache/spark/pull/2287 [SPARK-3415] [PySpark] removes SerializingAdapter code This code removes the SerializingAdapter code that was copied from PiCloud You can merge this pull request into a Git repository by

[GitHub] spark pull request: Don't include the empty string as a default...

2014-09-05 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/2286#discussion_r17174246 --- Diff: core/src/main/scala/org/apache/spark/SecurityManager.scala --- @@ -162,7 +162,7 @@ private[spark] class SecurityManager(sparkConf: SparkConf)

[GitHub] spark pull request: Don't include the empty string as a default...

2014-09-05 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/2286#issuecomment-54630202 Thanks for working on this, I've been meaning to fix this for a while. Could you also please file a jira and link them. The header of the pr should include

[GitHub] spark pull request: [SPARK-3410] The priority of shutdownhook for ...

2014-09-05 Thread sarutak
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/2283#issuecomment-54631454 Jenkins, retest this please . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-3211 .take() is OOM-prone with empty par...

2014-09-05 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/2117#issuecomment-54634193 @ash211 Thank you for explaining that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-3410] The priority of shutdownhook for ...

2014-09-05 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/2283#issuecomment-54634308 I don't think this is really necessary as I see the value of the Filesystem one as a public api now and changing its value would break compatibility, but I'm ok with

[GitHub] spark pull request: [SPARK-2140] Updating heap memory calculation ...

2014-09-05 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/2253#issuecomment-54634933 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: pyspark.sql.SQLContext is new-style class

2014-09-05 Thread mrocklin
GitHub user mrocklin opened a pull request: https://github.com/apache/spark/pull/2288 pyspark.sql.SQLContext is new-style class Tiny PR making SQLContext a new-style class. This allows various type logic to work more effectively ```Python In [1]: import pyspark

[GitHub] spark pull request: [SPARK-3260] yarn - pass acls along with execu...

2014-09-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2185 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-3375] spark on yarn container allocatio...

2014-09-05 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2275 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-3415] [PySpark] removes SerializingAdap...

2014-09-05 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2287#issuecomment-54636076 Hi @wardviaene, Do you have an example program that reproduces this bug? We should probably add it as a regression test (see `python/pyspark/tests.py` for

[GitHub] spark pull request: pyspark.sql.SQLContext is new-style class

2014-09-05 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2288#issuecomment-54636484 Good catch! While you're at it, are there any other old-style classes in PySpark that should be made into new-style ones? --- If your project is set up for it, you

[GitHub] spark pull request: pyspark.sql.SQLContext is new-style class

2014-09-05 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2288#issuecomment-54636685 Also, do you mind opening a JIRA ticket on https://issues.apache.org/jira/browse/SPARK and editing the title of your pull request to reference it, e.g. `[SPARK-]

[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

2014-09-05 Thread bbejeck
Github user bbejeck commented on the pull request: https://github.com/apache/spark/pull/2227#issuecomment-54637031 Did any of the admin had chance to check it out? Let me know if you want me to modify anything in it? --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: pyspark.sql.SQLContext is new-style class

2014-09-05 Thread mrocklin
Github user mrocklin commented on the pull request: https://github.com/apache/spark/pull/2288#issuecomment-54638388 Sure. Next time I find a few free minutes. On Fri, Sep 5, 2014 at 8:04 AM, Josh Rosen notificati...@github.com wrote: Also, do you mind opening a

[GitHub] spark pull request: [SPARK-3361] Expand PEP 8 checks to include EC...

2014-09-05 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/2277#issuecomment-54638477 Jenkinshenck, could you test this please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: pyspark.sql.SQLContext is new-style class

2014-09-05 Thread mrocklin
Github user mrocklin commented on the pull request: https://github.com/apache/spark/pull/2288#issuecomment-54639788 ``` mrocklin@notebook:~/workspace/spark$ git grep ^class \w*: mrocklin@notebook:~/workspace/spark$ ``` --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-3417] -Use of old-style classes in pysp...

2014-09-05 Thread mrocklin
Github user mrocklin commented on the pull request: https://github.com/apache/spark/pull/2288#issuecomment-54639853 Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: Spark-3406 add a default storage level to pyth...

2014-09-05 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2280#issuecomment-54641296 It looks like `sql.py` overrides the default `persist()`, so you might want to update it there, too. LGTM otherwise. --- If your project is set up for it, you can

[GitHub] spark pull request: [SPARK-3286] - Cannot view ApplicationMaster U...

2014-09-05 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/2276#discussion_r17180863 --- Diff: yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/YarnRMClientImpl.scala --- @@ -96,7 +96,7 @@ private class YarnRMClientImpl(args:

[GitHub] spark pull request: [SPARK-2778] [yarn] Add yarn integration tests...

2014-09-05 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/2257#issuecomment-54648499 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3399][PySpark] Test for PySpark should ...

2014-09-05 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/2270#discussion_r17182845 --- Diff: bin/pyspark --- @@ -85,6 +85,8 @@ export PYSPARK_SUBMIT_ARGS # For pyspark tests if [[ -n $SPARK_TESTING ]]; then + unset

[GitHub] spark pull request: [SPARK-1825] Fixes cross-platform submit probl...

2014-09-05 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/899#issuecomment-54652280 @zeodtr does this compile with anything hadoop 2.4? If it doesn't, this is a no-go. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-3286] - Cannot view ApplicationMaster U...

2014-09-05 Thread benoyantony
Github user benoyantony commented on the pull request: https://github.com/apache/spark/pull/2276#issuecomment-54652578 Sure. I'll do both. Does Alpha corresponds to Hadoop versions before YARN-1203 ? As you know, before YARN-1203, we cannot pass AM URLS with scheme. --- If

[GitHub] spark pull request: [SPARK-3286] - Cannot view ApplicationMaster U...

2014-09-05 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/2276#issuecomment-54653232 No, alpha means pre-branch-2 hadoop (I think, Hadoop branching is not exactly an exact science). Anyway, there are stable releases without YARN-1203. So that probably

[GitHub] spark pull request: [SPARK-3399][PySpark] Test for PySpark should ...

2014-09-05 Thread sarutak
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/2270#discussion_r17184086 --- Diff: bin/pyspark --- @@ -85,6 +85,8 @@ export PYSPARK_SUBMIT_ARGS # For pyspark tests if [[ -n $SPARK_TESTING ]]; then + unset

[GitHub] spark pull request: [SPARK-3361] Expand PEP 8 checks to include EC...

2014-09-05 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2277#issuecomment-54654607 Jenkins, retest this please. (Not sure if Jenkins is programmed to listen to @nchammas or not...) --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-3094] [PySpark] compatitable with PyPy

2014-09-05 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/2144#issuecomment-54654606 @mateiz @JoshRosen @mattf run-tests will try to run tests for spark core and sql with PyPy. One known issue is that serialization of array in PyPy is similar to

[GitHub] spark pull request: [SPARK-3094] [PySpark] compatitable with PyPy

2014-09-05 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/2144#issuecomment-54654638 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

2014-09-05 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2227#issuecomment-54654828 Jenkins, this is ok to test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

2014-09-05 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/2227#issuecomment-54655129 Feels to me like it would be better to fix this in `Utils.memoryStringToMb`. That way all code using it benefits. As for the behavior of that method, maybe it

[GitHub] spark pull request: [SPARK-3375] spark on yarn container allocatio...

2014-09-05 Thread vanzin
Github user vanzin commented on the pull request: https://github.com/apache/spark/pull/2275#issuecomment-54655288 Oops. Thanks for fixing it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

2014-09-05 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/2227#discussion_r17184945 --- Diff: core/src/test/scala/org/apache/spark/deploy/worker/WorkerArgumentsTest.scala --- @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-3399][PySpark] Test for PySpark should ...

2014-09-05 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/2270#discussion_r17184963 --- Diff: bin/pyspark --- @@ -85,6 +85,8 @@ export PYSPARK_SUBMIT_ARGS # For pyspark tests if [[ -n $SPARK_TESTING ]]; then + unset

[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

2014-09-05 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/2227#discussion_r17185052 --- Diff: core/src/test/scala/org/apache/spark/deploy/worker/WorkerArgumentsTest.scala --- @@ -0,0 +1,60 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-3399][PySpark] Test for PySpark should ...

2014-09-05 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/2270#issuecomment-54657139 This patch looks good to me. @JoshRosen could you help to re-visit this? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-3030] [PySpark] Reuse Python worker

2014-09-05 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/2259#issuecomment-54657265 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-3399][PySpark] Test for PySpark should ...

2014-09-05 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2270#issuecomment-54660424 Looks good to me, too. Thanks for fixing this! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-09-05 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/1616#discussion_r17188055 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -313,14 +313,74 @@ private[spark] object Utils extends Logging { }

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-09-05 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/1616#discussion_r17188080 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -313,14 +313,74 @@ private[spark] object Utils extends Logging { }

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-09-05 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/1616#discussion_r17188168 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -313,14 +313,74 @@ private[spark] object Utils extends Logging { }

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-09-05 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/1616#issuecomment-54661638 Do we need to clean up the new cache files we created? Or is that handled automatically somewhere --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: SPARK-3337 Paranoid quoting in shell to allow ...

2014-09-05 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2229#issuecomment-54661731 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-3415] [PySpark] removes SerializingAdap...

2014-09-05 Thread wardviaene
Github user wardviaene commented on the pull request: https://github.com/apache/spark/pull/2287#issuecomment-54661969 Hi @JoshRosen I added a test script in this pull request. The sys.stderr in a class triggers the bug. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-3415] [PySpark] removes SerializingAdap...

2014-09-05 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/2287#discussion_r17188526 --- Diff: python/pyspark/tests.py --- @@ -180,6 +180,22 @@ def tearDown(self): self.sc.stop() sys.path = self._old_sys_path

[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

2014-09-05 Thread bbejeck
Github user bbejeck commented on the pull request: https://github.com/apache/spark/pull/2227#issuecomment-54662304 Josh, Thanks for the heads up on testing with environment variables. I will look at the PR and make the required changes to the test. --- If your project is

[GitHub] spark pull request: [SPARK-3415] [PySpark] removes SerializingAdap...

2014-09-05 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/2287#discussion_r17188600 --- Diff: python/pyspark/tests.py --- @@ -180,6 +180,22 @@ def tearDown(self): self.sc.stop() sys.path = self._old_sys_path

[GitHub] spark pull request: [SPARK-3415] [PySpark] removes SerializingAdap...

2014-09-05 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/2287#discussion_r17188799 --- Diff: python/pyspark/tests.py --- @@ -180,6 +180,22 @@ def tearDown(self): self.sc.stop() sys.path = self._old_sys_path

[GitHub] spark pull request: SPARK-3178 setting SPARK_WORKER_MEMORY to a va...

2014-09-05 Thread bbejeck
Github user bbejeck commented on the pull request: https://github.com/apache/spark/pull/2227#issuecomment-54662763 Feels to me like it would be better to fix this in Utils.memoryStringToMb. That way all code using it benefits. I thought the same thing, but I was not sure

[GitHub] spark pull request: [SPARK-3415] [PySpark] removes SerializingAdap...

2014-09-05 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/2287#discussion_r17188918 --- Diff: python/pyspark/tests.py --- @@ -180,6 +180,22 @@ def tearDown(self): self.sc.stop() sys.path = self._old_sys_path

[GitHub] spark pull request: [SPARK-3176] Implement 'ABS and 'LAST' for sql

2014-09-05 Thread xinyunh
Github user xinyunh commented on the pull request: https://github.com/apache/spark/pull/2099#issuecomment-54663384 Sorry, I forgot --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: TEST ONLY DO NOT MERGE

2014-09-05 Thread shaneknapp
GitHub user shaneknapp opened a pull request: https://github.com/apache/spark/pull/2289 TEST ONLY DO NOT MERGE TEST ONLY DO NOT MERGE You can merge this pull request into a Git repository by running: $ git pull https://github.com/shaneknapp/spark sknapptest Alternatively you

[GitHub] spark pull request: [EC2] don't duplicate default values

2014-09-05 Thread nchammas
GitHub user nchammas opened a pull request: https://github.com/apache/spark/pull/2290 [EC2] don't duplicate default values This PR makes two minor changes to the `spark-ec2` script: 1. The script's input parameter default values are duplicated into the help text. This is

[GitHub] spark pull request: [SPARK-2491]: Fix When an fatal error is throw...

2014-09-05 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/1482#issuecomment-54664325 This seems reasonable to me. /cc @andrewor14 for another pair of eyes. To recap [some discussion on the

[GitHub] spark pull request: [EC2] don't duplicate default values

2014-09-05 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2290#issuecomment-54664634 Woah, I didn't know optparse had `%default`. Cool! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

  1   2   3   4   5   >