[GitHub] spark pull request: [SPARK-3086] [SPARK-3043] [SPARK-3156] [mllib]...

2014-09-04 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/2125#issuecomment-54590611 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this featu

[GitHub] spark pull request: [SPARK-3391][EC2] Support attaching up to 8 EB...

2014-09-04 Thread pdeyhim
Github user pdeyhim commented on the pull request: https://github.com/apache/spark/pull/2260#issuecomment-54590541 @rxin ok that's correct for smaller instance types. But FYI, EBS on larger instances (and ebs optimized instances) should perform well on shuffle read/write --- If your

[GitHub] spark pull request: [SPARK-3410] The priority of shutdownhook for ...

2014-09-04 Thread sarutak
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/2283#issuecomment-54590502 Ah, sorry it's my wrong. I confirm the logic of ShutdownHookManager, and higher value is higher priority. --- If your project is set up for it, you can reply to this ema

[GitHub] spark pull request: [SPARK-3410] The priority of shutdownhook for ...

2014-09-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2283#issuecomment-54590354 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19800/consoleFull) for PR 2283 at commit [`717aba2`](https://github.com/a

[GitHub] spark pull request: [Docs] fix minor MLlib case typo

2014-09-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2278 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [Docs] fix minor MLlib case typo

2014-09-04 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/2278#issuecomment-54590246 Merged into master and branch-1.1. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-3391][EC2] Support attaching up to 8 EB...

2014-09-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2260 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-3391][EC2] Support attaching up to 8 EB...

2014-09-04 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2260#issuecomment-54590209 the ebs volumes are not great for shuffle (bad small write performance). Let's hold that off for now. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-3391][EC2] Support attaching up to 8 EB...

2014-09-04 Thread pdeyhim
Github user pdeyhim commented on the pull request: https://github.com/apache/spark/pull/2260#issuecomment-54590139 And what happens when the additional EBS volumes get added? We probably want to configure spark-env.sh and spark_local_dir with the new volumes correct? the place this ha

[GitHub] spark pull request: [SPARK-3391][EC2] Support attaching up to 8 EB...

2014-09-04 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2260#issuecomment-54590041 Ok merging this (and removed io1 for now). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project d

[GitHub] spark pull request: [SPARK-3410] The priority of shutdownhook for ...

2014-09-04 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2283#issuecomment-54589954 Ah I see. That's fine, I just wasn't sure which the intent was since I think the original description is missing a word. --- If your project is set up for it, you can rep

[GitHub] spark pull request: [SPARK-3410] The priority of shutdownhook for ...

2014-09-04 Thread sarutak
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/2283#issuecomment-54589820 @srowen It's confused but lower value is higher priority. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: SPARK-3211 .take() is OOM-prone with empty par...

2014-09-04 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2117#issuecomment-54589602 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have thi

[GitHub] spark pull request: [SPARK-3410] The priority of shutdownhook for ...

2014-09-04 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/2283#issuecomment-54589212 This change makes this shutdown hook lower than `FileSystem`'s, whereas it used to be higher. Also does this compile for `yarn-alpha` too? Given the time it went in, it pr

[GitHub] spark pull request: [SPARK-3362][SQL] bug in casewhen resolve

2014-09-04 Thread adrian-wang
Github user adrian-wang commented on the pull request: https://github.com/apache/spark/pull/2245#issuecomment-54589163 All right... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this featu

[GitHub] spark pull request: Optimize the schedule procedure in Master

2014-09-04 Thread WangTaoTheTonic
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/1106#issuecomment-54588914 The PR is: https://issues.apache.org/jira/browse/SPARK-3411. Cause the filter will create copy of worker, so I change the way of filtering. The shuffle wil

[GitHub] spark pull request: [SPARK-3362][SQL] bug in casewhen resolve

2014-09-04 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2245#issuecomment-54588910 Jenkins is down right now ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: [SPARK-3362][SQL] bug in casewhen resolve

2014-09-04 Thread adrian-wang
Github user adrian-wang commented on the pull request: https://github.com/apache/spark/pull/2245#issuecomment-54588876 Seems so unlucky to be trapped in different test suite in every run. @marmbrus @rxin Can you give the patch a retest? --- If your project is set up for it, you ca

[GitHub] spark pull request: SPARK-3337 Paranoid quoting in shell to allow ...

2014-09-04 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/2229#discussion_r17158127 --- Diff: sbt/sbt-launch-lib.bash --- @@ -180,7 +180,7 @@ run() { ${SBT_OPTS:-$default_sbt_opts} \ $(get_mem_opts $sbt_mem) \ ${j

[GitHub] spark pull request: [SPARK-3408] Fixed Limit operator so it works ...

2014-09-04 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2281#issuecomment-54588066 Yea. I marked 1.1.1 as the target version there. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proj

[GitHub] spark pull request: [SPARK-3391][EC2] Support attaching up to 8 EB...

2014-09-04 Thread pdeyhim
Github user pdeyhim commented on the pull request: https://github.com/apache/spark/pull/2260#issuecomment-54588029 for io1, the specifying the number of iops is required. So we either have to limit this to gp2 and standard or fully support io1 but allowing users to specify the number

[GitHub] spark pull request: [SPARK-3410] The priority of shutdownhook for ...

2014-09-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2283#issuecomment-54587067 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19800/consoleFull) for PR 2283 at commit [`717aba2`](https://github.com/ap

[GitHub] spark pull request: [SPARK-3410] The priority of shutdownhook for ...

2014-09-04 Thread sarutak
GitHub user sarutak opened a pull request: https://github.com/apache/spark/pull/2283 [SPARK-3410] The priority of shutdownhook for ApplicationMaster should not be integer literal, rather than refer constant. I think, it need to keep the priority of shutdown hook for ApplicationMast

[GitHub] spark pull request: [SPARK-3363][SQL] Type Coercion should support...

2014-09-04 Thread adrian-wang
Github user adrian-wang commented on the pull request: https://github.com/apache/spark/pull/2246#issuecomment-54585966 Hmm... you are right, I misunderstood some code in `compatibleType` there. Thank you! --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-3408] Fixed Limit operator so it works ...

2014-09-04 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2281#issuecomment-54584908 @rxin this should be backported right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project do

[GitHub] spark pull request: [SPARK-3181][MLLIB]: Add Robust Regression Alg...

2014-09-04 Thread fjiang6
Github user fjiang6 commented on the pull request: https://github.com/apache/spark/pull/2096#issuecomment-54577616 ERROR: Timeout after 10 minutes FATAL: Failed to fetch from https://github.com/apache/spark.git Can you please retest? --- If your project is set up for it,

[GitHub] spark pull request: [SPARK-3007][SQL]Add "Dynamic Partition" suppo...

2014-09-04 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/2226#issuecomment-54577446 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: [Build] Removed -Phive-thriftserver since this...

2014-09-04 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/2269#issuecomment-54577457 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: [SPARK-3325] Add a parameter to the method pri...

2014-09-04 Thread watermen
Github user watermen commented on the pull request: https://github.com/apache/spark/pull/2216#issuecomment-54577266 @srowen it doesn't need to add an existing method, see my [files changed]. --- If your project is set up for it, you can reply to this email and have your reply appear o

[GitHub] spark pull request: [SPARK-3391][EC2] Support attaching up to 8 EB...

2014-09-04 Thread shivaram
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/2260#issuecomment-54576199 LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [Build] Removed -Phive-thriftserver since this...

2014-09-04 Thread sarutak
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/2269#issuecomment-54575742 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: [SPARK-3007][SQL]Add "Dynamic Partition" suppo...

2014-09-04 Thread baishuo
Github user baishuo commented on the pull request: https://github.com/apache/spark/pull/2226#issuecomment-54575671 can this PR be tested? :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: [SPARK-3280] Made sort-based shuffle the defau...

2014-09-04 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2178#issuecomment-54574633 Ok let's test this again after https://github.com/apache/spark/pull/2281 is merged. --- If your project is set up for it, you can reply to this email and have your reply a

[GitHub] spark pull request: [SPARK-3409][SQL] Avoid pulling in Exchange op...

2014-09-04 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/2282 [SPARK-3409][SQL] Avoid pulling in Exchange operator itself in Exchange's closures. This is a tiny teeny optimization. You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: [SPARK-3007][SQL]Add "Dynamic Partition" suppo...

2014-09-04 Thread baishuo
Github user baishuo commented on the pull request: https://github.com/apache/spark/pull/2226#issuecomment-54574495 can this PR be tested? The golden file related HiveCompatibilitySuite with had already exists in master branch of spark. So do not need to add them. --- If your project

[GitHub] spark pull request: [SPARK-3408] Fixed Limit operator so it works ...

2014-09-04 Thread rxin
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/2281 [SPARK-3408] Fixed Limit operator so it works with sort-based shuffle. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark sql-limit-sort A

[GitHub] spark pull request: [SPARK-3392] [SQL] Show value spark.sql.shuffl...

2014-09-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2261 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-3408] Fixed Limit operator so it works ...

2014-09-04 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2281#issuecomment-54574292 @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-3392] [SQL] Show value spark.sql.shuffl...

2014-09-04 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2261#issuecomment-54574089 Thanks! I've merged this to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-3361] Expand PEP 8 checks to include EC...

2014-09-04 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/2277#issuecomment-54573820 This page is like an altar to a mysterious deity. Come and pray for testing! Perhaps today the bald one will listen. :pray: --- If your project is set up for it, you c

[GitHub] spark pull request: [SPARK-3176] Implement 'POWER', 'ABS and 'LAST...

2014-09-04 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2099#issuecomment-54573623 Mind fixing the title to remove Power? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project do

[GitHub] spark pull request: [SPARK-3349] [SQL] Output partitioning of limi...

2014-09-04 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2262#issuecomment-54573544 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this featu

[GitHub] spark pull request: [SPARK-2096][SQL] Correctly parse dot notation...

2014-09-04 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2230#issuecomment-54573524 Yeah, I'd like to simplify this, but unfortunately I think this version introduces a regression for hive queries. I've made a PR (against your PR) that shows this regre

[GitHub] spark pull request: [SPARK-3000][CORE] drop old blocks to disk in ...

2014-09-04 Thread liyezhang556520
Github user liyezhang556520 commented on the pull request: https://github.com/apache/spark/pull/2134#issuecomment-54572675 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: [SPARK-2219][SQL] Added support for the "add j...

2014-09-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2242 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-2219][SQL] Added support for the "add j...

2014-09-04 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2242#issuecomment-54572401 Thanks! I've merged this to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-3310][SQL] Directly use currentTable wi...

2014-09-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2203 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-3310][SQL] Directly use currentTable wi...

2014-09-04 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2203#issuecomment-54572309 Thanks! I've merged this to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-3363][SQL] Type Coercion should support...

2014-09-04 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2246#issuecomment-54572192 I don't think the concern about using `findTightestCommonType` with struct types is valid. If the fields match up exactly we will return the type, otherwise we will fal

[GitHub] spark pull request: Spark-3406 add a default storage level to pyth...

2014-09-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2280#issuecomment-54568013 [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19796/consoleFull) for PR 2280 at commit [`e658227`](https://github.com/a

[GitHub] spark pull request: SPARK-2978. Transformation with MR shuffle sem...

2014-09-04 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2274#issuecomment-54567422 Ah, I see. Then we can add it, but in that case I'd also add it in Python. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-3094] [PySpark] compatitable with PyPy

2014-09-04 Thread JoshRosen
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2144#issuecomment-54567184 Let's just have the PyPy tests run by default on Jenkins. If this causes build speed problems later down the road, we can revisit the issue of selectively running test

[GitHub] spark pull request: [SPARK-3094] [PySpark] compatitable with PyPy

2014-09-04 Thread mattf
Github user mattf commented on the pull request: https://github.com/apache/spark/pull/2144#issuecomment-54566230 > So you guys should figure out a way to run this so that it doesn't get stale. For example it's fine to add some code to the script that runs all the tests except the MLli

[GitHub] spark pull request: [SPARK-3391][EC2] Support attaching up to 8 EB...

2014-09-04 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2260#issuecomment-54566142 Ok I made the ebs volume type configurable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-3124] Fix the jar version conflict in u...

2014-09-04 Thread chenghao-intel
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/2035#issuecomment-54566049 thank you @witgo. @pwendell I think this is ready to be merged. --- If your project is set up for it, you can reply to this email and have your reply appear on Git

[GitHub] spark pull request: SPARK-2978. Transformation with MR shuffle sem...

2014-09-04 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2274#discussion_r17152358 --- Diff: core/src/main/scala/org/apache/spark/rdd/OrderedRDDFunctions.scala --- @@ -64,4 +64,17 @@ class OrderedRDDFunctions[K : Ordering : ClassTag, n

[GitHub] spark pull request: [SPARK-3393] [SQL] add configuration template ...

2014-09-04 Thread chenghao-intel
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/2263#issuecomment-54565805 Thank you @liancheng . I didn't notice that file previously. :) test this please. --- If your project is set up for it, you can reply to this email and ha

[GitHub] spark pull request: SPARK-2978. Transformation with MR shuffle sem...

2014-09-04 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2274#issuecomment-54565772 @mateiz The reason to add this is because this is a smaller API that we can support in the long run before finalizing ShuffledRDD (since that one has been in flux

[GitHub] spark pull request: SPARK-1630: Make PythonRDD handle NULL element...

2014-09-04 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/554#issuecomment-54565701 I've closed this since it was fixed separately. Thanks for sending a patch here. --- If your project is set up for it, you can reply to this email and have your reply appe

[GitHub] spark pull request: [SPARK-1825] Fixes cross-platform submit probl...

2014-09-04 Thread zeodtr
Github user zeodtr commented on the pull request: https://github.com/apache/spark/pull/899#issuecomment-54565600 @andrewor14 Updated the title. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Tests meant to demonstrate the bug in SPARK-26...

2014-09-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1588 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: Fix sbt script

2014-09-04 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/260#issuecomment-54565274 If there's no way to reproduce this by the way, would it be okay to close this PR? I haven't seen other people bring it up. --- If your project is set up for it, you can r

[GitHub] spark pull request: [SPARK-3391][EC2] Support attaching up to 8 EB...

2014-09-04 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/2260#discussion_r17151598 --- Diff: ec2/spark_ec2.py --- @@ -348,13 +353,16 @@ def launch_cluster(conn, opts, cluster_name): print >> stderr, "Could not find AMI " + opts.am

[GitHub] spark pull request: [SPARK-3094] [PySpark] compatitable with PyPy

2014-09-04 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2144#issuecomment-54564941 So you guys should figure out a way to run this so that it doesn't get stale. For example it's fine to add some code to the script that runs all the tests except the MLlib

[GitHub] spark pull request: Spark-3406 add a default storage level to pyth...

2014-09-04 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2280#issuecomment-54564903 [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19796/consoleFull) for PR 2280 at commit [`e658227`](https://github.com/ap

[GitHub] spark pull request: [SPARK-2140] Updating heap memory calculation ...

2014-09-04 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2253#issuecomment-54564805 It's been down this afternoon --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2140] Updating heap memory calculation ...

2014-09-04 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2253#issuecomment-54564800 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have thi

[GitHub] spark pull request: SPARK-3211 .take() is OOM-prone with empty par...

2014-09-04 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2117#issuecomment-54564675 Sorry for the delay on this -- fix looks good but I'd like it to run through Jenkins, and Jenkins has been down today. --- If your project is set up for it, you can reply

[GitHub] spark pull request: SPARK-3211 .take() is OOM-prone with empty par...

2014-09-04 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2117#issuecomment-54564632 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have thi

[GitHub] spark pull request: [SPARK-3353] parent stage should have lower st...

2014-09-04 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2273#issuecomment-54564481 Wow, that's a small change -- kind of unfortunate that we didn't do it before. --- If your project is set up for it, you can reply to this email and have your reply appea

[GitHub] spark pull request: Spark-3406 add a default storage level to pyth...

2014-09-04 Thread holdenk
GitHub user holdenk opened a pull request: https://github.com/apache/spark/pull/2280 Spark-3406 add a default storage level to python RDD persist API You can merge this pull request into a Git repository by running: $ git pull https://github.com/holdenk/spark SPARK-3406-Pytho

[GitHub] spark pull request: [SPARK-3012] Standardized Distance Functions b...

2014-09-04 Thread yu-iskw
Github user yu-iskw commented on the pull request: https://github.com/apache/spark/pull/1964#issuecomment-54564146 I'm sorry for delay to reply in replying for you. Because I didn't concern about Python API, I rethink of the design for distance now. Please give me a few days.

[GitHub] spark pull request: SPARK-2978. Transformation with MR shuffle sem...

2014-09-04 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/2274#issuecomment-54564083 Just a nit, it should probably be called repartitionAndSortWithinPartition*s*. Also, this name is pretty long. Another one I'd reconsider is `repartitionWithSort`

[GitHub] spark pull request: SPARK-2621. Update task InputMetrics increment...

2014-09-04 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2087#issuecomment-54563520 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-2621. Update task InputMetrics increment...

2014-09-04 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2087#issuecomment-54563140 Hm, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-3353] parent stage should have lower st...

2014-09-04 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/2273#issuecomment-54563115 retest this please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: SPARK-3337 Paranoid quoting in shell to allow ...

2014-09-04 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/2229#discussion_r17150612 --- Diff: sbt/sbt-launch-lib.bash --- @@ -180,7 +180,7 @@ run() { ${SBT_OPTS:-$default_sbt_opts} \ $(get_mem_opts $sbt_mem) \ ${j

[GitHub] spark pull request: [SPARK-3361] Expand PEP 8 checks to include EC...

2014-09-04 Thread shaneknapp
Github user shaneknapp commented on the pull request: https://github.com/apache/spark/pull/2277#issuecomment-54562305 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: [SPARK-3361] Expand PEP 8 checks to include EC...

2014-09-04 Thread shaneknapp
Github user shaneknapp commented on the pull request: https://github.com/apache/spark/pull/2277#issuecomment-54562200 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: add ability to submit multiple jars for Driver

2014-09-04 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/1113#issuecomment-54562091 Could you add `[SPARK-2301]` to the title? This will help us organize PRs and link them to JIRAs properly. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [Build] suppress curl/wget progress bars

2014-09-04 Thread nchammas
GitHub user nchammas opened a pull request: https://github.com/apache/spark/pull/2279 [Build] suppress curl/wget progress bars In the Jenkins console output, `curl` gives us mountains of `#` symbols as it tries to show its download progress. ![noise from curl in Jenkins ou

[GitHub] spark pull request: [SPARK-3361] Expand PEP 8 checks to include EC...

2014-09-04 Thread shaneknapp
Github user shaneknapp commented on the pull request: https://github.com/apache/spark/pull/2277#issuecomment-54561366 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: [Docs] fix minor MLlib case typo

2014-09-04 Thread nchammas
GitHub user nchammas opened a pull request: https://github.com/apache/spark/pull/2278 [Docs] fix minor MLlib case typo Also make the list of features consistent in style. You can merge this pull request into a Git repository by running: $ git pull https://github.com/nchammas/sp

[GitHub] spark pull request: [SPARK-3394] [SQL] Fix crash in TakeOrdered wh...

2014-09-04 Thread liancheng
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/2264#issuecomment-54557694 retest this please. Jenkins was down a couple hours ago and just came alive. --- If your project is set up for it, you can reply to this email and have your re

[GitHub] spark pull request: [SPARK-3353] parent stage should have lower st...

2014-09-04 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2273#issuecomment-54556840 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: [SPARK-3361] Expand PEP 8 checks to include EC...

2014-09-04 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/2277#issuecomment-54556818 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-3361] Expand PEP 8 checks to include EC...

2014-09-04 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/2277#issuecomment-54556628 Hmm, the [previous build failed](https://amplab.cs.berkeley.edu/jenkins/view/Pull%20Request%20Builders/job/SparkPullRequestBuilder/19794/console). Jenkinmeister,

[GitHub] spark pull request: [SPARK-3397] Bump pom.xml version number of ma...

2014-09-04 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/2268#issuecomment-54556330 Yeah I think it's reasonable to bump the versions in master now rather than wait for the release. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SQL] Update SQL Programming Guide

2014-09-04 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/2258#discussion_r17147395 --- Diff: python/pyspark/sql.py --- @@ -287,7 +287,7 @@ class StructType(DataType): """Spark SQL StructType -The data type represen

[GitHub] spark pull request: SPARK-2978. Transformation with MR shuffle sem...

2014-09-04 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/2274#issuecomment-54555319 Updated patch removes Python version, adds Java version, and adds some additional doc. --- If your project is set up for it, you can reply to this email and have your repl

[GitHub] spark pull request: Augmented updateStateByKey API

2014-09-04 Thread xiliu82
Github user xiliu82 commented on a diff in the pull request: https://github.com/apache/spark/pull/2267#discussion_r17147160 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/PairDStreamFunctions.scala --- @@ -396,6 +396,26 @@ class PairDStreamFunctions[K, V](se

[GitHub] spark pull request: [SPARK-3391][EC2] Support attaching up to 8 EB...

2014-09-04 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2260#discussion_r17146850 --- Diff: ec2/spark_ec2.py --- @@ -348,13 +353,16 @@ def launch_cluster(conn, opts, cluster_name): print >> stderr, "Could not find AMI " + opts.ami

[GitHub] spark pull request: [SPARK-3094] [PySpark] compatitable with PyPy

2014-09-04 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/2144#issuecomment-54554607 PyPy does not fully support NumPy right now, so MLlib can not run with PyPy. --- If your project is set up for it, you can reply to this email and have your reply appear o

[GitHub] spark pull request: [SPARK-3361] Expand PEP 8 checks to include EC...

2014-09-04 Thread nchammas
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/2277#issuecomment-54554048 Jenkinmensch, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-3378] [DOCS] Replace the word "SparkSQL...

2014-09-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/2251 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enab

[GitHub] spark pull request: [SPARK-3378] [DOCS] Replace the word "SparkSQL...

2014-09-04 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2251#issuecomment-54552369 Thanks! Merged to master. :) BTW, no need to create a JIRA for doc updates. --- If your project is set up for it, you can reply to this email and have your repl

[GitHub] spark pull request: [SQL] Update SQL Programming Guide

2014-09-04 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/2258#discussion_r17145595 --- Diff: python/pyspark/sql.py --- @@ -287,7 +287,7 @@ class StructType(DataType): """Spark SQL StructType -The data type repres

[GitHub] spark pull request: [SPARK-3395] [SQL] DSL sometimes incorrectly r...

2014-09-04 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/2266#issuecomment-54551830 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature ena

[GitHub] spark pull request: [SPARK-3391][EC2] Support attaching up to 8 EB...

2014-09-04 Thread shivaram
Github user shivaram commented on a diff in the pull request: https://github.com/apache/spark/pull/2260#discussion_r17145258 --- Diff: ec2/spark_ec2.py --- @@ -348,13 +353,16 @@ def launch_cluster(conn, opts, cluster_name): print >> stderr, "Could not find AMI " + opts.

[GitHub] spark pull request: [SPARK-3391][EC2] Support attaching up to 8 EB...

2014-09-04 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/2260#discussion_r17144325 --- Diff: ec2/spark_ec2.py --- @@ -348,13 +353,16 @@ def launch_cluster(conn, opts, cluster_name): print >> stderr, "Could not find AMI " + opts.ami

  1   2   3   >