[GitHub] spark pull request: [SPARK-6551][PYSPARK]

2015-08-05 Thread megatron-me-uk
GitHub user megatron-me-uk opened a pull request: https://github.com/apache/spark/pull/7965 [SPARK-6551][PYSPARK] Implement a test for SPARK-6551. You can merge this pull request into a Git repository by running: $ git pull https://github.com/megatron-me-uk/spark patch-4

[GitHub] spark pull request: [SPARK-6551][PYSPARK] Incorrect aggregate resu...

2015-08-05 Thread megatron-me-uk
Github user megatron-me-uk commented on the pull request: https://github.com/apache/spark/pull/7965#issuecomment-128062579 I believe that this will now resolve the issues reported in SPARK-6551 and test for regression. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-6551][PYSPARK] Incorrect aggregate resu...

2015-08-05 Thread megatron-me-uk
Github user megatron-me-uk closed the pull request at: https://github.com/apache/spark/pull/7965 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request: [SPARK-6551][PYSPARK] Incorrect aggregate resu...

2015-08-05 Thread megatron-me-uk
Github user megatron-me-uk commented on the pull request: https://github.com/apache/spark/pull/7965#issuecomment-128174983 #7378 has been merged into master and fails half of the tests that I have added in this pull request. The returned result is correct however the `zeroValue

[GitHub] spark pull request: [SPARK-6551][PYSPARK] Incorrect aggregate resu...

2015-08-05 Thread megatron-me-uk
Github user megatron-me-uk commented on the pull request: https://github.com/apache/spark/pull/7965#issuecomment-128193230 OK good point, I guess I will close this pull request then. I think the best way to resole this jira issue is to backport #7378 or just recommend an upgrade

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-07-15 Thread megatron-me-uk
Github user megatron-me-uk closed the pull request at: https://github.com/apache/spark/pull/6262 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-07-10 Thread megatron-me-uk
Github user megatron-me-uk commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-120299302 I have simplified the optional parameter to be a boolean and added this to the docstring. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-07-08 Thread megatron-me-uk
Github user megatron-me-uk commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-119549489 I have changed the implementation of the optional parameter to a boolean `checkCode` and updated the docstring to reflect that. --- If your project is set up

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-07-03 Thread megatron-me-uk
Github user megatron-me-uk commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-118294969 I guess the benefit of having a third mode is that grep can return 1 for no results without raising an exception but if grep encounters an error of some unknown

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-07-01 Thread megatron-me-uk
Github user megatron-me-uk commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-117607504 OK so I have implemented an optional argument 'mode' which by default ('permissive') maintains the current behaviour. I have added two other modes: 'strict

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-06-29 Thread megatron-me-uk
Github user megatron-me-uk commented on a diff in the pull request: https://github.com/apache/spark/pull/6262#discussion_r33448014 --- Diff: python/pyspark/tests.py --- @@ -874,6 +874,15 @@ def test_sortByKey_uses_all_partitions_not_only_first_and_last(self

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-06-24 Thread megatron-me-uk
Github user megatron-me-uk commented on a diff in the pull request: https://github.com/apache/spark/pull/6262#discussion_r33139983 --- Diff: python/pyspark/rdd.py --- @@ -704,7 +704,16 @@ def pipe_objs(out): out.write(s.encode('utf-8

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-06-23 Thread megatron-me-uk
Github user megatron-me-uk commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-114423691 ``` Recording test results ERROR: Publisher 'Publish JUnit test result report' failed: No test report files were found. Configuration error? Finished

[GitHub] spark pull request: [SPARK-8541][PySpark] test the absolute error ...

2015-06-22 Thread megatron-me-uk
GitHub user megatron-me-uk opened a pull request: https://github.com/apache/spark/pull/6942 [SPARK-8541][PySpark] test the absolute error in approx doctests A minor change but one which is (presumably) visible on the public api docs webpage. You can merge this pull request

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-06-22 Thread megatron-me-uk
Github user megatron-me-uk commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-114049600 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-06-19 Thread megatron-me-uk
Github user megatron-me-uk commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-113471065 It failed an unrelated test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-06-18 Thread megatron-me-uk
Github user megatron-me-uk commented on a diff in the pull request: https://github.com/apache/spark/pull/6262#discussion_r32792241 --- Diff: python/pyspark/rdd.py --- @@ -704,7 +704,12 @@ def pipe_objs(out): out.write(s.encode('utf-8

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-06-18 Thread megatron-me-uk
Github user megatron-me-uk commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-113271342 Wohoo! Took me a while to work out the testing and styles. I have added a test that checks for an error on a clearly incorrect shell command and checks

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-06-18 Thread megatron-me-uk
Github user megatron-me-uk commented on a diff in the pull request: https://github.com/apache/spark/pull/6262#discussion_r32785462 --- Diff: python/pyspark/tests.py --- @@ -874,6 +874,15 @@ def test_sortByKey_uses_all_partitions_not_only_first_and_last(self

[GitHub] spark pull request: [SPARK-7735] [pyspark] Raise Exception on non-...

2015-06-04 Thread megatron-me-uk
Github user megatron-me-uk commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-108969028 The test failed on org.apache.spark.network.netty.NettyBlockTransferSecuritySuite.security mismatch auth off on client which I don't think is related

[GitHub] spark pull request: Raise Exception on non-zero exit from pipe com...

2015-05-19 Thread megatron-me-uk
GitHub user megatron-me-uk opened a pull request: https://github.com/apache/spark/pull/6262 Raise Exception on non-zero exit from pipe commands This will allow problems with piped commands to be detected. This will also allow tasks to be retried where errors are rare

[GitHub] spark pull request: Raise Exception on non-zero exit from pipe com...

2015-05-19 Thread megatron-me-uk
Github user megatron-me-uk commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-103441329 A simple test of this: ```python a = sc.parallelize([1, 2, 3]) b = a.pipe('cc') # a clearly incorrect pipe command b.collect

[GitHub] spark pull request: Raise Exception on non-zero exit from pipe com...

2015-05-19 Thread megatron-me-uk
Github user megatron-me-uk commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-103447402 Ah, I hadn't seen that! Will take a look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: Raise Exception on non-zero exit from pipe com...

2015-05-19 Thread megatron-me-uk
Github user megatron-me-uk commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-103462427 I believe that this change will bring pyspark more in line with the operation of the scala implementation. See: https://github.com/apache/spark/blob

[GitHub] spark pull request: Raise Exception on non-zero exit from pipe com...

2015-05-19 Thread megatron-me-uk
Github user megatron-me-uk commented on the pull request: https://github.com/apache/spark/pull/6262#issuecomment-103463636 OK, seems I have to create an account etc. I will put it on my to-do list. Thanks for the help! --- If your project is set up for it, you can reply