GitHub user megatron-me-uk opened a pull request:
https://github.com/apache/spark/pull/7965
[SPARK-6551][PYSPARK]
Implement a test for SPARK-6551.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/megatron-me-uk/spark patch-4
Github user megatron-me-uk commented on the pull request:
https://github.com/apache/spark/pull/7965#issuecomment-128062579
I believe that this will now resolve the issues reported in SPARK-6551 and
test for regression.
---
If your project is set up for it, you can reply
Github user megatron-me-uk closed the pull request at:
https://github.com/apache/spark/pull/7965
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user megatron-me-uk commented on the pull request:
https://github.com/apache/spark/pull/7965#issuecomment-128174983
#7378 has been merged into master and fails half of the tests that I have
added in this pull request. The returned result is correct however the
`zeroValue
Github user megatron-me-uk commented on the pull request:
https://github.com/apache/spark/pull/7965#issuecomment-128193230
OK good point, I guess I will close this pull request then. I think the
best way to resole this jira issue is to backport #7378 or just recommend an
upgrade
Github user megatron-me-uk closed the pull request at:
https://github.com/apache/spark/pull/6262
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user megatron-me-uk commented on the pull request:
https://github.com/apache/spark/pull/6262#issuecomment-120299302
I have simplified the optional parameter to be a boolean and added this to
the docstring.
---
If your project is set up for it, you can reply to this email
Github user megatron-me-uk commented on the pull request:
https://github.com/apache/spark/pull/6262#issuecomment-119549489
I have changed the implementation of the optional parameter to a boolean
`checkCode` and updated the docstring to reflect that.
---
If your project is set up
Github user megatron-me-uk commented on the pull request:
https://github.com/apache/spark/pull/6262#issuecomment-118294969
I guess the benefit of having a third mode is that grep can return 1 for no
results without raising an exception but if grep encounters an error of some
unknown
Github user megatron-me-uk commented on the pull request:
https://github.com/apache/spark/pull/6262#issuecomment-117607504
OK so I have implemented an optional argument 'mode' which by default
('permissive') maintains the current behaviour. I have added two other modes:
'strict
Github user megatron-me-uk commented on a diff in the pull request:
https://github.com/apache/spark/pull/6262#discussion_r33448014
--- Diff: python/pyspark/tests.py ---
@@ -874,6 +874,15 @@ def
test_sortByKey_uses_all_partitions_not_only_first_and_last(self
Github user megatron-me-uk commented on a diff in the pull request:
https://github.com/apache/spark/pull/6262#discussion_r33139983
--- Diff: python/pyspark/rdd.py ---
@@ -704,7 +704,16 @@ def pipe_objs(out):
out.write(s.encode('utf-8
Github user megatron-me-uk commented on the pull request:
https://github.com/apache/spark/pull/6262#issuecomment-114423691
```
Recording test results
ERROR: Publisher 'Publish JUnit test result report' failed: No test report
files were found. Configuration error?
Finished
GitHub user megatron-me-uk opened a pull request:
https://github.com/apache/spark/pull/6942
[SPARK-8541][PySpark] test the absolute error in approx doctests
A minor change but one which is (presumably) visible on the public api docs
webpage.
You can merge this pull request
Github user megatron-me-uk commented on the pull request:
https://github.com/apache/spark/pull/6262#issuecomment-114049600
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user megatron-me-uk commented on the pull request:
https://github.com/apache/spark/pull/6262#issuecomment-113471065
It failed an unrelated test.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user megatron-me-uk commented on a diff in the pull request:
https://github.com/apache/spark/pull/6262#discussion_r32792241
--- Diff: python/pyspark/rdd.py ---
@@ -704,7 +704,12 @@ def pipe_objs(out):
out.write(s.encode('utf-8
Github user megatron-me-uk commented on the pull request:
https://github.com/apache/spark/pull/6262#issuecomment-113271342
Wohoo! Took me a while to work out the testing and styles. I have added a
test that checks for an error on a clearly incorrect shell command and checks
Github user megatron-me-uk commented on a diff in the pull request:
https://github.com/apache/spark/pull/6262#discussion_r32785462
--- Diff: python/pyspark/tests.py ---
@@ -874,6 +874,15 @@ def
test_sortByKey_uses_all_partitions_not_only_first_and_last(self
Github user megatron-me-uk commented on the pull request:
https://github.com/apache/spark/pull/6262#issuecomment-108969028
The test failed on
org.apache.spark.network.netty.NettyBlockTransferSecuritySuite.security
mismatch auth off on client which I don't think is related
GitHub user megatron-me-uk opened a pull request:
https://github.com/apache/spark/pull/6262
Raise Exception on non-zero exit from pipe commands
This will allow problems with piped commands to be detected.
This will also allow tasks to be retried where errors are rare
Github user megatron-me-uk commented on the pull request:
https://github.com/apache/spark/pull/6262#issuecomment-103441329
A simple test of this:
```python
a = sc.parallelize([1, 2, 3])
b = a.pipe('cc') # a clearly incorrect pipe command
b.collect
Github user megatron-me-uk commented on the pull request:
https://github.com/apache/spark/pull/6262#issuecomment-103447402
Ah, I hadn't seen that! Will take a look.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user megatron-me-uk commented on the pull request:
https://github.com/apache/spark/pull/6262#issuecomment-103462427
I believe that this change will bring pyspark more in line with the
operation of the scala implementation.
See:
https://github.com/apache/spark/blob
Github user megatron-me-uk commented on the pull request:
https://github.com/apache/spark/pull/6262#issuecomment-103463636
OK, seems I have to create an account etc. I will put it on my to-do list.
Thanks for the help!
---
If your project is set up for it, you can reply
25 matches
Mail list logo