[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-08-11 Thread belevtsoff
Github user belevtsoff commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-129903372
  
Not sure if here's the right place to post this, but the documentation on 
the official website appears to be outdated. For example, for spark 1.4.0 and 
1.4.1 
[this](http://spark.apache.org/docs/1.4.1/programming-guide.html#linking-with-spark)
 paragraph (python tab) seems particularly misleading. Also, the last line of 
[this](http://spark.apache.org/docs/1.4.1/#downloading) paragraph doesn't 
mention python 3 support. Maybe there are other places.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-08-11 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-129943743
  
This is also a duplicate of 
https://issues.apache.org/jira/browse/SPARK-9705, which I'm going to merge into 
the new issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-08-11 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-129938050
  
@belevtsoff Thanks for reporting this, Created 
https://issues.apache.org/jira/browse/SPARK-9822 to track it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-07-10 Thread delallea
Github user delallea commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-120427873
  
I created a Jira here (I'm running into the same issue): 
https://issues.apache.org/jira/browse/SPARK-8976
It's my first time creating a Jira for Apache products, so someone more 
familiar with the process should probably review it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-07-09 Thread rilut
Github user rilut commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-119837999
  
Sorry, I'm in a remote location for months. Maybe you/anyone could
help us to create a new issue if it still unresolved.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-07-08 Thread latkin
Github user latkin commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-119745079
  
Is there a link to JIRA yet? I'm hitting the invalid mode issue with 
3.4.3, as well.  
[Searching](https://issues.apache.org/jira/browse/SPARK-7909?jql=project%20%3D%20SPARK%20AND%20text%20~%203.4.3)
 3.4.3 yields only [this 
guy](https://issues.apache.org/jira/browse/SPARK-7909), which is related but 
not quite the same.

FWIW 3.4.3 is now the default Python version that Visual Studio 2015 
suggests you install.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-06-19 Thread rilut
Github user rilut commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-113410652
  
I'm on Python 3.4.3 (Anaconda 2.2.0 64-bit, Windows 10 x64) and 
experiencing a problem.

 sc.parallelize([1, 2]).count()

File C:\Anaconda3\lib\socket.py, line 205, in makefile
raise ValueError(invalid mode %r (only r, w, b allowed) % (mode,))
ValueError: invalid mode 'a+' (only r, w, b allowed)
15/06/19 14:10:54 WARN PythonRDD: Incomplete task interrupted: 
Attempting to kill Python Worker


I think it's because of 
https://github.com/apache/spark/blob/master/python/pyspark/worker.py#L149 .
I'm not sure if `a+` mode exists in Python 3.

 import socket
 sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
 sock.connect((127.0.0.1, 4040))
 sock_file = sock.makefile(a+, 65536) ## failed
Traceback (most recent call last):
  File stdin, line 1, in module
  File C:\Anaconda3\lib\socket.py, line 205, in makefile
raise ValueError(invalid mode %r (only r, w, b allowed) % (mode,))
ValueError: invalid mode 'a+' (only r, w, b allowed)
 sock_file = sock.makefile(r, 65536) ## r is okay
 sock_file = sock.makefile(x, 65536) ## x is obviously doesn't 
exists, I'm making this up.
Traceback (most recent call last):
  File stdin, line 1, in module
  File C:\Anaconda3\lib\socket.py, line 205, in makefile
raise ValueError(invalid mode %r (only r, w, b allowed) % (mode,))
ValueError: invalid mode 'x' (only r, w, b allowed)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-06-19 Thread twneale
Github user twneale commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-113541699
  
Looks like this is originating from the socket module
https://github.com/python/cpython/blob/master/Lib/socket.py. the .makefile
method only allows r w and b modes, probably in both 2.x and 3.x. I wonder
if that particular socket is usually a UNIX socket, but falls back to tcp
on windows. Rilut does your spark installation work with python 2.7?

On Fri, Jun 19, 2015, 3:34 AM rilut notificati...@github.com wrote:

 I'm on Python 3.4.3 (Anaconda 2.2.0 64-bit, Windows 10 x64) and
 experiencing a problem.

  sc.parallelize([1, 2]).count()
 
 File C:\Anaconda3\lib\socket.py, line 205, in makefile
 raise ValueError(invalid mode %r (only r, w, b allowed) % (mode,))
 ValueError: invalid mode 'a+' (only r, w, b allowed)
 15/06/19 14:10:54 WARN PythonRDD: Incomplete task interrupted: Attempting 
to kill Python Worker
 

 I think it's because of
 https://github.com/apache/spark/blob/master/python/pyspark/worker.py#L149
 .
 I'm not sure if a+ mode exists in Python 3.

  import socket
  sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
  sock.connect((127.0.0.1, 4040))
  sock_file = sock.makefile(a+, 65536) ## failed
 Traceback (most recent call last):
   File stdin, line 1, in module
   File C:\Anaconda3\lib\socket.py, line 205, in makefile
 raise ValueError(invalid mode %r (only r, w, b allowed) % (mode,))
 ValueError: invalid mode 'a+' (only r, w, b allowed)
  sock_file = sock.makefile(r, 65536) ## r is okay
  sock_file = sock.makefile(x, 65536) ## x is obviously doesn't 
exists, I'm making this up.
 Traceback (most recent call last):
   File stdin, line 1, in module
   File C:\Anaconda3\lib\socket.py, line 205, in makefile
 raise ValueError(invalid mode %r (only r, w, b allowed) % (mode,))
 ValueError: invalid mode 'x' (only r, w, b allowed)

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/spark/pull/5173#issuecomment-113410652.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-06-19 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-113586048
  
@rilut Can we move this discussion to JIRA so that it's easier to track?  
File a ticket at https://issues.apache.org/jira/browse/SPARK and link it here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-06-19 Thread rilut
Github user rilut commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-113586217
  
@twneale @JoshRosen ok, I'll repost it to JIRA


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-05-05 Thread evertlammerts
Github user evertlammerts commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-98980872
  
My bad! I forgot to roll out the new module to all nodes after recompiling. 
After doing that I'm back in business.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-05-03 Thread evertlammerts
Github user evertlammerts commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-98541048
  
After this PR is merged i'm having some trouble getting pyspark to work:

```
 sc.parallelize([1, 2]).count()
...
AttributeError: 'module' object has no attribute '_builtin_type'
...
```

I'm on python 2.7.9 (Anaconda 2.2.0) on RHEL 6 with Spark master @ 
f4af92550cb90e47a12d4625fa615dd2b1587d42

I see some tests are skipped, maybe ```count``` among them? ```distinct``` 
seems to have the same problem, as far as I can see now. Does anybody else see 
this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-05-03 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-98541978
  
@evertlammerts I don't see that issue locally; I'm using Python 2.7.9 on 
OSX.  Can you file an issue on the [Spark 
JIRA](http://issues.apache.org/jira/browse/SPARK) and mark it as a 1.4.0 
blocker until we've debugged?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-29 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-97563467
  
  [Test build #28 has 
started](https://hadrian.millennium.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/28/consoleFull)
 for   PR 5173 at commit 
[`59bb492`](https://github.com/apache/spark/commit/59bb49260f62fda2af0d48e35447d1a7dcd0a479).
 * This patch **does not merge cleanly**.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-29 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-97600364
  
**[Test build #28 timed 
out](https://hadrian.millennium.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/28/consoleFull)**
 for PR 5173 at commit 
[`59bb492`](https://github.com/apache/spark/commit/59bb49260f62fda2af0d48e35447d1a7dcd0a479)
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-29 Thread shaneknapp
Github user shaneknapp commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-97603182
  
ignore these comments -- this is me testing on our staging instance


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-17 Thread ogrisel
Github user ogrisel commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-94064445
  
Thank you very much, porting PySpark to Python 3 is very appreciated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93657492
  
  [Test build #30401 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30401/consoleFull)
 for   PR 5173 at commit 
[`cafd5ec`](https://github.com/apache/spark/commit/cafd5ec1403f47681950233361815f468435d05f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93660706
  
  [Test build #30404 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30404/consoleFull)
 for   PR 5173 at commit 
[`b716610`](https://github.com/apache/spark/commit/b716610900f54b4f88c5953ab1ad1a27caa3386d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93659307
  
  [Test build #30401 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30401/consoleFull)
 for   PR 5173 at commit 
[`cafd5ec`](https://github.com/apache/spark/commit/cafd5ec1403f47681950233361815f468435d05f).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93659321
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30401/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93673473
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30404/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93673463
  
  [Test build #30404 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30404/consoleFull)
 for   PR 5173 at commit 
[`b716610`](https://github.com/apache/spark/commit/b716610900f54b4f88c5953ab1ad1a27caa3386d).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93779040
  
  [Test build #30422 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30422/consoleFull)
 for   PR 5173 at commit 
[`99e334f`](https://github.com/apache/spark/commit/99e334f987c561421b10fe2a6942144d1585ecb1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93802559
  
  [Test build #30429 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30429/consoleFull)
 for   PR 5173 at commit 
[`6c52a98`](https://github.com/apache/spark/commit/6c52a98dee887e21e115ba1194b0e617ff9f27a8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93795958
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30422/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93795907
  
  [Test build #30422 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30422/consoleFull)
 for   PR 5173 at commit 
[`99e334f`](https://github.com/apache/spark/commit/99e334f987c561421b10fe2a6942144d1585ecb1).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93819774
  
  [Test build #30429 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30429/consoleFull)
 for   PR 5173 at commit 
[`6c52a98`](https://github.com/apache/spark/commit/6c52a98dee887e21e115ba1194b0e617ff9f27a8).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93827547
  
  [Test build #30433 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30433/consoleFull)
 for   PR 5173 at commit 
[`d7d6323`](https://github.com/apache/spark/commit/d7d63237036bec439700085ed31b6225199b38c1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93846635
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30433/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93846620
  
  [Test build #30433 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30433/consoleFull)
 for   PR 5173 at commit 
[`d7d6323`](https://github.com/apache/spark/commit/d7d63237036bec439700085ed31b6225199b38c1).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93816252
  
MLlib changes look good to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93819786
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30429/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread shaneknapp
Github user shaneknapp commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93862558
  
woot!

On Thu, Apr 16, 2015 at 4:22 PM, Josh Rosen notificati...@github.com
wrote:

 I've merged this into master (1.4.0). Thanks @davies
 https://github.com/davies, @twneale https://github.com/twneale,
 @mengxr https://github.com/mengxr, and everyone else who helped to test
 this patch!

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/spark/pull/5173#issuecomment-93861091.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93861091
  
I've merged this into `master` (1.4.0).  Thanks @davies, @twneale, @mengxr, 
and everyone else who helped to test this patch!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93863171
  
Oh, and a **huge** thanks to @shaneknapp for helping us configure Jenkins 
for Python 3 and PyPy, which was no easy task!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/5173


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-16 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93806902
  
@JoshRosen Once it pass the tests, I think it's ready to go.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread shaananc
Github user shaananc commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93394590
  
Good idea. I was on YARN 2.4, testing it with just two nodes.

I just tried running it locally rather than on the cluster and it worked 
fine.
If you want more details or the docker images I'm using let me know.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread rgbkrk
Github user rgbkrk commented on a diff in the pull request:

https://github.com/apache/spark/pull/5173#discussion_r28436264
  
--- Diff: python/pyspark/cloudpickle.py ---
@@ -40,164 +40,126 @@
 NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 
-
+from __future__ import print_function
 
 import operator
 import os
+import io
 import pickle
 import struct
 import sys
 import types
 from functools import partial
 import itertools
-from copy_reg import _extension_registry, _inverted_registry, 
_extension_cache
-import new
 import dis
 import traceback
-import platform
-
-PyImp = platform.python_implementation()
-
 
-import logging
-cloudLog = logging.getLogger(Cloud.Transport)
--- End diff --

Would you be amenable to moving cloudpickle to a separate repository? We'd 
like to be able to rely on it in IPython parallel as well as other projects. In 
the past couple days, folks at the PyCon sprints have been adding tests for the 
[current codebase](https://github.com/cloudpipe/cloudpickle).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/5173#discussion_r28444539
  
--- Diff: python/pyspark/streaming/dstream.py ---
@@ -579,9 +580,9 @@ def reduceFunc(t, a, b):
 g = b.groupByKey(numPartitions).mapValues(lambda vs: 
(list(vs), None))
 else:
 g = a.cogroup(b.partitionBy(numPartitions), numPartitions)
-g = g.mapValues(lambda (va, vb): (list(vb), list(va)[0] if 
len(va) else None))
--- End diff --

done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/5173#discussion_r28444580
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -116,7 +114,7 @@ def __init__(self, func, returnType):
 
 def _create_judf(self):
 f = self.func  # put it in closure `func`
-func = lambda _, it: imap(lambda x: f(*x), it)
+func = lambda _, it: map(lambda x: f(*x), it)
--- End diff --

Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/5173#discussion_r28445096
  
--- Diff: python/pyspark/cloudpickle.py ---
@@ -40,164 +40,126 @@
 NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 
-
+from __future__ import print_function
 
 import operator
 import os
+import io
 import pickle
 import struct
 import sys
 import types
 from functools import partial
 import itertools
-from copy_reg import _extension_registry, _inverted_registry, 
_extension_cache
-import new
 import dis
 import traceback
-import platform
-
-PyImp = platform.python_implementation()
-
 
-import logging
-cloudLog = logging.getLogger(Cloud.Transport)
--- End diff --

Right now, vendoring is easier for us to minimize the dependencies. We'd 
like contributing back these changes to upstream later.

@rgbkrk Have you tried Dill ?  https://github.com/uqfoundation/dill


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread shaananc
Github user shaananc commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93519593
  
@davies  - Thanks so much and good catch!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93528080
  
  [Test build #30365 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30365/consoleFull)
 for   PR 5173 at commit 
[`8c8b957`](https://github.com/apache/spark/commit/8c8b957fdeb721dc772584ba6135810163ef488c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93506874
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93510822
  
@shaananc It seems that you are using Python3 in the driver, but python2 in 
YARN. PySpark can not work with different minor versions in driver and worker.

So you could specify the Python version by:
```
PYSPARK_PYTHON=python2 bin/spark-submit xxx 
```
Or change the default version to python3 in YARN.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93516544
  
  [Test build #30363 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30363/consoleFull)
 for   PR 5173 at commit 
[`5c57c95`](https://github.com/apache/spark/commit/5c57c95a0e8b8ca11f88a60d6e48ef0e4caa3a16).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93524365
  
  [Test build #30364 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30364/consoleFull)
 for   PR 5173 at commit 
[`8f8e710`](https://github.com/apache/spark/commit/8f8e7100937f2fd5ce5252c11e42cb2230d21581).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93507501
  
  [Test build #30362 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30362/consoleFull)
 for   PR 5173 at commit 
[`4006829`](https://github.com/apache/spark/commit/400682982bbb6277d4b2c6dca2c3b88d491e5b21).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread rgbkrk
Github user rgbkrk commented on a diff in the pull request:

https://github.com/apache/spark/pull/5173#discussion_r28447868
  
--- Diff: python/pyspark/cloudpickle.py ---
@@ -40,164 +40,126 @@
 NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 
-
+from __future__ import print_function
 
 import operator
 import os
+import io
 import pickle
 import struct
 import sys
 import types
 from functools import partial
 import itertools
-from copy_reg import _extension_registry, _inverted_registry, 
_extension_cache
-import new
 import dis
 import traceback
-import platform
-
-PyImp = platform.python_implementation()
-
 
-import logging
-cloudLog = logging.getLogger(Cloud.Transport)
--- End diff --

Yes, others have noted that dill didn't have the same opinionated base for 
pickling functions (especially functions within main). /cc @ogrisel 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread JoshRosen
Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/5173#discussion_r28459908
  
--- Diff: python/pyspark/cloudpickle.py ---
@@ -40,164 +40,126 @@
 NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 
-
+from __future__ import print_function
 
 import operator
 import os
+import io
 import pickle
 import struct
 import sys
 import types
 from functools import partial
 import itertools
-from copy_reg import _extension_registry, _inverted_registry, 
_extension_cache
-import new
 import dis
 import traceback
-import platform
-
-PyImp = platform.python_implementation()
-
 
-import logging
-cloudLog = logging.getLogger(Cloud.Transport)
--- End diff --

I'd be interested in chatting more aboutcCloudpickle, but should probably 
move this discussion to a mailing list since it's hard to link to GitHub line 
comments.  Mind sending an email to the [dev 
list](https://spark.apache.org/community.html) and CC'ing me?  My email is 
`joshro...@databricks.com`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread ogrisel
Github user ogrisel commented on a diff in the pull request:

https://github.com/apache/spark/pull/5173#discussion_r28461471
  
--- Diff: python/pyspark/cloudpickle.py ---
@@ -40,164 +40,126 @@
 NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 
-
+from __future__ import print_function
 
 import operator
 import os
+import io
 import pickle
 import struct
 import sys
 import types
 from functools import partial
 import itertools
-from copy_reg import _extension_registry, _inverted_registry, 
_extension_cache
-import new
 import dis
 import traceback
-import platform
-
-PyImp = platform.python_implementation()
-
 
-import logging
-cloudLog = logging.getLogger(Cloud.Transport)
--- End diff --

Maybe you can join us on gitter.im:

https://gitter.im/cloudpipe/cloudpickle

@rgbkrk @sdegryze and I can also join the spark-dev mailing list if your 
really want to.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread rgbkrk
Github user rgbkrk commented on a diff in the pull request:

https://github.com/apache/spark/pull/5173#discussion_r28451434
  
--- Diff: python/pyspark/cloudpickle.py ---
@@ -40,164 +40,126 @@
 NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 
-
+from __future__ import print_function
 
 import operator
 import os
+import io
 import pickle
 import struct
 import sys
 import types
 from functools import partial
 import itertools
-from copy_reg import _extension_registry, _inverted_registry, 
_extension_cache
-import new
 import dis
 import traceback
-import platform
-
-PyImp = platform.python_implementation()
-
 
-import logging
-cloudLog = logging.getLogger(Cloud.Transport)
--- End diff --

It's no longer vendoring when changes are happening in your own code base 
of cloudpickle. This ends up being even worse for projects hoping to use it 
when pyspark itself isn't pip installable either.

What's the best path forward for us to help maintain cloudpickle in a way 
that is friendly to you vendoring it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93536322
  
  [Test build #30362 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30362/consoleFull)
 for   PR 5173 at commit 
[`4006829`](https://github.com/apache/spark/commit/400682982bbb6277d4b2c6dca2c3b88d491e5b21).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93536338
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30362/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93539962
  
  [Test build #678 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/678/consoleFull)
 for   PR 5173 at commit 
[`8c8b957`](https://github.com/apache/spark/commit/8c8b957fdeb721dc772584ba6135810163ef488c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93542611
  
  [Test build #30364 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30364/consoleFull)
 for   PR 5173 at commit 
[`8f8e710`](https://github.com/apache/spark/commit/8f8e7100937f2fd5ce5252c11e42cb2230d21581).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93542647
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30364/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93549729
  
  [Test build #30363 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30363/consoleFull)
 for   PR 5173 at commit 
[`5c57c95`](https://github.com/apache/spark/commit/5c57c95a0e8b8ca11f88a60d6e48ef0e4caa3a16).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93549748
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30363/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93551704
  
  [Test build #30365 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30365/consoleFull)
 for   PR 5173 at commit 
[`8c8b957`](https://github.com/apache/spark/commit/8c8b957fdeb721dc772584ba6135810163ef488c).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93551719
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30365/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93593451
  
**[Test build #30373 timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30373/consoleFull)**
 for PR 5173 at commit 
[`179fc8d`](https://github.com/apache/spark/commit/179fc8d7b426cbd2e00640ac3b46b26475cfa73a)
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93593461
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30373/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93575149
  
  [Test build #680 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/680/consoleFull)
 for   PR 5173 at commit 
[`179fc8d`](https://github.com/apache/spark/commit/179fc8d7b426cbd2e00640ac3b46b26475cfa73a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93595119
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30376/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93595106
  
  [Test build #30376 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30376/consoleFull)
 for   PR 5173 at commit 
[`bf225d7`](https://github.com/apache/spark/commit/bf225d7ddd43cf297eeabbfe6e888b0655b6b1a5).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93575338
  
  [Test build #30376 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30376/consoleFull)
 for   PR 5173 at commit 
[`bf225d7`](https://github.com/apache/spark/commit/bf225d7ddd43cf297eeabbfe6e888b0655b6b1a5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93575307
  
  [Test build #678 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/678/consoleFull)
 for   PR 5173 at commit 
[`8c8b957`](https://github.com/apache/spark/commit/8c8b957fdeb721dc772584ba6135810163ef488c).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/5173#discussion_r28483754
  
--- Diff: python/pyspark/rdd.py ---
@@ -123,6 +129,13 @@ def _load_from_socket(port, serializer):
 sock.close()
 
 
+def ignore_unicode_prefix(f):
--- End diff --

Please add docstring for this function. It is not clear what it does based 
on the method name.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/5173#discussion_r28483721
  
--- Diff: python/pyspark/mllib/util.py ---
@@ -40,7 +40,7 @@ def _parse_libsvm_line(line, multiclass=None):
 nnz = len(items) - 1
 indices = np.zeros(nnz, dtype=np.int32)
 values = np.zeros(nnz)
-for i in xrange(nnz):
+for i in range(nnz):
--- End diff --

Can we use the following instead? Using `range` would hurt performance in 
Python 2 here.

~~~
if sys.version = '3':
xrange = range
~~~


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/5173#discussion_r28483636
  
--- Diff: python/pyspark/mllib/recommendation.py ---
@@ -61,7 +61,7 @@ class MatrixFactorizationModel(JavaModelWrapper, 
JavaSaveable, JavaLoader):
 
  model = ALS.train(ratings, 4, seed=10)
  model.userFeatures().collect()
-[(1, array('d', [...])), (2, array('d', [...]))]
+[(1, DenseVector([...])), (2, DenseVector([...]))]
--- End diff --

We shouldn't change the return type. I'm not sure `DenseVector` is a safe 
replacement for array.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/5173#discussion_r28483597
  
--- Diff: examples/src/main/python/mllib/gradient_boosted_trees.py ---
@@ -49,8 +50,8 @@ def testRegression(trainingData, testData):
 # Evaluate model on test instances and compute test error
 predictions = model.predict(testData.map(lambda x: x.features))
 labelsAndPredictions = testData.map(lambda lp: 
lp.label).zip(predictions)
-testMSE = labelsAndPredictions.map(lambda (v, p): (v - p) * (v - 
p)).sum() \
-/ float(testData.count())
+testMSE = labelsAndPredictions.map(lambda v_p1: (v_p1[0] - v_p1[1]) * 
(v_p1[0] - v_p1[1]))\
--- End diff --

Why using `v_p1` instead of `vp` or `v_p`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/5173#discussion_r28483601
  
--- Diff: python/pyspark/mllib/feature.py ---
@@ -216,7 +221,10 @@ def __init__(self, numFeatures=1  20):
 
 def indexOf(self, term):
  Returns the index of the input term. 
-return hash(term) % self.numFeatures
+# hash of string is not portable in Python 3
+if isinstance(term, unicode):
+term = term.encode('utf-8')
+return (binascii.crc32(term)  0x7FFF) % self.numFeatures
--- End diff --

Any performance overhead with the new approach?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/5173#discussion_r28483605
  
--- Diff: python/pyspark/rdd.py ---
@@ -368,12 +381,14 @@ def randomSplit(self, weights, seed=None):
 :param seed: random seed
 :return: split RDDs in a list
 
- rdd = sc.parallelize(range(5), 1)
+ rdd = sc.parallelize(range(500), 1)
  rdd1, rdd2 = rdd.randomSplit([2, 3], 17)
- rdd1.collect()
-[1, 3]
- rdd2.collect()
-[0, 2, 4]
+ len(rdd1.collect() + rdd2.collect())
+500
+ 180  rdd1.count()  220
--- End diff --

This could be relaxed to 150-250, and then `250  rdd2.count()  350`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/5173#discussion_r28483604
  
--- Diff: python/pyspark/rdd.py ---
@@ -353,8 +365,8 @@ def sample(self, withReplacement, fraction, seed=None):
 :param seed: seed for the random number generator
 
  rdd = sc.parallelize(range(100), 4)
- rdd.sample(False, 0.1, 81).count()
-10
+ 9 = rdd.sample(False, 0.1, 81).count() = 11
--- End diff --

We can further relax the bounds to match the theory, e.g., 6-14.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-15 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93569850
  
  [Test build #30373 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30373/consoleFull)
 for   PR 5173 at commit 
[`179fc8d`](https://github.com/apache/spark/commit/179fc8d7b426cbd2e00640ac3b46b26475cfa73a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-92972052
  
  [Test build #667 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/667/consoleFull)
 for   PR 5173 at commit 
[`71535e9`](https://github.com/apache/spark/commit/71535e9450419adec289685abdd306d2e264e710).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93123888
  
  [Test build #30287 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30287/consoleFull)
 for   PR 5173 at commit 
[`4006829`](https://github.com/apache/spark/commit/400682982bbb6277d4b2c6dca2c3b88d491e5b21).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93147729
  
  [Test build #30287 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30287/consoleFull)
 for   PR 5173 at commit 
[`4006829`](https://github.com/apache/spark/commit/400682982bbb6277d4b2c6dca2c3b88d491e5b21).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch **adds the following new dependencies:**
   * `snappy-java-1.1.1.7.jar`

 * This patch **removes the following dependencies:**
   * `snappy-java-1.1.1.6.jar`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93147750
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30287/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-14 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93119996
  
@mengxr Could you take a look at the MLlib changes?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-14 Thread shaananc
Github user shaananc commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93175644
  
I'm getting the error: 
`File /spark/python/pyspark/serializers.py, line 419, in loads
return pickle.loads(obj)
TypeError: ('code() takes at most 14 arguments (15 given)', type 'code', 
(2, 0, 2, 2, 19, '\x88\x00\x00|\x01\x00\x83\x01\x00S', (None,), (), (u's', 
u'iterator'), u'/spark/python/pyspark/rdd.py', u'func', 294, '\x00\x01', 
(u'f',), ()))`

Whenever I try and run 

`data = (1, 2)
distData = sc.parallelize(data)
distData.reduce(lambda a, b: a + b)`

After having pulled this into master. Any clue what the issue could be?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93117173
  
  [Test build #675 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/675/consoleFull)
 for   PR 5173 at commit 
[`2fc0066`](https://github.com/apache/spark/commit/2fc0066bc402a5b1579c9355687ea8f46de9e99c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93131436
  
  [Test build #675 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/675/consoleFull)
 for   PR 5173 at commit 
[`2fc0066`](https://github.com/apache/spark/commit/2fc0066bc402a5b1579c9355687ea8f46de9e99c).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93125045
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30285/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93125649
  
  [Test build #676 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/676/consoleFull)
 for   PR 5173 at commit 
[`4006829`](https://github.com/apache/spark/commit/400682982bbb6277d4b2c6dca2c3b88d491e5b21).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93156062
  
  [Test build #676 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/676/consoleFull)
 for   PR 5173 at commit 
[`4006829`](https://github.com/apache/spark/commit/400682982bbb6277d4b2c6dca2c3b88d491e5b21).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-14 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93184039
  
@shaananc  It works fine here:
```
Using Python version 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014 00:54:21)
SparkContext available as sc, SQLContext available as sqlContext.
 data = (1, 2)
 sc.parallelize(data).reduce(lambda a, b: a + b)
3
```

What's is your environment?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93075612
  
  [Test build #674 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/674/consoleFull)
 for   PR 5173 at commit 
[`2fc0066`](https://github.com/apache/spark/commit/2fc0066bc402a5b1579c9355687ea8f46de9e99c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93059764
  
  [Test build #30278 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30278/consoleFull)
 for   PR 5173 at commit 
[`2fc0066`](https://github.com/apache/spark/commit/2fc0066bc402a5b1579c9355687ea8f46de9e99c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93062692
  
  [Test build #669 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/669/consoleFull)
 for   PR 5173 at commit 
[`2fc0066`](https://github.com/apache/spark/commit/2fc0066bc402a5b1579c9355687ea8f46de9e99c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93106732
  
  [Test build #30278 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30278/consoleFull)
 for   PR 5173 at commit 
[`2fc0066`](https://github.com/apache/spark/commit/2fc0066bc402a5b1579c9355687ea8f46de9e99c).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch **adds the following new dependencies:**
   * `snappy-java-1.1.1.7.jar`

 * This patch **removes the following dependencies:**
   * `snappy-java-1.1.1.6.jar`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93106745
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30278/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-93113314
  
  [Test build #674 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/674/consoleFull)
 for   PR 5173 at commit 
[`2fc0066`](https://github.com/apache/spark/commit/2fc0066bc402a5b1579c9355687ea8f46de9e99c).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch **adds the following new dependencies:**
   * `commons-math3-3.4.1.jar`
   * `snappy-java-1.1.1.7.jar`

 * This patch **removes the following dependencies:**
   * `commons-math3-3.1.1.jar`
   * `snappy-java-1.1.1.6.jar`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-92466565
  
  [Test build #30186 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30186/consoleFull)
 for   PR 5173 at commit 
[`2ddfba0`](https://github.com/apache/spark/commit/2ddfba04c2b874a632a7b0c49d28ec570109767d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-92486933
  
  [Test build #30186 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30186/consoleFull)
 for   PR 5173 at commit 
[`2ddfba0`](https://github.com/apache/spark/commit/2ddfba04c2b874a632a7b0c49d28ec570109767d).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-92486953
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30186/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4897] [PySpark] Python 3 support

2015-04-13 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5173#issuecomment-92517118
  
  [Test build #30193 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30193/consoleFull)
 for   PR 5173 at commit 
[`5a55ab4`](https://github.com/apache/spark/commit/5a55ab4cf65dc13e76a32325d99dda81d7e65874).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.
 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   >