[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-25 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19929
  
Could you change the JIRA number to 
https://issues.apache.org/jira/browse/SPARK-22901 ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-25 Thread mgaido91
Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/19929
  
@gatorsmile, yes, the reason why seed doesn't work is in the way Python 
UDFs are executed, i.e. a new python process is created for each partition to 
evaluate the Python UDF. Thus the seed is set only on the driver, but not in 
the process where the UDF is executed. What I am saying can be easily confirmed 
by this:
```
>>> from pyspark.sql.functions import udf
>>> import os
>>> pid_udf = udf(lambda: str(os.getpid()))
>>> spark.range(2).select(pid_udf()).show()
+--+

|()|
+--+
|  4132|
|  4130|
+--+
>>> os.getpid()
4070
```
Therefore there is no easy way to set the seed. If I set it inside the UDF, 
the UDF would become deterministic. Therefore I think that the best option is 
the current test.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-25 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19929
  
Any update on 
https://github.com/apache/spark/pull/19929/files/cc309b0ce2496365afd8c602c282e3d84aeed940#r158579661?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19929
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85359/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19929
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19929
  
**[Test build #85359 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85359/testReport)**
 for PR 19929 at commit 
[`47801c7`](https://github.com/apache/spark/commit/47801c7dc532aa9a19d59cdef1fe021c61a0b2c8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19929
  
**[Test build #85359 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85359/testReport)**
 for PR 19929 at commit 
[`47801c7`](https://github.com/apache/spark/commit/47801c7dc532aa9a19d59cdef1fe021c61a0b2c8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19929
  
**[Test build #85358 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85358/testReport)**
 for PR 19929 at commit 
[`a40ba73`](https://github.com/apache/spark/commit/a40ba7384db1030b6facb14b741349da09562d1f).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19929
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19929
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85358/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19929
  
**[Test build #85358 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85358/testReport)**
 for PR 19929 at commit 
[`a40ba73`](https://github.com/apache/spark/commit/a40ba7384db1030b6facb14b741349da09562d1f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19929
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19929
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85337/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19929
  
**[Test build #85337 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85337/testReport)**
 for PR 19929 at commit 
[`462a92a`](https://github.com/apache/spark/commit/462a92a4237deb63d0b7128ff3585bb7595692fe).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19929
  
**[Test build #85337 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85337/testReport)**
 for PR 19929 at commit 
[`462a92a`](https://github.com/apache/spark/commit/462a92a4237deb63d0b7128ff3585bb7595692fe).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19929
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85322/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19929
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19929
  
**[Test build #85322 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85322/testReport)**
 for PR 19929 at commit 
[`cc309b0`](https://github.com/apache/spark/commit/cc309b0ce2496365afd8c602c282e3d84aeed940).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19929
  
**[Test build #85322 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85322/testReport)**
 for PR 19929 at commit 
[`cc309b0`](https://github.com/apache/spark/commit/cc309b0ce2496365afd8c602c282e3d84aeed940).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-22 Thread mgaido91
Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/19929
  
Jenkins, retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19929
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19929
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85316/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19929
  
**[Test build #85316 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85316/testReport)**
 for PR 19929 at commit 
[`cc309b0`](https://github.com/apache/spark/commit/cc309b0ce2496365afd8c602c282e3d84aeed940).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19929
  
**[Test build #85316 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85316/testReport)**
 for PR 19929 at commit 
[`cc309b0`](https://github.com/apache/spark/commit/cc309b0ce2496365afd8c602c282e3d84aeed940).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-22 Thread mgaido91
Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/19929
  
thank you @cloud-fan, changed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19929
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19929
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85309/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19929
  
**[Test build #85309 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85309/testReport)**
 for PR 19929 at commit 
[`187ff9a`](https://github.com/apache/spark/commit/187ff9a22edecdca582893b2fd836a343972f68b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-22 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/19929
  
`UDFRegistration.registerFunction` needs a minor update for the log


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-22 Thread mgaido91
Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/19929
  
@gatorsmile I added the test, but I didn't get what needs to be updated in 
`registerPython`. May you explain me please? Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19929
  
**[Test build #85309 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85309/testReport)**
 for PR 19929 at commit 
[`187ff9a`](https://github.com/apache/spark/commit/187ff9a22edecdca582893b2fd836a343972f68b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-22 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19929
  
We need test cases. Manual tests are not enough.

I will try to review this tomorrow. Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-20 Thread mgaido91
Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/19929
  
kindly ping @cloud-fan @gatorsmile @HyukjinKwon @zero323 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-13 Thread mgaido91
Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/19929
  
cc @cloud-fan @HyukjinKwon @zero323 maybe you can help too reviewing this, 
thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-11 Thread mgaido91
Github user mgaido91 commented on the issue:

https://github.com/apache/spark/pull/19929
  
@gatorsmile sorry, I saw that you did the path for scala UDF. Might you 
help reviewing this please? Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19929
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19929
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84660/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19929
  
**[Test build #84660 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84660/testReport)**
 for PR 19929 at commit 
[`6187d5a`](https://github.com/apache/spark/commit/6187d5a0df7c409a49cd636eb74dea9323044c6b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19929: [SPARK-22629][PYTHON] Add deterministic flag to pyspark ...

2017-12-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19929
  
**[Test build #84660 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84660/testReport)**
 for PR 19929 at commit 
[`6187d5a`](https://github.com/apache/spark/commit/6187d5a0df7c409a49cd636eb74dea9323044c6b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org