[jira] [Commented] (SPARK-30063) Failure when returning a value from multiple Pandas UDFs

2019-12-03 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987418#comment-16987418 ] Hyukjin Kwon commented on SPARK-30063: -- Closing this assuming the issue was resolved. > Failure

[jira] [Commented] (SPARK-30063) Failure when returning a value from multiple Pandas UDFs

2019-12-03 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987417#comment-16987417 ] Hyukjin Kwon commented on SPARK-30063: -- {quote} Set the PYTHONHASHSEED environment variable such

[jira] [Commented] (SPARK-30063) Failure when returning a value from multiple Pandas UDFs

2019-12-03 Thread Tim Kellogg (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987230#comment-16987230 ] Tim Kellogg commented on SPARK-30063: - Improvement suggestions * Set the PYTHONHASHSEED environment

[jira] [Commented] (SPARK-30063) Failure when returning a value from multiple Pandas UDFs

2019-12-03 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987154#comment-16987154 ] Bryan Cutler commented on SPARK-30063: -- I haven't looked at your bug report in detail but you are

[jira] [Commented] (SPARK-30063) Failure when returning a value from multiple Pandas UDFs

2019-12-03 Thread Ruben Berenguel (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986817#comment-16986817 ] Ruben Berenguel commented on SPARK-30063: - Wow, this looks bad for now (since grouped_aggs are

[jira] [Commented] (SPARK-30063) Failure when returning a value from multiple Pandas UDFs

2019-12-02 Thread Tim Kellogg (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986438#comment-16986438 ] Tim Kellogg commented on SPARK-30063: - Additional notes on UDFs returning

[jira] [Commented] (SPARK-30063) Failure when returning a value from multiple Pandas UDFs

2019-12-02 Thread Tim Kellogg (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986373#comment-16986373 ] Tim Kellogg commented on SPARK-30063: - This appears to be 100% related to pyarrow version. I

[jira] [Commented] (SPARK-30063) Failure when returning a value from multiple Pandas UDFs

2019-12-02 Thread Tim Kellogg (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986304#comment-16986304 ] Tim Kellogg commented on SPARK-30063: - I attached variety-of-schemas.ipynb. It seems reproducing

[jira] [Commented] (SPARK-30063) Failure when returning a value from multiple Pandas UDFs

2019-12-02 Thread Tim Kellogg (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986253#comment-16986253 ] Tim Kellogg commented on SPARK-30063: - I added the expanded query plan, hopefully that can holdover

[jira] [Commented] (SPARK-30063) Failure when returning a value from multiple Pandas UDFs

2019-12-02 Thread Tim Kellogg (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986242#comment-16986242 ] Tim Kellogg commented on SPARK-30063: - [~RBerenguel] I agree that the schema seems to be passed

[jira] [Commented] (SPARK-30063) Failure when returning a value from multiple Pandas UDFs

2019-12-02 Thread Tim Kellogg (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986236#comment-16986236 ] Tim Kellogg commented on SPARK-30063: - pyarrow==0.15.1 pandas==0.24.2 I've tried several other

[jira] [Commented] (SPARK-30063) Failure when returning a value from multiple Pandas UDFs

2019-12-01 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985755#comment-16985755 ] Hyukjin Kwon commented on SPARK-30063: -- Also can you describe the PyArrow and pandas versions you

[jira] [Commented] (SPARK-30063) Failure when returning a value from multiple Pandas UDFs

2019-11-29 Thread Ruben Berenguel (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985192#comment-16985192 ] Ruben Berenguel commented on SPARK-30063: - Hi [~tkellogg] I’d like to have a look, do you have