[jira] [Commented] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames

2020-02-27 Thread Jorge Machado (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046958#comment-17046958 ] Jorge Machado commented on SPARK-26412: --- Thanks for the Tipp. It helps > Allow Pandas UDF to

[jira] [Commented] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames

2020-02-27 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046542#comment-17046542 ] Hyukjin Kwon commented on SPARK-26412: -- You cannot separate one iterator to multiple iterators

[jira] [Commented] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames

2020-02-27 Thread Jorge Machado (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046373#comment-17046373 ] Jorge Machado commented on SPARK-26412: --- Well I was thinking on something more. like I would like

[jira] [Commented] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames

2020-02-27 Thread Hyukjin Kwon (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046365#comment-17046365 ] Hyukjin Kwon commented on SPARK-26412: -- You can do it via: {code} def map_func(batch_iter):

[jira] [Commented] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames

2020-02-26 Thread Jorge Machado (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046265#comment-17046265 ] Jorge Machado commented on SPARK-26412: --- Hi, one question.  when using "a tuple of pd.Series if

[jira] [Commented] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames

2019-06-19 Thread Terry Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867954#comment-16867954 ] Terry Kim commented on SPARK-26412: --- [~WeichenXu123] and [~mengxr] do you plan to do something similar

[jira] [Commented] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames

2019-05-10 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16837353#comment-16837353 ] Xiangrui Meng commented on SPARK-26412: --- [~WeichenXu123] I updated the description. > Allow

[jira] [Commented] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames or Arrow batches

2019-05-09 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836480#comment-16836480 ] Weichen Xu commented on SPARK-26412: Discuss with [~mengxr] , discard proposal (2), this should be

[jira] [Commented] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames or Arrow batches

2019-05-09 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836377#comment-16836377 ] Weichen Xu commented on SPARK-26412: [~mengxr]   There's one issue:   There're 2 proposals in the

[jira] [Commented] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames for the entire partition

2019-04-20 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16822521#comment-16822521 ] Xiangrui Meng commented on SPARK-26412: --- [~bryanc] It handles the data exchange for DL model

[jira] [Commented] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames for the entire partition

2019-01-24 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751785#comment-16751785 ] Bryan Cutler commented on SPARK-26412: -- [~mengxr] I think Arrow record batches would be a much more

[jira] [Commented] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames for the entire partition

2019-01-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16743691#comment-16743691 ] Hyukjin Kwon commented on SPARK-26412: -- [~mengxr], looks this can be subset of SPARK-26413. Did I

[jira] [Commented] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames for the entire partition

2018-12-20 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725957#comment-16725957 ] Li Jin commented on SPARK-26412: So this is similar to the mapPartitions API in Scala but instead of