[jira] [Resolved] (SPARK-27463) Support Dataframe Cogroup via Pandas UDFs

2019-09-17 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-27463. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 24981 [https://gi

[jira] [Assigned] (SPARK-27463) Support Dataframe Cogroup via Pandas UDFs

2019-09-17 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-27463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-27463: Assignee: Chris Martin > Support Dataframe Cogroup via Pandas UDFs > ---

[jira] [Created] (SPARK-29126) Add usage guide for cogroup Pandas UDF

2019-09-17 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-29126: Summary: Add usage guide for cogroup Pandas UDF Key: SPARK-29126 URL: https://issues.apache.org/jira/browse/SPARK-29126 Project: Spark Issue Type: Documentat

[jira] [Commented] (SPARK-28502) Error with struct conversion while using pandas_udf

2019-09-24 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937233#comment-16937233 ] Bryan Cutler commented on SPARK-28502: -- I was able to reproduce in Spark 2.4.3. The

[jira] [Commented] (SPARK-29367) pandas udf not working with latest pyarrow release (0.15.0)

2019-10-07 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16946093#comment-16946093 ] Bryan Cutler commented on SPARK-29367: -- There was a change in the Arrow IPC format,

[jira] [Assigned] (SPARK-29367) pandas udf not working with latest pyarrow release (0.15.0)

2019-10-07 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-29367: Assignee: Bryan Cutler > pandas udf not working with latest pyarrow release (0.15.0) > --

[jira] [Updated] (SPARK-29367) pandas udf not working with latest pyarrow release (0.15.0)

2019-10-07 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-29367: - Issue Type: Documentation (was: Bug) > pandas udf not working with latest pyarrow release (0.15

[jira] [Created] (SPARK-29376) Upgrade Apache Arrow to 0.15.0

2019-10-07 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-29376: Summary: Upgrade Apache Arrow to 0.15.0 Key: SPARK-29376 URL: https://issues.apache.org/jira/browse/SPARK-29376 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-29402) Add tests for grouped map pandas_udf using window

2019-10-08 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-29402: Summary: Add tests for grouped map pandas_udf using window Key: SPARK-29402 URL: https://issues.apache.org/jira/browse/SPARK-29402 Project: Spark Issue Type:

[jira] [Commented] (SPARK-29402) Add tests for grouped map pandas_udf using window

2019-10-08 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16947266#comment-16947266 ] Bryan Cutler commented on SPARK-29402: -- This is related to SPARK-28502 that using g

[jira] [Commented] (SPARK-28502) Error with struct conversion while using pandas_udf

2019-10-08 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16947272#comment-16947272 ] Bryan Cutler commented on SPARK-28502: -- I'm closing this since it is working in mas

[jira] [Resolved] (SPARK-28502) Error with struct conversion while using pandas_udf

2019-10-08 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-28502. -- Fix Version/s: 3.0.0 Resolution: Fixed This was fixed once support for StructType was a

[jira] [Created] (SPARK-32080) Simplify ArrowColumnVector ListArray accessor

2020-06-23 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-32080: Summary: Simplify ArrowColumnVector ListArray accessor Key: SPARK-32080 URL: https://issues.apache.org/jira/browse/SPARK-32080 Project: Spark Issue Type: Imp

[jira] [Updated] (SPARK-32080) Simplify ArrowColumnVector ListArray accessor

2020-06-23 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-32080: - Priority: Trivial (was: Major) > Simplify ArrowColumnVector ListArray accessor > --

[jira] [Updated] (SPARK-31998) Change package references for ArrowBuf

2020-06-24 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-31998: - Component/s: (was: Spark Core) SQL > Change package references for ArrowBuf

[jira] [Updated] (SPARK-31998) Change package references for ArrowBuf

2020-06-24 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-31998: - Issue Type: Improvement (was: Bug) > Change package references for ArrowBuf > -

[jira] [Resolved] (SPARK-32098) Use iloc for positional slicing instead of direct slicing in createDataFrame with Arrow

2020-06-25 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-32098. -- Fix Version/s: 3.1.0 2.4.7 3.0.1 Resolution: Fixed

[jira] [Assigned] (SPARK-32098) Use iloc for positional slicing instead of direct slicing in createDataFrame with Arrow

2020-06-25 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-32098: Assignee: Hyukjin Kwon > Use iloc for positional slicing instead of direct slicing in cre

[jira] [Created] (SPARK-32162) Improve Pandas Grouped Map with Window test output

2020-07-02 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-32162: Summary: Improve Pandas Grouped Map with Window test output Key: SPARK-32162 URL: https://issues.apache.org/jira/browse/SPARK-32162 Project: Spark Issue Type

[jira] [Commented] (SPARK-32174) toPandas attempted Arrow optimization but has reached an error and can not continue

2020-07-07 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152958#comment-17152958 ] Bryan Cutler commented on SPARK-32174: -- >From the stacktrace, it looks like you are

[jira] [Resolved] (SPARK-32174) toPandas attempted Arrow optimization but has reached an error and can not continue

2020-07-08 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-32174. -- Resolution: Not A Problem Great, I will mark this as resolved then.  We should add the configu

[jira] [Created] (SPARK-32285) Add PySpark support for nested timestamps with arrow

2020-07-12 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-32285: Summary: Add PySpark support for nested timestamps with arrow Key: SPARK-32285 URL: https://issues.apache.org/jira/browse/SPARK-32285 Project: Spark Issue Ty

[jira] [Updated] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2020-07-12 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-21187: - Description: This is to track adding the remaining type support in Arrow Converters. Currently,

[jira] [Created] (SPARK-32312) Upgrade Apache Arrow to 1.0.0

2020-07-14 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-32312: Summary: Upgrade Apache Arrow to 1.0.0 Key: SPARK-32312 URL: https://issues.apache.org/jira/browse/SPARK-32312 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-32312) Upgrade Apache Arrow to 1.0.0

2020-07-14 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157543#comment-17157543 ] Bryan Cutler commented on SPARK-32312: -- I've been doing local testing and will subm

[jira] [Assigned] (SPARK-32300) toPandas with no partitions should work

2020-07-14 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-32300: Assignee: Hyukjin Kwon > toPandas with no partitions should work > --

[jira] [Resolved] (SPARK-32300) toPandas with no partitions should work

2020-07-14 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-32300. -- Fix Version/s: 2.4.7 Resolution: Fixed Issue resolved by pull request 29098 [https://gi

[jira] [Resolved] (SPARK-32413) Guidance for my project

2020-07-23 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-32413. -- Resolution: Not A Problem Hi [~stoksoz] , this type of discussion is more appropriate for the

[jira] [Closed] (SPARK-32413) Guidance for my project

2020-07-23 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler closed SPARK-32413. > Guidance for my project > > > Key: SPARK-32413 >

[jira] [Commented] (SPARK-28502) Error with struct conversion while using pandas_udf

2019-10-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16948981#comment-16948981 ] Bryan Cutler commented on SPARK-28502: -- Thanks for testing it out [~nasirali]! It's

[jira] [Assigned] (SPARK-29402) Add tests for grouped map pandas_udf using window

2019-10-11 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-29402: Assignee: Bryan Cutler > Add tests for grouped map pandas_udf using window >

[jira] [Resolved] (SPARK-29402) Add tests for grouped map pandas_udf using window

2019-10-11 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-29402. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26063 [https://gi

[jira] [Commented] (SPARK-29428) Can't persist/set None-valued param

2019-10-14 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951212#comment-16951212 ] Bryan Cutler commented on SPARK-29428: -- The usage of {{None}} in pyspark ml is a bi

[jira] [Resolved] (SPARK-29428) Can't persist/set None-valued param

2019-10-14 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-29428. -- Resolution: Not A Problem > Can't persist/set None-valued param > ---

[jira] [Created] (SPARK-29464) PySpark ML should expose Params.clear() to unset a user supplied Param

2019-10-14 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-29464: Summary: PySpark ML should expose Params.clear() to unset a user supplied Param Key: SPARK-29464 URL: https://issues.apache.org/jira/browse/SPARK-29464 Project: Spark

[jira] [Updated] (SPARK-29464) PySpark ML should expose Params.clear() to unset a user supplied Param

2019-10-14 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-29464: - Description: PySpark ML currently has a private {{_clear()}} method that will unset a param. Thi

[jira] [Created] (SPARK-29493) Add MapType support for Arrow Java

2019-10-16 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-29493: Summary: Add MapType support for Arrow Java Key: SPARK-29493 URL: https://issues.apache.org/jira/browse/SPARK-29493 Project: Spark Issue Type: Sub-task

[jira] [Reopened] (SPARK-24554) Add MapType Support for Arrow in PySpark

2019-10-16 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reopened SPARK-24554: -- Reopening this to be completed in 2 steps, first Java after Arrow 0.15.0 and then pyspark when Ma

[jira] [Assigned] (SPARK-29464) PySpark ML should expose Params.clear() to unset a user supplied Param

2019-10-17 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-29464: Assignee: Huaxin Gao > PySpark ML should expose Params.clear() to unset a user supplied P

[jira] [Resolved] (SPARK-29464) PySpark ML should expose Params.clear() to unset a user supplied Param

2019-10-17 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-29464. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26130 [https://gi

[jira] [Resolved] (SPARK-29414) HasOutputCol param isSet() property is not preserved after persistence

2019-10-25 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-29414. -- Fix Version/s: 2.4.4 Resolution: Fixed Thanks [~borys.biletskyy], I'll mark this as res

[jira] [Created] (SPARK-29748) Remove sorting of fields in PySpark SQL Row creation

2019-11-04 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-29748: Summary: Remove sorting of fields in PySpark SQL Row creation Key: SPARK-29748 URL: https://issues.apache.org/jira/browse/SPARK-29748 Project: Spark Issue Ty

[jira] [Updated] (SPARK-29748) Remove sorting of fields in PySpark SQL Row creation

2019-11-04 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-29748: - Description: Currently, when a PySpark Row is created with keyword arguments, the fields are so

[jira] [Commented] (SPARK-29691) Estimator fit method fails to copy params (in PySpark)

2019-11-05 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967748#comment-16967748 ] Bryan Cutler commented on SPARK-29691: -- [~JohnHBauer] I'm not sure we should extend

[jira] [Commented] (SPARK-28502) Error with struct conversion while using pandas_udf

2019-11-05 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16967973#comment-16967973 ] Bryan Cutler commented on SPARK-28502: -- That's strange, I added your example as a u

[jira] [Commented] (SPARK-28502) Error with struct conversion while using pandas_udf

2019-11-06 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968744#comment-16968744 ] Bryan Cutler commented on SPARK-28502: -- Ahh, so Arrow 0.15.0+ had a change in the I

[jira] [Commented] (SPARK-29803) remove all instances of 'from __future__ import print_function'

2019-11-08 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16970522#comment-16970522 ] Bryan Cutler commented on SPARK-29803: -- This should be done once Python 2 support i

[jira] [Updated] (SPARK-29376) Upgrade Apache Arrow to 0.15.1

2019-11-08 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-29376: - Summary: Upgrade Apache Arrow to 0.15.1 (was: Upgrade Apache Arrow to 0.15.0) > Upgrade Apache

[jira] [Updated] (SPARK-29376) Upgrade Apache Arrow to 0.15.1

2019-11-08 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-29376: - Description: Apache Arrow 0.15.0 was just released see [https://arrow.apache.org/blog/2019/10/0

[jira] [Assigned] (SPARK-29798) Infers bytes as binary type in Python 3 at PySpark

2019-11-08 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-29798: Assignee: Hyukjin Kwon > Infers bytes as binary type in Python 3 at PySpark > ---

[jira] [Resolved] (SPARK-29798) Infers bytes as binary type in Python 3 at PySpark

2019-11-08 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-29798. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26432 [https://gi

[jira] [Reopened] (SPARK-25351) Handle Pandas category type when converting from Python with Arrow

2019-11-12 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reopened SPARK-25351: -- reopening, this should be straightforward to add > Handle Pandas category type when converting fr

[jira] [Commented] (SPARK-29493) Add MapType support for Arrow Java

2019-11-12 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972833#comment-16972833 ] Bryan Cutler commented on SPARK-29493: -- [~jalpan.randeri] this depends on SPARK-293

[jira] [Commented] (SPARK-29748) Remove sorting of fields in PySpark SQL Row creation

2019-11-14 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16974678#comment-16974678 ] Bryan Cutler commented on SPARK-29748: -- Thanks for discussing [~zero323] . The goal

[jira] [Commented] (SPARK-29748) Remove sorting of fields in PySpark SQL Row creation

2019-11-19 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16977749#comment-16977749 ] Bryan Cutler commented on SPARK-29748: -- [~zero323] and [~jhereth] this is targeted

[jira] [Resolved] (SPARK-29691) Estimator fit method fails to copy params (in PySpark)

2019-11-19 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-29691. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26527 [https://gi

[jira] [Assigned] (SPARK-29691) Estimator fit method fails to copy params (in PySpark)

2019-11-19 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-29691: Assignee: John Bauer > Estimator fit method fails to copy params (in PySpark) > -

[jira] [Commented] (SPARK-30063) Failure when returning a value from multiple Pandas UDFs

2019-12-03 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16987154#comment-16987154 ] Bryan Cutler commented on SPARK-30063: -- I haven't looked at your bug report in deta

[jira] [Commented] (SPARK-29748) Remove sorting of fields in PySpark SQL Row creation

2019-12-03 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16987252#comment-16987252 ] Bryan Cutler commented on SPARK-29748: -- [~zero323] I made some updates to the PR wi

[jira] [Resolved] (SPARK-28502) Error with struct conversion while using pandas_udf

2019-12-18 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-28502. -- Resolution: Fixed > Error with struct conversion while using pandas_udf >

[jira] [Commented] (SPARK-28502) Error with struct conversion while using pandas_udf

2019-12-18 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999545#comment-16999545 ] Bryan Cutler commented on SPARK-28502: -- The problem is that returning nested Struct

[jira] [Updated] (SPARK-32285) Add PySpark support for nested timestamps with arrow

2020-11-20 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-32285: - Parent: (was: SPARK-21187) Issue Type: Improvement (was: Sub-task) > Add PySpark su

[jira] [Updated] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2020-11-20 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-21187: - Description: This is to track adding the remaining type support in Arrow Converters. Currently,

[jira] [Resolved] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2020-11-20 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-21187. -- Fix Version/s: 3.1.0 Resolution: Fixed With MapType now added, all basic types are supp

[jira] [Commented] (SPARK-33489) Support null for conversion from and to Arrow type

2020-11-25 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17238950#comment-17238950 ] Bryan Cutler commented on SPARK-33489: -- Yes, Arrow supports null type. Should be pr

[jira] [Created] (SPARK-33613) [Python][Tests] Replace calls to deprecated test APIs

2020-11-30 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-33613: Summary: [Python][Tests] Replace calls to deprecated test APIs Key: SPARK-33613 URL: https://issues.apache.org/jira/browse/SPARK-33613 Project: Spark Issue T

[jira] [Commented] (SPARK-33489) Support null for conversion from and to Arrow type

2020-11-30 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17241092#comment-17241092 ] Bryan Cutler commented on SPARK-33489: -- Great, thanks [~cactice] ! Please feel free

[jira] [Commented] (SPARK-33576) PythonException: An exception was thrown from a UDF: 'OSError: Invalid IPC message: negative bodyLength'.

2020-12-01 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17241769#comment-17241769 ] Bryan Cutler commented on SPARK-33576: -- Is this due to the 2GB limit? As in https:

[jira] [Commented] (SPARK-33576) PythonException: An exception was thrown from a UDF: 'OSError: Invalid IPC message: negative bodyLength'.

2020-12-11 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17248064#comment-17248064 ] Bryan Cutler commented on SPARK-33576: -- [~darshats] I believe the only current work

[jira] [Resolved] (SPARK-33576) PythonException: An exception was thrown from a UDF: 'OSError: Invalid IPC message: negative bodyLength'.

2020-12-11 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-33576. -- Resolution: Duplicate Going to resolve as a duplicate, but please reopen if you find it is dif

[jira] [Commented] (SPARK-24632) Allow 3rd-party libraries to use pyspark.ml abstractions for Java wrappers for persistence

2020-12-28 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17255712#comment-17255712 ] Bryan Cutler commented on SPARK-24632: -- Ping [~huaxingao] in case you have some tim

[jira] [Resolved] (SPARK-29748) Remove sorting of fields in PySpark SQL Row creation

2020-01-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-29748. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26496 [https://gi

[jira] [Assigned] (SPARK-29748) Remove sorting of fields in PySpark SQL Row creation

2020-01-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-29748: Assignee: Bryan Cutler > Remove sorting of fields in PySpark SQL Row creation > -

[jira] [Resolved] (SPARK-22232) Row objects in pyspark created using the `Row(**kwars)` syntax do not get serialized/deserialized properly

2020-01-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-22232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-22232. -- Resolution: Won't Fix Closing in favor for fix in SPARK-29748 > Row objects in pyspark create

[jira] [Resolved] (SPARK-24915) Calling SparkSession.createDataFrame with schema can throw exception

2020-01-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-24915. -- Resolution: Won't Fix Closing in favor of fix in SPARK-29748 > Calling SparkSession.createDat

[jira] [Commented] (SPARK-24915) Calling SparkSession.createDataFrame with schema can throw exception

2020-01-13 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014719#comment-17014719 ] Bryan Cutler commented on SPARK-24915: -- [~jhereth] apologies for closing prematurel

[jira] [Reopened] (SPARK-24915) Calling SparkSession.createDataFrame with schema can throw exception

2020-01-13 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reopened SPARK-24915: -- > Calling SparkSession.createDataFrame with schema can throw exception > -

[jira] [Commented] (SPARK-24915) Calling SparkSession.createDataFrame with schema can throw exception

2020-01-20 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019770#comment-17019770 ] Bryan Cutler commented on SPARK-24915: -- [~jhereth] since there is already a lot of

[jira] [Created] (SPARK-30640) Prevent unnessary copies of data in Arrow to Pandas conversion with Timestamps

2020-01-24 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-30640: Summary: Prevent unnessary copies of data in Arrow to Pandas conversion with Timestamps Key: SPARK-30640 URL: https://issues.apache.org/jira/browse/SPARK-30640 Projec

[jira] [Assigned] (SPARK-30640) Prevent unnessary copies of data in Arrow to Pandas conversion with Timestamps

2020-01-26 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-30640: Assignee: Bryan Cutler > Prevent unnessary copies of data in Arrow to Pandas conversion w

[jira] [Resolved] (SPARK-30640) Prevent unnessary copies of data in Arrow to Pandas conversion with Timestamps

2020-01-26 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-30640. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 27358 [https://gi

[jira] [Created] (SPARK-30777) PySpark test_arrow tests fail with Pandas > 1.0.0

2020-02-10 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-30777: Summary: PySpark test_arrow tests fail with Pandas > 1.0.0 Key: SPARK-30777 URL: https://issues.apache.org/jira/browse/SPARK-30777 Project: Spark Issue Type:

[jira] [Commented] (SPARK-30777) PySpark test_arrow tests fail with Pandas >= 1.0.0

2020-02-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033871#comment-17033871 ] Bryan Cutler commented on SPARK-30777: -- I'm working on the patch > PySpark test_ar

[jira] [Commented] (SPARK-30777) PySpark test_arrow tests fail with Pandas >= 1.0.0

2020-02-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033873#comment-17033873 ] Bryan Cutler commented on SPARK-30777: -- [~dongjoon] I don't think it's a blocker, o

[jira] [Created] (SPARK-30834) Add note for recommended versions of Pandas and PyArrow for 2.4.x

2020-02-14 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-30834: Summary: Add note for recommended versions of Pandas and PyArrow for 2.4.x Key: SPARK-30834 URL: https://issues.apache.org/jira/browse/SPARK-30834 Project: Spark

[jira] [Updated] (SPARK-30834) Add note for recommended versions of Pandas and PyArrow for 2.4.x

2020-02-14 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-30834: - Description: CI testing for branch 2.4 has been with the versions below. These are recommened a

[jira] [Updated] (SPARK-30834) Add note for recommended versions of Pandas and PyArrow for 2.4.x

2020-02-14 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-30834: - Description: CI testing for branch 2.4 has been with the versions below. These are recommened a

[jira] [Updated] (SPARK-30834) Add note for recommended versions of Pandas and PyArrow for 2.4.x

2020-02-14 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-30834: - Component/s: PySpark > Add note for recommended versions of Pandas and PyArrow for 2.4.x > -

[jira] [Assigned] (SPARK-30861) Deprecate constructor of SQLContext and getOrCreate in SQLContext at PySpark

2020-02-20 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-30861: Assignee: Hyukjin Kwon > Deprecate constructor of SQLContext and getOrCreate in SQLContex

[jira] [Commented] (SPARK-30861) Deprecate constructor of SQLContext and getOrCreate in SQLContext at PySpark

2020-02-20 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17041268#comment-17041268 ] Bryan Cutler commented on SPARK-30861: -- Issue resolved by pull request 27614 https:

[jira] [Updated] (SPARK-30861) Deprecate constructor of SQLContext and getOrCreate in SQLContext at PySpark

2020-02-20 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-30861: - Fix Version/s: 3.0.0 > Deprecate constructor of SQLContext and getOrCreate in SQLContext at PySp

[jira] [Resolved] (SPARK-30861) Deprecate constructor of SQLContext and getOrCreate in SQLContext at PySpark

2020-02-20 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-30861. -- Resolution: Fixed > Deprecate constructor of SQLContext and getOrCreate in SQLContext at PySpa

[jira] [Commented] (SPARK-30961) Arrow enabled: to_pandas with date column fails

2020-02-26 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17045961#comment-17045961 ] Bryan Cutler commented on SPARK-30961: -- [~nicornk] there were a number of fixes rel

[jira] [Commented] (SPARK-30961) Arrow enabled: to_pandas with date column fails

2020-02-27 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17046801#comment-17046801 ] Bryan Cutler commented on SPARK-30961: -- Yes, we should be able to keep Spark 3.x up

[jira] [Resolved] (SPARK-30961) Arrow enabled: to_pandas with date column fails

2020-03-06 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-30961. -- Resolution: Won't Fix Thanks [~KevinAppel] and [~nicornk] for the info, I'll go ahead and clos

[jira] [Commented] (SPARK-30961) Arrow enabled: to_pandas with date column fails

2020-03-06 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17053717#comment-17053717 ] Bryan Cutler commented on SPARK-30961: -- Just to be clear, this is only an issue wit

[jira] [Assigned] (SPARK-27805) toPandas does not propagate SparkExceptions with arrow enabled

2019-06-04 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-27805: Assignee: David Vogelbacher > toPandas does not propagate SparkExceptions with arrow enab

[jira] [Resolved] (SPARK-27805) toPandas does not propagate SparkExceptions with arrow enabled

2019-06-04 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-27805. -- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24677 [https://gi

[jira] [Updated] (SPARK-27805) toPandas does not propagate SparkExceptions with arrow enabled

2019-06-04 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-27805: - Affects Version/s: (was: 3.1.0) 2.4.3 > toPandas does not propagate S

[jira] [Commented] (SPARK-27939) Defining a schema with VectorUDT

2019-06-04 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855966#comment-16855966 ] Bryan Cutler commented on SPARK-27939: -- The problem is the {{Row}} class sorts the

<    1   2   3   4   5   6   7   8   >