[jira] [Assigned] (SPARK-38098) Add support for ArrayType of nested StructType to arrow-based conversion

2022-09-22 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-38098: Assignee: Luca Canali > Add support for ArrayType of nested StructType to arrow-based

[jira] [Resolved] (SPARK-38098) Add support for ArrayType of nested StructType to arrow-based conversion

2022-09-22 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-38098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-38098. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 35391

[jira] [Resolved] (SPARK-39160) Remove workaround for ARROW-1948

2022-05-12 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-39160. -- Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 36518

[jira] [Assigned] (SPARK-39160) Remove workaround for ARROW-1948

2022-05-12 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-39160: Assignee: Cheng Pan > Remove workaround for ARROW-1948 >

[jira] [Assigned] (SPARK-34521) spark.createDataFrame does not support Pandas StringDtype extension type

2021-12-16 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-34521: Assignee: Nicolas Azrak > spark.createDataFrame does not support Pandas StringDtype

[jira] [Resolved] (SPARK-34521) spark.createDataFrame does not support Pandas StringDtype extension type

2021-12-15 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-34521. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34509

[jira] [Updated] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2021-10-28 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-21187: - Attachment: (was: 0--1172099527-254246775-1412485878) > Complete support for remaining

[jira] [Comment Edited] (SPARK-34463) toPandas failed with error: buffer source array is read-only when Arrow with self-destruct is enabled

2021-03-02 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17293893#comment-17293893 ] Bryan Cutler edited comment on SPARK-34463 at 3/2/21, 6:11 PM: --- As David

[jira] [Commented] (SPARK-34463) toPandas failed with error: buffer source array is read-only when Arrow with self-destruct is enabled

2021-03-02 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-34463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17293893#comment-17293893 ] Bryan Cutler commented on SPARK-34463: -- As David said, it depends on what is done in Pandas that

[jira] [Assigned] (SPARK-32953) Lower memory usage in toPandas with Arrow self_destruct

2021-02-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-32953: Assignee: David Li > Lower memory usage in toPandas with Arrow self_destruct >

[jira] [Resolved] (SPARK-32953) Lower memory usage in toPandas with Arrow self_destruct

2021-02-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-32953. -- Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 29818

[jira] [Commented] (SPARK-24632) Allow 3rd-party libraries to use pyspark.ml abstractions for Java wrappers for persistence

2020-12-28 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17255712#comment-17255712 ] Bryan Cutler commented on SPARK-24632: -- Ping [~huaxingao] in case you have some time to look into

[jira] [Resolved] (SPARK-33576) PythonException: An exception was thrown from a UDF: 'OSError: Invalid IPC message: negative bodyLength'.

2020-12-11 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-33576. -- Resolution: Duplicate Going to resolve as a duplicate, but please reopen if you find it is

[jira] [Commented] (SPARK-33576) PythonException: An exception was thrown from a UDF: 'OSError: Invalid IPC message: negative bodyLength'.

2020-12-11 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17248064#comment-17248064 ] Bryan Cutler commented on SPARK-33576: -- [~darshats] I believe the only current workaround is to

[jira] [Commented] (SPARK-33576) PythonException: An exception was thrown from a UDF: 'OSError: Invalid IPC message: negative bodyLength'.

2020-12-01 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17241769#comment-17241769 ] Bryan Cutler commented on SPARK-33576: -- Is this due to the 2GB limit? As in

[jira] [Commented] (SPARK-33489) Support null for conversion from and to Arrow type

2020-11-30 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17241092#comment-17241092 ] Bryan Cutler commented on SPARK-33489: -- Great, thanks [~cactice] ! Please feel free to ping me if

[jira] [Created] (SPARK-33613) [Python][Tests] Replace calls to deprecated test APIs

2020-11-30 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-33613: Summary: [Python][Tests] Replace calls to deprecated test APIs Key: SPARK-33613 URL: https://issues.apache.org/jira/browse/SPARK-33613 Project: Spark Issue

[jira] [Commented] (SPARK-33489) Support null for conversion from and to Arrow type

2020-11-25 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17238950#comment-17238950 ] Bryan Cutler commented on SPARK-33489: -- Yes, Arrow supports null type. Should be pretty

[jira] [Resolved] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2020-11-20 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-21187. -- Fix Version/s: 3.1.0 Resolution: Fixed With MapType now added, all basic types are

[jira] [Updated] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2020-11-20 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-21187: - Description: This is to track adding the remaining type support in Arrow Converters.

[jira] [Updated] (SPARK-32285) Add PySpark support for nested timestamps with arrow

2020-11-20 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-32285: - Parent: (was: SPARK-21187) Issue Type: Improvement (was: Sub-task) > Add PySpark

[jira] [Comment Edited] (SPARK-33279) Spark 3.0 failure due to lack of Arrow dependency

2020-11-01 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17224420#comment-17224420 ] Bryan Cutler edited comment on SPARK-33279 at 11/2/20, 5:21 AM:

[jira] [Commented] (SPARK-33279) Spark 3.0 failure due to lack of Arrow dependency

2020-11-01 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17224420#comment-17224420 ] Bryan Cutler commented on SPARK-33279: -- [~fan_li_ya] we should change the Arrow-Spark integration

[jira] [Commented] (SPARK-33213) Upgrade Apache Arrow to 2.0.0

2020-10-23 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17219840#comment-17219840 ] Bryan Cutler commented on SPARK-33213: -- Just a couple notes: The library and format versions are

[jira] [Commented] (SPARK-33189) Support PyArrow 2.0.0+

2020-10-20 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17217779#comment-17217779 ] Bryan Cutler commented on SPARK-33189: -- There is an env var we can set that will use the old

[jira] [Updated] (SPARK-33073) Improve error handling on Pandas to Arrow conversion failures

2020-10-06 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-33073: - Description: Currently, when converting from Pandas to Arrow for Pandas UDF return values or

[jira] [Created] (SPARK-33073) Improve error handling on Pandas to Arrow conversion failures

2020-10-06 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-33073: Summary: Improve error handling on Pandas to Arrow conversion failures Key: SPARK-33073 URL: https://issues.apache.org/jira/browse/SPARK-33073 Project: Spark

[jira] [Commented] (SPARK-24554) Add MapType Support for Arrow in PySpark

2020-10-01 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205719#comment-17205719 ] Bryan Cutler commented on SPARK-24554: -- I started working on this, but ran into an issue at

[jira] [Commented] (SPARK-32312) Upgrade Apache Arrow to 1.0.0

2020-09-03 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190553#comment-17190553 ] Bryan Cutler commented on SPARK-32312: -- Sorry for the delay, I was holding off for a couple of

[jira] [Assigned] (SPARK-32686) Un-deprecate inferring DataFrame schema from list of dictionaries

2020-08-24 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-32686: Assignee: Nicholas Chammas > Un-deprecate inferring DataFrame schema from list of

[jira] [Resolved] (SPARK-32686) Un-deprecate inferring DataFrame schema from list of dictionaries

2020-08-24 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-32686. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 29510

[jira] [Closed] (SPARK-32413) Guidance for my project

2020-07-23 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler closed SPARK-32413. > Guidance for my project > > > Key: SPARK-32413 >

[jira] [Resolved] (SPARK-32413) Guidance for my project

2020-07-23 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-32413. -- Resolution: Not A Problem Hi [~stoksoz] , this type of discussion is more appropriate for the

[jira] [Assigned] (SPARK-32300) toPandas with no partitions should work

2020-07-14 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-32300: Assignee: Hyukjin Kwon > toPandas with no partitions should work >

[jira] [Resolved] (SPARK-32300) toPandas with no partitions should work

2020-07-14 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-32300. -- Fix Version/s: 2.4.7 Resolution: Fixed Issue resolved by pull request 29098

[jira] [Commented] (SPARK-32312) Upgrade Apache Arrow to 1.0.0

2020-07-14 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17157543#comment-17157543 ] Bryan Cutler commented on SPARK-32312: -- I've been doing local testing and will submit a WIP PR

[jira] [Created] (SPARK-32312) Upgrade Apache Arrow to 1.0.0

2020-07-14 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-32312: Summary: Upgrade Apache Arrow to 1.0.0 Key: SPARK-32312 URL: https://issues.apache.org/jira/browse/SPARK-32312 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2020-07-12 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-21187: - Description: This is to track adding the remaining type support in Arrow Converters.

[jira] [Created] (SPARK-32285) Add PySpark support for nested timestamps with arrow

2020-07-12 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-32285: Summary: Add PySpark support for nested timestamps with arrow Key: SPARK-32285 URL: https://issues.apache.org/jira/browse/SPARK-32285 Project: Spark Issue

[jira] [Resolved] (SPARK-32174) toPandas attempted Arrow optimization but has reached an error and can not continue

2020-07-08 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-32174. -- Resolution: Not A Problem Great, I will mark this as resolved then.  We should add the

[jira] [Commented] (SPARK-32174) toPandas attempted Arrow optimization but has reached an error and can not continue

2020-07-07 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152958#comment-17152958 ] Bryan Cutler commented on SPARK-32174: -- >From the stacktrace, it looks like you are using JDK9 or

[jira] [Created] (SPARK-32162) Improve Pandas Grouped Map with Window test output

2020-07-02 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-32162: Summary: Improve Pandas Grouped Map with Window test output Key: SPARK-32162 URL: https://issues.apache.org/jira/browse/SPARK-32162 Project: Spark Issue

[jira] [Assigned] (SPARK-32098) Use iloc for positional slicing instead of direct slicing in createDataFrame with Arrow

2020-06-25 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-32098: Assignee: Hyukjin Kwon > Use iloc for positional slicing instead of direct slicing in

[jira] [Resolved] (SPARK-32098) Use iloc for positional slicing instead of direct slicing in createDataFrame with Arrow

2020-06-25 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-32098. -- Fix Version/s: 3.1.0 2.4.7 3.0.1 Resolution:

[jira] [Updated] (SPARK-31998) Change package references for ArrowBuf

2020-06-24 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-31998: - Component/s: (was: Spark Core) SQL > Change package references for

[jira] [Updated] (SPARK-31998) Change package references for ArrowBuf

2020-06-24 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-31998: - Issue Type: Improvement (was: Bug) > Change package references for ArrowBuf >

[jira] [Updated] (SPARK-32080) Simplify ArrowColumnVector ListArray accessor

2020-06-23 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-32080: - Priority: Trivial (was: Major) > Simplify ArrowColumnVector ListArray accessor >

[jira] [Created] (SPARK-32080) Simplify ArrowColumnVector ListArray accessor

2020-06-23 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-32080: Summary: Simplify ArrowColumnVector ListArray accessor Key: SPARK-32080 URL: https://issues.apache.org/jira/browse/SPARK-32080 Project: Spark Issue Type:

[jira] [Created] (SPARK-31964) Avoid Pandas import for CategoricalDtype with Arrow conversion

2020-06-10 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-31964: Summary: Avoid Pandas import for CategoricalDtype with Arrow conversion Key: SPARK-31964 URL: https://issues.apache.org/jira/browse/SPARK-31964 Project: Spark

[jira] [Resolved] (SPARK-31915) Resolve the grouping column properly per the case sensitivity in grouped and cogrouped pandas UDFs

2020-06-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-31915. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 28777

[jira] [Assigned] (SPARK-31915) Resolve the grouping column properly per the case sensitivity in grouped and cogrouped pandas UDFs

2020-06-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-31915: Assignee: Hyukjin Kwon > Resolve the grouping column properly per the case sensitivity

[jira] [Resolved] (SPARK-25351) Handle Pandas category type when converting from Python with Arrow

2020-05-27 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-25351. -- Fix Version/s: 3.1.0 Resolution: Fixed Issue resolved by pull request 26585

[jira] [Assigned] (SPARK-25351) Handle Pandas category type when converting from Python with Arrow

2020-05-27 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-25351: Assignee: Jalpan Randeri > Handle Pandas category type when converting from Python with

[jira] [Commented] (SPARK-31704) PandasUDFType.GROUPED_AGG with Java 11

2020-05-13 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106500#comment-17106500 ] Bryan Cutler commented on SPARK-31704: -- This is due to a Netty API that Arrow uses and

[jira] [Commented] (SPARK-31629) "py4j.protocol.Py4JJavaError: An error occurred while calling o90.save" in pyspark 2.3.1

2020-05-05 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17100154#comment-17100154 ] Bryan Cutler commented on SPARK-31629: -- [~appleyuchi] are you able to try out a more recent version

[jira] [Assigned] (SPARK-31306) rand() function documentation suggests an inclusive upper bound of 1.0

2020-04-13 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-31306: Assignee: Ben > rand() function documentation suggests an inclusive upper bound of 1.0 >

[jira] [Assigned] (SPARK-31306) rand() function documentation suggests an inclusive upper bound of 1.0

2020-04-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-31306: Assignee: Bryan Cutler > rand() function documentation suggests an inclusive upper bound

[jira] [Assigned] (SPARK-31306) rand() function documentation suggests an inclusive upper bound of 1.0

2020-04-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-31306: Assignee: (was: Bryan Cutler) > rand() function documentation suggests an inclusive

[jira] [Resolved] (SPARK-31306) rand() function documentation suggests an inclusive upper bound of 1.0

2020-04-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-31306. -- Resolution: Fixed Issue resolved by pull request 28071

[jira] [Updated] (SPARK-31299) Pyspark.ml.clustering illegalArgumentException with dataframe created from rows

2020-04-01 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-31299: - Description: I hope this is the right place and way to report a bug in (at least) the PySpark

[jira] [Commented] (SPARK-31299) Pyspark.ml.clustering illegalArgumentException with dataframe created from rows

2020-04-01 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073027#comment-17073027 ] Bryan Cutler commented on SPARK-31299: -- It looks like you are using {{DenseVector}} from

[jira] [Resolved] (SPARK-31299) Pyspark.ml.clustering illegalArgumentException with dataframe created from rows

2020-04-01 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-31299. -- Resolution: Not A Problem > Pyspark.ml.clustering illegalArgumentException with dataframe

[jira] [Commented] (SPARK-30961) Arrow enabled: to_pandas with date column fails

2020-03-06 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17053717#comment-17053717 ] Bryan Cutler commented on SPARK-30961: -- Just to be clear, this is only an issue with Spark 2.4.x.

[jira] [Resolved] (SPARK-30961) Arrow enabled: to_pandas with date column fails

2020-03-06 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-30961. -- Resolution: Won't Fix Thanks [~KevinAppel] and [~nicornk] for the info, I'll go ahead and

[jira] [Commented] (SPARK-30961) Arrow enabled: to_pandas with date column fails

2020-02-27 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046801#comment-17046801 ] Bryan Cutler commented on SPARK-30961: -- Yes, we should be able to keep Spark 3.x up to date with

[jira] [Commented] (SPARK-30961) Arrow enabled: to_pandas with date column fails

2020-02-26 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045961#comment-17045961 ] Bryan Cutler commented on SPARK-30961: -- [~nicornk] there were a number of fixes related to Arrow

[jira] [Updated] (SPARK-30861) Deprecate constructor of SQLContext and getOrCreate in SQLContext at PySpark

2020-02-20 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-30861: - Fix Version/s: 3.0.0 > Deprecate constructor of SQLContext and getOrCreate in SQLContext at

[jira] [Resolved] (SPARK-30861) Deprecate constructor of SQLContext and getOrCreate in SQLContext at PySpark

2020-02-20 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-30861. -- Resolution: Fixed > Deprecate constructor of SQLContext and getOrCreate in SQLContext at

[jira] [Commented] (SPARK-30861) Deprecate constructor of SQLContext and getOrCreate in SQLContext at PySpark

2020-02-20 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041268#comment-17041268 ] Bryan Cutler commented on SPARK-30861: -- Issue resolved by pull request 27614

[jira] [Assigned] (SPARK-30861) Deprecate constructor of SQLContext and getOrCreate in SQLContext at PySpark

2020-02-20 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-30861: Assignee: Hyukjin Kwon > Deprecate constructor of SQLContext and getOrCreate in

[jira] [Updated] (SPARK-30834) Add note for recommended versions of Pandas and PyArrow for 2.4.x

2020-02-14 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-30834: - Component/s: PySpark > Add note for recommended versions of Pandas and PyArrow for 2.4.x >

[jira] [Updated] (SPARK-30834) Add note for recommended versions of Pandas and PyArrow for 2.4.x

2020-02-14 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-30834: - Description: CI testing for branch 2.4 has been with the versions below. These are recommened

[jira] [Updated] (SPARK-30834) Add note for recommended versions of Pandas and PyArrow for 2.4.x

2020-02-14 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-30834: - Description: CI testing for branch 2.4 has been with the versions below. These are recommened

[jira] [Created] (SPARK-30834) Add note for recommended versions of Pandas and PyArrow for 2.4.x

2020-02-14 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-30834: Summary: Add note for recommended versions of Pandas and PyArrow for 2.4.x Key: SPARK-30834 URL: https://issues.apache.org/jira/browse/SPARK-30834 Project: Spark

[jira] [Commented] (SPARK-30777) PySpark test_arrow tests fail with Pandas >= 1.0.0

2020-02-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033873#comment-17033873 ] Bryan Cutler commented on SPARK-30777: -- [~dongjoon] I don't think it's a blocker, only the tests

[jira] [Commented] (SPARK-30777) PySpark test_arrow tests fail with Pandas >= 1.0.0

2020-02-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033871#comment-17033871 ] Bryan Cutler commented on SPARK-30777: -- I'm working on the patch > PySpark test_arrow tests fail

[jira] [Created] (SPARK-30777) PySpark test_arrow tests fail with Pandas > 1.0.0

2020-02-10 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-30777: Summary: PySpark test_arrow tests fail with Pandas > 1.0.0 Key: SPARK-30777 URL: https://issues.apache.org/jira/browse/SPARK-30777 Project: Spark Issue

[jira] [Resolved] (SPARK-30640) Prevent unnessary copies of data in Arrow to Pandas conversion with Timestamps

2020-01-26 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-30640. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 27358

[jira] [Assigned] (SPARK-30640) Prevent unnessary copies of data in Arrow to Pandas conversion with Timestamps

2020-01-26 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-30640: Assignee: Bryan Cutler > Prevent unnessary copies of data in Arrow to Pandas conversion

[jira] [Created] (SPARK-30640) Prevent unnessary copies of data in Arrow to Pandas conversion with Timestamps

2020-01-24 Thread Bryan Cutler (Jira)
Bryan Cutler created SPARK-30640: Summary: Prevent unnessary copies of data in Arrow to Pandas conversion with Timestamps Key: SPARK-30640 URL: https://issues.apache.org/jira/browse/SPARK-30640

[jira] [Commented] (SPARK-24915) Calling SparkSession.createDataFrame with schema can throw exception

2020-01-20 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019770#comment-17019770 ] Bryan Cutler commented on SPARK-24915: -- [~jhereth] since there is already a lot of discussion on

[jira] [Reopened] (SPARK-24915) Calling SparkSession.createDataFrame with schema can throw exception

2020-01-13 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reopened SPARK-24915: -- > Calling SparkSession.createDataFrame with schema can throw exception >

[jira] [Commented] (SPARK-24915) Calling SparkSession.createDataFrame with schema can throw exception

2020-01-13 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014719#comment-17014719 ] Bryan Cutler commented on SPARK-24915: -- [~jhereth] apologies for closing prematurely, I didn't know

[jira] [Resolved] (SPARK-24915) Calling SparkSession.createDataFrame with schema can throw exception

2020-01-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-24915. -- Resolution: Won't Fix Closing in favor of fix in SPARK-29748 > Calling

[jira] [Resolved] (SPARK-22232) Row objects in pyspark created using the `Row(**kwars)` syntax do not get serialized/deserialized properly

2020-01-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-22232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-22232. -- Resolution: Won't Fix Closing in favor for fix in SPARK-29748 > Row objects in pyspark

[jira] [Assigned] (SPARK-29748) Remove sorting of fields in PySpark SQL Row creation

2020-01-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-29748: Assignee: Bryan Cutler > Remove sorting of fields in PySpark SQL Row creation >

[jira] [Resolved] (SPARK-29748) Remove sorting of fields in PySpark SQL Row creation

2020-01-10 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-29748. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26496

[jira] [Commented] (SPARK-28502) Error with struct conversion while using pandas_udf

2019-12-18 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16999545#comment-16999545 ] Bryan Cutler commented on SPARK-28502: -- The problem is that returning nested StructTypes is not

[jira] [Resolved] (SPARK-28502) Error with struct conversion while using pandas_udf

2019-12-18 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-28502. -- Resolution: Fixed > Error with struct conversion while using pandas_udf >

[jira] [Commented] (SPARK-29748) Remove sorting of fields in PySpark SQL Row creation

2019-12-03 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987252#comment-16987252 ] Bryan Cutler commented on SPARK-29748: -- [~zero323] I made some updates to the PR with remove the

[jira] [Commented] (SPARK-30063) Failure when returning a value from multiple Pandas UDFs

2019-12-03 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-30063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987154#comment-16987154 ] Bryan Cutler commented on SPARK-30063: -- I haven't looked at your bug report in detail but you are

[jira] [Assigned] (SPARK-29691) Estimator fit method fails to copy params (in PySpark)

2019-11-19 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-29691: Assignee: John Bauer > Estimator fit method fails to copy params (in PySpark) >

[jira] [Resolved] (SPARK-29691) Estimator fit method fails to copy params (in PySpark)

2019-11-19 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-29691. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26527

[jira] [Commented] (SPARK-29748) Remove sorting of fields in PySpark SQL Row creation

2019-11-19 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977749#comment-16977749 ] Bryan Cutler commented on SPARK-29748: -- [~zero323] and [~jhereth] this is targeted for Spark 3.0

[jira] [Commented] (SPARK-29748) Remove sorting of fields in PySpark SQL Row creation

2019-11-14 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974678#comment-16974678 ] Bryan Cutler commented on SPARK-29748: -- Thanks for discussing [~zero323] . The goal here is to only

[jira] [Commented] (SPARK-29493) Add MapType support for Arrow Java

2019-11-12 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972833#comment-16972833 ] Bryan Cutler commented on SPARK-29493: -- [~jalpan.randeri] this depends on SPARK-29376 for a newer

[jira] [Reopened] (SPARK-25351) Handle Pandas category type when converting from Python with Arrow

2019-11-12 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-25351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reopened SPARK-25351: -- reopening, this should be straightforward to add > Handle Pandas category type when converting

[jira] [Resolved] (SPARK-29798) Infers bytes as binary type in Python 3 at PySpark

2019-11-08 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-29798. -- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26432

[jira] [Assigned] (SPARK-29798) Infers bytes as binary type in Python 3 at PySpark

2019-11-08 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-29798: Assignee: Hyukjin Kwon > Infers bytes as binary type in Python 3 at PySpark >

[jira] [Updated] (SPARK-29376) Upgrade Apache Arrow to 0.15.1

2019-11-08 Thread Bryan Cutler (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-29376: - Description: Apache Arrow 0.15.0 was just released see

  1   2   3   4   5   6   7   8   >