[jira] [Comment Edited] (SPARK-22034) CrossValidator's training and testing set with different set of labels, resulting in encoder transform error

2017-09-25 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179938#comment-16179938 ] Bryan Cutler edited comment on SPARK-22034 at 9/25/17 11:18 PM: You would

[jira] [Commented] (SPARK-12717) pyspark broadcast fails when using multiple threads

2017-09-25 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179440#comment-16179440 ] Bryan Cutler commented on SPARK-12717: -- Hi [~avloss], the fix will be in Spark 2.1.2 which will be

[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-09-01 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150924#comment-16150924 ] Bryan Cutler commented on SPARK-21190: -- I'm good with the API summary proposed by [~ueshin], but I'm

[jira] [Comment Edited] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-09-01 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150924#comment-16150924 ] Bryan Cutler edited comment on SPARK-21190 at 9/1/17 5:56 PM: -- I'm good with

[jira] [Updated] (SPARK-22324) Upgrade Arrow to version 0.8.0

2017-10-23 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-22324: - Description: Arrow version 0.8.0 is slated for release in early November, but I'd like to start

[jira] [Commented] (SPARK-22250) Be less restrictive on type checking

2017-10-19 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211750#comment-16211750 ] Bryan Cutler commented on SPARK-22250: -- [~ferdonline] maybe SPARK-20791 would help you out when

[jira] [Commented] (SPARK-22209) PySpark does not recognize imports from submodules

2017-10-19 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211772#comment-16211772 ] Bryan Cutler commented on SPARK-22209: -- As a workaround, you could probably do the following {code}

[jira] [Updated] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2017-11-15 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-21187: - Description: This is to track adding the remaining type support in Arrow Converters.

[jira] [Commented] (SPARK-22530) Add ArrayType Support for working with Pandas and Arrow

2017-11-15 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16254074#comment-16254074 ] Bryan Cutler commented on SPARK-22530: -- working on it > Add ArrayType Support for working with

[jira] [Created] (SPARK-22530) Add ArrayType Support for working with Pandas and Arrow

2017-11-15 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-22530: Summary: Add ArrayType Support for working with Pandas and Arrow Key: SPARK-22530 URL: https://issues.apache.org/jira/browse/SPARK-22530 Project: Spark

[jira] [Commented] (SPARK-22324) Upgrade Arrow to version 0.8.0

2017-11-15 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16254054#comment-16254054 ] Bryan Cutler commented on SPARK-22324: -- I started working on this to test out latest changes in

[jira] [Comment Edited] (SPARK-22534) Add integration test case to explicitly verify optional validity buffer

2017-11-15 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16254450#comment-16254450 ] Bryan Cutler edited comment on SPARK-22534 at 11/15/17 11:37 PM: - Opened

[jira] [Commented] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2017-12-04 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16277549#comment-16277549 ] Bryan Cutler commented on SPARK-21187: -- Hi [~icexelloss], StructType has been added on the Java

[jira] [Created] (SPARK-22534) Add integration test case to explicitly verify optional validity buffer

2017-11-15 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-22534: Summary: Add integration test case to explicitly verify optional validity buffer Key: SPARK-22534 URL: https://issues.apache.org/jira/browse/SPARK-22534 Project:

[jira] [Closed] (SPARK-22534) Add integration test case to explicitly verify optional validity buffer

2017-11-15 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler closed SPARK-22534. > Add integration test case to explicitly verify optional validity buffer >

[jira] [Resolved] (SPARK-22534) Add integration test case to explicitly verify optional validity buffer

2017-11-15 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-22534. -- Resolution: Not A Problem Opened by mistake > Add integration test case to explicitly verify

[jira] [Updated] (SPARK-22484) PySpark DataFrame.write.csv(quote="") uses nullchar as quote

2017-11-15 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-22484: - Component/s: PySpark > PySpark DataFrame.write.csv(quote="") uses nullchar as quote >

[jira] [Commented] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2017-11-17 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16257742#comment-16257742 ] Bryan Cutler commented on SPARK-21187: -- [~icexelloss] It looks like there is a bug in older Arrow

[jira] [Commented] (SPARK-22147) BlockId.hashCode allocates a StringBuilder/String on each call

2017-11-03 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237775#comment-16237775 ] Bryan Cutler commented on SPARK-22147: -- Sorry, I linked the above PR to this JIRA accidentally >

[jira] [Created] (SPARK-22417) createDataFrame from a pandas.DataFrame reads datetime64 values as longs

2017-11-01 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-22417: Summary: createDataFrame from a pandas.DataFrame reads datetime64 values as longs Key: SPARK-22417 URL: https://issues.apache.org/jira/browse/SPARK-22417 Project:

[jira] [Updated] (SPARK-22324) Upgrade Arrow to version 0.8.0

2017-11-07 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-22324: - Description: Arrow version 0.8.0 is slated for release in early November, but I'd like to start

[jira] [Resolved] (SPARK-22209) PySpark does not recognize imports from submodules

2017-11-07 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-22209. -- Resolution: Fixed Fix Version/s: 2.3.0 Resolving this as fixed upstream by SPARK-21753,

[jira] [Commented] (SPARK-22209) PySpark does not recognize imports from submodules

2017-10-25 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219322#comment-16219322 ] Bryan Cutler commented on SPARK-22209: -- I tried the example with the latest master and did not get

[jira] [Updated] (SPARK-22324) Upgrade Arrow to version 0.8.0

2017-10-24 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-22324: - Description: Arrow version 0.8.0 is slated for release in early November, but I'd like to start

[jira] [Comment Edited] (SPARK-22323) Design doc for different types of pandas_udf

2017-10-20 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213182#comment-16213182 ] Bryan Cutler edited comment on SPARK-22323 at 10/20/17 8:30 PM: Is this

[jira] [Commented] (SPARK-22323) Design doc for different types of pandas_udf

2017-10-20 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213151#comment-16213151 ] Bryan Cutler commented on SPARK-22323: -- Should I close SPARK-1 since it looks like the docs will

[jira] [Commented] (SPARK-22323) Design doc for different types of pandas_udf

2017-10-20 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213182#comment-16213182 ] Bryan Cutler commented on SPARK-22323: -- I this meant to be a user doc? > Design doc for different

[jira] [Updated] (SPARK-23874) Upgrade apache/arrow to 0.10.0

2018-05-14 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-23874: - Description: Version 0.10.0 will allow for the following improvements and bug fixes: * Allow

[jira] [Commented] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2018-05-14 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474664#comment-16474664 ] Bryan Cutler commented on SPARK-21187: -- Hi [~ewohlstadter], thanks for the interest!  The Map type

[jira] [Commented] (SPARK-22232) Row objects in pyspark created using the `Row(**kwars)` syntax do not get serialized/deserialized properly

2018-05-11 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472714#comment-16472714 ] Bryan Cutler commented on SPARK-22232: -- I'm closing the PR for now, will reopen for Spark 3.0.0.

[jira] [Updated] (SPARK-23161) Add missing APIs to Python GBTClassifier

2018-05-07 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-23161: - Description: GBTClassifier is missing \{{featureSubsetStrategy}}.  This should be moved to

[jira] [Created] (SPARK-24392) Mark pandas_udf as Experimental

2018-05-25 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-24392: Summary: Mark pandas_udf as Experimental Key: SPARK-24392 URL: https://issues.apache.org/jira/browse/SPARK-24392 Project: Spark Issue Type: Task

[jira] [Updated] (SPARK-24392) Mark pandas_udf as Experimental

2018-05-25 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-24392: - Priority: Blocker (was: Critical) > Mark pandas_udf as Experimental >

[jira] [Updated] (SPARK-24392) Mark pandas_udf as Experimental

2018-05-25 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-24392: - Fix Version/s: 2.3.1 > Mark pandas_udf as Experimental > --- > >

[jira] [Comment Edited] (SPARK-24392) Mark pandas_udf as Experimental

2018-05-25 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491305#comment-16491305 ] Bryan Cutler edited comment on SPARK-24392 at 5/25/18 9:53 PM: --- Targeting

[jira] [Commented] (SPARK-24392) Mark pandas_udf as Experimental

2018-05-25 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491305#comment-16491305 ] Bryan Cutler commented on SPARK-24392: -- Targeting 2.3.1 > Mark pandas_udf as Experimental >

[jira] [Updated] (SPARK-24324) Pandas Grouped Map UserDefinedFunction mixes column labels

2018-05-24 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-24324: - Summary: Pandas Grouped Map UserDefinedFunction mixes column labels (was: UserDefinedFunction

[jira] [Commented] (SPARK-24324) UserDefinedFunction mixes column labels

2018-05-24 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489533#comment-16489533 ] Bryan Cutler commented on SPARK-24324: -- I was able to reproduce, the problem is that when pyspark

[jira] [Assigned] (SPARK-24303) Update cloudpickle to v0.4.4

2018-05-18 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-24303: Assignee: Hyukjin Kwon > Update cloudpickle to v0.4.4 > > >

[jira] [Commented] (SPARK-24303) Update cloudpickle to v0.4.4

2018-05-18 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480912#comment-16480912 ] Bryan Cutler commented on SPARK-24303: -- Issue resolved by pull request 21350

[jira] [Resolved] (SPARK-24303) Update cloudpickle to v0.4.4

2018-05-18 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-24303. -- Resolution: Fixed Fix Version/s: 2.4.0 > Update cloudpickle to v0.4.4 >

[jira] [Created] (SPARK-24319) run-example can not print usage

2018-05-18 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-24319: Summary: run-example can not print usage Key: SPARK-24319 URL: https://issues.apache.org/jira/browse/SPARK-24319 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-24554) Add MapType Support for Arrow in PySpark

2018-06-13 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-24554: Summary: Add MapType Support for Arrow in PySpark Key: SPARK-24554 URL: https://issues.apache.org/jira/browse/SPARK-24554 Project: Spark Issue Type:

[jira] [Commented] (SPARK-24554) Add MapType Support for Arrow in PySpark

2018-06-13 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511691#comment-16511691 ] Bryan Cutler commented on SPARK-24554: -- There still is work to be done to add a Map logical type to

[jira] [Updated] (SPARK-23874) Upgrade apache/arrow to 0.10.0

2018-06-13 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-23874: - Description: Version 0.10.0 will allow for the following improvements and bug fixes: * Allow

[jira] [Assigned] (SPARK-23161) Add missing APIs to Python GBTClassifier

2018-05-30 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-23161: Assignee: Huaxin Gao > Add missing APIs to Python GBTClassifier >

[jira] [Resolved] (SPARK-23161) Add missing APIs to Python GBTClassifier

2018-05-30 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-23161. -- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21413

[jira] [Updated] (SPARK-24392) Mark pandas_udf as Experimental

2018-05-29 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-24392: - Fix Version/s: 2.4.0 > Mark pandas_udf as Experimental > --- > >

[jira] [Created] (SPARK-24444) Improve pandas_udf GROUPED_MAP docs to explain column assignment

2018-05-31 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-2: Summary: Improve pandas_udf GROUPED_MAP docs to explain column assignment Key: SPARK-2 URL: https://issues.apache.org/jira/browse/SPARK-2 Project: Spark

[jira] [Updated] (SPARK-23874) Upgrade apache/arrow to 0.10.0

2018-05-31 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-23874: - Description: Version 0.10.0 will allow for the following improvements and bug fixes: * Allow

[jira] [Updated] (SPARK-24444) Improve pandas_udf GROUPED_MAP docs to explain column assignment

2018-05-31 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-2: - Target Version/s: 2.3.1, 2.4.0 (was: 2.3.1) > Improve pandas_udf GROUPED_MAP docs to explain

[jira] [Commented] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2018-05-31 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497244#comment-16497244 ] Bryan Cutler commented on SPARK-21187: -- Hi [~teddy.choi], MapType still needs some work to be done

[jira] [Commented] (SPARK-23858) Need to apply pyarrow adjustments to complex types with DateType/TimestampType

2018-06-25 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522916#comment-16522916 ] Bryan Cutler commented on SPARK-23858: -- [~semanticbeeng] sorry, there aren't failing tests I can

[jira] [Commented] (SPARK-24579) SPIP: Standardize Optimized Data Exchange between Spark and DL/AI frameworks

2018-06-25 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522869#comment-16522869 ] Bryan Cutler commented on SPARK-24579: -- I left some comments on the shared doc, overall sounds

[jira] [Assigned] (SPARK-24057) put the real data type in the AssertionError message

2018-04-26 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-24057: Assignee: Huaxin Gao > put the real data type in the AssertionError message >

[jira] [Resolved] (SPARK-24057) put the real data type in the AssertionError message

2018-04-26 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-24057. -- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21159

[jira] [Resolved] (SPARK-24044) Explicitly print out skipped tests from unittest module

2018-04-26 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-24044. -- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21107

[jira] [Assigned] (SPARK-24044) Explicitly print out skipped tests from unittest module

2018-04-26 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-24044: Assignee: Hyukjin Kwon > Explicitly print out skipped tests from unittest module >

[jira] [Created] (SPARK-22324) Upgrade Arrow to version 0.8.0

2017-10-20 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-22324: Summary: Upgrade Arrow to version 0.8.0 Key: SPARK-22324 URL: https://issues.apache.org/jira/browse/SPARK-22324 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-21750) Use arrow 0.6.0

2017-10-20 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213012#comment-16213012 ] Bryan Cutler commented on SPARK-21750: -- Thanks [~dongjoon], I opened SPARK-22324 under the Arrow

[jira] [Comment Edited] (SPARK-22209) PySpark does not recognize imports from submodules

2017-10-20 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212890#comment-16212890 ] Bryan Cutler edited comment on SPARK-22209 at 10/20/17 5:01 PM: It does

[jira] [Commented] (SPARK-22209) PySpark does not recognize imports from submodules

2017-10-20 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16212890#comment-16212890 ] Bryan Cutler commented on SPARK-22209: -- It does seem like a bug to me so it should be fixed, I

[jira] [Commented] (SPARK-22126) Fix model-specific optimization support for ML tuning

2017-12-31 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307145#comment-16307145 ] Bryan Cutler commented on SPARK-22126: -- Hi All, I've been following the discussions here and the

[jira] [Commented] (SPARK-22126) Fix model-specific optimization support for ML tuning

2018-01-05 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313833#comment-16313833 ] Bryan Cutler commented on SPARK-22126: -- Hi [~bago.amirbekian], I was looking into similar pipeline

[jira] [Commented] (SPARK-23009) PySpark should not assume Pandas cols are a basestring type

2018-01-09 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16318990#comment-16318990 ] Bryan Cutler commented on SPARK-23009: -- I can put in a fix for this > PySpark should not assume

[jira] [Updated] (SPARK-23009) PySpark should not assume Pandas cols are a basestring type

2018-01-09 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-23009: - Description: When calling {{SparkSession.createDataFrame}} using a Pandas DataFrame as input,

[jira] [Updated] (SPARK-23009) PySpark should not assume Pandas cols are a basestring type

2018-01-09 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-23009: - Description: When calling {{SparkSession.createDataFrame}} using a Pandas DataFrame as input,

[jira] [Created] (SPARK-23009) PySpark should not assume Pandas cols are a basestring type

2018-01-09 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-23009: Summary: PySpark should not assume Pandas cols are a basestring type Key: SPARK-23009 URL: https://issues.apache.org/jira/browse/SPARK-23009 Project: Spark

[jira] [Commented] (SPARK-23018) PySpark creatDataFrame causes Pandas warning of assignment to a copy of a reference

2018-01-09 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319417#comment-16319417 ] Bryan Cutler commented on SPARK-23018: -- I can submit a PR > PySpark creatDataFrame causes Pandas

[jira] [Created] (SPARK-23018) PySpark creatDataFrame causes Pandas warning of assignment to a copy of a reference

2018-01-09 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-23018: Summary: PySpark creatDataFrame causes Pandas warning of assignment to a copy of a reference Key: SPARK-23018 URL: https://issues.apache.org/jira/browse/SPARK-23018

[jira] [Updated] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2018-01-10 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-21187: - Description: This is to track adding the remaining type support in Arrow Converters.

[jira] [Commented] (SPARK-23030) Decrease memory consumption with toPandas() collection using Arrow

2018-01-10 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16320942#comment-16320942 ] Bryan Cutler commented on SPARK-23030: -- I'm looking into this, will submit a WIP PR if I see an

[jira] [Created] (SPARK-23030) Decrease memory consumption with toPandas() collection using Arrow

2018-01-10 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-23030: Summary: Decrease memory consumption with toPandas() collection using Arrow Key: SPARK-23030 URL: https://issues.apache.org/jira/browse/SPARK-23030 Project: Spark

[jira] [Commented] (SPARK-12717) pyspark broadcast fails when using multiple threads

2018-01-15 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326459#comment-16326459 ] Bryan Cutler commented on SPARK-12717: -- Hi [~codlife], you can use Spark 2.2.1 which was released in

[jira] [Commented] (SPARK-23159) Update Cloudpickle to match version 0.4.2

2018-01-19 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332752#comment-16332752 ] Bryan Cutler commented on SPARK-23159: -- I can work on this > Update Cloudpickle to match version

[jira] [Commented] (SPARK-23109) ML 2.3 QA: API: Python API coverage

2018-01-19 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332698#comment-16332698 ] Bryan Cutler commented on SPARK-23109: -- I did the following: generated HTML doc and checked for

[jira] [Commented] (SPARK-23109) ML 2.3 QA: API: Python API coverage

2018-01-19 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332703#comment-16332703 ] Bryan Cutler commented on SPARK-23109: -- [~josephkb] the image module is missing many of the get*

[jira] [Created] (SPARK-23159) Update Cloudpickle to match version 0.4.2

2018-01-19 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-23159: Summary: Update Cloudpickle to match version 0.4.2 Key: SPARK-23159 URL: https://issues.apache.org/jira/browse/SPARK-23159 Project: Spark Issue Type:

[jira] [Commented] (SPARK-22711) _pickle.PicklingError: args[0] from __newobj__ args has the wrong class from cloudpickle.py

2018-01-24 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338116#comment-16338116 ] Bryan Cutler commented on SPARK-22711: -- Yes, normally you would not need to import inside the

[jira] [Commented] (SPARK-22711) _pickle.PicklingError: args[0] from __newobj__ args has the wrong class from cloudpickle.py

2018-01-24 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338070#comment-16338070 ] Bryan Cutler commented on SPARK-22711: -- Hi [~PrateekRM], here is your code trimmed down to where the

[jira] [Commented] (SPARK-23109) ML 2.3 QA: API: Python API coverage

2018-01-17 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16329481#comment-16329481 ] Bryan Cutler commented on SPARK-23109: -- [~josephkb] I can take this, thanks! > ML 2.3 QA: API:

[jira] [Created] (SPARK-23258) Should not split Arrow record batches based on row count

2018-01-29 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-23258: Summary: Should not split Arrow record batches based on row count Key: SPARK-23258 URL: https://issues.apache.org/jira/browse/SPARK-23258 Project: Spark

[jira] [Commented] (SPARK-23109) ML 2.3 QA: API: Python API coverage

2018-01-29 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343665#comment-16343665 ] Bryan Cutler commented on SPARK-23109: -- Thanks [~mlnick], yes this is done. > ML 2.3 QA: API:

[jira] [Resolved] (SPARK-23109) ML 2.3 QA: API: Python API coverage

2018-01-29 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-23109. -- Resolution: Done > ML 2.3 QA: API: Python API coverage > --- >

[jira] [Comment Edited] (SPARK-23109) ML 2.3 QA: API: Python API coverage

2018-01-29 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332698#comment-16332698 ] Bryan Cutler edited comment on SPARK-23109 at 1/29/18 5:25 PM: --- I did the

[jira] [Comment Edited] (SPARK-23109) ML 2.3 QA: API: Python API coverage

2018-01-29 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332698#comment-16332698 ] Bryan Cutler edited comment on SPARK-23109 at 1/29/18 5:26 PM: --- I did the

[jira] [Updated] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2018-02-08 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-21187: - Description: This is to track adding the remaining type support in Arrow Converters. Currently,

[jira] [Updated] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2018-02-08 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-21187: - Description: This is to track adding the remaining type support in Arrow Converters. Currently,

[jira] [Updated] (SPARK-21187) Complete support for remaining Spark data types in Arrow Converters

2018-02-08 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-21187: - Description: This is to track adding the remaining type support in Arrow Converters. Currently,

[jira] [Commented] (SPARK-23244) Incorrect handling of default values when deserializing python wrappers of scala transformers

2018-02-08 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357504#comment-16357504 ] Bryan Cutler commented on SPARK-23244: -- This is same issue as SPARK-21685 caused by pyspark not

[jira] [Comment Edited] (SPARK-23244) Incorrect handling of default values when deserializing python wrappers of scala transformers

2018-02-08 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357504#comment-16357504 ] Bryan Cutler edited comment on SPARK-23244 at 2/8/18 8:08 PM: -- This is the

[jira] [Updated] (SPARK-23360) SparkSession.createDataFrame timestamps can be incorrect with non-Arrow codepath

2018-02-09 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-23360: - Summary: SparkSession.createDataFrame timestamps can be incorrect with non-Arrow codepath (was:

[jira] [Updated] (SPARK-23159) Update Cloudpickle to match version 0.4.3

2018-02-13 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-23159: - Summary: Update Cloudpickle to match version 0.4.3 (was: Update Cloudpickle to match version

[jira] [Updated] (SPARK-23159) Update Cloudpickle to match version 0.4.3

2018-02-13 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-23159: - Description: Update PySpark's version of Cloudpickle to match version 0.4.3.  The reasons for

[jira] [Commented] (SPARK-22126) Fix model-specific optimization support for ML tuning

2018-01-02 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308868#comment-16308868 ] Bryan Cutler commented on SPARK-22126: -- Thanks for taking a look [~josephkb]! I believe it's

[jira] [Assigned] (SPARK-24976) Allow None for Decimal type conversion (specific to PyArrow 0.9.0)

2018-07-31 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-24976: Assignee: Hyukjin Kwon > Allow None for Decimal type conversion (specific to PyArrow

[jira] [Resolved] (SPARK-24976) Allow None for Decimal type conversion (specific to PyArrow 0.9.0)

2018-07-31 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-24976. -- Resolution: Fixed Fix Version/s: 2.3.2 2.4.0 Issue resolved by pull

[jira] [Comment Edited] (SPARK-25060) PySpark UDF in case statement is always run

2018-08-08 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573866#comment-16573866 ] Bryan Cutler edited comment on SPARK-25060 at 8/8/18 9:03 PM: -- I believe

[jira] [Commented] (SPARK-25060) PySpark UDF in case statement is always run

2018-08-08 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573866#comment-16573866 ] Bryan Cutler commented on SPARK-25060: -- I believe this was brought up here

[jira] [Resolved] (SPARK-23874) Upgrade apache/arrow to 0.10.0

2018-08-14 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-23874. -- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21939

<    1   2   3   4   5   6   7   8   >