[jira] [Commented] (SPARK-48361) Correctness: CSV corrupt record filter with aggregate ignored

2024-05-23 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848936#comment-17848936 ] Ted Chester Jenks commented on SPARK-48361: --- Ah yes! Sorry [~bersprockets] I messed up my

[jira] [Updated] (SPARK-48361) Correctness: CSV corrupt record filter with aggregate ignored

2024-05-23 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Chester Jenks updated SPARK-48361: -- Description: Using corrupt record in CSV parsing for some data cleaning logic, I came

[jira] [Updated] (SPARK-48361) Correctness: CSV corrupt record filter with aggregate ignored

2024-05-22 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Chester Jenks updated SPARK-48361: -- Description: Using corrupt record in CSV parsing for some data cleaning logic, I came

[jira] [Commented] (SPARK-48361) Correctness: CSV corrupt record filter with aggregate ignored

2024-05-22 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848694#comment-17848694 ] Ted Chester Jenks commented on SPARK-48361: --- {code:java} +---+---+

[jira] [Updated] (SPARK-48361) Correctness: CSV corrupt record filter with aggregate ignored

2024-05-20 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Chester Jenks updated SPARK-48361: -- Description: Using corrupt record in CSV parsing for some data cleaning logic, I came

[jira] [Created] (SPARK-48361) Correctness: CSV corrupt record filter with aggregate ignored

2024-05-20 Thread Ted Chester Jenks (Jira)
Ted Chester Jenks created SPARK-48361: - Summary: Correctness: CSV corrupt record filter with aggregate ignored Key: SPARK-48361 URL: https://issues.apache.org/jira/browse/SPARK-48361 Project:

[jira] [Updated] (SPARK-48361) Correctness: CSV corrupt record filter with aggregate ignored

2024-05-20 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-48361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Chester Jenks updated SPARK-48361: -- Description: Using corrupt record in CSV parsing for some data cleaning logic, I came

[jira] [Resolved] (SPARK-43883) CTAS Command Nodes Prevent Some Optimizer Rules From Running

2024-03-05 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Chester Jenks resolved SPARK-43883. --- Resolution: Won't Fix > CTAS Command Nodes Prevent Some Optimizer Rules From

[jira] [Commented] (SPARK-47288) DataType __repr__ change breaks datatype checking (anit-)pattern

2024-03-05 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823717#comment-17823717 ] Ted Chester Jenks commented on SPARK-47288: --- [~gurwls223] I saw you on the original PR,

[jira] [Created] (SPARK-47288) DataType __repr__ change breaks datatype checking (anit-)pattern

2024-03-05 Thread Ted Chester Jenks (Jira)
Ted Chester Jenks created SPARK-47288: - Summary: DataType __repr__ change breaks datatype checking (anit-)pattern Key: SPARK-47288 URL: https://issues.apache.org/jira/browse/SPARK-47288 Project:

[jira] [Updated] (SPARK-47287) Aggregate in not causes

2024-03-05 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-47287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Chester Jenks updated SPARK-47287: -- Description:   The below snippet is confirmed working with Spark 3.2.1 and broken

[jira] [Created] (SPARK-47287) Aggregate in not causes

2024-03-05 Thread Ted Chester Jenks (Jira)
Ted Chester Jenks created SPARK-47287: - Summary: Aggregate in not causes Key: SPARK-47287 URL: https://issues.apache.org/jira/browse/SPARK-47287 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-33152) SPIP: Constraint Propagation code causes OOM issues or increasing compilation time to hours

2024-03-05 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823506#comment-17823506 ] Ted Chester Jenks commented on SPARK-33152: --- [~ashahid7] I see. This is very painful for us

[jira] [Commented] (SPARK-33152) SPIP: Constraint Propagation code causes OOM issues or increasing compilation time to hours

2024-03-04 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-33152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823314#comment-17823314 ] Ted Chester Jenks commented on SPARK-33152: --- [~cloud_fan] [~ashahid7] We have seen a huge

[jira] [Commented] (SPARK-44142) Utility to convert python types to spark types compares Python "type" object rather than user's "tpe" for categorical data types

2023-06-22 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17736100#comment-17736100 ] Ted Chester Jenks commented on SPARK-44142: --- https://github.com/apache/spark/pull/41697 >

[jira] [Updated] (SPARK-44142) Utility to convert python types to spark types compares Python "type" object rather than user's "tpe" for categorical data types

2023-06-22 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-44142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Chester Jenks updated SPARK-44142: -- Description: In the typehints utility that converts python types to spark types, the

[jira] [Created] (SPARK-44142) Utility to convert python types to spark types compares Python "type" object rather than user's "tpe" for categorical data types

2023-06-22 Thread Ted Chester Jenks (Jira)
Ted Chester Jenks created SPARK-44142: - Summary: Utility to convert python types to spark types compares Python "type" object rather than user's "tpe" for categorical data types Key: SPARK-44142 URL:

[jira] [Updated] (SPARK-43883) CTAS Command Nodes Prevent Some Optimizer Rules From Running

2023-05-30 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Chester Jenks updated SPARK-43883: -- Description: The changes introduced to resolve SPARK-41713 in

[jira] [Commented] (SPARK-43883) CTAS Command Nodes Prevent Some Optimizer Rules From Running

2023-05-30 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17727638#comment-17727638 ] Ted Chester Jenks commented on SPARK-43883: --- Working on a fix in

[jira] [Updated] (SPARK-43883) CTAS Command Nodes Prevent Some Optimizer Rules From Running

2023-05-30 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Chester Jenks updated SPARK-43883: -- Description: The changes introduced to resolve SPARK-41713 in

[jira] [Updated] (SPARK-43883) CTAS Command Nodes Prevent Some Optimizer Rules From Running

2023-05-30 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Chester Jenks updated SPARK-43883: -- Description: The changes introduced to resolve SPARK-41713 in

[jira] [Updated] (SPARK-43883) CTAS Command Nodes Prevent Some Optimizer Rules From Running

2023-05-30 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-43883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Chester Jenks updated SPARK-43883: -- Attachment: Not Working - Create Table.png Working - 3.2.0.png

[jira] [Created] (SPARK-43883) CTAS Command Nodes Prevent Some Optimizer Rules From Running

2023-05-30 Thread Ted Chester Jenks (Jira)
Ted Chester Jenks created SPARK-43883: - Summary: CTAS Command Nodes Prevent Some Optimizer Rules From Running Key: SPARK-43883 URL: https://issues.apache.org/jira/browse/SPARK-43883 Project:

[jira] [Comment Edited] (SPARK-42359) Support row skipping when reading CSV files

2023-02-15 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689008#comment-17689008 ] Ted Chester Jenks edited comment on SPARK-42359 at 2/15/23 10:23 AM: -

[jira] [Comment Edited] (SPARK-42359) Support row skipping when reading CSV files

2023-02-15 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689008#comment-17689008 ] Ted Chester Jenks edited comment on SPARK-42359 at 2/15/23 10:23 AM: -

[jira] [Comment Edited] (SPARK-42359) Support row skipping when reading CSV files

2023-02-15 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689008#comment-17689008 ] Ted Chester Jenks edited comment on SPARK-42359 at 2/15/23 10:23 AM: -

[jira] [Commented] (SPARK-42359) Support row skipping when reading CSV files

2023-02-15 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689008#comment-17689008 ] Ted Chester Jenks commented on SPARK-42359: --- I have a PR that implements this feature:

[jira] [Commented] (SPARK-42373) Remove unused blank line removal from CSVExprUtils

2023-02-14 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17688531#comment-17688531 ] Ted Chester Jenks commented on SPARK-42373: --- For the main use-case for this, 

[jira] [Commented] (SPARK-42397) Inconsistent data produced by `FlatMapCoGroupsInPandas`

2023-02-13 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687794#comment-17687794 ] Ted Chester Jenks commented on SPARK-42397: --- Is it ever expected for df.show() and

[jira] [Updated] (SPARK-42397) Inconsistent data produced by `FlatMapCoGroupsInPandas`

2023-02-10 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Chester Jenks updated SPARK-42397: -- Description: We are seeing inconsistent data returned when using

[jira] [Updated] (SPARK-42397) Inconsistent data produced by `FlatMapCoGroupsInPandas`

2023-02-10 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Chester Jenks updated SPARK-42397: -- Description: We are seeing inconsistent data returned when using

[jira] [Commented] (SPARK-42397) Inconsistent data produced by `FlatMapCoGroupsInPandas`

2023-02-10 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17687011#comment-17687011 ] Ted Chester Jenks commented on SPARK-42397: --- {{    test_df = spark.createDataFrame(}} {{       

[jira] [Updated] (SPARK-42397) Inconsistent data produced by `FlatMapCoGroupsInPandas`

2023-02-10 Thread Ted Chester Jenks (Jira)
[ https://issues.apache.org/jira/browse/SPARK-42397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Chester Jenks updated SPARK-42397: -- Description: We are seeing inconsistent data returned when using

[jira] [Created] (SPARK-42397) Inconsistent data produced by `FlatMapCoGroupsInPandas`

2023-02-10 Thread Ted Chester Jenks (Jira)
Ted Chester Jenks created SPARK-42397: - Summary: Inconsistent data produced by `FlatMapCoGroupsInPandas` Key: SPARK-42397 URL: https://issues.apache.org/jira/browse/SPARK-42397 Project: Spark