[jira] [Updated] (SPARK-39380) Ignore comment syntax in dfs command

2022-06-04 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Zhang updated SPARK-39380: --- Description: In version 3.2, Spark SQL ignores semicolons in comment syntax when splitting input

[jira] [Created] (SPARK-39380) Ignore comment syntax in dfs command

2022-06-04 Thread Chen Zhang (Jira)
Chen Zhang created SPARK-39380: -- Summary: Ignore comment syntax in dfs command Key: SPARK-39380 URL: https://issues.apache.org/jira/browse/SPARK-39380 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-32978) Incorrect number of dynamic part metric

2020-09-24 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201516#comment-17201516 ] Chen Zhang commented on SPARK-32978: Hello, [~yumwang] I used the default config Spark to run this

[jira] [Commented] (SPARK-32956) Duplicate Columns in a csv file

2020-09-22 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200382#comment-17200382 ] Chen Zhang commented on SPARK-32956: Okay, I will submit a PR later. > Duplicate Columns in a csv

[jira] [Updated] (SPARK-32956) Duplicate Columns in a csv file

2020-09-22 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Zhang updated SPARK-32956: --- Component/s: (was: Spark Core) SQL > Duplicate Columns in a csv file >

[jira] [Commented] (SPARK-32956) Duplicate Columns in a csv file

2020-09-22 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200092#comment-17200092 ] Chen Zhang commented on SPARK-32956: In SPARK-16896, if the CSV data has duplicate column headers,

[jira] [Updated] (SPARK-32317) Parquet file loading with different schema(Decimal(N, P)) in files is not working as expected

2020-09-02 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Zhang updated SPARK-32317: --- Labels: (was: easyfix) > Parquet file loading with different schema(Decimal(N, P)) in files is

[jira] [Updated] (SPARK-32317) Parquet file loading with different schema(Decimal(N, P)) in files is not working as expected

2020-09-02 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Zhang updated SPARK-32317: --- Component/s: (was: PySpark) SQL > Parquet file loading with different

[jira] [Updated] (SPARK-32639) Support GroupType parquet mapkey field

2020-08-17 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Zhang updated SPARK-32639: --- Attachment: 000.snappy.parquet > Support GroupType parquet mapkey field >

[jira] [Created] (SPARK-32639) Support GroupType parquet mapkey field

2020-08-17 Thread Chen Zhang (Jira)
Chen Zhang created SPARK-32639: -- Summary: Support GroupType parquet mapkey field Key: SPARK-32639 URL: https://issues.apache.org/jira/browse/SPARK-32639 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-32317) Parquet file loading with different schema(Decimal(N, P)) in files is not working as expected

2020-07-19 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17160786#comment-17160786 ] Chen Zhang commented on SPARK-32317: Spark uses requiredSchema to convert the read parquet file

[jira] [Comment Edited] (SPARK-32317) Parquet file loading with different schema(Decimal(N, P)) in files is not working as expected

2020-07-19 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17160207#comment-17160207 ] Chen Zhang edited comment on SPARK-32317 at 7/19/20, 6:40 PM: -- The DECIMAL

[jira] [Comment Edited] (SPARK-32317) Parquet file loading with different schema(Decimal(N, P)) in files is not working as expected

2020-07-19 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17160207#comment-17160207 ] Chen Zhang edited comment on SPARK-32317 at 7/19/20, 6:38 PM: -- The DECIMAL

[jira] [Comment Edited] (SPARK-32317) Parquet file loading with different schema(Decimal(N, P)) in files is not working as expected

2020-07-19 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17160207#comment-17160207 ] Chen Zhang edited comment on SPARK-32317 at 7/19/20, 6:10 PM: -- The DECIMAL

[jira] [Commented] (SPARK-32317) Parquet file loading with different schema(Decimal(N, P)) in files is not working as expected

2020-07-17 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17160207#comment-17160207 ] Chen Zhang commented on SPARK-32317: The DECIMAL type in Parquet is stored by INT32, INT64,

[jira] [Commented] (SPARK-32226) JDBC TimeStamp predicates always append `.0`

2020-07-13 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17156501#comment-17156501 ] Chen Zhang commented on SPARK-32226: [~thesuperzapper], glad to receive your reply. I don't have an

[jira] [Commented] (SPARK-32226) JDBC TimeStamp predicates always append `.0`

2020-07-12 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17156287#comment-17156287 ] Chen Zhang commented on SPARK-32226: Hello [~thesuperzapper], The dialect of Informix database is

[jira] [Commented] (SPARK-31635) Spark SQL Sort fails when sorting big data points

2020-07-11 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17156111#comment-17156111 ] Chen Zhang commented on SPARK-31635: Hello [~george21], I have submitted a PR. Please take a look.

[jira] [Updated] (SPARK-32212) RDD.takeOrdered can choose to merge intermediate results in executor or driver

2020-07-07 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Zhang updated SPARK-32212: --- Summary: RDD.takeOrdered can choose to merge intermediate results in executor or driver (was:

[jira] [Created] (SPARK-32212) RDD.takeOrdered merge intermediate results can be configured in driver or executor

2020-07-07 Thread Chen Zhang (Jira)
Chen Zhang created SPARK-32212: -- Summary: RDD.takeOrdered merge intermediate results can be configured in driver or executor Key: SPARK-32212 URL: https://issues.apache.org/jira/browse/SPARK-32212

[jira] [Comment Edited] (SPARK-31635) Spark SQL Sort fails when sorting big data points

2020-07-07 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147653#comment-17147653 ] Chen Zhang edited comment on SPARK-31635 at 7/7/20, 10:14 AM: -- In fact, the

[jira] [Commented] (SPARK-31635) Spark SQL Sort fails when sorting big data points

2020-07-07 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152618#comment-17152618 ] Chen Zhang commented on SPARK-31635: This problem is not a bug, but I think it is necessary to

[jira] [Comment Edited] (SPARK-31635) Spark SQL Sort fails when sorting big data points

2020-07-07 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147653#comment-17147653 ] Chen Zhang edited comment on SPARK-31635 at 7/7/20, 10:12 AM: -- In fact, the

[jira] [Comment Edited] (SPARK-31635) Spark SQL Sort fails when sorting big data points

2020-06-29 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147653#comment-17147653 ] Chen Zhang edited comment on SPARK-31635 at 6/29/20, 9:51 AM: -- In fact, the

[jira] [Commented] (SPARK-31635) Spark SQL Sort fails when sorting big data points

2020-06-29 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-31635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147653#comment-17147653 ] Chen Zhang commented on SPARK-31635: In fact, the RDD API corresponding to _DF.sort().take()_ is

[jira] [Commented] (SPARK-32109) SQL hash function handling of nulls makes collision too likely

2020-06-28 Thread Chen Zhang (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17147328#comment-17147328 ] Chen Zhang commented on SPARK-32109: The logic in the source code can be represented by the

[jira] [Commented] (SPARK-10925) Exception when joining DataFrames

2016-09-02 Thread Chen Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15458996#comment-15458996 ] Chen Zhang commented on SPARK-10925: I have the same issue too. Very annoying. After I do several