[jira] [Commented] (SPARK-26699) Dataset column output discrepancies
[ https://issues.apache.org/jira/browse/SPARK-26699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16751159#comment-16751159 ] Praveena commented on SPARK-26699: -- I am trying to understand why its behaving differently on Local and Cluster mode. Please let me know the emailing list, so i can reach them. Thanks in advance > Dataset column output discrepancies > > > Key: SPARK-26699 > URL: https://issues.apache.org/jira/browse/SPARK-26699 > Project: Spark > Issue Type: Question > Components: Input/Output >Affects Versions: 2.3.2 >Reporter: Praveena >Priority: Major > > Hi, > > When i run my job in Local mode (meaning as standalone in Eclipse) with same > parquet input files, the output is - > > locations > > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > null > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > > But when i run the same code base with same input parquet files in the YARN > cluster mode, my output is as below - > > locations > > [*WrappedArray*([tr... > [*WrappedArray*([tr... > [WrappedArray([tr... > null > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > Its appending WrappedArray :( > I am using Apache Spark 2.3.2 version and the EMR Version is 5.19.0. What > could be the reason for discrepancies in the output of certain Table columns ? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-26699) Dataset column output discrepancies
[ https://issues.apache.org/jira/browse/SPARK-26699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Praveena updated SPARK-26699: - Issue Type: Question (was: Bug) > Dataset column output discrepancies > > > Key: SPARK-26699 > URL: https://issues.apache.org/jira/browse/SPARK-26699 > Project: Spark > Issue Type: Question > Components: Input/Output >Affects Versions: 2.3.2 >Reporter: Praveena >Priority: Major > > Hi, > > When i run my job in Local mode (meaning as standalone in Eclipse) with same > parquet input files, the output is - > > locations > > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > null > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > > But when i run the same code base with same input parquet files in the YARN > cluster mode, my output is as below - > > locations > > [*WrappedArray*([tr... > [*WrappedArray*([tr... > [WrappedArray([tr... > null > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > Its appending WrappedArray :( > I am using Apache Spark 2.3.2 version and the EMR Version is 5.19.0. What > could be the reason for discrepancies in the output of certain Table columns ? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-26699) Dataset column output discrepancies
[ https://issues.apache.org/jira/browse/SPARK-26699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Praveena updated SPARK-26699: - Description: Hi, When i run my job in Local mode (meaning as standalone in Eclipse) with same parquet input files, the output is - locations [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... null [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... But when i run the same code base with same input parquet files in the YARN cluster mode, my output is as below - locations [*WrappedArray*([tr... [*WrappedArray*([tr... [WrappedArray([tr... null [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... Its appending WrappedArray :( I am using Apache Spark 2.3.2 version and the EMR Version is 5.19.0. What could be the reason for discrepancies in the output of certain Table columns ? was: Hi, When i run my job in Local mode (meaning as standalone in Eclipse) with same parquet input files, the output is - locations [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... null [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... But when i run the same code base with same input parquet files in the YARN cluster mode, my output is as below - locations [*WrappedArray*([tr... [*WrappedArray*([tr... [WrappedArray([tr... null [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... Its appending WrappedArray :( I am using Apache Spark 2.3.2 version and the EMR Version while cluster is 5.19.0. What could be the reason for discrepancies in the output of certain Table columns ? > Dataset column output discrepancies > > > Key: SPARK-26699 > URL: https://issues.apache.org/jira/browse/SPARK-26699 > Project: Spark > Issue Type: Bug > Components: Input/Output >Affects Versions: 2.3.2 >Reporter: Praveena >Priority: Major > > Hi, > > When i run my job in Local mode (meaning as standalone in Eclipse) with same > parquet input files, the output is - > > locations > > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > null > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > > But when i run the same code base with same input parquet files in the YARN > cluster mode, my output is as below - > > locations > > [*WrappedArray*([tr... > [*WrappedArray*([tr... > [WrappedArray([tr... > null > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > Its appending WrappedArray :( > I am using Apache Spark 2.3.2 version and the EMR Version is 5.19.0. What > could be the reason for discrepancies in the output of certain Table columns ? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-26699) Dataset column output discrepancies
[ https://issues.apache.org/jira/browse/SPARK-26699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Praveena updated SPARK-26699: - Description: Hi, When i run my job in Local mode (meaning as standalone in Eclipse) with same parquet input files, the output is - locations [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... null [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... But when i run the same code base with same input parquet files in the YARN cluster mode, my output is as below - locations [*WrappedArray*([tr... [*WrappedArray*([tr... [WrappedArray([tr... null [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... Its appending WrappedArray :( I am using Apache Spark 2.3.2 version and the EMR Version while cluster is 5.19.0. What could be the reason for discrepancies in the output of certain Table columns ? was: Hi, When i run my job in Local mode with same parquet input files, the output is - locations [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... null [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... But when i run the same code base with same input parquet files in the YARN cluster mode, my output is as below - locations [*WrappedArray*([tr... [*WrappedArray*([tr... [WrappedArray([tr... null [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... Its appending WrappedArray :( I am using Apache Spark 2.3.2 version and the EMR Version while cluster is 5.19.0. What could be the reason for discrepancies in the output of certain Table columns ? > Dataset column output discrepancies > > > Key: SPARK-26699 > URL: https://issues.apache.org/jira/browse/SPARK-26699 > Project: Spark > Issue Type: Bug > Components: Input/Output >Affects Versions: 2.3.2 >Reporter: Praveena >Priority: Major > > Hi, > > When i run my job in Local mode (meaning as standalone in Eclipse) with same > parquet input files, the output is - > > locations > > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > null > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > > But when i run the same code base with same input parquet files in the YARN > cluster mode, my output is as below - > > locations > > [*WrappedArray*([tr... > [*WrappedArray*([tr... > [WrappedArray([tr... > null > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > Its appending WrappedArray :( > I am using Apache Spark 2.3.2 version and the EMR Version while cluster is > 5.19.0. What could be the reason for discrepancies in the output of certain > Table columns ? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-26699) Dataset column discrepancies between Parquet
Lakshmi Praveena created SPARK-26699: Summary: Dataset column discrepancies between Parquet Key: SPARK-26699 URL: https://issues.apache.org/jira/browse/SPARK-26699 Project: Spark Issue Type: Bug Components: Input/Output Affects Versions: 2.3.2 Reporter: Lakshmi Praveena Hi, When i run my job in Local mode with same parquet input files, the output is - locations [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... null [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... [[[true, [[, phys... But when i run the same code base with same input parquet files in the YARN cluster mode, my output is as below - locations [*WrappedArray*([tr... [*WrappedArray*([tr... [WrappedArray([tr... null [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... [WrappedArray([tr... Its appending WrappedArray :( I am using Apache Spark 2.3.2 version and the EMR Version while cluster is 5.19.0. What could be the reason for discrepancies in the output of certain Table columns ? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-26699) Dataset column output discrepancies
[ https://issues.apache.org/jira/browse/SPARK-26699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lakshmi Praveena updated SPARK-26699: - Summary: Dataset column output discrepancies (was: Dataset column discrepancies between Parquet ) > Dataset column output discrepancies > > > Key: SPARK-26699 > URL: https://issues.apache.org/jira/browse/SPARK-26699 > Project: Spark > Issue Type: Bug > Components: Input/Output >Affects Versions: 2.3.2 >Reporter: Lakshmi Praveena >Priority: Major > > Hi, > > When i run my job in Local mode with same parquet input files, the output is - > > locations > > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > null > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > [[[true, [[, phys... > > But when i run the same code base with same input parquet files in the YARN > cluster mode, my output is as below - > > locations > > [*WrappedArray*([tr... > [*WrappedArray*([tr... > [WrappedArray([tr... > null > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > [WrappedArray([tr... > Its appending WrappedArray :( > I am using Apache Spark 2.3.2 version and the EMR Version while cluster is > 5.19.0. What could be the reason for discrepancies in the output of certain > Table columns ? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org