[jira] [Assigned] (SPARK-36830) Read/write dataframes with ANSI intervals from/to JSON files
[ https://issues.apache.org/jira/browse/SPARK-36830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36830: Assignee: (was: Apache Spark) > Read/write dataframes with ANSI intervals from/to JSON files > > > Key: SPARK-36830 > URL: https://issues.apache.org/jira/browse/SPARK-36830 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > Implement writing and reading ANSI intervals (year-month and day-time > intervals) columns in dataframes to JSON datasources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36830) Read/write dataframes with ANSI intervals from/to JSON files
[ https://issues.apache.org/jira/browse/SPARK-36830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36830: Assignee: Apache Spark > Read/write dataframes with ANSI intervals from/to JSON files > > > Key: SPARK-36830 > URL: https://issues.apache.org/jira/browse/SPARK-36830 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Assignee: Apache Spark >Priority: Major > > Implement writing and reading ANSI intervals (year-month and day-time > intervals) columns in dataframes to JSON datasources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36830) Read/write dataframes with ANSI intervals from/to JSON files
[ https://issues.apache.org/jira/browse/SPARK-36830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422555#comment-17422555 ] Apache Spark commented on SPARK-36830: -- User 'sarutak' has created a pull request for this issue: https://github.com/apache/spark/pull/34155 > Read/write dataframes with ANSI intervals from/to JSON files > > > Key: SPARK-36830 > URL: https://issues.apache.org/jira/browse/SPARK-36830 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > Implement writing and reading ANSI intervals (year-month and day-time > intervals) columns in dataframes to JSON datasources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36900) "SPARK-36464: size returns correct positive number even with over 2GB data" will oom with JDK17
[ https://issues.apache.org/jira/browse/SPARK-36900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Jie updated SPARK-36900: - Description: Execute {code:java} build/mvn clean install -pl core -am -Dtest=none -DwildcardSuites=org.apache.spark.util.io.ChunkedByteBufferOutputStreamSuite {code} with JDK 17, {code:java} ChunkedByteBufferOutputStreamSuite: - empty output - write a single byte - write a single near boundary - write a single at boundary - single chunk output - single chunk output at boundary size - multiple chunk output - multiple chunk output at boundary size *** RUN ABORTED *** java.lang.OutOfMemoryError: Java heap space at java.base/java.lang.Integer.valueOf(Integer.java:1081) at scala.runtime.BoxesRunTime.boxToInteger(BoxesRunTime.java:67) at org.apache.spark.util.io.ChunkedByteBufferOutputStream.allocateNewChunkIfNeeded(ChunkedByteBufferOutputStream.scala:87) at org.apache.spark.util.io.ChunkedByteBufferOutputStream.write(ChunkedByteBufferOutputStream.scala:75) at java.base/java.io.OutputStream.write(OutputStream.java:127) at org.apache.spark.util.io.ChunkedByteBufferOutputStreamSuite.$anonfun$new$22(ChunkedByteBufferOutputStreamSuite.scala:127) at org.apache.spark.util.io.ChunkedByteBufferOutputStreamSuite$$Lambda$179/0x0008011a75d8.apply(Unknown Source) at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) {code} > "SPARK-36464: size returns correct positive number even with over 2GB data" > will oom with JDK17 > > > Key: SPARK-36900 > URL: https://issues.apache.org/jira/browse/SPARK-36900 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 3.3.0 >Reporter: Yang Jie >Priority: Major > > Execute > > {code:java} > build/mvn clean install -pl core -am -Dtest=none > -DwildcardSuites=org.apache.spark.util.io.ChunkedByteBufferOutputStreamSuite > {code} > with JDK 17, > {code:java} > ChunkedByteBufferOutputStreamSuite: > - empty output > - write a single byte > - write a single near boundary > - write a single at boundary > - single chunk output > - single chunk output at boundary size > - multiple chunk output > - multiple chunk output at boundary size > *** RUN ABORTED *** > java.lang.OutOfMemoryError: Java heap space > at java.base/java.lang.Integer.valueOf(Integer.java:1081) > at scala.runtime.BoxesRunTime.boxToInteger(BoxesRunTime.java:67) > at > org.apache.spark.util.io.ChunkedByteBufferOutputStream.allocateNewChunkIfNeeded(ChunkedByteBufferOutputStream.scala:87) > at > org.apache.spark.util.io.ChunkedByteBufferOutputStream.write(ChunkedByteBufferOutputStream.scala:75) > at java.base/java.io.OutputStream.write(OutputStream.java:127) > at > org.apache.spark.util.io.ChunkedByteBufferOutputStreamSuite.$anonfun$new$22(ChunkedByteBufferOutputStreamSuite.scala:127) > at > org.apache.spark.util.io.ChunkedByteBufferOutputStreamSuite$$Lambda$179/0x0008011a75d8.apply(Unknown > Source) > at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) > at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) > at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36900) "SPARK-36464: size returns correct positive number even with over 2GB data" will oom with JDK17
Yang Jie created SPARK-36900: Summary: "SPARK-36464: size returns correct positive number even with over 2GB data" will oom with JDK17 Key: SPARK-36900 URL: https://issues.apache.org/jira/browse/SPARK-36900 Project: Spark Issue Type: Sub-task Components: Spark Core Affects Versions: 3.3.0 Reporter: Yang Jie -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36899) Support ILIKE API on R
[ https://issues.apache.org/jira/browse/SPARK-36899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta resolved SPARK-36899. Fix Version/s: 3.3.0 Assignee: Leona Yoda Resolution: Fixed Issue resolved in https://github.com/apache/spark/pull/34152 > Support ILIKE API on R > -- > > Key: SPARK-36899 > URL: https://issues.apache.org/jira/browse/SPARK-36899 > Project: Spark > Issue Type: Sub-task > Components: R >Affects Versions: 3.3.0 >Reporter: Leona Yoda >Assignee: Leona Yoda >Priority: Major > Fix For: 3.3.0 > > > Support ILIKE (case sensitive LIKE) API on R -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36896) Return boolean for `dropTempView` and `dropGlobalTempView`
[ https://issues.apache.org/jira/browse/SPARK-36896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-36896: Assignee: Xinrong Meng > Return boolean for `dropTempView` and `dropGlobalTempView` > -- > > Key: SPARK-36896 > URL: https://issues.apache.org/jira/browse/SPARK-36896 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0 > Environment: Currently`dropTempView` and `dropGlobalTempView` don't > have return value, which conflicts with their docstring: > `Returns true if this view is dropped successfully, false otherwise.`. > > We should fix that. >Reporter: Xinrong Meng >Assignee: Xinrong Meng >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36896) Return boolean for `dropTempView` and `dropGlobalTempView`
[ https://issues.apache.org/jira/browse/SPARK-36896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-36896. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34147 [https://github.com/apache/spark/pull/34147] > Return boolean for `dropTempView` and `dropGlobalTempView` > -- > > Key: SPARK-36896 > URL: https://issues.apache.org/jira/browse/SPARK-36896 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0 > Environment: Currently`dropTempView` and `dropGlobalTempView` don't > have return value, which conflicts with their docstring: > `Returns true if this view is dropped successfully, false otherwise.`. > > We should fix that. >Reporter: Xinrong Meng >Assignee: Xinrong Meng >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36892) Disable batch fetch for a shuffle when push based shuffle is enabled
[ https://issues.apache.org/jira/browse/SPARK-36892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422533#comment-17422533 ] Gengliang Wang commented on SPARK-36892: [~zhouyejoe] Thank you! > Disable batch fetch for a shuffle when push based shuffle is enabled > > > Key: SPARK-36892 > URL: https://issues.apache.org/jira/browse/SPARK-36892 > Project: Spark > Issue Type: Bug > Components: Shuffle >Affects Versions: 3.2.0 >Reporter: Mridul Muralidharan >Priority: Blocker > > When push based shuffle is enabled, efficient fetch of merged mapper shuffle > output happens. > Unfortunately, this currently interacts badly with > spark.sql.adaptive.fetchShuffleBlocksInBatch, potentially causing shuffle > fetch to hang and/or duplicate data to be fetched, causing correctness issues. > Given batch fetch does not benefit spark stages reading merged blocks when > push based shuffle is enabled, ShuffleBlockFetcherIterator.doBatchFetch can > be disabled when push based shuffle is enabled. > Thx to [~Ngone51] for surfacing this issue. > +CC [~Gengliang.Wang] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36893) upgrade mesos into 1.4.3
[ https://issues.apache.org/jira/browse/SPARK-36893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-36893. --- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34144 [https://github.com/apache/spark/pull/34144] > upgrade mesos into 1.4.3 > > > Key: SPARK-36893 > URL: https://issues.apache.org/jira/browse/SPARK-36893 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.1.2 >Reporter: Zhongwei Zhu >Assignee: Zhongwei Zhu >Priority: Minor > Fix For: 3.3.0 > > > Upgrade mesos to 1.4.3 to fix CVE-2018-11793 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36893) upgrade mesos into 1.4.3
[ https://issues.apache.org/jira/browse/SPARK-36893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-36893: - Assignee: Zhongwei Zhu > upgrade mesos into 1.4.3 > > > Key: SPARK-36893 > URL: https://issues.apache.org/jira/browse/SPARK-36893 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.1.2 >Reporter: Zhongwei Zhu >Assignee: Zhongwei Zhu >Priority: Minor > > Upgrade mesos to 1.4.3 to fix CVE-2018-11793 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36796) Make all unit tests pass on Java 17
[ https://issues.apache.org/jira/browse/SPARK-36796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422517#comment-17422517 ] Apache Spark commented on SPARK-36796: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/34153 > Make all unit tests pass on Java 17 > --- > > Key: SPARK-36796 > URL: https://issues.apache.org/jira/browse/SPARK-36796 > Project: Spark > Issue Type: Sub-task > Components: Tests >Affects Versions: 3.3.0 >Reporter: Yuming Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36796) Make all unit tests pass on Java 17
[ https://issues.apache.org/jira/browse/SPARK-36796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36796: Assignee: (was: Apache Spark) > Make all unit tests pass on Java 17 > --- > > Key: SPARK-36796 > URL: https://issues.apache.org/jira/browse/SPARK-36796 > Project: Spark > Issue Type: Sub-task > Components: Tests >Affects Versions: 3.3.0 >Reporter: Yuming Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36796) Make all unit tests pass on Java 17
[ https://issues.apache.org/jira/browse/SPARK-36796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36796: Assignee: Apache Spark > Make all unit tests pass on Java 17 > --- > > Key: SPARK-36796 > URL: https://issues.apache.org/jira/browse/SPARK-36796 > Project: Spark > Issue Type: Sub-task > Components: Tests >Affects Versions: 3.3.0 >Reporter: Yuming Wang >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36796) Make all unit tests pass on Java 17
[ https://issues.apache.org/jira/browse/SPARK-36796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422516#comment-17422516 ] Apache Spark commented on SPARK-36796: -- User 'LuciferYang' has created a pull request for this issue: https://github.com/apache/spark/pull/34153 > Make all unit tests pass on Java 17 > --- > > Key: SPARK-36796 > URL: https://issues.apache.org/jira/browse/SPARK-36796 > Project: Spark > Issue Type: Sub-task > Components: Tests >Affects Versions: 3.3.0 >Reporter: Yuming Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36899) Support ILIKE API on R
[ https://issues.apache.org/jira/browse/SPARK-36899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422507#comment-17422507 ] Apache Spark commented on SPARK-36899: -- User 'yoda-mon' has created a pull request for this issue: https://github.com/apache/spark/pull/34152 > Support ILIKE API on R > -- > > Key: SPARK-36899 > URL: https://issues.apache.org/jira/browse/SPARK-36899 > Project: Spark > Issue Type: Sub-task > Components: R >Affects Versions: 3.3.0 >Reporter: Leona Yoda >Priority: Major > > Support ILIKE (case sensitive LIKE) API on R -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36899) Support ILIKE API on R
[ https://issues.apache.org/jira/browse/SPARK-36899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36899: Assignee: (was: Apache Spark) > Support ILIKE API on R > -- > > Key: SPARK-36899 > URL: https://issues.apache.org/jira/browse/SPARK-36899 > Project: Spark > Issue Type: Sub-task > Components: R >Affects Versions: 3.3.0 >Reporter: Leona Yoda >Priority: Major > > Support ILIKE (case sensitive LIKE) API on R -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36899) Support ILIKE API on R
[ https://issues.apache.org/jira/browse/SPARK-36899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422506#comment-17422506 ] Apache Spark commented on SPARK-36899: -- User 'yoda-mon' has created a pull request for this issue: https://github.com/apache/spark/pull/34152 > Support ILIKE API on R > -- > > Key: SPARK-36899 > URL: https://issues.apache.org/jira/browse/SPARK-36899 > Project: Spark > Issue Type: Sub-task > Components: R >Affects Versions: 3.3.0 >Reporter: Leona Yoda >Priority: Major > > Support ILIKE (case sensitive LIKE) API on R -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36899) Support ILIKE API on R
[ https://issues.apache.org/jira/browse/SPARK-36899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36899: Assignee: Apache Spark > Support ILIKE API on R > -- > > Key: SPARK-36899 > URL: https://issues.apache.org/jira/browse/SPARK-36899 > Project: Spark > Issue Type: Sub-task > Components: R >Affects Versions: 3.3.0 >Reporter: Leona Yoda >Assignee: Apache Spark >Priority: Major > > Support ILIKE (case sensitive LIKE) API on R -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36886) Inline type hints for python/pyspark/sql/context.py
[ https://issues.apache.org/jira/browse/SPARK-36886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36886: Assignee: (was: Apache Spark) > Inline type hints for python/pyspark/sql/context.py > --- > > Key: SPARK-36886 > URL: https://issues.apache.org/jira/browse/SPARK-36886 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dgd_contributor >Priority: Major > > Inline type hints for python/pyspark/sql/context.py from Inline type hints > for python/pyspark/sql/context.pyi. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36886) Inline type hints for python/pyspark/sql/context.py
[ https://issues.apache.org/jira/browse/SPARK-36886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422503#comment-17422503 ] Apache Spark commented on SPARK-36886: -- User 'dgd-contributor' has created a pull request for this issue: https://github.com/apache/spark/pull/34151 > Inline type hints for python/pyspark/sql/context.py > --- > > Key: SPARK-36886 > URL: https://issues.apache.org/jira/browse/SPARK-36886 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dgd_contributor >Priority: Major > > Inline type hints for python/pyspark/sql/context.py from Inline type hints > for python/pyspark/sql/context.pyi. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36886) Inline type hints for python/pyspark/sql/context.py
[ https://issues.apache.org/jira/browse/SPARK-36886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36886: Assignee: Apache Spark > Inline type hints for python/pyspark/sql/context.py > --- > > Key: SPARK-36886 > URL: https://issues.apache.org/jira/browse/SPARK-36886 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.3.0 >Reporter: dgd_contributor >Assignee: Apache Spark >Priority: Major > > Inline type hints for python/pyspark/sql/context.py from Inline type hints > for python/pyspark/sql/context.pyi. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36899) Support ILIKE API on R
Leona Yoda created SPARK-36899: -- Summary: Support ILIKE API on R Key: SPARK-36899 URL: https://issues.apache.org/jira/browse/SPARK-36899 Project: Spark Issue Type: Sub-task Components: R Affects Versions: 3.3.0 Reporter: Leona Yoda Support ILIKE (case sensitive LIKE) API on R -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36898) Make the shuffle hash join factor configurable
[ https://issues.apache.org/jira/browse/SPARK-36898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422499#comment-17422499 ] Apache Spark commented on SPARK-36898: -- User 'JkSelf' has created a pull request for this issue: https://github.com/apache/spark/pull/34150 > Make the shuffle hash join factor configurable > -- > > Key: SPARK-36898 > URL: https://issues.apache.org/jira/browse/SPARK-36898 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: Ke Jia >Priority: Major > > Make the shuffle hash join factor configurable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36898) Make the shuffle hash join factor configurable
[ https://issues.apache.org/jira/browse/SPARK-36898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36898: Assignee: Apache Spark > Make the shuffle hash join factor configurable > -- > > Key: SPARK-36898 > URL: https://issues.apache.org/jira/browse/SPARK-36898 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: Ke Jia >Assignee: Apache Spark >Priority: Major > > Make the shuffle hash join factor configurable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36898) Make the shuffle hash join factor configurable
[ https://issues.apache.org/jira/browse/SPARK-36898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422498#comment-17422498 ] Apache Spark commented on SPARK-36898: -- User 'JkSelf' has created a pull request for this issue: https://github.com/apache/spark/pull/34150 > Make the shuffle hash join factor configurable > -- > > Key: SPARK-36898 > URL: https://issues.apache.org/jira/browse/SPARK-36898 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: Ke Jia >Priority: Major > > Make the shuffle hash join factor configurable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36898) Make the shuffle hash join factor configurable
[ https://issues.apache.org/jira/browse/SPARK-36898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36898: Assignee: (was: Apache Spark) > Make the shuffle hash join factor configurable > -- > > Key: SPARK-36898 > URL: https://issues.apache.org/jira/browse/SPARK-36898 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: Ke Jia >Priority: Major > > Make the shuffle hash join factor configurable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36898) Make the shuffle hash join factor configurable
Ke Jia created SPARK-36898: -- Summary: Make the shuffle hash join factor configurable Key: SPARK-36898 URL: https://issues.apache.org/jira/browse/SPARK-36898 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.1.2 Reporter: Ke Jia Make the shuffle hash join factor configurable. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36897) Replace collections.namedtuple() by typing.NamedTuple
Xinrong Meng created SPARK-36897: Summary: Replace collections.namedtuple() by typing.NamedTuple Key: SPARK-36897 URL: https://issues.apache.org/jira/browse/SPARK-36897 Project: Spark Issue Type: Sub-task Components: PySpark Affects Versions: 3.3.0 Reporter: Xinrong Meng Per discussion under [https://github.com/apache/spark/pull/34133#discussion_r718833451,] we wanted to replace collections.namedtuple() by typing.NamedTuple. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36845) Inline type hint files
[ https://issues.apache.org/jira/browse/SPARK-36845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422487#comment-17422487 ] Takuya Ueshin commented on SPARK-36845: --- For the {{_typing.pyi}}, it uses {{Protocol}} that is supported Python 3.8 and above, so it won't be a straightforward, IIUC. > Inline type hint files > -- > > Key: SPARK-36845 > URL: https://issues.apache.org/jira/browse/SPARK-36845 > Project: Spark > Issue Type: Umbrella > Components: PySpark, SQL >Affects Versions: 3.3.0 >Reporter: Takuya Ueshin >Priority: Major > > Currently there are type hint stub files ({{*.pyi}}) to show the expected > types for functions, but we can also take advantage of static type checking > within the functions by inlining the type hints. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-36845) Inline type hint files
[ https://issues.apache.org/jira/browse/SPARK-36845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422486#comment-17422486 ] Takuya Ueshin edited comment on SPARK-36845 at 9/30/21, 1:36 AM: - It would be great if we could use the "annotations" future flag! was (Author: ueshin): It would be great if we could use the "annotation" future flag! > Inline type hint files > -- > > Key: SPARK-36845 > URL: https://issues.apache.org/jira/browse/SPARK-36845 > Project: Spark > Issue Type: Umbrella > Components: PySpark, SQL >Affects Versions: 3.3.0 >Reporter: Takuya Ueshin >Priority: Major > > Currently there are type hint stub files ({{*.pyi}}) to show the expected > types for functions, but we can also take advantage of static type checking > within the functions by inlining the type hints. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36845) Inline type hint files
[ https://issues.apache.org/jira/browse/SPARK-36845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422486#comment-17422486 ] Takuya Ueshin commented on SPARK-36845: --- It would be great if we could use the "annotation" future flag! > Inline type hint files > -- > > Key: SPARK-36845 > URL: https://issues.apache.org/jira/browse/SPARK-36845 > Project: Spark > Issue Type: Umbrella > Components: PySpark, SQL >Affects Versions: 3.3.0 >Reporter: Takuya Ueshin >Priority: Major > > Currently there are type hint stub files ({{*.pyi}}) to show the expected > types for functions, but we can also take advantage of static type checking > within the functions by inlining the type hints. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36830) Read/write dataframes with ANSI intervals from/to JSON files
[ https://issues.apache.org/jira/browse/SPARK-36830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-36830: --- Description: Implement writing and reading ANSI intervals (year-month and day-time intervals) columns in dataframes to JSON datasources. (was: Implement writing and reading ANSI intervals (year-month and day-time intervals) columns in dataframes to Parquet datasources.) > Read/write dataframes with ANSI intervals from/to JSON files > > > Key: SPARK-36830 > URL: https://issues.apache.org/jira/browse/SPARK-36830 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > Implement writing and reading ANSI intervals (year-month and day-time > intervals) columns in dataframes to JSON datasources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36830) Read/write dataframes with ANSI intervals from/to JSON files
[ https://issues.apache.org/jira/browse/SPARK-36830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422483#comment-17422483 ] Kousuke Saruta commented on SPARK-36830: Thank you, will do. > Read/write dataframes with ANSI intervals from/to JSON files > > > Key: SPARK-36830 > URL: https://issues.apache.org/jira/browse/SPARK-36830 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > Implement writing and reading ANSI intervals (year-month and day-time > intervals) columns in dataframes to Parquet datasources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36888) Sha2 with bit_length 512 not being tested
[ https://issues.apache.org/jira/browse/SPARK-36888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-36888. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34145 [https://github.com/apache/spark/pull/34145] > Sha2 with bit_length 512 not being tested > - > > Key: SPARK-36888 > URL: https://issues.apache.org/jira/browse/SPARK-36888 > Project: Spark > Issue Type: Task > Components: SQL >Affects Versions: 3.2.0 >Reporter: H. Vetinari >Assignee: Richard Chen >Priority: Major > Fix For: 3.3.0 > > > Looking at > [https://github.com/apache/spark/commit/6c6291b3f6ac13b8415b87b2b741a9cd95bc6c3b] > for https://issues.apache.org/jira/browse/SPARK-36836, it's clear that 512 > bits are supported > {{bitLength match {}} > {{[...]}} > {{ case 512 =>}} > {{ UTF8String.fromString(DigestUtils.sha512Hex(input))}} > resp. > {{nullSafeCodeGen(ctx, ev, (eval1, eval2) => {}} > {{ [...]}} > {{ else if ($eval2 == 512) {}} > {{ ${ev.value} =}} > {{ UTF8String.fromString($digestUtils.sha512Hex($eval1));}} > but the test claims it is unsupported: > {{// unsupported bit length}} > {{checkEvaluation(Sha2(Literal.create(null, BinaryType), Literal(1024)), > null)}} > {{checkEvaluation(Sha2(Literal.create(null, BinaryType), Literal(512)), > null)}} > To avoid a similar fate as SPARK-36836, tests should be added. > > CC [~richardc-db] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36888) Sha2 with bit_length 512 not being tested
[ https://issues.apache.org/jira/browse/SPARK-36888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-36888: Assignee: Richard Chen > Sha2 with bit_length 512 not being tested > - > > Key: SPARK-36888 > URL: https://issues.apache.org/jira/browse/SPARK-36888 > Project: Spark > Issue Type: Task > Components: SQL >Affects Versions: 3.2.0 >Reporter: H. Vetinari >Assignee: Richard Chen >Priority: Major > > Looking at > [https://github.com/apache/spark/commit/6c6291b3f6ac13b8415b87b2b741a9cd95bc6c3b] > for https://issues.apache.org/jira/browse/SPARK-36836, it's clear that 512 > bits are supported > {{bitLength match {}} > {{[...]}} > {{ case 512 =>}} > {{ UTF8String.fromString(DigestUtils.sha512Hex(input))}} > resp. > {{nullSafeCodeGen(ctx, ev, (eval1, eval2) => {}} > {{ [...]}} > {{ else if ($eval2 == 512) {}} > {{ ${ev.value} =}} > {{ UTF8String.fromString($digestUtils.sha512Hex($eval1));}} > but the test claims it is unsupported: > {{// unsupported bit length}} > {{checkEvaluation(Sha2(Literal.create(null, BinaryType), Literal(1024)), > null)}} > {{checkEvaluation(Sha2(Literal.create(null, BinaryType), Literal(512)), > null)}} > To avoid a similar fate as SPARK-36836, tests should be added. > > CC [~richardc-db] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36891) Add new test suite to cover Parquet decoding
[ https://issues.apache.org/jira/browse/SPARK-36891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422480#comment-17422480 ] Apache Spark commented on SPARK-36891: -- User 'sunchao' has created a pull request for this issue: https://github.com/apache/spark/pull/34149 > Add new test suite to cover Parquet decoding > > > Key: SPARK-36891 > URL: https://issues.apache.org/jira/browse/SPARK-36891 > Project: Spark > Issue Type: Test > Components: SQL >Affects Versions: 3.3.0 >Reporter: Chao Sun >Priority: Major > > Add a new test suite to add more coverage for Parquet vectorized decoding, > focusing on different combinations of Parquet column index, dictionary, batch > size, page size, etc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36891) Add new test suite to cover Parquet decoding
[ https://issues.apache.org/jira/browse/SPARK-36891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422476#comment-17422476 ] Apache Spark commented on SPARK-36891: -- User 'sunchao' has created a pull request for this issue: https://github.com/apache/spark/pull/34149 > Add new test suite to cover Parquet decoding > > > Key: SPARK-36891 > URL: https://issues.apache.org/jira/browse/SPARK-36891 > Project: Spark > Issue Type: Test > Components: SQL >Affects Versions: 3.3.0 >Reporter: Chao Sun >Priority: Major > > Add a new test suite to add more coverage for Parquet vectorized decoding, > focusing on different combinations of Parquet column index, dictionary, batch > size, page size, etc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36891) Add new test suite to cover Parquet decoding
[ https://issues.apache.org/jira/browse/SPARK-36891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36891: Assignee: Apache Spark > Add new test suite to cover Parquet decoding > > > Key: SPARK-36891 > URL: https://issues.apache.org/jira/browse/SPARK-36891 > Project: Spark > Issue Type: Test > Components: SQL >Affects Versions: 3.3.0 >Reporter: Chao Sun >Assignee: Apache Spark >Priority: Major > > Add a new test suite to add more coverage for Parquet vectorized decoding, > focusing on different combinations of Parquet column index, dictionary, batch > size, page size, etc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36891) Add new test suite to cover Parquet decoding
[ https://issues.apache.org/jira/browse/SPARK-36891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36891: Assignee: (was: Apache Spark) > Add new test suite to cover Parquet decoding > > > Key: SPARK-36891 > URL: https://issues.apache.org/jira/browse/SPARK-36891 > Project: Spark > Issue Type: Test > Components: SQL >Affects Versions: 3.3.0 >Reporter: Chao Sun >Priority: Major > > Add a new test suite to add more coverage for Parquet vectorized decoding, > focusing on different combinations of Parquet column index, dictionary, batch > size, page size, etc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36895) Add Create Index syntax support
[ https://issues.apache.org/jira/browse/SPARK-36895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422468#comment-17422468 ] Apache Spark commented on SPARK-36895: -- User 'huaxingao' has created a pull request for this issue: https://github.com/apache/spark/pull/34148 > Add Create Index syntax support > --- > > Key: SPARK-36895 > URL: https://issues.apache.org/jira/browse/SPARK-36895 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Huaxin Gao >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36895) Add Create Index syntax support
[ https://issues.apache.org/jira/browse/SPARK-36895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422466#comment-17422466 ] Apache Spark commented on SPARK-36895: -- User 'huaxingao' has created a pull request for this issue: https://github.com/apache/spark/pull/34148 > Add Create Index syntax support > --- > > Key: SPARK-36895 > URL: https://issues.apache.org/jira/browse/SPARK-36895 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Huaxin Gao >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36895) Add Create Index syntax support
[ https://issues.apache.org/jira/browse/SPARK-36895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36895: Assignee: (was: Apache Spark) > Add Create Index syntax support > --- > > Key: SPARK-36895 > URL: https://issues.apache.org/jira/browse/SPARK-36895 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Huaxin Gao >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36895) Add Create Index syntax support
[ https://issues.apache.org/jira/browse/SPARK-36895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36895: Assignee: Apache Spark > Add Create Index syntax support > --- > > Key: SPARK-36895 > URL: https://issues.apache.org/jira/browse/SPARK-36895 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Huaxin Gao >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36845) Inline type hint files
[ https://issues.apache.org/jira/browse/SPARK-36845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422464#comment-17422464 ] Hyukjin Kwon commented on SPARK-36845: -- Yeah, _typing.pyi one I am not sure. Agree we should think about which approach we'll take on this .. I added SPARK-36145 as a related ticket for now. I will make sure making it resolved all together for Spark 3.3. > Inline type hint files > -- > > Key: SPARK-36845 > URL: https://issues.apache.org/jira/browse/SPARK-36845 > Project: Spark > Issue Type: Umbrella > Components: PySpark, SQL >Affects Versions: 3.3.0 >Reporter: Takuya Ueshin >Priority: Major > > Currently there are type hint stub files ({{*.pyi}}) to show the expected > types for functions, but we can also take advantage of static type checking > within the functions by inlining the type hints. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36894) RDD.toDF should be synchronized with dispatched variants of SparkSession.createDataFrame
[ https://issues.apache.org/jira/browse/SPARK-36894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz updated SPARK-36894: --- Description: There are some variants that are supported: * Providing a schema as a {{str}} object for {{RDD[RowLike]}} objects * Providing a schema as a {{Tuple[str, ...]}} names * Calling {{toDF}} on {{RDD}} of atomic values, when schema of {{str}} or {{AtomicType}} is provided. was:In {{toDF}} docs we explicitly mention that {{str}} schema is supported, so it should be reflected in the type hints. > RDD.toDF should be synchronized with dispatched variants of > SparkSession.createDataFrame > > > Key: SPARK-36894 > URL: https://issues.apache.org/jira/browse/SPARK-36894 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL >Affects Versions: 3.1.2, 3.2.0, 3.3.0 >Reporter: Maciej Szymkiewicz >Priority: Minor > > There are some variants that are supported: > * Providing a schema as a {{str}} object for {{RDD[RowLike]}} objects > * Providing a schema as a {{Tuple[str, ...]}} names > * Calling {{toDF}} on {{RDD}} of atomic values, when schema of {{str}} or > {{AtomicType}} is provided. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36894) RDD.toDF should be synchronized with dispatched variants of SparkSession.createDataFrame
[ https://issues.apache.org/jira/browse/SPARK-36894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz updated SPARK-36894: --- Summary: RDD.toDF should be synchronized with dispatched variants of SparkSession.createDataFrame (was: RDD.toDF should support string schema) > RDD.toDF should be synchronized with dispatched variants of > SparkSession.createDataFrame > > > Key: SPARK-36894 > URL: https://issues.apache.org/jira/browse/SPARK-36894 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL >Affects Versions: 3.1.2, 3.2.0, 3.3.0 >Reporter: Maciej Szymkiewicz >Priority: Minor > > In {{toDF}} docs we explicitly mention that {{str}} schema is supported, so > it should be reflected in the type hints. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36896) Return boolean for `dropTempView` and `dropGlobalTempView`
[ https://issues.apache.org/jira/browse/SPARK-36896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-36896: - Environment: Currently`dropTempView` and `dropGlobalTempView` don't have return value, which conflicts with their docstring: `Returns true if this view is dropped successfully, false otherwise.`. We should fix that. was: dropTempView, dropGlobalTempView should have return values. setCurrentDatabase shouldn't return anything. We should fix these API accordingly. > Return boolean for `dropTempView` and `dropGlobalTempView` > -- > > Key: SPARK-36896 > URL: https://issues.apache.org/jira/browse/SPARK-36896 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0 > Environment: Currently`dropTempView` and `dropGlobalTempView` don't > have return value, which conflicts with their docstring: > `Returns true if this view is dropped successfully, false otherwise.`. > > We should fix that. >Reporter: Xinrong Meng >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36896) Return boolean for `dropTempView` and `dropGlobalTempView`
[ https://issues.apache.org/jira/browse/SPARK-36896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-36896: - Summary: Return boolean for `dropTempView` and `dropGlobalTempView` (was: Fix returns of functions in python/pyspark/sql/catalog.py) > Return boolean for `dropTempView` and `dropGlobalTempView` > -- > > Key: SPARK-36896 > URL: https://issues.apache.org/jira/browse/SPARK-36896 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0 > Environment: dropTempView, dropGlobalTempView should have return > values. > setCurrentDatabase shouldn't return anything. > We should fix these API accordingly. >Reporter: Xinrong Meng >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36896) Fix returns of functions in python/pyspark/sql/catalog.py
[ https://issues.apache.org/jira/browse/SPARK-36896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-36896: - Summary: Fix returns of functions in python/pyspark/sql/catalog.py (was: Fix returns of functions python/pyspark/sql/catalog.py) > Fix returns of functions in python/pyspark/sql/catalog.py > - > > Key: SPARK-36896 > URL: https://issues.apache.org/jira/browse/SPARK-36896 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0 > Environment: dropTempView, dropGlobalTempView should have return > values. > setCurrentDatabase shouldn't return anything. > We should fix these API accordingly. >Reporter: Xinrong Meng >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36896) Fix returns of functions python/pyspark/sql/catalog.py
[ https://issues.apache.org/jira/browse/SPARK-36896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422440#comment-17422440 ] Apache Spark commented on SPARK-36896: -- User 'xinrong-databricks' has created a pull request for this issue: https://github.com/apache/spark/pull/34147 > Fix returns of functions python/pyspark/sql/catalog.py > -- > > Key: SPARK-36896 > URL: https://issues.apache.org/jira/browse/SPARK-36896 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0 > Environment: dropTempView, dropGlobalTempView should have return > values. > setCurrentDatabase shouldn't return anything. > We should fix these API accordingly. >Reporter: Xinrong Meng >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36896) Fix returns of functions python/pyspark/sql/catalog.py
[ https://issues.apache.org/jira/browse/SPARK-36896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36896: Assignee: Apache Spark > Fix returns of functions python/pyspark/sql/catalog.py > -- > > Key: SPARK-36896 > URL: https://issues.apache.org/jira/browse/SPARK-36896 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0 > Environment: dropTempView, dropGlobalTempView should have return > values. > setCurrentDatabase shouldn't return anything. > We should fix these API accordingly. >Reporter: Xinrong Meng >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36896) Fix returns of functions python/pyspark/sql/catalog.py
[ https://issues.apache.org/jira/browse/SPARK-36896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422439#comment-17422439 ] Apache Spark commented on SPARK-36896: -- User 'xinrong-databricks' has created a pull request for this issue: https://github.com/apache/spark/pull/34147 > Fix returns of functions python/pyspark/sql/catalog.py > -- > > Key: SPARK-36896 > URL: https://issues.apache.org/jira/browse/SPARK-36896 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0 > Environment: dropTempView, dropGlobalTempView should have return > values. > setCurrentDatabase shouldn't return anything. > We should fix these API accordingly. >Reporter: Xinrong Meng >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36896) Fix returns of functions python/pyspark/sql/catalog.py
[ https://issues.apache.org/jira/browse/SPARK-36896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36896: Assignee: (was: Apache Spark) > Fix returns of functions python/pyspark/sql/catalog.py > -- > > Key: SPARK-36896 > URL: https://issues.apache.org/jira/browse/SPARK-36896 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.3.0 > Environment: dropTempView, dropGlobalTempView should have return > values. > setCurrentDatabase shouldn't return anything. > We should fix these API accordingly. >Reporter: Xinrong Meng >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36896) Fix returns of functions python/pyspark/sql/catalog.py
Xinrong Meng created SPARK-36896: Summary: Fix returns of functions python/pyspark/sql/catalog.py Key: SPARK-36896 URL: https://issues.apache.org/jira/browse/SPARK-36896 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 3.3.0 Environment: dropTempView, dropGlobalTempView should have return values. setCurrentDatabase shouldn't return anything. We should fix these API accordingly. Reporter: Xinrong Meng -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36895) Add Create Index syntax support
Huaxin Gao created SPARK-36895: -- Summary: Add Create Index syntax support Key: SPARK-36895 URL: https://issues.apache.org/jira/browse/SPARK-36895 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Huaxin Gao -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36894) RDD.toDF should support string schema
[ https://issues.apache.org/jira/browse/SPARK-36894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422430#comment-17422430 ] Apache Spark commented on SPARK-36894: -- User 'zero323' has created a pull request for this issue: https://github.com/apache/spark/pull/34146 > RDD.toDF should support string schema > - > > Key: SPARK-36894 > URL: https://issues.apache.org/jira/browse/SPARK-36894 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL >Affects Versions: 3.1.2, 3.2.0, 3.3.0 >Reporter: Maciej Szymkiewicz >Priority: Minor > > In {{toDF}} docs we explicitly mention that {{str}} schema is supported, so > it should be reflected in the type hints. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36894) RDD.toDF should support string schema
[ https://issues.apache.org/jira/browse/SPARK-36894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36894: Assignee: (was: Apache Spark) > RDD.toDF should support string schema > - > > Key: SPARK-36894 > URL: https://issues.apache.org/jira/browse/SPARK-36894 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL >Affects Versions: 3.1.2, 3.2.0, 3.3.0 >Reporter: Maciej Szymkiewicz >Priority: Minor > > In {{toDF}} docs we explicitly mention that {{str}} schema is supported, so > it should be reflected in the type hints. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36894) RDD.toDF should support string schema
[ https://issues.apache.org/jira/browse/SPARK-36894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422429#comment-17422429 ] Apache Spark commented on SPARK-36894: -- User 'zero323' has created a pull request for this issue: https://github.com/apache/spark/pull/34146 > RDD.toDF should support string schema > - > > Key: SPARK-36894 > URL: https://issues.apache.org/jira/browse/SPARK-36894 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL >Affects Versions: 3.1.2, 3.2.0, 3.3.0 >Reporter: Maciej Szymkiewicz >Priority: Minor > > In {{toDF}} docs we explicitly mention that {{str}} schema is supported, so > it should be reflected in the type hints. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36894) RDD.toDF should support string schema
[ https://issues.apache.org/jira/browse/SPARK-36894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36894: Assignee: Apache Spark > RDD.toDF should support string schema > - > > Key: SPARK-36894 > URL: https://issues.apache.org/jira/browse/SPARK-36894 > Project: Spark > Issue Type: Improvement > Components: PySpark, SQL >Affects Versions: 3.1.2, 3.2.0, 3.3.0 >Reporter: Maciej Szymkiewicz >Assignee: Apache Spark >Priority: Minor > > In {{toDF}} docs we explicitly mention that {{str}} schema is supported, so > it should be reflected in the type hints. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36892) Disable batch fetch for a shuffle when push based shuffle is enabled
[ https://issues.apache.org/jira/browse/SPARK-36892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422425#comment-17422425 ] Ye Zhou commented on SPARK-36892: - I am working on this issue. We have a job which can reproduce this hanging issue and after disabling the batch fetch, the job can go through. Will post PR soon. > Disable batch fetch for a shuffle when push based shuffle is enabled > > > Key: SPARK-36892 > URL: https://issues.apache.org/jira/browse/SPARK-36892 > Project: Spark > Issue Type: Bug > Components: Shuffle >Affects Versions: 3.2.0 >Reporter: Mridul Muralidharan >Priority: Blocker > > When push based shuffle is enabled, efficient fetch of merged mapper shuffle > output happens. > Unfortunately, this currently interacts badly with > spark.sql.adaptive.fetchShuffleBlocksInBatch, potentially causing shuffle > fetch to hang and/or duplicate data to be fetched, causing correctness issues. > Given batch fetch does not benefit spark stages reading merged blocks when > push based shuffle is enabled, ShuffleBlockFetcherIterator.doBatchFetch can > be disabled when push based shuffle is enabled. > Thx to [~Ngone51] for surfacing this issue. > +CC [~Gengliang.Wang] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36894) RDD.toDF should support string schema
Maciej Szymkiewicz created SPARK-36894: -- Summary: RDD.toDF should support string schema Key: SPARK-36894 URL: https://issues.apache.org/jira/browse/SPARK-36894 Project: Spark Issue Type: Improvement Components: PySpark, SQL Affects Versions: 3.1.2, 3.2.0, 3.3.0 Reporter: Maciej Szymkiewicz In {{toDF}} docs we explicitly mention that {{str}} schema is supported, so it should be reflected in the type hints. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36888) Sha2 with bit_length 512 not being tested
[ https://issues.apache.org/jira/browse/SPARK-36888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36888: Assignee: (was: Apache Spark) > Sha2 with bit_length 512 not being tested > - > > Key: SPARK-36888 > URL: https://issues.apache.org/jira/browse/SPARK-36888 > Project: Spark > Issue Type: Task > Components: SQL >Affects Versions: 3.2.0 >Reporter: H. Vetinari >Priority: Major > > Looking at > [https://github.com/apache/spark/commit/6c6291b3f6ac13b8415b87b2b741a9cd95bc6c3b] > for https://issues.apache.org/jira/browse/SPARK-36836, it's clear that 512 > bits are supported > {{bitLength match {}} > {{[...]}} > {{ case 512 =>}} > {{ UTF8String.fromString(DigestUtils.sha512Hex(input))}} > resp. > {{nullSafeCodeGen(ctx, ev, (eval1, eval2) => {}} > {{ [...]}} > {{ else if ($eval2 == 512) {}} > {{ ${ev.value} =}} > {{ UTF8String.fromString($digestUtils.sha512Hex($eval1));}} > but the test claims it is unsupported: > {{// unsupported bit length}} > {{checkEvaluation(Sha2(Literal.create(null, BinaryType), Literal(1024)), > null)}} > {{checkEvaluation(Sha2(Literal.create(null, BinaryType), Literal(512)), > null)}} > To avoid a similar fate as SPARK-36836, tests should be added. > > CC [~richardc-db] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36888) Sha2 with bit_length 512 not being tested
[ https://issues.apache.org/jira/browse/SPARK-36888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422424#comment-17422424 ] Apache Spark commented on SPARK-36888: -- User 'richardc-db' has created a pull request for this issue: https://github.com/apache/spark/pull/34145 > Sha2 with bit_length 512 not being tested > - > > Key: SPARK-36888 > URL: https://issues.apache.org/jira/browse/SPARK-36888 > Project: Spark > Issue Type: Task > Components: SQL >Affects Versions: 3.2.0 >Reporter: H. Vetinari >Priority: Major > > Looking at > [https://github.com/apache/spark/commit/6c6291b3f6ac13b8415b87b2b741a9cd95bc6c3b] > for https://issues.apache.org/jira/browse/SPARK-36836, it's clear that 512 > bits are supported > {{bitLength match {}} > {{[...]}} > {{ case 512 =>}} > {{ UTF8String.fromString(DigestUtils.sha512Hex(input))}} > resp. > {{nullSafeCodeGen(ctx, ev, (eval1, eval2) => {}} > {{ [...]}} > {{ else if ($eval2 == 512) {}} > {{ ${ev.value} =}} > {{ UTF8String.fromString($digestUtils.sha512Hex($eval1));}} > but the test claims it is unsupported: > {{// unsupported bit length}} > {{checkEvaluation(Sha2(Literal.create(null, BinaryType), Literal(1024)), > null)}} > {{checkEvaluation(Sha2(Literal.create(null, BinaryType), Literal(512)), > null)}} > To avoid a similar fate as SPARK-36836, tests should be added. > > CC [~richardc-db] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36888) Sha2 with bit_length 512 not being tested
[ https://issues.apache.org/jira/browse/SPARK-36888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36888: Assignee: Apache Spark > Sha2 with bit_length 512 not being tested > - > > Key: SPARK-36888 > URL: https://issues.apache.org/jira/browse/SPARK-36888 > Project: Spark > Issue Type: Task > Components: SQL >Affects Versions: 3.2.0 >Reporter: H. Vetinari >Assignee: Apache Spark >Priority: Major > > Looking at > [https://github.com/apache/spark/commit/6c6291b3f6ac13b8415b87b2b741a9cd95bc6c3b] > for https://issues.apache.org/jira/browse/SPARK-36836, it's clear that 512 > bits are supported > {{bitLength match {}} > {{[...]}} > {{ case 512 =>}} > {{ UTF8String.fromString(DigestUtils.sha512Hex(input))}} > resp. > {{nullSafeCodeGen(ctx, ev, (eval1, eval2) => {}} > {{ [...]}} > {{ else if ($eval2 == 512) {}} > {{ ${ev.value} =}} > {{ UTF8String.fromString($digestUtils.sha512Hex($eval1));}} > but the test claims it is unsupported: > {{// unsupported bit length}} > {{checkEvaluation(Sha2(Literal.create(null, BinaryType), Literal(1024)), > null)}} > {{checkEvaluation(Sha2(Literal.create(null, BinaryType), Literal(512)), > null)}} > To avoid a similar fate as SPARK-36836, tests should be added. > > CC [~richardc-db] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36888) Sha2 with bit_length 512 not being tested
[ https://issues.apache.org/jira/browse/SPARK-36888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422422#comment-17422422 ] Apache Spark commented on SPARK-36888: -- User 'richardc-db' has created a pull request for this issue: https://github.com/apache/spark/pull/34145 > Sha2 with bit_length 512 not being tested > - > > Key: SPARK-36888 > URL: https://issues.apache.org/jira/browse/SPARK-36888 > Project: Spark > Issue Type: Task > Components: SQL >Affects Versions: 3.2.0 >Reporter: H. Vetinari >Priority: Major > > Looking at > [https://github.com/apache/spark/commit/6c6291b3f6ac13b8415b87b2b741a9cd95bc6c3b] > for https://issues.apache.org/jira/browse/SPARK-36836, it's clear that 512 > bits are supported > {{bitLength match {}} > {{[...]}} > {{ case 512 =>}} > {{ UTF8String.fromString(DigestUtils.sha512Hex(input))}} > resp. > {{nullSafeCodeGen(ctx, ev, (eval1, eval2) => {}} > {{ [...]}} > {{ else if ($eval2 == 512) {}} > {{ ${ev.value} =}} > {{ UTF8String.fromString($digestUtils.sha512Hex($eval1));}} > but the test claims it is unsupported: > {{// unsupported bit length}} > {{checkEvaluation(Sha2(Literal.create(null, BinaryType), Literal(1024)), > null)}} > {{checkEvaluation(Sha2(Literal.create(null, BinaryType), Literal(512)), > null)}} > To avoid a similar fate as SPARK-36836, tests should be added. > > CC [~richardc-db] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36893) upgrade mesos into 1.4.3
[ https://issues.apache.org/jira/browse/SPARK-36893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422395#comment-17422395 ] Apache Spark commented on SPARK-36893: -- User 'warrenzhu25' has created a pull request for this issue: https://github.com/apache/spark/pull/34144 > upgrade mesos into 1.4.3 > > > Key: SPARK-36893 > URL: https://issues.apache.org/jira/browse/SPARK-36893 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.1.2 >Reporter: Zhongwei Zhu >Priority: Minor > > Upgrade mesos to 1.4.3 to fix CVE-2018-11793 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36893) upgrade mesos into 1.4.3
[ https://issues.apache.org/jira/browse/SPARK-36893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422393#comment-17422393 ] Apache Spark commented on SPARK-36893: -- User 'warrenzhu25' has created a pull request for this issue: https://github.com/apache/spark/pull/34144 > upgrade mesos into 1.4.3 > > > Key: SPARK-36893 > URL: https://issues.apache.org/jira/browse/SPARK-36893 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.1.2 >Reporter: Zhongwei Zhu >Priority: Minor > > Upgrade mesos to 1.4.3 to fix CVE-2018-11793 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36893) upgrade mesos into 1.4.3
[ https://issues.apache.org/jira/browse/SPARK-36893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36893: Assignee: Apache Spark > upgrade mesos into 1.4.3 > > > Key: SPARK-36893 > URL: https://issues.apache.org/jira/browse/SPARK-36893 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.1.2 >Reporter: Zhongwei Zhu >Assignee: Apache Spark >Priority: Minor > > Upgrade mesos to 1.4.3 to fix CVE-2018-11793 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36893) upgrade mesos into 1.4.3
[ https://issues.apache.org/jira/browse/SPARK-36893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36893: Assignee: (was: Apache Spark) > upgrade mesos into 1.4.3 > > > Key: SPARK-36893 > URL: https://issues.apache.org/jira/browse/SPARK-36893 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 3.1.2 >Reporter: Zhongwei Zhu >Priority: Minor > > Upgrade mesos to 1.4.3 to fix CVE-2018-11793 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36893) upgrade mesos into 1.4.3
Zhongwei Zhu created SPARK-36893: Summary: upgrade mesos into 1.4.3 Key: SPARK-36893 URL: https://issues.apache.org/jira/browse/SPARK-36893 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 3.1.2 Reporter: Zhongwei Zhu Upgrade mesos to 1.4.3 to fix CVE-2018-11793 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36845) Inline type hint files
[ https://issues.apache.org/jira/browse/SPARK-36845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422379#comment-17422379 ] Maciej Szymkiewicz commented on SPARK-36845: Additionally, we should probably rethink `_typing.pyi` modules. In general it should be safe to move these to plain Python modules (see for example how Pandas folks do similar thing). > Inline type hint files > -- > > Key: SPARK-36845 > URL: https://issues.apache.org/jira/browse/SPARK-36845 > Project: Spark > Issue Type: Umbrella > Components: PySpark, SQL >Affects Versions: 3.3.0 >Reporter: Takuya Ueshin >Priority: Major > > Currently there are type hint stub files ({{*.pyi}}) to show the expected > types for functions, but we can also take advantage of static type checking > within the functions by inlining the type hints. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36869) Spark job fails due to java.io.InvalidClassException: scala.collection.mutable.WrappedArray$ofRef; local class incompatible
[ https://issues.apache.org/jira/browse/SPARK-36869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422375#comment-17422375 ] Hamid EL MAAZOUZ commented on SPARK-36869: -- Thank you :) > Spark job fails due to java.io.InvalidClassException: > scala.collection.mutable.WrappedArray$ofRef; local class incompatible > --- > > Key: SPARK-36869 > URL: https://issues.apache.org/jira/browse/SPARK-36869 > Project: Spark > Issue Type: Bug > Components: Input/Output >Affects Versions: 3.1.2 > Environment: * RHEL 8.4 > * Java 11.0.12 > * Spark 3.1.2 (only prebuilt with *2.12.10)* > * Scala *2.12.14* for the application code >Reporter: Hamid EL MAAZOUZ >Priority: Blocker > Labels: scala, serialization, spark > > This is a Scala problem. It has been already reported here > [https://github.com/scala/bug/issues/5046] and a fix has been merged here > [https://github.com/scala/scala/pull/9166.|https://github.com/scala/scala/pull/9166] > According to > [https://github.com/scala/bug/issues/5046#issuecomment-928108088], the *fix* > is available on *Scala 2.12.14*, but *Spark 3.0+* is only pre-built with > Scala *2.12.10*. > > * Stacktrace of the failure: (Taken from stderr of a worker process) > {code:java} > Spark Executor Command: "/usr/java/jdk-11.0.12/bin/java" "-cp" > "/opt/apache/spark-3.1.2-bin-hadoop3.2/conf/:/opt/apache/spark-3.1.2-bin-hadoop3.2/jars/*" > "-Xmx1024M" "-Dspark.driver.port=45887" > "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" > "spark://CoarseGrainedScheduler@192.168.0.191:45887" "--executor-id" "0" > "--hostname" "192.168.0.191" "--cores" "12" "--app-id" > "app-20210927231035-" "--worker-url" "spark://Worker@192.168.0.191:35261" > Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > 21/09/27 23:10:36 INFO CoarseGrainedExecutorBackend: Started daemon with > process name: 18957@localhost > 21/09/27 23:10:36 INFO SignalUtils: Registering signal handler for TERM > 21/09/27 23:10:36 INFO SignalUtils: Registering signal handler for HUP > 21/09/27 23:10:36 INFO SignalUtils: Registering signal handler for INT > 21/09/27 23:10:36 WARN Utils: Your hostname, localhost resolves to a loopback > address: 127.0.0.1; using 192.168.0.191 instead (on interface wlp82s0) > 21/09/27 23:10:36 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to > another address > WARNING: An illegal reflective access operation has occurred > WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform > (file:/opt/apache/spark-3.1.2-bin-hadoop3.2/jars/spark-unsafe_2.12-3.1.2.jar) > to constructor java.nio.DirectByteBuffer(long,int) > WARNING: Please consider reporting this to the maintainers of > org.apache.spark.unsafe.Platform > WARNING: Use --illegal-access=warn to enable warnings of further illegal > reflective access operations > WARNING: All illegal access operations will be denied in a future release > 21/09/27 23:10:36 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > 21/09/27 23:10:36 INFO SecurityManager: Changing view acls to: hamidelmaazouz > 21/09/27 23:10:36 INFO SecurityManager: Changing modify acls to: > hamidelmaazouz > 21/09/27 23:10:36 INFO SecurityManager: Changing view acls groups to: > 21/09/27 23:10:36 INFO SecurityManager: Changing modify acls groups to: > 21/09/27 23:10:36 INFO SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: > Set(hamidelmaazouz); groups with view permissions: Set(); users with modify > permissions: Set(hamidelmaazouz); groups with modify permissions: Set() > 21/09/27 23:10:37 INFO TransportClientFactory: Successfully created > connection to /192.168.0.191:45887 after 44 ms (0 ms spent in bootstraps) > 21/09/27 23:10:37 WARN TransportChannelHandler: Exception in connection from > /192.168.0.191:45887 > java.io.InvalidClassException: scala.collection.mutable.WrappedArray$ofRef; > local class incompatible: stream classdesc serialVersionUID = > 3456489343829468865, local class serialVersionUID = 1028182004549731694 > at > java.base/java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:689) > at > java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2012) > at > java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1862) > at > java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2169) > at > java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679) > at > java.base/
[jira] [Created] (SPARK-36892) Disable batch fetch for a shuffle when push based shuffle is enabled
Mridul Muralidharan created SPARK-36892: --- Summary: Disable batch fetch for a shuffle when push based shuffle is enabled Key: SPARK-36892 URL: https://issues.apache.org/jira/browse/SPARK-36892 Project: Spark Issue Type: Bug Components: Shuffle Affects Versions: 3.2.0 Reporter: Mridul Muralidharan When push based shuffle is enabled, efficient fetch of merged mapper shuffle output happens. Unfortunately, this currently interacts badly with spark.sql.adaptive.fetchShuffleBlocksInBatch, potentially causing shuffle fetch to hang and/or duplicate data to be fetched, causing correctness issues. Given batch fetch does not benefit spark stages reading merged blocks when push based shuffle is enabled, ShuffleBlockFetcherIterator.doBatchFetch can be disabled when push based shuffle is enabled. Thx to [~Ngone51] for surfacing this issue. +CC [~Gengliang.Wang] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-36845) Inline type hint files
[ https://issues.apache.org/jira/browse/SPARK-36845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422371#comment-17422371 ] Maciej Szymkiewicz edited comment on SPARK-36845 at 9/29/21, 8:14 PM: -- I'd recommend adding SPARK-36145 as blocker for this, so we can proceed cleanly with {code:python} from __future__ import annotations {code} and avoid all the quoting in in-lined annotations. See [PEP 563|https://www.python.org/dev/peps/pep-0563/#id6] was (Author: zero323): I'd recommend adding SPARK-36145 as blocker for this, so we can proceed cleanly with {code:python} from __future__ import annotations {code} and avoid all the quoting in in-lined annotations. > Inline type hint files > -- > > Key: SPARK-36845 > URL: https://issues.apache.org/jira/browse/SPARK-36845 > Project: Spark > Issue Type: Umbrella > Components: PySpark, SQL >Affects Versions: 3.3.0 >Reporter: Takuya Ueshin >Priority: Major > > Currently there are type hint stub files ({{*.pyi}}) to show the expected > types for functions, but we can also take advantage of static type checking > within the functions by inlining the type hints. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-36845) Inline type hint files
[ https://issues.apache.org/jira/browse/SPARK-36845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422371#comment-17422371 ] Maciej Szymkiewicz edited comment on SPARK-36845 at 9/29/21, 8:13 PM: -- I'd recommend adding SPARK-36145 as blocker for this, so we can proceed cleanly with {code:python} from __future__ import annotations {code} and avoid all the quoting in in-lined annotations. was (Author: zero323): I'd recommend adding SPARK-36145 as blocker for this, so we can proceed cleanly with {{from __future__ import annotations}} and avoid all the quoting in inlined annotations. > Inline type hint files > -- > > Key: SPARK-36845 > URL: https://issues.apache.org/jira/browse/SPARK-36845 > Project: Spark > Issue Type: Umbrella > Components: PySpark, SQL >Affects Versions: 3.3.0 >Reporter: Takuya Ueshin >Priority: Major > > Currently there are type hint stub files ({{*.pyi}}) to show the expected > types for functions, but we can also take advantage of static type checking > within the functions by inlining the type hints. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36845) Inline type hint files
[ https://issues.apache.org/jira/browse/SPARK-36845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422371#comment-17422371 ] Maciej Szymkiewicz commented on SPARK-36845: I'd recommend adding SPARK-36145 as blocker for this, so we can proceed cleanly with {{from __future__ import annotations}} and avoid all the quoting in inlined annotations. > Inline type hint files > -- > > Key: SPARK-36845 > URL: https://issues.apache.org/jira/browse/SPARK-36845 > Project: Spark > Issue Type: Umbrella > Components: PySpark, SQL >Affects Versions: 3.3.0 >Reporter: Takuya Ueshin >Priority: Major > > Currently there are type hint stub files ({{*.pyi}}) to show the expected > types for functions, but we can also take advantage of static type checking > within the functions by inlining the type hints. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36830) Read/write dataframes with ANSI intervals from/to JSON files
[ https://issues.apache.org/jira/browse/SPARK-36830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422341#comment-17422341 ] Max Gekk commented on SPARK-36830: -- [~sarutak] FYI, I don't plan to work on this. Please, feel free to take this. > Read/write dataframes with ANSI intervals from/to JSON files > > > Key: SPARK-36830 > URL: https://issues.apache.org/jira/browse/SPARK-36830 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > Implement writing and reading ANSI intervals (year-month and day-time > intervals) columns in dataframes to Parquet datasources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36883) Upgrade R version to 4.1.1 in CI images
[ https://issues.apache.org/jira/browse/SPARK-36883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-36883. --- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34138 [https://github.com/apache/spark/pull/34138] > Upgrade R version to 4.1.1 in CI images > --- > > Key: SPARK-36883 > URL: https://issues.apache.org/jira/browse/SPARK-36883 > Project: Spark > Issue Type: Test > Components: Tests >Affects Versions: 3.3.0 >Reporter: Hyukjin Kwon >Assignee: Dongjoon Hyun >Priority: Major > Fix For: 3.3.0 > > > https://developer.r-project.org/#:~:text=Release%20plans,on%202021%2D08%2D10. > R 4.1.1 is released. We might better to test the latest version of R with > SparkR. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36883) Upgrade R version to 4.1.1 in CI images
[ https://issues.apache.org/jira/browse/SPARK-36883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-36883: - Assignee: Dongjoon Hyun > Upgrade R version to 4.1.1 in CI images > --- > > Key: SPARK-36883 > URL: https://issues.apache.org/jira/browse/SPARK-36883 > Project: Spark > Issue Type: Test > Components: Tests >Affects Versions: 3.3.0 >Reporter: Hyukjin Kwon >Assignee: Dongjoon Hyun >Priority: Major > > https://developer.r-project.org/#:~:text=Release%20plans,on%202021%2D08%2D10. > R 4.1.1 is released. We might better to test the latest version of R with > SparkR. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36831) Read/write dataframes with ANSI intervals from/to CSV files
[ https://issues.apache.org/jira/browse/SPARK-36831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk resolved SPARK-36831. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34142 [https://github.com/apache/spark/pull/34142] > Read/write dataframes with ANSI intervals from/to CSV files > --- > > Key: SPARK-36831 > URL: https://issues.apache.org/jira/browse/SPARK-36831 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Assignee: Kousuke Saruta >Priority: Major > Fix For: 3.3.0 > > > Implement writing and reading ANSI intervals (year-month and day-time > intervals) columns in dataframes to CSV datasources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36831) Read/write dataframes with ANSI intervals from/to CSV files
[ https://issues.apache.org/jira/browse/SPARK-36831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk reassigned SPARK-36831: Assignee: Kousuke Saruta > Read/write dataframes with ANSI intervals from/to CSV files > --- > > Key: SPARK-36831 > URL: https://issues.apache.org/jira/browse/SPARK-36831 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Assignee: Kousuke Saruta >Priority: Major > > Implement writing and reading ANSI intervals (year-month and day-time > intervals) columns in dataframes to CSV datasources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36891) Add new test suite to cover Parquet decoding
Chao Sun created SPARK-36891: Summary: Add new test suite to cover Parquet decoding Key: SPARK-36891 URL: https://issues.apache.org/jira/browse/SPARK-36891 Project: Spark Issue Type: Test Components: SQL Affects Versions: 3.3.0 Reporter: Chao Sun Add a new test suite to add more coverage for Parquet vectorized decoding, focusing on different combinations of Parquet column index, dictionary, batch size, page size, etc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36869) Spark job fails due to java.io.InvalidClassException: scala.collection.mutable.WrappedArray$ofRef; local class incompatible
[ https://issues.apache.org/jira/browse/SPARK-36869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422294#comment-17422294 ] Dongjoon Hyun commented on SPARK-36869: --- Thank you for your confirmation, [~hamidelmaazouz]. As you noticed at the RC number `6`, Apache Spark 3.2.0 is almost ready. > However, Scala 2.12.15 against the RC6 jars work fine > Spark job fails due to java.io.InvalidClassException: > scala.collection.mutable.WrappedArray$ofRef; local class incompatible > --- > > Key: SPARK-36869 > URL: https://issues.apache.org/jira/browse/SPARK-36869 > Project: Spark > Issue Type: Bug > Components: Input/Output >Affects Versions: 3.1.2 > Environment: * RHEL 8.4 > * Java 11.0.12 > * Spark 3.1.2 (only prebuilt with *2.12.10)* > * Scala *2.12.14* for the application code >Reporter: Hamid EL MAAZOUZ >Priority: Blocker > Labels: scala, serialization, spark > > This is a Scala problem. It has been already reported here > [https://github.com/scala/bug/issues/5046] and a fix has been merged here > [https://github.com/scala/scala/pull/9166.|https://github.com/scala/scala/pull/9166] > According to > [https://github.com/scala/bug/issues/5046#issuecomment-928108088], the *fix* > is available on *Scala 2.12.14*, but *Spark 3.0+* is only pre-built with > Scala *2.12.10*. > > * Stacktrace of the failure: (Taken from stderr of a worker process) > {code:java} > Spark Executor Command: "/usr/java/jdk-11.0.12/bin/java" "-cp" > "/opt/apache/spark-3.1.2-bin-hadoop3.2/conf/:/opt/apache/spark-3.1.2-bin-hadoop3.2/jars/*" > "-Xmx1024M" "-Dspark.driver.port=45887" > "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" > "spark://CoarseGrainedScheduler@192.168.0.191:45887" "--executor-id" "0" > "--hostname" "192.168.0.191" "--cores" "12" "--app-id" > "app-20210927231035-" "--worker-url" "spark://Worker@192.168.0.191:35261" > Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > 21/09/27 23:10:36 INFO CoarseGrainedExecutorBackend: Started daemon with > process name: 18957@localhost > 21/09/27 23:10:36 INFO SignalUtils: Registering signal handler for TERM > 21/09/27 23:10:36 INFO SignalUtils: Registering signal handler for HUP > 21/09/27 23:10:36 INFO SignalUtils: Registering signal handler for INT > 21/09/27 23:10:36 WARN Utils: Your hostname, localhost resolves to a loopback > address: 127.0.0.1; using 192.168.0.191 instead (on interface wlp82s0) > 21/09/27 23:10:36 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to > another address > WARNING: An illegal reflective access operation has occurred > WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform > (file:/opt/apache/spark-3.1.2-bin-hadoop3.2/jars/spark-unsafe_2.12-3.1.2.jar) > to constructor java.nio.DirectByteBuffer(long,int) > WARNING: Please consider reporting this to the maintainers of > org.apache.spark.unsafe.Platform > WARNING: Use --illegal-access=warn to enable warnings of further illegal > reflective access operations > WARNING: All illegal access operations will be denied in a future release > 21/09/27 23:10:36 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > 21/09/27 23:10:36 INFO SecurityManager: Changing view acls to: hamidelmaazouz > 21/09/27 23:10:36 INFO SecurityManager: Changing modify acls to: > hamidelmaazouz > 21/09/27 23:10:36 INFO SecurityManager: Changing view acls groups to: > 21/09/27 23:10:36 INFO SecurityManager: Changing modify acls groups to: > 21/09/27 23:10:36 INFO SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: > Set(hamidelmaazouz); groups with view permissions: Set(); users with modify > permissions: Set(hamidelmaazouz); groups with modify permissions: Set() > 21/09/27 23:10:37 INFO TransportClientFactory: Successfully created > connection to /192.168.0.191:45887 after 44 ms (0 ms spent in bootstraps) > 21/09/27 23:10:37 WARN TransportChannelHandler: Exception in connection from > /192.168.0.191:45887 > java.io.InvalidClassException: scala.collection.mutable.WrappedArray$ofRef; > local class incompatible: stream classdesc serialVersionUID = > 3456489343829468865, local class serialVersionUID = 1028182004549731694 > at > java.base/java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:689) > at > java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2012) > at > java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1862) > at > java.base/java.io.ObjectInputStream
[jira] [Commented] (SPARK-36890) Websocket timeouts to K8s-API
[ https://issues.apache.org/jira/browse/SPARK-36890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422252#comment-17422252 ] Apache Spark commented on SPARK-36890: -- User 'Reamer' has created a pull request for this issue: https://github.com/apache/spark/pull/34143 > Websocket timeouts to K8s-API > - > > Key: SPARK-36890 > URL: https://issues.apache.org/jira/browse/SPARK-36890 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.0, 2.4.1, 2.4.2, > 2.4.3, 2.4.4, 2.4.5, 2.4.6, 2.4.7, 2.4.8, 3.0.0, 3.0.1, 3.0.2, 3.0.3, 3.1.0, > 3.1.1, 3.1.2 >Reporter: Philipp Dallig >Priority: Major > > If you access the Kubernetes API via a load balancer (e.g. HAProxy) and have > set a tunnel timeout, the following error message is thrown exactly after > each timeout. > {code} > >>> 21/09/27 15:35:19 WARN WatchConnectionManager: Exec Failure > java.io.EOFException > at okio.RealBufferedSource.require(RealBufferedSource.java:61) > at okio.RealBufferedSource.readByte(RealBufferedSource.java:74) > at > okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:117) > at > okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:101) > at > okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:274) > at > okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:214) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:203) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > This exception is quite annoying when working interactively with a paused > pySpark shell where the driver component runs locally but the executors run > in Kubernetes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36890) Websocket timeouts to K8s-API
[ https://issues.apache.org/jira/browse/SPARK-36890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36890: Assignee: (was: Apache Spark) > Websocket timeouts to K8s-API > - > > Key: SPARK-36890 > URL: https://issues.apache.org/jira/browse/SPARK-36890 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.0, 2.4.1, 2.4.2, > 2.4.3, 2.4.4, 2.4.5, 2.4.6, 2.4.7, 2.4.8, 3.0.0, 3.0.1, 3.0.2, 3.0.3, 3.1.0, > 3.1.1, 3.1.2 >Reporter: Philipp Dallig >Priority: Major > > If you access the Kubernetes API via a load balancer (e.g. HAProxy) and have > set a tunnel timeout, the following error message is thrown exactly after > each timeout. > {code} > >>> 21/09/27 15:35:19 WARN WatchConnectionManager: Exec Failure > java.io.EOFException > at okio.RealBufferedSource.require(RealBufferedSource.java:61) > at okio.RealBufferedSource.readByte(RealBufferedSource.java:74) > at > okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:117) > at > okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:101) > at > okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:274) > at > okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:214) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:203) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > This exception is quite annoying when working interactively with a paused > pySpark shell where the driver component runs locally but the executors run > in Kubernetes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36890) Websocket timeouts to K8s-API
[ https://issues.apache.org/jira/browse/SPARK-36890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422253#comment-17422253 ] Apache Spark commented on SPARK-36890: -- User 'Reamer' has created a pull request for this issue: https://github.com/apache/spark/pull/34143 > Websocket timeouts to K8s-API > - > > Key: SPARK-36890 > URL: https://issues.apache.org/jira/browse/SPARK-36890 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.0, 2.4.1, 2.4.2, > 2.4.3, 2.4.4, 2.4.5, 2.4.6, 2.4.7, 2.4.8, 3.0.0, 3.0.1, 3.0.2, 3.0.3, 3.1.0, > 3.1.1, 3.1.2 >Reporter: Philipp Dallig >Priority: Major > > If you access the Kubernetes API via a load balancer (e.g. HAProxy) and have > set a tunnel timeout, the following error message is thrown exactly after > each timeout. > {code} > >>> 21/09/27 15:35:19 WARN WatchConnectionManager: Exec Failure > java.io.EOFException > at okio.RealBufferedSource.require(RealBufferedSource.java:61) > at okio.RealBufferedSource.readByte(RealBufferedSource.java:74) > at > okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:117) > at > okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:101) > at > okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:274) > at > okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:214) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:203) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > This exception is quite annoying when working interactively with a paused > pySpark shell where the driver component runs locally but the executors run > in Kubernetes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36890) Websocket timeouts to K8s-API
[ https://issues.apache.org/jira/browse/SPARK-36890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36890: Assignee: Apache Spark > Websocket timeouts to K8s-API > - > > Key: SPARK-36890 > URL: https://issues.apache.org/jira/browse/SPARK-36890 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.0, 2.4.1, 2.4.2, > 2.4.3, 2.4.4, 2.4.5, 2.4.6, 2.4.7, 2.4.8, 3.0.0, 3.0.1, 3.0.2, 3.0.3, 3.1.0, > 3.1.1, 3.1.2 >Reporter: Philipp Dallig >Assignee: Apache Spark >Priority: Major > > If you access the Kubernetes API via a load balancer (e.g. HAProxy) and have > set a tunnel timeout, the following error message is thrown exactly after > each timeout. > {code} > >>> 21/09/27 15:35:19 WARN WatchConnectionManager: Exec Failure > java.io.EOFException > at okio.RealBufferedSource.require(RealBufferedSource.java:61) > at okio.RealBufferedSource.readByte(RealBufferedSource.java:74) > at > okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:117) > at > okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:101) > at > okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:274) > at > okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:214) > at okhttp3.RealCall$AsyncCall.execute(RealCall.java:203) > at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > This exception is quite annoying when working interactively with a paused > pySpark shell where the driver component runs locally but the executors run > in Kubernetes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36890) Websocket timeouts to K8s-API
Philipp Dallig created SPARK-36890: -- Summary: Websocket timeouts to K8s-API Key: SPARK-36890 URL: https://issues.apache.org/jira/browse/SPARK-36890 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 3.1.2, 3.1.1, 3.1.0, 3.0.3, 3.0.2, 3.0.1, 3.0.0, 2.4.8, 2.4.7, 2.4.6, 2.4.5, 2.4.4, 2.4.3, 2.4.2, 2.4.1, 2.4.0, 2.3.4, 2.3.3, 2.3.2, 2.3.1, 2.3.0 Reporter: Philipp Dallig If you access the Kubernetes API via a load balancer (e.g. HAProxy) and have set a tunnel timeout, the following error message is thrown exactly after each timeout. {code} >>> 21/09/27 15:35:19 WARN WatchConnectionManager: Exec Failure java.io.EOFException at okio.RealBufferedSource.require(RealBufferedSource.java:61) at okio.RealBufferedSource.readByte(RealBufferedSource.java:74) at okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:117) at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:101) at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:274) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:214) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:203) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} This exception is quite annoying when working interactively with a paused pySpark shell where the driver component runs locally but the executors run in Kubernetes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36869) Spark job fails due to java.io.InvalidClassException: scala.collection.mutable.WrappedArray$ofRef; local class incompatible
[ https://issues.apache.org/jira/browse/SPARK-36869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422240#comment-17422240 ] Hamid EL MAAZOUZ commented on SPARK-36869: -- Testing Scala 2.12.15 against old Spark jars (3.1.2 pulled from mvn repos) version fails similarly (serialVersionUID mismatch for a different class) However, Scala 2.12.15 against the RC6 jars work fine :) > Spark job fails due to java.io.InvalidClassException: > scala.collection.mutable.WrappedArray$ofRef; local class incompatible > --- > > Key: SPARK-36869 > URL: https://issues.apache.org/jira/browse/SPARK-36869 > Project: Spark > Issue Type: Bug > Components: Input/Output >Affects Versions: 3.1.2 > Environment: * RHEL 8.4 > * Java 11.0.12 > * Spark 3.1.2 (only prebuilt with *2.12.10)* > * Scala *2.12.14* for the application code >Reporter: Hamid EL MAAZOUZ >Priority: Blocker > Labels: scala, serialization, spark > > This is a Scala problem. It has been already reported here > [https://github.com/scala/bug/issues/5046] and a fix has been merged here > [https://github.com/scala/scala/pull/9166.|https://github.com/scala/scala/pull/9166] > According to > [https://github.com/scala/bug/issues/5046#issuecomment-928108088], the *fix* > is available on *Scala 2.12.14*, but *Spark 3.0+* is only pre-built with > Scala *2.12.10*. > > * Stacktrace of the failure: (Taken from stderr of a worker process) > {code:java} > Spark Executor Command: "/usr/java/jdk-11.0.12/bin/java" "-cp" > "/opt/apache/spark-3.1.2-bin-hadoop3.2/conf/:/opt/apache/spark-3.1.2-bin-hadoop3.2/jars/*" > "-Xmx1024M" "-Dspark.driver.port=45887" > "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" > "spark://CoarseGrainedScheduler@192.168.0.191:45887" "--executor-id" "0" > "--hostname" "192.168.0.191" "--cores" "12" "--app-id" > "app-20210927231035-" "--worker-url" "spark://Worker@192.168.0.191:35261" > Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > 21/09/27 23:10:36 INFO CoarseGrainedExecutorBackend: Started daemon with > process name: 18957@localhost > 21/09/27 23:10:36 INFO SignalUtils: Registering signal handler for TERM > 21/09/27 23:10:36 INFO SignalUtils: Registering signal handler for HUP > 21/09/27 23:10:36 INFO SignalUtils: Registering signal handler for INT > 21/09/27 23:10:36 WARN Utils: Your hostname, localhost resolves to a loopback > address: 127.0.0.1; using 192.168.0.191 instead (on interface wlp82s0) > 21/09/27 23:10:36 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to > another address > WARNING: An illegal reflective access operation has occurred > WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform > (file:/opt/apache/spark-3.1.2-bin-hadoop3.2/jars/spark-unsafe_2.12-3.1.2.jar) > to constructor java.nio.DirectByteBuffer(long,int) > WARNING: Please consider reporting this to the maintainers of > org.apache.spark.unsafe.Platform > WARNING: Use --illegal-access=warn to enable warnings of further illegal > reflective access operations > WARNING: All illegal access operations will be denied in a future release > 21/09/27 23:10:36 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > 21/09/27 23:10:36 INFO SecurityManager: Changing view acls to: hamidelmaazouz > 21/09/27 23:10:36 INFO SecurityManager: Changing modify acls to: > hamidelmaazouz > 21/09/27 23:10:36 INFO SecurityManager: Changing view acls groups to: > 21/09/27 23:10:36 INFO SecurityManager: Changing modify acls groups to: > 21/09/27 23:10:36 INFO SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: > Set(hamidelmaazouz); groups with view permissions: Set(); users with modify > permissions: Set(hamidelmaazouz); groups with modify permissions: Set() > 21/09/27 23:10:37 INFO TransportClientFactory: Successfully created > connection to /192.168.0.191:45887 after 44 ms (0 ms spent in bootstraps) > 21/09/27 23:10:37 WARN TransportChannelHandler: Exception in connection from > /192.168.0.191:45887 > java.io.InvalidClassException: scala.collection.mutable.WrappedArray$ofRef; > local class incompatible: stream classdesc serialVersionUID = > 3456489343829468865, local class serialVersionUID = 1028182004549731694 > at > java.base/java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:689) > at > java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2012) > at > java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1862) > at > java
[jira] [Resolved] (SPARK-36624) When application killed, sc should not exit with code 0
[ https://issues.apache.org/jira/browse/SPARK-36624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-36624. --- Fix Version/s: 3.3.0 Assignee: angerszhu Resolution: Fixed > When application killed, sc should not exit with code 0 > --- > > Key: SPARK-36624 > URL: https://issues.apache.org/jira/browse/SPARK-36624 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, YARN >Affects Versions: 3.2.0 >Reporter: angerszhu >Assignee: angerszhu >Priority: Major > Fix For: 3.3.0 > > > When application killed, sc should not exit with code 0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36869) Spark job fails due to java.io.InvalidClassException: scala.collection.mutable.WrappedArray$ofRef; local class incompatible
[ https://issues.apache.org/jira/browse/SPARK-36869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422215#comment-17422215 ] Dongjoon Hyun commented on SPARK-36869: --- You can test Scala 2.12.15 with Apache Spark 3.2.0 RC6 binaries. - https://dist.apache.org/repos/dist/dev/spark/v3.2.0-rc6-bin/ > Spark job fails due to java.io.InvalidClassException: > scala.collection.mutable.WrappedArray$ofRef; local class incompatible > --- > > Key: SPARK-36869 > URL: https://issues.apache.org/jira/browse/SPARK-36869 > Project: Spark > Issue Type: Bug > Components: Input/Output >Affects Versions: 3.1.2 > Environment: * RHEL 8.4 > * Java 11.0.12 > * Spark 3.1.2 (only prebuilt with *2.12.10)* > * Scala *2.12.14* for the application code >Reporter: Hamid EL MAAZOUZ >Priority: Blocker > Labels: scala, serialization, spark > > This is a Scala problem. It has been already reported here > [https://github.com/scala/bug/issues/5046] and a fix has been merged here > [https://github.com/scala/scala/pull/9166.|https://github.com/scala/scala/pull/9166] > According to > [https://github.com/scala/bug/issues/5046#issuecomment-928108088], the *fix* > is available on *Scala 2.12.14*, but *Spark 3.0+* is only pre-built with > Scala *2.12.10*. > > * Stacktrace of the failure: (Taken from stderr of a worker process) > {code:java} > Spark Executor Command: "/usr/java/jdk-11.0.12/bin/java" "-cp" > "/opt/apache/spark-3.1.2-bin-hadoop3.2/conf/:/opt/apache/spark-3.1.2-bin-hadoop3.2/jars/*" > "-Xmx1024M" "-Dspark.driver.port=45887" > "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" > "spark://CoarseGrainedScheduler@192.168.0.191:45887" "--executor-id" "0" > "--hostname" "192.168.0.191" "--cores" "12" "--app-id" > "app-20210927231035-" "--worker-url" "spark://Worker@192.168.0.191:35261" > Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > 21/09/27 23:10:36 INFO CoarseGrainedExecutorBackend: Started daemon with > process name: 18957@localhost > 21/09/27 23:10:36 INFO SignalUtils: Registering signal handler for TERM > 21/09/27 23:10:36 INFO SignalUtils: Registering signal handler for HUP > 21/09/27 23:10:36 INFO SignalUtils: Registering signal handler for INT > 21/09/27 23:10:36 WARN Utils: Your hostname, localhost resolves to a loopback > address: 127.0.0.1; using 192.168.0.191 instead (on interface wlp82s0) > 21/09/27 23:10:36 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to > another address > WARNING: An illegal reflective access operation has occurred > WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform > (file:/opt/apache/spark-3.1.2-bin-hadoop3.2/jars/spark-unsafe_2.12-3.1.2.jar) > to constructor java.nio.DirectByteBuffer(long,int) > WARNING: Please consider reporting this to the maintainers of > org.apache.spark.unsafe.Platform > WARNING: Use --illegal-access=warn to enable warnings of further illegal > reflective access operations > WARNING: All illegal access operations will be denied in a future release > 21/09/27 23:10:36 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > 21/09/27 23:10:36 INFO SecurityManager: Changing view acls to: hamidelmaazouz > 21/09/27 23:10:36 INFO SecurityManager: Changing modify acls to: > hamidelmaazouz > 21/09/27 23:10:36 INFO SecurityManager: Changing view acls groups to: > 21/09/27 23:10:36 INFO SecurityManager: Changing modify acls groups to: > 21/09/27 23:10:36 INFO SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: > Set(hamidelmaazouz); groups with view permissions: Set(); users with modify > permissions: Set(hamidelmaazouz); groups with modify permissions: Set() > 21/09/27 23:10:37 INFO TransportClientFactory: Successfully created > connection to /192.168.0.191:45887 after 44 ms (0 ms spent in bootstraps) > 21/09/27 23:10:37 WARN TransportChannelHandler: Exception in connection from > /192.168.0.191:45887 > java.io.InvalidClassException: scala.collection.mutable.WrappedArray$ofRef; > local class incompatible: stream classdesc serialVersionUID = > 3456489343829468865, local class serialVersionUID = 1028182004549731694 > at > java.base/java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:689) > at > java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2012) > at > java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1862) > at > java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2169) >
[jira] [Assigned] (SPARK-36831) Read/write dataframes with ANSI intervals from/to CSV files
[ https://issues.apache.org/jira/browse/SPARK-36831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36831: Assignee: (was: Apache Spark) > Read/write dataframes with ANSI intervals from/to CSV files > --- > > Key: SPARK-36831 > URL: https://issues.apache.org/jira/browse/SPARK-36831 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > Implement writing and reading ANSI intervals (year-month and day-time > intervals) columns in dataframes to CSV datasources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36831) Read/write dataframes with ANSI intervals from/to CSV files
[ https://issues.apache.org/jira/browse/SPARK-36831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36831: Assignee: Apache Spark > Read/write dataframes with ANSI intervals from/to CSV files > --- > > Key: SPARK-36831 > URL: https://issues.apache.org/jira/browse/SPARK-36831 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Assignee: Apache Spark >Priority: Major > > Implement writing and reading ANSI intervals (year-month and day-time > intervals) columns in dataframes to CSV datasources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36831) Read/write dataframes with ANSI intervals from/to CSV files
[ https://issues.apache.org/jira/browse/SPARK-36831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422117#comment-17422117 ] Apache Spark commented on SPARK-36831: -- User 'sarutak' has created a pull request for this issue: https://github.com/apache/spark/pull/34142 > Read/write dataframes with ANSI intervals from/to CSV files > --- > > Key: SPARK-36831 > URL: https://issues.apache.org/jira/browse/SPARK-36831 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > Implement writing and reading ANSI intervals (year-month and day-time > intervals) columns in dataframes to CSV datasources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36550) Propagation cause when UDF reflection fails
[ https://issues.apache.org/jira/browse/SPARK-36550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen reassigned SPARK-36550: Assignee: dzcxzl > Propagation cause when UDF reflection fails > --- > > Key: SPARK-36550 > URL: https://issues.apache.org/jira/browse/SPARK-36550 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: dzcxzl >Assignee: dzcxzl >Priority: Trivial > > Now when UDF reflection fails, InvocationTargetException is thrown, but it is > not a specific exception. > {code:java} > Error in query: No handler for Hive UDF 'XXX': > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36550) Propagation cause when UDF reflection fails
[ https://issues.apache.org/jira/browse/SPARK-36550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-36550. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 33796 [https://github.com/apache/spark/pull/33796] > Propagation cause when UDF reflection fails > --- > > Key: SPARK-36550 > URL: https://issues.apache.org/jira/browse/SPARK-36550 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.1.2 >Reporter: dzcxzl >Assignee: dzcxzl >Priority: Trivial > Fix For: 3.3.0 > > > Now when UDF reflection fails, InvocationTargetException is thrown, but it is > not a specific exception. > {code:java} > Error in query: No handler for Hive UDF 'XXX': > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36831) Read/write dataframes with ANSI intervals from/to CSV files
[ https://issues.apache.org/jira/browse/SPARK-36831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422108#comment-17422108 ] Kousuke Saruta commented on SPARK-36831: Thank you. I'll open a PR. > Read/write dataframes with ANSI intervals from/to CSV files > --- > > Key: SPARK-36831 > URL: https://issues.apache.org/jira/browse/SPARK-36831 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > Implement writing and reading ANSI intervals (year-month and day-time > intervals) columns in dataframes to CSV datasources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36831) Read/write dataframes with ANSI intervals from/to CSV files
[ https://issues.apache.org/jira/browse/SPARK-36831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17422093#comment-17422093 ] Max Gekk commented on SPARK-36831: -- [~sarutak] No, feel free to take this. > Read/write dataframes with ANSI intervals from/to CSV files > --- > > Key: SPARK-36831 > URL: https://issues.apache.org/jira/browse/SPARK-36831 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > Implement writing and reading ANSI intervals (year-month and day-time > intervals) columns in dataframes to CSV datasources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36424) Support eliminate limits in AQE Optimizer
[ https://issues.apache.org/jira/browse/SPARK-36424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] XiDuo You updated SPARK-36424: -- Parent: SPARK-33828 Issue Type: Sub-task (was: Improvement) > Support eliminate limits in AQE Optimizer > - > > Key: SPARK-36424 > URL: https://issues.apache.org/jira/browse/SPARK-36424 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: XiDuo You >Priority: Major > Fix For: 3.3.0 > > > In Ad-hoc scenario, we always add limit for the query if user have no special > limit value, but not all limit is nesessary. > With the power of AQE, we can eliminate limits using running statistics. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org