[jira] [Comment Edited] (SPARK-33325) Spark executors pod are not shutting down when losing driver connection
[ https://issues.apache.org/jira/browse/SPARK-33325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453536#comment-17453536 ] wineternity edited comment on SPARK-33325 at 12/5/21, 7:20 AM: --- Fixed in https://issues.apache.org/jira/browse/SPARK-36532 was (Author: yimo_yym): Fixed in https://issues.apache.org/jira/browse/SPARK-36532 > Spark executors pod are not shutting down when losing driver connection > --- > > Key: SPARK-33325 > URL: https://issues.apache.org/jira/browse/SPARK-33325 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.0.1 >Reporter: Hadrien Kohl >Priority: Major > > In situations where the executors lose contact with the driver, the java > process does not die. I am looking at what on the kubernetes cluster could > prevent proper clean-up. > The spark driver is started in it's own pod in client mode (pyspark shell > started by jupyter). I works fine most of the time but if the driver process > crashes (OOM or kill signal for instance) the executor complains about the > connection reset by peer and then hangs. > Here's the log from an executor pod that hangs: > {code:java} > 20/11/03 07:35:30 WARN TransportChannelHandler: Exception in connection from > /10.17.0.152:37161 > java.io.IOException: Connection reset by peer > at java.base/sun.nio.ch.FileDispatcherImpl.read0(Native Method) > at java.base/sun.nio.ch.SocketDispatcher.read(Unknown Source) > at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source) > at java.base/sun.nio.ch.IOUtil.read(Unknown Source) > at java.base/sun.nio.ch.IOUtil.read(Unknown Source) > at java.base/sun.nio.ch.SocketChannelImpl.read(Unknown Source) > at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:253) > at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1133) > at > io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:350) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) > at > io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) > at > io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Unknown Source) > 20/11/03 07:35:30 ERROR CoarseGrainedExecutorBackend: Executor self-exiting > due to : Driver 10.17.0.152:37161 disassociated! Shutting down. > 20/11/03 07:35:31 INFO MemoryStore: MemoryStore cleared > 20/11/03 07:35:31 INFO BlockManager: BlockManager stopped > {code} > When start a shell in the pod I can see the process are still running: > {code:java} > UID PIDPPID CSZ RSS PSR STIME TTY TIME CMD > 185 125 0 0 5045 3968 2 10:07 pts/000:00:00 /bin/bash > 185 166 125 0 9019 3364 1 10:39 pts/000:00:00 \_ ps > -AF --forest > 1851 0 0 1130 768 0 07:34 ?00:00:00 > /usr/bin/tini -s -- /opt/java/openjdk/ > 185 14 1 0 1935527 493976 3 07:34 ? 00:00:21 > /opt/java/openjdk/bin/java -Dspark.dri > {code} > Here's the full command used to start the executor: > {code:java} > /opt/java/openjdk/ > bin/java -Dspark.driver.port=37161 -Xms4g -Xmx4g -cp :/opt/spark/jars/*: > org.apache.spark.executor.CoarseG > rainedExecutorBackend --driver-url > spark://CoarseGrainedScheduler@10.17.0.152:37161 --executor-id 1 --core > s 1 --app-id spark-application-1604388891044 --hostname 10.17.2.151 > {code} > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33325) Spark executors pod are not shutting down when losing driver connection
[ https://issues.apache.org/jira/browse/SPARK-33325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453536#comment-17453536 ] wineternity commented on SPARK-33325: - Fixed in https://issues.apache.org/jira/browse/SPARK-36532 > Spark executors pod are not shutting down when losing driver connection > --- > > Key: SPARK-33325 > URL: https://issues.apache.org/jira/browse/SPARK-33325 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.0.1 >Reporter: Hadrien Kohl >Priority: Major > > In situations where the executors lose contact with the driver, the java > process does not die. I am looking at what on the kubernetes cluster could > prevent proper clean-up. > The spark driver is started in it's own pod in client mode (pyspark shell > started by jupyter). I works fine most of the time but if the driver process > crashes (OOM or kill signal for instance) the executor complains about the > connection reset by peer and then hangs. > Here's the log from an executor pod that hangs: > {code:java} > 20/11/03 07:35:30 WARN TransportChannelHandler: Exception in connection from > /10.17.0.152:37161 > java.io.IOException: Connection reset by peer > at java.base/sun.nio.ch.FileDispatcherImpl.read0(Native Method) > at java.base/sun.nio.ch.SocketDispatcher.read(Unknown Source) > at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source) > at java.base/sun.nio.ch.IOUtil.read(Unknown Source) > at java.base/sun.nio.ch.IOUtil.read(Unknown Source) > at java.base/sun.nio.ch.SocketChannelImpl.read(Unknown Source) > at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:253) > at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1133) > at > io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:350) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) > at > io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) > at > io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Unknown Source) > 20/11/03 07:35:30 ERROR CoarseGrainedExecutorBackend: Executor self-exiting > due to : Driver 10.17.0.152:37161 disassociated! Shutting down. > 20/11/03 07:35:31 INFO MemoryStore: MemoryStore cleared > 20/11/03 07:35:31 INFO BlockManager: BlockManager stopped > {code} > When start a shell in the pod I can see the process are still running: > {code:java} > UID PIDPPID CSZ RSS PSR STIME TTY TIME CMD > 185 125 0 0 5045 3968 2 10:07 pts/000:00:00 /bin/bash > 185 166 125 0 9019 3364 1 10:39 pts/000:00:00 \_ ps > -AF --forest > 1851 0 0 1130 768 0 07:34 ?00:00:00 > /usr/bin/tini -s -- /opt/java/openjdk/ > 185 14 1 0 1935527 493976 3 07:34 ? 00:00:21 > /opt/java/openjdk/bin/java -Dspark.dri > {code} > Here's the full command used to start the executor: > {code:java} > /opt/java/openjdk/ > bin/java -Dspark.driver.port=37161 -Xms4g -Xmx4g -cp :/opt/spark/jars/*: > org.apache.spark.executor.CoarseG > rainedExecutorBackend --driver-url > spark://CoarseGrainedScheduler@10.17.0.152:37161 --executor-id 1 --core > s 1 --app-id spark-application-1604388891044 --hostname 10.17.2.151 > {code} > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37545) V2 CreateTableAsSelect command should qualify location
[ https://issues.apache.org/jira/browse/SPARK-37545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huaxin Gao resolved SPARK-37545. Fix Version/s: 3.3.0 Assignee: Terry Kim Resolution: Fixed > V2 CreateTableAsSelect command should qualify location > -- > > Key: SPARK-37545 > URL: https://issues.apache.org/jira/browse/SPARK-37545 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Terry Kim >Assignee: Terry Kim >Priority: Major > Fix For: 3.3.0 > > > V2 CreateTableAsSelect command should qualify location. Currently, > > {code:java} > spark.sql("CREATE TABLE testcat.t USING foo LOCATION '/tmp/foo' AS SELECT id > FROM source") > spark.sql("DESCRIBE EXTENDED testcat.t").show(false) > {code} > displays the location as `/tmp/foo` whereas V1 command displays/stores it as > qualified (`[file:/tmp/foo|file:///tmp/foo]`). > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37548) Add Java17 SparkR daily test coverage
[ https://issues.apache.org/jira/browse/SPARK-37548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-37548. --- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34808 [https://github.com/apache/spark/pull/34808] > Add Java17 SparkR daily test coverage > - > > Key: SPARK-37548 > URL: https://issues.apache.org/jira/browse/SPARK-37548 > Project: Spark > Issue Type: Sub-task > Components: Project Infra, SparkR, Tests >Affects Versions: 3.3.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37548) Add Java17 SparkR daily test coverage
[ https://issues.apache.org/jira/browse/SPARK-37548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-37548: - Assignee: Dongjoon Hyun > Add Java17 SparkR daily test coverage > - > > Key: SPARK-37548 > URL: https://issues.apache.org/jira/browse/SPARK-37548 > Project: Spark > Issue Type: Sub-task > Components: Project Infra, SparkR, Tests >Affects Versions: 3.3.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37548) Add Java17 SparkR daily test coverage
[ https://issues.apache.org/jira/browse/SPARK-37548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453503#comment-17453503 ] Apache Spark commented on SPARK-37548: -- User 'dongjoon-hyun' has created a pull request for this issue: https://github.com/apache/spark/pull/34808 > Add Java17 SparkR daily test coverage > - > > Key: SPARK-37548 > URL: https://issues.apache.org/jira/browse/SPARK-37548 > Project: Spark > Issue Type: Sub-task > Components: Project Infra, SparkR, Tests >Affects Versions: 3.3.0 >Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37548) Add Java17 SparkR daily test coverage
[ https://issues.apache.org/jira/browse/SPARK-37548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37548: Assignee: (was: Apache Spark) > Add Java17 SparkR daily test coverage > - > > Key: SPARK-37548 > URL: https://issues.apache.org/jira/browse/SPARK-37548 > Project: Spark > Issue Type: Sub-task > Components: Project Infra, SparkR, Tests >Affects Versions: 3.3.0 >Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37548) Add Java17 SparkR daily test coverage
[ https://issues.apache.org/jira/browse/SPARK-37548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37548: Assignee: Apache Spark > Add Java17 SparkR daily test coverage > - > > Key: SPARK-37548 > URL: https://issues.apache.org/jira/browse/SPARK-37548 > Project: Spark > Issue Type: Sub-task > Components: Project Infra, SparkR, Tests >Affects Versions: 3.3.0 >Reporter: Dongjoon Hyun >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37548) Add Java17 SparkR daily test coverage
Dongjoon Hyun created SPARK-37548: - Summary: Add Java17 SparkR daily test coverage Key: SPARK-37548 URL: https://issues.apache.org/jira/browse/SPARK-37548 Project: Spark Issue Type: Sub-task Components: Project Infra, SparkR, Tests Affects Versions: 3.3.0 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37546) V2 ReplaceTableAsSelect command should qualify location
[ https://issues.apache.org/jira/browse/SPARK-37546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453501#comment-17453501 ] Apache Spark commented on SPARK-37546: -- User 'huaxingao' has created a pull request for this issue: https://github.com/apache/spark/pull/34807 > V2 ReplaceTableAsSelect command should qualify location > --- > > Key: SPARK-37546 > URL: https://issues.apache.org/jira/browse/SPARK-37546 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Huaxin Gao >Priority: Major > > V2 ReplaceTableAsSelect command should qualify location. Currently, > {code:java} > spark.sql("REPLACE TABLE testcat.t USING foo LOCATION '/tmp/foo' AS SELECT id > FROM source") > spark.sql("DESCRIBE EXTENDED testcat.t").show(false) > {code} > displays the location as `/tmp/foo` whereas V1 command displays/stores it as > qualified (`file:/tmp/foo`). -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37547) Unexpected NullPointerException when Aggregator.finish returns null
Andrei created SPARK-37547: -- Summary: Unexpected NullPointerException when Aggregator.finish returns null Key: SPARK-37547 URL: https://issues.apache.org/jira/browse/SPARK-37547 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.2.0, 3.1.2 Reporter: Andrei I'm migrating existing code (Java 8) from Spark 2.4 to Spark 3 and I see NullPointerException when an Aggregator returns null in finish method for a custom class. I've created simple snippet to repro the issue. {code:java} public class SparkTest { public static void main(String[] args) { SparkConf conf = new SparkConf().setAppName("name").setMaster("local[*]"); SparkSession spark = SparkSession.builder().config(conf).getOrCreate(); List data = Arrays.asList("1", "2", "3"); Dataset dataset = spark.createDataset(data, Encoders.STRING()); Dataset aggDataset = dataset.groupBy("value").agg(new EntityAggregator().toColumn().name("agg")); aggDataset.show(); } } {code} {code:java} public class EntityAggregator extends Aggregator { public EntityAgg zero() { return new EntityAgg(0l); } public EntityAgg reduce(EntityAgg agg, Row row) { return agg; } public EntityAgg merge(EntityAgg e1, EntityAgg e2) { return e1; } public Encoder bufferEncoder() { return Encoders.bean(EntityAgg.class); } public Encoder outputEncoder() { return Encoders.bean(EntityAgg.class); } public EntityAgg finish(EntityAgg reduction) { return null; } } {code} {code:java} public class EntityAgg { private long field; public EntityAgg() { } public EntityAgg(long field) { this.field = field; } public long getField() { return field; } public void setField(long field) { this.field = field; } } {code} Expected behavior is to print table like this {noformat} +-++ |value| agg| +-++ | 3|null| | 1|null| | 2|null| +-++ {noformat} This code works fine for 2.4 but fails with the following stacktrace for Spark 3 (I tested for 3.1.2 and 3.2.0) {noformat} Caused by: java.lang.NullPointerException at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(generated.java:49) at org.apache.spark.sql.execution.aggregate.AggregationIterator.$anonfun$generateResultProjection$5(AggregationIterator.scala:259) at org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.next(ObjectAggregationIterator.scala:85) at org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.next(ObjectAggregationIterator.scala:32) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:346) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:898) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:898) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748){noformat} Another observation, that if I change EntityAgg to String in Aggregator then It works fine. I've found a test in github that should check for this behavior. [https://github.com/apache/spark/blob/branch-3.1/sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala#L338] I haven't found similar issue so please point me to open ticket if there is any. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37546) V2 ReplaceTableAsSelect command should qualify location
[ https://issues.apache.org/jira/browse/SPARK-37546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453500#comment-17453500 ] Apache Spark commented on SPARK-37546: -- User 'huaxingao' has created a pull request for this issue: https://github.com/apache/spark/pull/34807 > V2 ReplaceTableAsSelect command should qualify location > --- > > Key: SPARK-37546 > URL: https://issues.apache.org/jira/browse/SPARK-37546 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Huaxin Gao >Priority: Major > > V2 ReplaceTableAsSelect command should qualify location. Currently, > {code:java} > spark.sql("REPLACE TABLE testcat.t USING foo LOCATION '/tmp/foo' AS SELECT id > FROM source") > spark.sql("DESCRIBE EXTENDED testcat.t").show(false) > {code} > displays the location as `/tmp/foo` whereas V1 command displays/stores it as > qualified (`file:/tmp/foo`). -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37546) V2 ReplaceTableAsSelect command should qualify location
[ https://issues.apache.org/jira/browse/SPARK-37546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37546: Assignee: Apache Spark > V2 ReplaceTableAsSelect command should qualify location > --- > > Key: SPARK-37546 > URL: https://issues.apache.org/jira/browse/SPARK-37546 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Huaxin Gao >Assignee: Apache Spark >Priority: Major > > V2 ReplaceTableAsSelect command should qualify location. Currently, > {code:java} > spark.sql("REPLACE TABLE testcat.t USING foo LOCATION '/tmp/foo' AS SELECT id > FROM source") > spark.sql("DESCRIBE EXTENDED testcat.t").show(false) > {code} > displays the location as `/tmp/foo` whereas V1 command displays/stores it as > qualified (`file:/tmp/foo`). -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37546) V2 ReplaceTableAsSelect command should qualify location
[ https://issues.apache.org/jira/browse/SPARK-37546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-37546: Assignee: (was: Apache Spark) > V2 ReplaceTableAsSelect command should qualify location > --- > > Key: SPARK-37546 > URL: https://issues.apache.org/jira/browse/SPARK-37546 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Huaxin Gao >Priority: Major > > V2 ReplaceTableAsSelect command should qualify location. Currently, > {code:java} > spark.sql("REPLACE TABLE testcat.t USING foo LOCATION '/tmp/foo' AS SELECT id > FROM source") > spark.sql("DESCRIBE EXTENDED testcat.t").show(false) > {code} > displays the location as `/tmp/foo` whereas V1 command displays/stores it as > qualified (`file:/tmp/foo`). -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37529) Support K8s integration tests for Java 17
[ https://issues.apache.org/jira/browse/SPARK-37529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-37529. --- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34790 [https://github.com/apache/spark/pull/34790] > Support K8s integration tests for Java 17 > - > > Key: SPARK-37529 > URL: https://issues.apache.org/jira/browse/SPARK-37529 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Tests >Affects Versions: 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > Fix For: 3.3.0 > > > Now that we can build container image for Java 17, let's support K8s > integration tests for Java 17. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37546) V2 ReplaceTableAsSelect command should qualify location
Huaxin Gao created SPARK-37546: -- Summary: V2 ReplaceTableAsSelect command should qualify location Key: SPARK-37546 URL: https://issues.apache.org/jira/browse/SPARK-37546 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.3.0 Reporter: Huaxin Gao V2 ReplaceTableAsSelect command should qualify location. Currently, {code:java} spark.sql("REPLACE TABLE testcat.t USING foo LOCATION '/tmp/foo' AS SELECT id FROM source") spark.sql("DESCRIBE EXTENDED testcat.t").show(false) {code} displays the location as `/tmp/foo` whereas V1 command displays/stores it as qualified (`file:/tmp/foo`). -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37533) New SQL function: try_element_at
[ https://issues.apache.org/jira/browse/SPARK-37533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-37533. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 34796 [https://github.com/apache/spark/pull/34796] > New SQL function: try_element_at > > > Key: SPARK-37533 > URL: https://issues.apache.org/jira/browse/SPARK-37533 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > Fix For: 3.3.0 > > > Add New SQL functions `try_element_at`, which is identical to the > `element_at` except that it returns null if error occurs > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37521) insert overwrite table but the partition information stored in Metastore was not changed
[ https://issues.apache.org/jira/browse/SPARK-37521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jingxiong zhong updated SPARK-37521: Issue Type: Bug (was: New Bugzilla Project) > insert overwrite table but the partition information stored in Metastore was > not changed > > > Key: SPARK-37521 > URL: https://issues.apache.org/jira/browse/SPARK-37521 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.0 > Environment: spark3.2.0 > hive2.3.9 > metastore2.3.9 >Reporter: jingxiong zhong >Priority: Major > > I create a partitioned table in SparkSQL, insert a data entry, add a regular > field, and finally insert a new data entry into the partition,The query is > normal in SparkSQL, but the return value of the newly inserted field is NULL > in Hive 2.3.9 > for example > create table updata_col_test1(a int) partitioned by (dt string); > insert overwrite table updata_col_test1 partition(dt='20200101') values(1); > insert overwrite table updata_col_test1 partition(dt='20200102') values(1); > insert overwrite table updata_col_test1 partition(dt='20200103') values(1); > alter table updata_col_test1 add columns (b int); > insert overwrite table updata_col_test1 partition(dt) values(1, 2, > '20200101'); fail > insert overwrite table updata_col_test1 partition(dt='20200101') values(1, > 2); fail > insert overwrite table updata_col_test1 partition(dt='20200104') values(1, > 2); sucessfully -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37521) insert overwrite table but the partition information stored in Metastore was not changed
[ https://issues.apache.org/jira/browse/SPARK-37521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jingxiong zhong updated SPARK-37521: Issue Type: New Bugzilla Project (was: Question) > insert overwrite table but the partition information stored in Metastore was > not changed > > > Key: SPARK-37521 > URL: https://issues.apache.org/jira/browse/SPARK-37521 > Project: Spark > Issue Type: New Bugzilla Project > Components: SQL >Affects Versions: 3.2.0 > Environment: spark3.2.0 > hive2.3.9 > metastore2.3.9 >Reporter: jingxiong zhong >Priority: Major > > I create a partitioned table in SparkSQL, insert a data entry, add a regular > field, and finally insert a new data entry into the partition,The query is > normal in SparkSQL, but the return value of the newly inserted field is NULL > in Hive 2.3.9 > for example > create table updata_col_test1(a int) partitioned by (dt string); > insert overwrite table updata_col_test1 partition(dt='20200101') values(1); > insert overwrite table updata_col_test1 partition(dt='20200102') values(1); > insert overwrite table updata_col_test1 partition(dt='20200103') values(1); > alter table updata_col_test1 add columns (b int); > insert overwrite table updata_col_test1 partition(dt) values(1, 2, > '20200101'); fail > insert overwrite table updata_col_test1 partition(dt='20200101') values(1, > 2); fail > insert overwrite table updata_col_test1 partition(dt='20200104') values(1, > 2); sucessfully -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org