[jira] [Comment Edited] (SPARK-33325) Spark executors pod are not shutting down when losing driver connection

2021-12-04 Thread wineternity (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453536#comment-17453536
 ] 

wineternity edited comment on SPARK-33325 at 12/5/21, 7:20 AM:
---

Fixed in https://issues.apache.org/jira/browse/SPARK-36532


was (Author: yimo_yym):
Fixed in https://issues.apache.org/jira/browse/SPARK-36532

> Spark executors pod are not shutting down when losing driver connection
> ---
>
> Key: SPARK-33325
> URL: https://issues.apache.org/jira/browse/SPARK-33325
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.0.1
>Reporter: Hadrien Kohl
>Priority: Major
>
> In situations where the executors lose contact with the driver, the java 
> process does not die. I am looking at what on the kubernetes cluster could 
> prevent proper clean-up. 
> The spark driver is started in it's own pod in client mode (pyspark shell 
> started by jupyter). I works fine most of the time but if the driver process 
> crashes (OOM or kill signal for instance) the executor complains about the 
> connection reset by peer and then hangs.
> Here's the log from an executor pod that hangs:
> {code:java}
> 20/11/03 07:35:30 WARN TransportChannelHandler: Exception in connection from 
> /10.17.0.152:37161
> java.io.IOException: Connection reset by peer
>   at java.base/sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>   at java.base/sun.nio.ch.SocketDispatcher.read(Unknown Source)
>   at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
>   at java.base/sun.nio.ch.IOUtil.read(Unknown Source)
>   at java.base/sun.nio.ch.IOUtil.read(Unknown Source)
>   at java.base/sun.nio.ch.SocketChannelImpl.read(Unknown Source)
>   at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:253)
>   at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1133)
>   at 
> io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:350)
>   at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
>   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>   at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.base/java.lang.Thread.run(Unknown Source)
> 20/11/03 07:35:30 ERROR CoarseGrainedExecutorBackend: Executor self-exiting 
> due to : Driver 10.17.0.152:37161 disassociated! Shutting down.
> 20/11/03 07:35:31 INFO MemoryStore: MemoryStore cleared
> 20/11/03 07:35:31 INFO BlockManager: BlockManager stopped
> {code}
> When start a shell in the pod I can see the process are still running: 
> {code:java}
> UID  PIDPPID  CSZ   RSS PSR STIME TTY  TIME CMD
> 185  125   0  0  5045  3968   2 10:07 pts/000:00:00 /bin/bash
> 185  166 125  0  9019  3364   1 10:39 pts/000:00:00  \_ ps 
> -AF --forest
> 1851   0  0  1130   768   0 07:34 ?00:00:00 
> /usr/bin/tini -s -- /opt/java/openjdk/
> 185   14   1  0 1935527 493976 3 07:34 ?   00:00:21 
> /opt/java/openjdk/bin/java -Dspark.dri
> {code}
> Here's the full command used to start the executor: 
> {code:java}
> /opt/java/openjdk/
> bin/java -Dspark.driver.port=37161 -Xms4g -Xmx4g -cp :/opt/spark/jars/*: 
> org.apache.spark.executor.CoarseG
> rainedExecutorBackend --driver-url 
> spark://CoarseGrainedScheduler@10.17.0.152:37161 --executor-id 1 --core
> s 1 --app-id spark-application-1604388891044 --hostname 10.17.2.151
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-33325) Spark executors pod are not shutting down when losing driver connection

2021-12-04 Thread wineternity (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-33325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453536#comment-17453536
 ] 

wineternity commented on SPARK-33325:
-

Fixed in https://issues.apache.org/jira/browse/SPARK-36532

> Spark executors pod are not shutting down when losing driver connection
> ---
>
> Key: SPARK-33325
> URL: https://issues.apache.org/jira/browse/SPARK-33325
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.0.1
>Reporter: Hadrien Kohl
>Priority: Major
>
> In situations where the executors lose contact with the driver, the java 
> process does not die. I am looking at what on the kubernetes cluster could 
> prevent proper clean-up. 
> The spark driver is started in it's own pod in client mode (pyspark shell 
> started by jupyter). I works fine most of the time but if the driver process 
> crashes (OOM or kill signal for instance) the executor complains about the 
> connection reset by peer and then hangs.
> Here's the log from an executor pod that hangs:
> {code:java}
> 20/11/03 07:35:30 WARN TransportChannelHandler: Exception in connection from 
> /10.17.0.152:37161
> java.io.IOException: Connection reset by peer
>   at java.base/sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>   at java.base/sun.nio.ch.SocketDispatcher.read(Unknown Source)
>   at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
>   at java.base/sun.nio.ch.IOUtil.read(Unknown Source)
>   at java.base/sun.nio.ch.IOUtil.read(Unknown Source)
>   at java.base/sun.nio.ch.SocketChannelImpl.read(Unknown Source)
>   at io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:253)
>   at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1133)
>   at 
> io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:350)
>   at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
>   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>   at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.base/java.lang.Thread.run(Unknown Source)
> 20/11/03 07:35:30 ERROR CoarseGrainedExecutorBackend: Executor self-exiting 
> due to : Driver 10.17.0.152:37161 disassociated! Shutting down.
> 20/11/03 07:35:31 INFO MemoryStore: MemoryStore cleared
> 20/11/03 07:35:31 INFO BlockManager: BlockManager stopped
> {code}
> When start a shell in the pod I can see the process are still running: 
> {code:java}
> UID  PIDPPID  CSZ   RSS PSR STIME TTY  TIME CMD
> 185  125   0  0  5045  3968   2 10:07 pts/000:00:00 /bin/bash
> 185  166 125  0  9019  3364   1 10:39 pts/000:00:00  \_ ps 
> -AF --forest
> 1851   0  0  1130   768   0 07:34 ?00:00:00 
> /usr/bin/tini -s -- /opt/java/openjdk/
> 185   14   1  0 1935527 493976 3 07:34 ?   00:00:21 
> /opt/java/openjdk/bin/java -Dspark.dri
> {code}
> Here's the full command used to start the executor: 
> {code:java}
> /opt/java/openjdk/
> bin/java -Dspark.driver.port=37161 -Xms4g -Xmx4g -cp :/opt/spark/jars/*: 
> org.apache.spark.executor.CoarseG
> rainedExecutorBackend --driver-url 
> spark://CoarseGrainedScheduler@10.17.0.152:37161 --executor-id 1 --core
> s 1 --app-id spark-application-1604388891044 --hostname 10.17.2.151
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-37545) V2 CreateTableAsSelect command should qualify location

2021-12-04 Thread Huaxin Gao (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxin Gao resolved SPARK-37545.

Fix Version/s: 3.3.0
 Assignee: Terry Kim
   Resolution: Fixed

> V2 CreateTableAsSelect command should qualify location
> --
>
> Key: SPARK-37545
> URL: https://issues.apache.org/jira/browse/SPARK-37545
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Terry Kim
>Assignee: Terry Kim
>Priority: Major
> Fix For: 3.3.0
>
>
> V2 CreateTableAsSelect command should qualify location. Currently, 
>  
> {code:java}
> spark.sql("CREATE TABLE testcat.t USING foo LOCATION '/tmp/foo' AS SELECT id 
> FROM source")
> spark.sql("DESCRIBE EXTENDED testcat.t").show(false)
> {code}
> displays the location as `/tmp/foo` whereas V1 command displays/stores it as 
> qualified (`[file:/tmp/foo|file:///tmp/foo]`).
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-37548) Add Java17 SparkR daily test coverage

2021-12-04 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-37548.
---
Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved by pull request 34808
[https://github.com/apache/spark/pull/34808]

> Add Java17 SparkR daily test coverage
> -
>
> Key: SPARK-37548
> URL: https://issues.apache.org/jira/browse/SPARK-37548
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra, SparkR, Tests
>Affects Versions: 3.3.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
> Fix For: 3.3.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37548) Add Java17 SparkR daily test coverage

2021-12-04 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-37548:
-

Assignee: Dongjoon Hyun

> Add Java17 SparkR daily test coverage
> -
>
> Key: SPARK-37548
> URL: https://issues.apache.org/jira/browse/SPARK-37548
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra, SparkR, Tests
>Affects Versions: 3.3.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37548) Add Java17 SparkR daily test coverage

2021-12-04 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453503#comment-17453503
 ] 

Apache Spark commented on SPARK-37548:
--

User 'dongjoon-hyun' has created a pull request for this issue:
https://github.com/apache/spark/pull/34808

> Add Java17 SparkR daily test coverage
> -
>
> Key: SPARK-37548
> URL: https://issues.apache.org/jira/browse/SPARK-37548
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra, SparkR, Tests
>Affects Versions: 3.3.0
>Reporter: Dongjoon Hyun
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37548) Add Java17 SparkR daily test coverage

2021-12-04 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37548:


Assignee: (was: Apache Spark)

> Add Java17 SparkR daily test coverage
> -
>
> Key: SPARK-37548
> URL: https://issues.apache.org/jira/browse/SPARK-37548
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra, SparkR, Tests
>Affects Versions: 3.3.0
>Reporter: Dongjoon Hyun
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37548) Add Java17 SparkR daily test coverage

2021-12-04 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37548:


Assignee: Apache Spark

> Add Java17 SparkR daily test coverage
> -
>
> Key: SPARK-37548
> URL: https://issues.apache.org/jira/browse/SPARK-37548
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra, SparkR, Tests
>Affects Versions: 3.3.0
>Reporter: Dongjoon Hyun
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37548) Add Java17 SparkR daily test coverage

2021-12-04 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-37548:
-

 Summary: Add Java17 SparkR daily test coverage
 Key: SPARK-37548
 URL: https://issues.apache.org/jira/browse/SPARK-37548
 Project: Spark
  Issue Type: Sub-task
  Components: Project Infra, SparkR, Tests
Affects Versions: 3.3.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37546) V2 ReplaceTableAsSelect command should qualify location

2021-12-04 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453501#comment-17453501
 ] 

Apache Spark commented on SPARK-37546:
--

User 'huaxingao' has created a pull request for this issue:
https://github.com/apache/spark/pull/34807

> V2 ReplaceTableAsSelect command should qualify location
> ---
>
> Key: SPARK-37546
> URL: https://issues.apache.org/jira/browse/SPARK-37546
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Huaxin Gao
>Priority: Major
>
> V2 ReplaceTableAsSelect command should qualify location. Currently, 
> {code:java}
> spark.sql("REPLACE TABLE testcat.t USING foo LOCATION '/tmp/foo' AS SELECT id 
> FROM source")
> spark.sql("DESCRIBE EXTENDED testcat.t").show(false)
> {code}
> displays the location as `/tmp/foo` whereas V1 command displays/stores it as 
> qualified (`file:/tmp/foo`).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37547) Unexpected NullPointerException when Aggregator.finish returns null

2021-12-04 Thread Andrei (Jira)
Andrei created SPARK-37547:
--

 Summary: Unexpected NullPointerException when Aggregator.finish 
returns null
 Key: SPARK-37547
 URL: https://issues.apache.org/jira/browse/SPARK-37547
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 3.2.0, 3.1.2
Reporter: Andrei


I'm migrating existing code (Java 8) from Spark 2.4 to Spark 3 and I see 
NullPointerException when an Aggregator returns null in finish method for a 
custom class.

I've created simple snippet to repro the issue.
{code:java}
public class SparkTest {
  public static void main(String[] args) {
SparkConf conf = new SparkConf().setAppName("name").setMaster("local[*]");
SparkSession spark = SparkSession.builder().config(conf).getOrCreate();
List data = Arrays.asList("1", "2", "3");
Dataset dataset = spark.createDataset(data, Encoders.STRING());
Dataset aggDataset = dataset.groupBy("value").agg(new 
EntityAggregator().toColumn().name("agg"));
aggDataset.show();
  }
} {code}
{code:java}
public class EntityAggregator extends Aggregator { 
public EntityAgg zero() { return new EntityAgg(0l); } 
public EntityAgg reduce(EntityAgg agg, Row row) { return agg; } 
public EntityAgg merge(EntityAgg e1, EntityAgg e2) { return e1; } 
public Encoder bufferEncoder() { return 
Encoders.bean(EntityAgg.class); } 
public Encoder outputEncoder() { return 
Encoders.bean(EntityAgg.class); } 
public EntityAgg finish(EntityAgg reduction) { return null; } 
}
{code}
{code:java}
public class EntityAgg {
  private long field;
  public EntityAgg() { }
  public EntityAgg(long field) { this.field = field; }
  public long getField() { return field; }
  public void setField(long field) { this.field = field; }
} {code}
Expected behavior is to print table like this
{noformat}
+-++
|value| agg|
+-++
|    3|null|
|    1|null|
|    2|null|
+-++
{noformat}
This code works fine for 2.4 but fails with the following stacktrace for Spark 
3 (I tested for 3.1.2 and 3.2.0)
{noformat}
Caused by: java.lang.NullPointerException
    at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(generated.java:49)
    at 
org.apache.spark.sql.execution.aggregate.AggregationIterator.$anonfun$generateResultProjection$5(AggregationIterator.scala:259)
    at 
org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.next(ObjectAggregationIterator.scala:85)
    at 
org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.next(ObjectAggregationIterator.scala:32)
    at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:346)
    at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:898)
    at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:898)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
    at org.apache.spark.scheduler.Task.run(Task.scala:131)
    at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748){noformat}
Another observation, that if I change EntityAgg to String in Aggregator then It 
works fine.

I've found a test in github that should check for this behavior. 
[https://github.com/apache/spark/blob/branch-3.1/sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala#L338]
 

I haven't found similar issue so please point me to open ticket if there is any.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-37546) V2 ReplaceTableAsSelect command should qualify location

2021-12-04 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-37546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453500#comment-17453500
 ] 

Apache Spark commented on SPARK-37546:
--

User 'huaxingao' has created a pull request for this issue:
https://github.com/apache/spark/pull/34807

> V2 ReplaceTableAsSelect command should qualify location
> ---
>
> Key: SPARK-37546
> URL: https://issues.apache.org/jira/browse/SPARK-37546
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Huaxin Gao
>Priority: Major
>
> V2 ReplaceTableAsSelect command should qualify location. Currently, 
> {code:java}
> spark.sql("REPLACE TABLE testcat.t USING foo LOCATION '/tmp/foo' AS SELECT id 
> FROM source")
> spark.sql("DESCRIBE EXTENDED testcat.t").show(false)
> {code}
> displays the location as `/tmp/foo` whereas V1 command displays/stores it as 
> qualified (`file:/tmp/foo`).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37546) V2 ReplaceTableAsSelect command should qualify location

2021-12-04 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37546:


Assignee: Apache Spark

> V2 ReplaceTableAsSelect command should qualify location
> ---
>
> Key: SPARK-37546
> URL: https://issues.apache.org/jira/browse/SPARK-37546
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Huaxin Gao
>Assignee: Apache Spark
>Priority: Major
>
> V2 ReplaceTableAsSelect command should qualify location. Currently, 
> {code:java}
> spark.sql("REPLACE TABLE testcat.t USING foo LOCATION '/tmp/foo' AS SELECT id 
> FROM source")
> spark.sql("DESCRIBE EXTENDED testcat.t").show(false)
> {code}
> displays the location as `/tmp/foo` whereas V1 command displays/stores it as 
> qualified (`file:/tmp/foo`).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-37546) V2 ReplaceTableAsSelect command should qualify location

2021-12-04 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-37546:


Assignee: (was: Apache Spark)

> V2 ReplaceTableAsSelect command should qualify location
> ---
>
> Key: SPARK-37546
> URL: https://issues.apache.org/jira/browse/SPARK-37546
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Huaxin Gao
>Priority: Major
>
> V2 ReplaceTableAsSelect command should qualify location. Currently, 
> {code:java}
> spark.sql("REPLACE TABLE testcat.t USING foo LOCATION '/tmp/foo' AS SELECT id 
> FROM source")
> spark.sql("DESCRIBE EXTENDED testcat.t").show(false)
> {code}
> displays the location as `/tmp/foo` whereas V1 command displays/stores it as 
> qualified (`file:/tmp/foo`).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-37529) Support K8s integration tests for Java 17

2021-12-04 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-37529.
---
Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved by pull request 34790
[https://github.com/apache/spark/pull/34790]

> Support K8s integration tests for Java 17
> -
>
> Key: SPARK-37529
> URL: https://issues.apache.org/jira/browse/SPARK-37529
> Project: Spark
>  Issue Type: Sub-task
>  Components: Kubernetes, Tests
>Affects Versions: 3.3.0
>Reporter: Kousuke Saruta
>Assignee: Kousuke Saruta
>Priority: Major
> Fix For: 3.3.0
>
>
> Now that we can build container image for Java 17, let's support K8s 
> integration tests for Java 17.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37546) V2 ReplaceTableAsSelect command should qualify location

2021-12-04 Thread Huaxin Gao (Jira)
Huaxin Gao created SPARK-37546:
--

 Summary: V2 ReplaceTableAsSelect command should qualify location
 Key: SPARK-37546
 URL: https://issues.apache.org/jira/browse/SPARK-37546
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.3.0
Reporter: Huaxin Gao


V2 ReplaceTableAsSelect command should qualify location. Currently, 


{code:java}
spark.sql("REPLACE TABLE testcat.t USING foo LOCATION '/tmp/foo' AS SELECT id 
FROM source")
spark.sql("DESCRIBE EXTENDED testcat.t").show(false)
{code}

displays the location as `/tmp/foo` whereas V1 command displays/stores it as 
qualified (`file:/tmp/foo`).




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-37533) New SQL function: try_element_at

2021-12-04 Thread Gengliang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gengliang Wang resolved SPARK-37533.

Fix Version/s: 3.3.0
   Resolution: Fixed

Issue resolved by pull request 34796
[https://github.com/apache/spark/pull/34796]

> New SQL function: try_element_at
> 
>
> Key: SPARK-37533
> URL: https://issues.apache.org/jira/browse/SPARK-37533
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
> Fix For: 3.3.0
>
>
> Add New SQL functions `try_element_at`, which is identical to the 
> `element_at` except that it returns null if error occurs
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-37521) insert overwrite table but the partition information stored in Metastore was not changed

2021-12-04 Thread jingxiong zhong (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jingxiong zhong updated SPARK-37521:

Issue Type: Bug  (was: New Bugzilla Project)

> insert overwrite table but the partition information stored in Metastore was 
> not changed
> 
>
> Key: SPARK-37521
> URL: https://issues.apache.org/jira/browse/SPARK-37521
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
> Environment: spark3.2.0
> hive2.3.9
> metastore2.3.9
>Reporter: jingxiong zhong
>Priority: Major
>
> I create a partitioned table in SparkSQL, insert a data entry, add a regular 
> field, and finally insert a new data entry into the partition,The query is 
> normal in SparkSQL, but the return value of the newly inserted field is NULL 
> in Hive 2.3.9
> for example
> create table updata_col_test1(a int) partitioned by (dt string); 
> insert overwrite table updata_col_test1 partition(dt='20200101') values(1); 
> insert overwrite table updata_col_test1 partition(dt='20200102') values(1);
> insert overwrite table updata_col_test1 partition(dt='20200103') values(1);
> alter table  updata_col_test1 add columns (b int);
> insert overwrite table updata_col_test1 partition(dt) values(1, 2, 
> '20200101'); fail
> insert overwrite table updata_col_test1 partition(dt='20200101') values(1, 
> 2); fail
> insert overwrite table updata_col_test1 partition(dt='20200104') values(1, 
> 2); sucessfully



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-37521) insert overwrite table but the partition information stored in Metastore was not changed

2021-12-04 Thread jingxiong zhong (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jingxiong zhong updated SPARK-37521:

Issue Type: New Bugzilla Project  (was: Question)

> insert overwrite table but the partition information stored in Metastore was 
> not changed
> 
>
> Key: SPARK-37521
> URL: https://issues.apache.org/jira/browse/SPARK-37521
> Project: Spark
>  Issue Type: New Bugzilla Project
>  Components: SQL
>Affects Versions: 3.2.0
> Environment: spark3.2.0
> hive2.3.9
> metastore2.3.9
>Reporter: jingxiong zhong
>Priority: Major
>
> I create a partitioned table in SparkSQL, insert a data entry, add a regular 
> field, and finally insert a new data entry into the partition,The query is 
> normal in SparkSQL, but the return value of the newly inserted field is NULL 
> in Hive 2.3.9
> for example
> create table updata_col_test1(a int) partitioned by (dt string); 
> insert overwrite table updata_col_test1 partition(dt='20200101') values(1); 
> insert overwrite table updata_col_test1 partition(dt='20200102') values(1);
> insert overwrite table updata_col_test1 partition(dt='20200103') values(1);
> alter table  updata_col_test1 add columns (b int);
> insert overwrite table updata_col_test1 partition(dt) values(1, 2, 
> '20200101'); fail
> insert overwrite table updata_col_test1 partition(dt='20200101') values(1, 
> 2); fail
> insert overwrite table updata_col_test1 partition(dt='20200104') values(1, 
> 2); sucessfully



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org