[jira] [Updated] (SPARK-39526) Remove no-null conditional statements in SparkSubmitCommandBuilder#isThriftServer
[ https://issues.apache.org/jira/browse/SPARK-39526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] qian updated SPARK-39526: - Description: {code:java} private boolean isThriftServer(String mainClass) { return (mainClass != null && mainClass.equals("org.apache.spark.sql.hive.thriftserver.HiveThriftServer2")); } {code} No scenario that *mainClass* is null, because already have defensive code. {code:java} if (isExample && !isSpecialCommand) { checkArgument(mainClass != null, "Missing example class name."); } if (mainClass != null) { args.add(parser.CLASS); args.add(mainClass); } {code} was: {code:java} private boolean isThriftServer(String mainClass) { return (mainClass != null && mainClass.equals("org.apache.spark.sql.hive.thriftserver.HiveThriftServer2")); } {code} No scenario that *mainClass* is null, because already have defensive code. {code:java} if (isExample && !isSpecialCommand) { checkArgument(mainClass != null, "Missing example class name."); } if (mainClass != null) { args.add(parser.CLASS); args.add(mainClass); } {code} > Remove no-null conditional statements in > SparkSubmitCommandBuilder#isThriftServer > - > > Key: SPARK-39526 > URL: https://issues.apache.org/jira/browse/SPARK-39526 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.3.0 >Reporter: qian >Priority: Minor > > {code:java} > private boolean isThriftServer(String mainClass) { > return (mainClass != null && > > mainClass.equals("org.apache.spark.sql.hive.thriftserver.HiveThriftServer2")); > > } {code} > No scenario that *mainClass* is null, because already have defensive code. > {code:java} > if (isExample && !isSpecialCommand) { > checkArgument(mainClass != null, "Missing example class name."); > } > if (mainClass != null) { > args.add(parser.CLASS); > args.add(mainClass); > } {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39526) Remove no-null conditional statements in SparkSubmitCommandBuilder#isThriftServer
qian created SPARK-39526: Summary: Remove no-null conditional statements in SparkSubmitCommandBuilder#isThriftServer Key: SPARK-39526 URL: https://issues.apache.org/jira/browse/SPARK-39526 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.3.0 Reporter: qian {code:java} private boolean isThriftServer(String mainClass) { return (mainClass != null && mainClass.equals("org.apache.spark.sql.hive.thriftserver.HiveThriftServer2")); } {code} No scenario that *mainClass* is null, because already have defensive code. {code:java} if (isExample && !isSpecialCommand) { checkArgument(mainClass != null, "Missing example class name."); } if (mainClass != null) { args.add(parser.CLASS); args.add(mainClass); } {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39504) No-null object invoke equal to avoid NPE
[ https://issues.apache.org/jira/browse/SPARK-39504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] qian updated SPARK-39504: - Component/s: SQL > No-null object invoke equal to avoid NPE > > > Key: SPARK-39504 > URL: https://issues.apache.org/jira/browse/SPARK-39504 > Project: Spark > Issue Type: Improvement > Components: Spark Core, SQL >Affects Versions: 3.2.1 >Reporter: qian >Priority: Major > > Since {{NullPointerException}} can possibly be thrown while calling the > _equals_ method of {{{}Object{}}}, _equals_ should be invoked by a constant > or an object that is definitely not {_}null{_}. > {quote}Positive example: {{"test".equals(object);}} > Counter example: {{object.equals("test");}} > {quote} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39504) No-null object invoke equal to avoid NPE
qian created SPARK-39504: Summary: No-null object invoke equal to avoid NPE Key: SPARK-39504 URL: https://issues.apache.org/jira/browse/SPARK-39504 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.2.1 Reporter: qian Since {{NullPointerException}} can possibly be thrown while calling the _equals_ method of {{{}Object{}}}, _equals_ should be invoked by a constant or an object that is definitely not {_}null{_}. {quote}Positive example: {{"test".equals(object);}} Counter example: {{object.equals("test");}} {quote} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39428) use code block for `Coalesce Hints for SQL Queries`
[ https://issues.apache.org/jira/browse/SPARK-39428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] qian updated SPARK-39428: - Attachment: Coalesce.png > use code block for `Coalesce Hints for SQL Queries` > --- > > Key: SPARK-39428 > URL: https://issues.apache.org/jira/browse/SPARK-39428 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 3.0.0, 3.0.1, 3.0.2, 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.2.1 >Reporter: qian >Priority: Minor > Attachments: Coalesce.png > > > Use code block for `Coalesce Hints for SQL Queries`, now it is plain block. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39428) use code block for `Coalesce Hints for SQL Queries`
qian created SPARK-39428: Summary: use code block for `Coalesce Hints for SQL Queries` Key: SPARK-39428 URL: https://issues.apache.org/jira/browse/SPARK-39428 Project: Spark Issue Type: Improvement Components: Documentation Affects Versions: 3.2.1, 3.2.0, 3.1.2, 3.1.1, 3.1.0, 3.0.2, 3.0.1, 3.0.0 Reporter: qian Use code block for `Coalesce Hints for SQL Queries`, now it is plain block. !image-2022-06-09-16-33-00-359.png! -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39428) use code block for `Coalesce Hints for SQL Queries`
[ https://issues.apache.org/jira/browse/SPARK-39428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] qian updated SPARK-39428: - Description: Use code block for `Coalesce Hints for SQL Queries`, now it is plain block. (was: Use code block for `Coalesce Hints for SQL Queries`, now it is plain block. !image-2022-06-09-16-33-00-359.png!) > use code block for `Coalesce Hints for SQL Queries` > --- > > Key: SPARK-39428 > URL: https://issues.apache.org/jira/browse/SPARK-39428 > Project: Spark > Issue Type: Improvement > Components: Documentation >Affects Versions: 3.0.0, 3.0.1, 3.0.2, 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.2.1 >Reporter: qian >Priority: Minor > > Use code block for `Coalesce Hints for SQL Queries`, now it is plain block. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39390) Hide and optimize `viewAcls`/`viewAclsGroups`/`modifyAcls`/`modifyAclsGroups` fron INFO log
qian created SPARK-39390: Summary: Hide and optimize `viewAcls`/`viewAclsGroups`/`modifyAcls`/`modifyAclsGroups` fron INFO log Key: SPARK-39390 URL: https://issues.apache.org/jira/browse/SPARK-39390 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.3.0 Reporter: qian This issue aims to hide and optimize `viewAcls`/`viewAclsGroups`/`modifyAcls`/`modifyAclsGroups` fron INFO log. {code:java} 2022-06-02 22:02:48.328 - stderr> 22/06/03 05:02:48 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set(){code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39289) Replace map.getOrElse(false/true) with exists/forall
qian created SPARK-39289: Summary: Replace map.getOrElse(false/true) with exists/forall Key: SPARK-39289 URL: https://issues.apache.org/jira/browse/SPARK-39289 Project: Spark Issue Type: Improvement Components: Spark Core, SQL, Structured Streaming Affects Versions: 3.3.0 Reporter: qian Replace _map(_.toBoolean).getOrElse(false)_ with _exists(_.toBoolean)_ Replace _map(_.toBoolean).getOrElse(true)_ with _exists(_.toBoolean)_ -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39196) Replace getOrElse(null) with orNull
qian created SPARK-39196: Summary: Replace getOrElse(null) with orNull Key: SPARK-39196 URL: https://issues.apache.org/jira/browse/SPARK-39196 Project: Spark Issue Type: Improvement Components: Kubernetes, Spark Core Affects Versions: 3.3.0 Reporter: qian Code Simplification. Replace _getOrElse(null)_ with _orNull_ -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-39111) Mark overriden methods with `@override` annotation
[ https://issues.apache.org/jira/browse/SPARK-39111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17533176#comment-17533176 ] qian commented on SPARK-39111: -- [~bjornjorgensen] I use Alibaba Java Coding Guidelines. Please refer https://plugins.jetbrains.com/plugin/10046-alibaba-java-coding-guidelines > Mark overriden methods with `@override` annotation > -- > > Key: SPARK-39111 > URL: https://issues.apache.org/jira/browse/SPARK-39111 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.3.0 >Reporter: qian >Assignee: qian >Priority: Minor > Fix For: 3.4.0 > > Attachments: override.png > > > An overridden method from an interface or abstract class must be marked with > {{@Override}} annotation. To accurately determine whether the overriding is > successful, an {{@Override}} annotation is necessary. Meanwhile, once the > method signature in the abstract class is changed, the implementation class > will report a compile-time error immediately. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39111) Mark overriden methods with `@override` annotation
[ https://issues.apache.org/jira/browse/SPARK-39111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] qian updated SPARK-39111: - Summary: Mark overriden methods with `@override` annotation (was: Add override annotation for handleExtraArgs#AbstractLauncher) > Mark overriden methods with `@override` annotation > -- > > Key: SPARK-39111 > URL: https://issues.apache.org/jira/browse/SPARK-39111 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.3.0 >Reporter: qian >Priority: Minor > > An overridden method from an interface or abstract class must be marked with > {{@Override}} annotation. To accurately determine whether the overriding is > successful, an {{@Override}} annotation is necessary. Meanwhile, once the > method signature in the abstract class is changed, the implementation class > will report a compile-time error immediately. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39111) Add override annotation for handleExtraArgs#AbstractLauncher
qian created SPARK-39111: Summary: Add override annotation for handleExtraArgs#AbstractLauncher Key: SPARK-39111 URL: https://issues.apache.org/jira/browse/SPARK-39111 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 2.3.0 Reporter: qian An overridden method from an interface or abstract class must be marked with {{@Override}} annotation. To accurately determine whether the overriding is successful, an {{@Override}} annotation is necessary. Meanwhile, once the method signature in the abstract class is changed, the implementation class will report a compile-time error immediately. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39006) Show a directional error message for PVC Dynamic Allocation Failure
[ https://issues.apache.org/jira/browse/SPARK-39006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] qian updated SPARK-39006: - Description: When spark application requires multiple executors and not set pvc claimName with onDemand or SPARK_EXECUTOR_ID, it always create executor pods. Because pvc has be created by first executor pod. {noformat} 22/04/22 08:55:47 WARN ExecutorPodsSnapshotsStoreImpl: Exception when notifying snapshot subscriber. io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://kubernetes.default.svc/api/v1/namespaces/default/persistentvolumeclaims. Message: persistentvolumeclaims "test-1" already exists. Received status: Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=null, kind=persistentvolumeclaims, name=test-1, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=persistentvolumeclaims "test-1" already exists, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=AlreadyExists, status=Failure, additionalProperties={}). at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:697) ~[kubernetes-client-5.10.1.jar:?] at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:676) ~[kubernetes-client-5.10.1.jar:?] at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:629) ~[kubernetes-client-5.10.1.jar:?] at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:566) ~[kubernetes-client-5.10.1.jar:?] at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:527) ~[kubernetes-client-5.10.1.jar:?] at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:315) ~[kubernetes-client-5.10.1.jar:?] at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:651) ~[kubernetes-client-5.10.1.jar:?] at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:91) ~[kubernetes-client-5.10.1.jar:?] at io.fabric8.kubernetes.client.dsl.base.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:61) ~[kubernetes-client-5.10.1.jar:?] at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$3(ExecutorPodsAllocator.scala:415) ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] at scala.collection.immutable.List.foreach(List.scala:431) ~[scala-library-2.12.15.jar:?] at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$1(ExecutorPodsAllocator.scala:408) ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) ~[scala-library-2.12.15.jar:?] at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.requestNewExecutors(ExecutorPodsAllocator.scala:385) ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$35(ExecutorPodsAllocator.scala:349) ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$35$adapted(ExecutorPodsAllocator.scala:342) ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) ~[scala-library-2.12.15.jar:?] at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) ~[scala-library-2.12.15.jar:?] at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) ~[scala-library-2.12.15.jar:?] at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.onNewSnapshots(ExecutorPodsAllocator.scala:342) ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$start$3(ExecutorPodsAllocator.scala:120) ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$start$3$adapted(ExecutorPodsAllocator.scala:120) ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsSnapshotsStoreImpl$SnapshotsSubscriber.org$apache$spark$scheduler$cluster$k8s$ExecutorPodsSnapshotsStoreImpl$SnapshotsSubscriber$$processSnapshotsInternal(ExecutorPodsSnapshotsStoreImpl.scala:138) ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] at
[jira] [Updated] (SPARK-39006) Show a directional error message for PVC Dynamic Allocation Failure
[ https://issues.apache.org/jira/browse/SPARK-39006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] qian updated SPARK-39006: - Summary: Show a directional error message for PVC Dynamic Allocation Failure (was: Check PVC claimName must be OnDemand when multiple executor required) > Show a directional error message for PVC Dynamic Allocation Failure > --- > > Key: SPARK-39006 > URL: https://issues.apache.org/jira/browse/SPARK-39006 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 3.1.0 >Reporter: qian >Priority: Major > > When spark application requires multiple executors and not set pvc claimName > be onDemand, it always create executor pods. Because pvc has be created by > first executor pod. > {noformat} > 22/04/22 08:55:47 WARN ExecutorPodsSnapshotsStoreImpl: Exception when > notifying snapshot subscriber. > io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: > POST at: > https://kubernetes.default.svc/api/v1/namespaces/default/persistentvolumeclaims. > Message: persistentvolumeclaims "test-1" already exists. Received status: > Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=null, > kind=persistentvolumeclaims, name=test-1, retryAfterSeconds=null, uid=null, > additionalProperties={}), kind=Status, message=persistentvolumeclaims > "test-1" already exists, metadata=ListMeta(_continue=null, > remainingItemCount=null, resourceVersion=null, selfLink=null, > additionalProperties={}), reason=AlreadyExists, status=Failure, > additionalProperties={}). > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:697) > ~[kubernetes-client-5.10.1.jar:?] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:676) > ~[kubernetes-client-5.10.1.jar:?] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:629) > ~[kubernetes-client-5.10.1.jar:?] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:566) > ~[kubernetes-client-5.10.1.jar:?] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:527) > ~[kubernetes-client-5.10.1.jar:?] > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:315) > ~[kubernetes-client-5.10.1.jar:?] > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:651) > ~[kubernetes-client-5.10.1.jar:?] > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:91) > ~[kubernetes-client-5.10.1.jar:?] > at > io.fabric8.kubernetes.client.dsl.base.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:61) > ~[kubernetes-client-5.10.1.jar:?] > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$3(ExecutorPodsAllocator.scala:415) > ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] > at scala.collection.immutable.List.foreach(List.scala:431) > ~[scala-library-2.12.15.jar:?] > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$1(ExecutorPodsAllocator.scala:408) > ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) > ~[scala-library-2.12.15.jar:?] > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.requestNewExecutors(ExecutorPodsAllocator.scala:385) > ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$35(ExecutorPodsAllocator.scala:349) > ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$35$adapted(ExecutorPodsAllocator.scala:342) > ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] > at > scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > ~[scala-library-2.12.15.jar:?] > at > scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > ~[scala-library-2.12.15.jar:?] > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) > ~[scala-library-2.12.15.jar:?] > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.onNewSnapshots(ExecutorPodsAllocator.scala:342) > ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] > at >
[jira] [Created] (SPARK-39026) Add k8s tolerations support for apache spark
qian created SPARK-39026: Summary: Add k8s tolerations support for apache spark Key: SPARK-39026 URL: https://issues.apache.org/jira/browse/SPARK-39026 Project: Spark Issue Type: Sub-task Components: Kubernetes Affects Versions: 3.2.1 Reporter: qian Fix For: 3.4.0 [https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/] As document shows, this issue aims to support tolerations for apache spark -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-39025) Add k8s Scheduling, Preemption and Eviction feature to apache spark
[ https://issues.apache.org/jira/browse/SPARK-39025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528081#comment-17528081 ] qian commented on SPARK-39025: -- I am working on this. Please assign to me [~dongjoon] [~hyukjin.kwon] > Add k8s Scheduling, Preemption and Eviction feature to apache spark > --- > > Key: SPARK-39025 > URL: https://issues.apache.org/jira/browse/SPARK-39025 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 3.2.1 >Reporter: qian >Priority: Major > Fix For: 3.4.0 > > > As [https://kubernetes.io/docs/concepts/scheduling-eviction/] show, apache > spark lacks of support for k8s scheduling, preemption and eviction. > This issue wants to support toleration/priorityClass/runtimeClass etc , more > information please refer > [https://kubernetes.io/docs/concepts/scheduling-eviction/] > [https://kubernetes.io/docs/concepts/scheduling-eviction/kube-scheduler/] > [https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/] > [https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/] > [https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/] > [https://kubernetes.io/docs/concepts/scheduling-eviction/node-pressure-eviction/] > [https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/] > [https://kubernetes.io/docs/concepts/scheduling-eviction/resource-bin-packing/] > [https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/] > https://kubernetes.io/docs/concepts/scheduling-eviction/scheduler-perf-tuning/ -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39025) Add k8s Scheduling, Preemption and Eviction feature to apache spark
qian created SPARK-39025: Summary: Add k8s Scheduling, Preemption and Eviction feature to apache spark Key: SPARK-39025 URL: https://issues.apache.org/jira/browse/SPARK-39025 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 3.2.1 Reporter: qian Fix For: 3.4.0 As [https://kubernetes.io/docs/concepts/scheduling-eviction/] show, apache spark lacks of support for k8s scheduling, preemption and eviction. This issue wants to support toleration/priorityClass/runtimeClass etc , more information please refer [https://kubernetes.io/docs/concepts/scheduling-eviction/] [https://kubernetes.io/docs/concepts/scheduling-eviction/kube-scheduler/] [https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/] [https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/] [https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/] [https://kubernetes.io/docs/concepts/scheduling-eviction/node-pressure-eviction/] [https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/] [https://kubernetes.io/docs/concepts/scheduling-eviction/resource-bin-packing/] [https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/] https://kubernetes.io/docs/concepts/scheduling-eviction/scheduler-perf-tuning/ -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39006) Check PVC claimName must be OnDemand when multiple executor required
qian created SPARK-39006: Summary: Check PVC claimName must be OnDemand when multiple executor required Key: SPARK-39006 URL: https://issues.apache.org/jira/browse/SPARK-39006 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 3.1.0 Reporter: qian Fix For: 3.4.0 When spark application requires multiple executors and not set pvc claimName be onDemand, it always create executor pods. Because pvc has be created by first executor pod. {noformat} 22/04/22 08:55:47 WARN ExecutorPodsSnapshotsStoreImpl: Exception when notifying snapshot subscriber. io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://kubernetes.default.svc/api/v1/namespaces/default/persistentvolumeclaims. Message: persistentvolumeclaims "test-1" already exists. Received status: Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=null, kind=persistentvolumeclaims, name=test-1, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=persistentvolumeclaims "test-1" already exists, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=AlreadyExists, status=Failure, additionalProperties={}). at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:697) ~[kubernetes-client-5.10.1.jar:?] at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:676) ~[kubernetes-client-5.10.1.jar:?] at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:629) ~[kubernetes-client-5.10.1.jar:?] at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:566) ~[kubernetes-client-5.10.1.jar:?] at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:527) ~[kubernetes-client-5.10.1.jar:?] at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:315) ~[kubernetes-client-5.10.1.jar:?] at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:651) ~[kubernetes-client-5.10.1.jar:?] at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:91) ~[kubernetes-client-5.10.1.jar:?] at io.fabric8.kubernetes.client.dsl.base.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:61) ~[kubernetes-client-5.10.1.jar:?] at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$3(ExecutorPodsAllocator.scala:415) ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] at scala.collection.immutable.List.foreach(List.scala:431) ~[scala-library-2.12.15.jar:?] at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$1(ExecutorPodsAllocator.scala:408) ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) ~[scala-library-2.12.15.jar:?] at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.requestNewExecutors(ExecutorPodsAllocator.scala:385) ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$35(ExecutorPodsAllocator.scala:349) ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$35$adapted(ExecutorPodsAllocator.scala:342) ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) ~[scala-library-2.12.15.jar:?] at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) ~[scala-library-2.12.15.jar:?] at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) ~[scala-library-2.12.15.jar:?] at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.onNewSnapshots(ExecutorPodsAllocator.scala:342) ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$start$3(ExecutorPodsAllocator.scala:120) ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$start$3$adapted(ExecutorPodsAllocator.scala:120) ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT] at
[jira] [Created] (SPARK-38968) remove hadoopConf from KerberosConfDriverFeatureStep
qian created SPARK-38968: Summary: remove hadoopConf from KerberosConfDriverFeatureStep Key: SPARK-38968 URL: https://issues.apache.org/jira/browse/SPARK-38968 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 3.2.1 Reporter: qian Fix For: 3.4.0 remove no-use hadoopConf from KerberosConfDriverFeatureStep -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38945) simply KEYTAB and PRINCIPAL in KerberosConfDriverFeatureStep
qian created SPARK-38945: Summary: simply KEYTAB and PRINCIPAL in KerberosConfDriverFeatureStep Key: SPARK-38945 URL: https://issues.apache.org/jira/browse/SPARK-38945 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 3.2.1 Reporter: qian Fix For: 3.4.0 Simply KEYTAB and PRINCIPAL in KerberosConfDriverFeatureStep, because already imported -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38925) update guava to 30.1.1-jre
qian created SPARK-38925: Summary: update guava to 30.1.1-jre Key: SPARK-38925 URL: https://issues.apache.org/jira/browse/SPARK-38925 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 3.2.1, 3.2.0, 3.1.2, 3.1.1, 3.1.0, 3.0.2, 3.0.1, 3.0.0 Reporter: qian Fix For: 3.4.0 Update guava to 30.1.1-jre guava 14.0.1 has risk: * [CVE-2020-8908|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-8908] * [CVE-2018-10237|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-10237] -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38770) Simply steps to re write primary resource in k8s spark application
[ https://issues.apache.org/jira/browse/SPARK-38770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] qian updated SPARK-38770: - Fix Version/s: 3.4.0 (was: 3.3.0) > Simply steps to re write primary resource in k8s spark application > -- > > Key: SPARK-38770 > URL: https://issues.apache.org/jira/browse/SPARK-38770 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.2.1 >Reporter: qian >Priority: Major > Fix For: 3.4.0 > > > re-write primary resource actions use renameMainAppResource method twice and > second usage has no effect. So, Simply steps to re write primary resource in > k8s spark application -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38770) Simply steps to re write primary resource in k8s spark application
qian created SPARK-38770: Summary: Simply steps to re write primary resource in k8s spark application Key: SPARK-38770 URL: https://issues.apache.org/jira/browse/SPARK-38770 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 3.2.1, 3.2.0, 3.1.2, 3.1.1, 3.1.0 Reporter: qian Fix For: 3.3.0 re-write primary resource actions use renameMainAppResource method twice and second usage has no effect. So, Simply steps to re write primary resource in k8s spark application -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2
[ https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17514390#comment-17514390 ] qian commented on SPARK-38652: -- [~dongjoon] Hi. DepsTestsSuite has tests as follow * Launcher client dependencies * SPARK-33615: Launcher client archives * SPARK-33748: Launcher python client respecting PYSPARK_PYTHON * ... spark-submit command is used by these tests. So, I think DepsTestsSuite blocks. Could you please check these tests run? Maybe `-Dtest.exclude.tags` option doesn't need `minikube` value. > K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2 > -- > > Key: SPARK-38652 > URL: https://issues.apache.org/jira/browse/SPARK-38652 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Tests >Affects Versions: 3.3.0 >Reporter: qian >Priority: Major > > DepsTestsSuite in k8s IT test is blocked with PathIOException in > hadoop-aws-3.3.2. Exception Message is as follow > {code:java} > Exception in thread "main" org.apache.spark.SparkException: Uploading file > /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar > failed... > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277) > > at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) > > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) > at scala.collection.TraversableLike.map(TraversableLike.scala:286) > at scala.collection.TraversableLike.map$(TraversableLike.scala:279) > at scala.collection.AbstractTraversable.map(Traversable.scala:108) > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275) > > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187) > > at scala.collection.immutable.List.foreach(List.scala:431) > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178) > > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86) > at > scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) > > at > scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) > > at scala.collection.immutable.List.foldLeft(List.scala:91) > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84) > > at > org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242) > at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214) > > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) > > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) > > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: > org.apache.spark.SparkException: Error uploading file > spark-examples_2.12-3.4.0-SNAPSHOT.jar > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328) > > ... 30 more > Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get relative path > for >
[jira] [Commented] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2
[ https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17512846#comment-17512846 ] qian commented on SPARK-38652: -- [~ste...@apache.org] Hi, I run the same suite against an real aws s3 endpoint, and have same exception. I think we could exclude reason about minio deployment. {noformat} $ bin/spark-submit --deploy-mode cluster --class org.apache.spark.examples.SparkRemoteFileTest --master k8s://https://192.168.64.87:8443/ --conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem --conf spark.testing=false --conf spark.hadoop.fs.s3a.access.key=XXX --conf spark.kubernetes.driver.label.spark-app-locator=a8937b5fdf6a444a806ee1c3ecac37fc --conf spark.kubernetes.file.upload.path=s3a://dcoliversun --conf spark.authenticate=true --conf spark.executor.instances=1 --conf spark.kubernetes.submission.waitAppCompletion=false --conf spark.kubernetes.executor.label.spark-app-locator=a8937b5fdf6a444a806ee1c3ecac37fc --conf spark.kubernetes.namespace=spark-job --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark --conf spark.hadoop.fs.s3a.secret.key=XXX --conf spark.executor.extraJavaOptions=-Dlog4j2.debug --conf spark.hadoop.fs.s3a.endpoint=https://s3.ap-southeast-1.amazonaws.com --conf spark.app.name=spark-test-app --conf spark.files=/tmp/tmp7013228683780235449.txt --conf spark.ui.enabled=true --conf spark.driver.extraJavaOptions=-Dlog4j2.debug --conf spark.kubernetes.container.image=registry.cn-hangzhou.aliyuncs.com/smart-spark/spark:test --conf spark.executor.cores=1 --conf spark.hadoop.fs.s3a.connection.ssl.enabled=false /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar tmp7013228683780235449.txt 22/03/27 15:16:27 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 22/03/27 15:16:28 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file 22/03/27 15:16:28 INFO KerberosConfDriverFeatureStep: You have not specified a krb5.conf file locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image. 22/03/27 15:16:29 INFO KubernetesUtils: sq-isLocalAndResolvable => resource is file:/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar 22/03/27 15:16:29 INFO KubernetesUtils: sq => uri is file:/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar, uri scheme is file 22/03/27 15:16:29 INFO KubernetesUtils: sq-uploadAndTransformFileUris, uri is file:/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar 22/03/27 15:16:29 WARN MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties 22/03/27 15:16:29 INFO MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s). 22/03/27 15:16:29 INFO MetricsSystemImpl: s3a-file-system metrics system started 22/03/27 15:16:31 INFO KubernetesUtils: Uploading file: /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar to dest: s3a://dcoliversun/spark-upload-eb20e2da-17b6-4dcd-b4f1-8e47bc80c1e9/spark-examples_2.12-3.4.0-SNAPSHOT.jar... 22/03/27 15:16:31 INFO S3AFileSystem: Copying local file from /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar to s3a://dcoliversun/spark-upload-eb20e2da-17b6-4dcd-b4f1-8e47bc80c1e9/spark-examples_2.12-3.4.0-SNAPSHOT.jar 22/03/27 15:16:31 INFO CopyFromLocalOperation: Copying local file from /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar to s3a://dcoliversun/spark-upload-eb20e2da-17b6-4dcd-b4f1-8e47bc80c1e9/spark-examples_2.12-3.4.0-SNAPSHOT.jar 22/03/27 15:16:31 INFO CopyFromLocalOperation: execute#CopyFromLocalOperation, sourceFile is /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar 22/03/27 15:16:31 INFO CopyFromLocalOperation: uploadSourceFromFS#CopyFromLocalOperation, localFile 1: path is LocatedFileStatus{path=file:/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar; isDirectory=false; length=1567474; replication=1; blocksize=33554432; modification_time=1647874074000; access_time=1647874074000; owner=hengzhen.sq; group=staff; permission=rw-r--r--; isSymlink=false; hasAcl=false; isEncrypted=false; isErasureCoded=false} 22/03/27 15:16:31 INFO CopyFromLocalOperation: getFinalPath#CopyFromLocalOperation, src is file:/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar, source is /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar Exception in thread "main"
[jira] [Commented] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2
[ https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17512657#comment-17512657 ] qian commented on SPARK-38652: -- [~ste...@apache.org] No. I can do it, which help us to confirm whether the cause of the problem is minio or hadoop-aws-3.3.2. And, I share result about test here. > K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2 > -- > > Key: SPARK-38652 > URL: https://issues.apache.org/jira/browse/SPARK-38652 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Tests >Affects Versions: 3.3.0 >Reporter: qian >Priority: Major > > DepsTestsSuite in k8s IT test is blocked with PathIOException in > hadoop-aws-3.3.2. Exception Message is as follow > {code:java} > Exception in thread "main" org.apache.spark.SparkException: Uploading file > /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar > failed... > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277) > > at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) > > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) > at scala.collection.TraversableLike.map(TraversableLike.scala:286) > at scala.collection.TraversableLike.map$(TraversableLike.scala:279) > at scala.collection.AbstractTraversable.map(Traversable.scala:108) > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275) > > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187) > > at scala.collection.immutable.List.foreach(List.scala:431) > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178) > > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86) > at > scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) > > at > scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) > > at scala.collection.immutable.List.foldLeft(List.scala:91) > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84) > > at > org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242) > at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738) > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214) > > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) > > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) > > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: > org.apache.spark.SparkException: Error uploading file > spark-examples_2.12-3.4.0-SNAPSHOT.jar > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328) > > ... 30 more > Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get relative path > for > URI:file:///Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar': > Input/output error > at > org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.getFinalPath(CopyFromLocalOperation.java:365) >
[jira] [Updated] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2
[ https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] qian updated SPARK-38652: - Description: DepsTestsSuite in k8s IT test is blocked with PathIOException in hadoop-aws-3.3.2. Exception Message is as follow {code:java} Exception in thread "main" org.apache.spark.SparkException: Uploading file /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar failed... at org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332) at org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at scala.collection.TraversableLike.map(TraversableLike.scala:286) at scala.collection.TraversableLike.map$(TraversableLike.scala:279) at scala.collection.AbstractTraversable.map(Traversable.scala:108) at org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275) at org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187) at scala.collection.immutable.List.foreach(List.scala:431) at org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178) at org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86) at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) at scala.collection.immutable.List.foldLeft(List.scala:91) at org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84) at org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242) at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: org.apache.spark.SparkException: Error uploading file spark-examples_2.12-3.4.0-SNAPSHOT.jar at org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355) at org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328) ... 30 more Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get relative path for URI:file:///Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar': Input/output error at org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.getFinalPath(CopyFromLocalOperation.java:365) at org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.uploadSourceFromFS(CopyFromLocalOperation.java:226) at org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.execute(CopyFromLocalOperation.java:170) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$copyFromLocalFile$25(S3AFileSystem.java:3920) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:499) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:444) at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2337) at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2356) at
[jira] [Updated] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2
[ https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] qian updated SPARK-38652: - Description: DepsTestsSuite in k8s IT test is blocked with PathIOException in hadoop-aws-3.3.2. Exception Message is as follow {code:java} Exception in thread "main" org.apache.spark.SparkException: Uploading file /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar failed... at org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332) at org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at scala.collection.TraversableLike.map(TraversableLike.scala:286) at scala.collection.TraversableLike.map$(TraversableLike.scala:279) at scala.collection.AbstractTraversable.map(Traversable.scala:108) at org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275) at org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187) at scala.collection.immutable.List.foreach(List.scala:431) at org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178) at org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86) at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) at scala.collection.immutable.List.foldLeft(List.scala:91) at org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84) at org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242) at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: org.apache.spark.SparkException: Error uploading file spark-examples_2.12-3.4.0-SNAPSHOT.jar at org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355) at org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328) ... 30 more Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get relative path for URI:file:///Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar': Input/output errorat org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.getFinalPath(CopyFromLocalOperation.java:365) at org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.uploadSourceFromFS(CopyFromLocalOperation.java:226) at org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.execute(CopyFromLocalOperation.java:170) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$copyFromLocalFile$25(S3AFileSystem.java:3920) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:499) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:444) at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2337) at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2356) at
[jira] [Commented] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2
[ https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17512163#comment-17512163 ] qian commented on SPARK-38652: -- I am working on it. cc [~chaosun] & [~dongjoon] > K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2 > -- > > Key: SPARK-38652 > URL: https://issues.apache.org/jira/browse/SPARK-38652 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Tests >Affects Versions: 3.3.0 >Reporter: qian >Priority: Major > > DepsTestsSuite in k8s IT test is blocked with PathIOException in > hadoop-aws-3.3.2. Exception Message is as follow > {code:java} > Exception in thread "main" org.apache.spark.SparkException: Uploading file > /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar > failed... > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277) > > at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) > > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) > at scala.collection.TraversableLike.map(TraversableLike.scala:286) > at scala.collection.TraversableLike.map$(TraversableLike.scala:279) > at scala.collection.AbstractTraversable.map(Traversable.scala:108) > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275) > > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187) > > at scala.collection.immutable.List.foreach(List.scala:431) > at > org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178) > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86) > at > scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) > > at > scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) > > at scala.collection.immutable.List.foldLeft(List.scala:91) > at > org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84) > > at > org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242) > at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242) > > at > org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214) > > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) > > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) > > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: > org.apache.spark.SparkException: Error uploading file > spark-examples_2.12-3.4.0-SNAPSHOT.jar > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355) > > at > org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328) > > ... 30 more > Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get relative path > for > URI:file:///Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar': > Input/output errorat > org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.getFinalPath(CopyFromLocalOperation.java:365) > > at >
[jira] [Created] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2
qian created SPARK-38652: Summary: K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2 Key: SPARK-38652 URL: https://issues.apache.org/jira/browse/SPARK-38652 Project: Spark Issue Type: Bug Components: Kubernetes, Tests Affects Versions: 3.3.0 Reporter: qian DepsTestsSuite in k8s IT test is blocked with PathIOException in hadoop-aws-3.3.2. Exception Message is as follow {code:java} Exception in thread "main" org.apache.spark.SparkException: Uploading file /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar failed... at org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332) at org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at scala.collection.TraversableLike.map(TraversableLike.scala:286) at scala.collection.TraversableLike.map$(TraversableLike.scala:279) at scala.collection.AbstractTraversable.map(Traversable.scala:108) at org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275) at org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187) at scala.collection.immutable.List.foreach(List.scala:431) at org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178) at org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86) at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) at scala.collection.immutable.List.foldLeft(List.scala:91) at org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84) at org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242) at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242) at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: org.apache.spark.SparkException: Error uploading file spark-examples_2.12-3.4.0-SNAPSHOT.jar at org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355) at org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328) ... 30 more Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get relative path for URI:file:///Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar': Input/output errorat org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.getFinalPath(CopyFromLocalOperation.java:365) at org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.uploadSourceFromFS(CopyFromLocalOperation.java:226) at org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.execute(CopyFromLocalOperation.java:170) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$copyFromLocalFile$25(S3AFileSystem.java:3920) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:499) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:444) at
[jira] [Created] (SPARK-38582) Introduce `buildEnvVarsWithKV` and `buildEnvVarsWithFieldRef` for `KubernetesUtils` to eliminate duplicate code pattern
qian created SPARK-38582: Summary: Introduce `buildEnvVarsWithKV` and `buildEnvVarsWithFieldRef` for `KubernetesUtils` to eliminate duplicate code pattern Key: SPARK-38582 URL: https://issues.apache.org/jira/browse/SPARK-38582 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 3.2.1 Reporter: qian There are many duplicate code patterns in Spark Code: {code:java} new EnvVarBuilder() .withName(key) .withValue(value) .build() {code} {code:java} new EnvVarBuilder() .withName(name) .withValueFrom(new EnvVarSourceBuilder() .withNewFieldRef(version, field) .build()) .build() {code} [The assignment statement for executor envVar | https://github.com/apache/spark/blob/branch-3.3/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala#L123-L185] has 63 lines. We could introduce _buildEnvVarsWithKV_ and _buildEnvVarsWithFieldRef_ function to simplify the above code patterns. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38546) replace deprecated ChiSqSelector with UnivariateFeatureSelector
qian created SPARK-38546: Summary: replace deprecated ChiSqSelector with UnivariateFeatureSelector Key: SPARK-38546 URL: https://issues.apache.org/jira/browse/SPARK-38546 Project: Spark Issue Type: Improvement Components: Examples Affects Versions: 3.2.1, 3.2.0, 3.1.2 Reporter: qian UnivariateFeatureSelector was added and ChiSqSelector was labeled as deprecated in SPARK-34080 So we need replace deprecated ChiSqSelector with UnivariateFeatureSelector. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38507) DataFrame withColumn method not adding or replacing columns when alias is used
[ https://issues.apache.org/jira/browse/SPARK-38507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504971#comment-17504971 ] qian commented on SPARK-38507: -- [~amavrommatis] Method *select()* regards input argument like _xx.xx_ as {_}table.column{_}, which is by design. I don't agree that this is actually a bug. If you stick to your point, you could email to spark user email group about this case. :) > DataFrame withColumn method not adding or replacing columns when alias is used > -- > > Key: SPARK-38507 > URL: https://issues.apache.org/jira/browse/SPARK-38507 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.2 >Reporter: Alexandros Mavrommatis >Priority: Major > Labels: SQL, catalyst > > I have an input DataFrame *df* created as follows: > {code:java} > import spark.implicits._ > val df = List((5, 10), (6, 20)).toDF("field1", "field2").alias("df") {code} > When I execute either this command: > {code:java} > df.select("df.field2").show(2) {code} > or that one: > {code:java} > df.withColumn("df.field2", lit(0)).select("df.field2").show(2) {code} > I get the same result: > {code:java} > +--+ > |field2| > +--+ > | 10| > | 20| > +--+ {code} > Additionally, when I execute the following command: > {code:java} > df.withColumn("df.field3", lit(0)).select("df.field3").show(2){code} > I get this exception: > {code:java} > org.apache.spark.sql.AnalysisException: cannot resolve '`df.field3`' given > input columns: [df.field3, df.field1, df.field2]; 'Project ['df.field3] +- > Project [field1#7, field2#8, 0 AS df.field3#31] +- SubqueryAlias df > +- Project [_1#2 AS field1#7, _2#3 AS field2#8] +- LocalRelation > [_1#2, _2#3] at > org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:155) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:152) > at > org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$2(TreeNode.scala:342) > at > org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:74) > at > org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:342) > at > org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformExpressionsUp$1(QueryPlan.scala:104) > at > org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:116) > at > org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:74) > at > org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:116) > at > org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:127) > at > org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$3(QueryPlan.scala:132) > at > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at > scala.collection.TraversableLike.map(TraversableLike.scala:238) at > scala.collection.TraversableLike.map$(TraversableLike.scala:231) at > scala.collection.AbstractTraversable.map(Traversable.scala:108) at > org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:132) > at > org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$4(QueryPlan.scala:137) > at > org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:244) > at > org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:137) > at > org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUp(QueryPlan.scala:104) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1(CheckAnalysis.scala:152) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1$adapted(CheckAnalysis.scala:93) > at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:184) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis(CheckAnalysis.scala:93) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis$(CheckAnalysis.scala:90) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:155) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:176) > at
[jira] [Commented] (SPARK-38507) DataFrame withColumn method not adding or replacing columns when alias is used
[ https://issues.apache.org/jira/browse/SPARK-38507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504703#comment-17504703 ] qian commented on SPARK-38507: -- Hi [~amavrommatis] The reason for this problem is because you alias dataframe *df* as *df*, resulting in a shema conflict. You can try this command: {code:scala} df.withColumn("field3", lit(0)).select("field3").show(2) {code} While this command works, the result is not right {code:scala} df.withColumn("df.field2", lit(0)).select("df.field2").show(2) {code} Result is origin column *field2*, not your new column *df.field2*, the value of which is 0. > DataFrame withColumn method not adding or replacing columns when alias is used > -- > > Key: SPARK-38507 > URL: https://issues.apache.org/jira/browse/SPARK-38507 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.2 >Reporter: Alexandros Mavrommatis >Priority: Major > Labels: SQL, catalyst > > I have an input DataFrame *df* created as follows: > {code:java} > import spark.implicits._ > val df = List((5, 10), (6, 20)).toDF("field1", "field2").alias("df") {code} > When I execute either this command: > {code:java} > df.select("df.field2").show(2) {code} > or that one: > {code:java} > df.withColumn("df.field2", lit(0)).select("df.field2").show(2) {code} > I get the same result: > {code:java} > +--+ > |field2| > +--+ > | 10| > | 20| > +--+ {code} > Additionally, when I execute the following command: > {code:java} > df.withColumn("df.field3", lit(0)).select("df.field3").show(2){code} > I get this exception: > {code:java} > org.apache.spark.sql.AnalysisException: cannot resolve '`df.field3`' given > input columns: [df.field3, df.field1, df.field2]; 'Project ['df.field3] +- > Project [field1#7, field2#8, 0 AS df.field3#31] +- SubqueryAlias df > +- Project [_1#2 AS field1#7, _2#3 AS field2#8] +- LocalRelation > [_1#2, _2#3] at > org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:155) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:152) > at > org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$2(TreeNode.scala:342) > at > org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:74) > at > org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:342) > at > org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformExpressionsUp$1(QueryPlan.scala:104) > at > org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:116) > at > org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:74) > at > org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:116) > at > org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:127) > at > org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$3(QueryPlan.scala:132) > at > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) > at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at > scala.collection.TraversableLike.map(TraversableLike.scala:238) at > scala.collection.TraversableLike.map$(TraversableLike.scala:231) at > scala.collection.AbstractTraversable.map(Traversable.scala:108) at > org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:132) > at > org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$4(QueryPlan.scala:137) > at > org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:244) > at > org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:137) > at > org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUp(QueryPlan.scala:104) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1(CheckAnalysis.scala:152) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1$adapted(CheckAnalysis.scala:93) > at > org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:184) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis(CheckAnalysis.scala:93) > at > org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis$(CheckAnalysis.scala:90)
[jira] [Updated] (SPARK-38439) Add Braces with if,else,for,do and while statements
[ https://issues.apache.org/jira/browse/SPARK-38439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] qian updated SPARK-38439: - Priority: Trivial (was: Minor) > Add Braces with if,else,for,do and while statements > --- > > Key: SPARK-38439 > URL: https://issues.apache.org/jira/browse/SPARK-38439 > Project: Spark > Issue Type: Improvement > Components: Spark Core, SQL >Affects Versions: 3.2.0, 3.2.1 >Reporter: qian >Priority: Trivial > > Braces are used with {_}if{_}, {_}else{_}, {_}for{_}, _do_ and _while_ > statements, even if the body contains only a single statement. Avoid using > the following example: > {code:java} > if (condition) statements; > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38439) Add Braces with if,else,for,do and while statements
[ https://issues.apache.org/jira/browse/SPARK-38439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17503989#comment-17503989 ] qian commented on SPARK-38439: -- This is useless. Please ignore it > Add Braces with if,else,for,do and while statements > --- > > Key: SPARK-38439 > URL: https://issues.apache.org/jira/browse/SPARK-38439 > Project: Spark > Issue Type: Improvement > Components: Spark Core, SQL >Affects Versions: 3.2.0, 3.2.1 >Reporter: qian >Priority: Minor > > Braces are used with {_}if{_}, {_}else{_}, {_}for{_}, _do_ and _while_ > statements, even if the body contains only a single statement. Avoid using > the following example: > {code:java} > if (condition) statements; > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38439) Add Braces with if,else,for,do and while statements
qian created SPARK-38439: Summary: Add Braces with if,else,for,do and while statements Key: SPARK-38439 URL: https://issues.apache.org/jira/browse/SPARK-38439 Project: Spark Issue Type: Improvement Components: Spark Core, SQL Affects Versions: 3.2.1, 3.2.0 Reporter: qian Braces are used with {_}if{_}, {_}else{_}, {_}for{_}, _do_ and _while_ statements, even if the body contains only a single statement. Avoid using the following example: {code:java} if (condition) statements; {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38302) Use Java 17 in K8S integration tests when setting spark-tgz
[ https://issues.apache.org/jira/browse/SPARK-38302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497833#comment-17497833 ] qian commented on SPARK-38302: -- [~dongjoon] Thanks for your work :) > Use Java 17 in K8S integration tests when setting spark-tgz > --- > > Key: SPARK-38302 > URL: https://issues.apache.org/jira/browse/SPARK-38302 > Project: Spark > Issue Type: Sub-task > Components: Kubernetes, Tests >Affects Versions: 3.3.0 >Reporter: qian >Assignee: qian >Priority: Minor > > When setting parameters `spark-tgz` during integration tests, the error that > `resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.java17` > cannot be found occurs. This is due to the default value of > `spark.kubernetes.test.dockerFile` being > `resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.java17`. > When using the tgz, the working directory is > `${spark.kubernetes.test.unpackSparkDir}`, and the relative path > `resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.java17` > is invalid. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38302) Dockerfile.java17 can't be used in K8s integration tests when
qian created SPARK-38302: Summary: Dockerfile.java17 can't be used in K8s integration tests when Key: SPARK-38302 URL: https://issues.apache.org/jira/browse/SPARK-38302 Project: Spark Issue Type: Improvement Components: Kubernetes, Tests Affects Versions: 3.3.0 Reporter: qian When setting parameters `spark-tgz` during integration tests, the error that `resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.java17` cannot be found occurs. This is due to the default value of `spark.kubernetes.test.dockerFile` being `resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.java17`. When using the tgz, the working directory is `${spark.kubernetes.test.unpackSparkDir}`, and the relative path `resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.java17` is invalid. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37713) No namespace assigned in Executor Pod ConfigMap
qian created SPARK-37713: Summary: No namespace assigned in Executor Pod ConfigMap Key: SPARK-37713 URL: https://issues.apache.org/jira/browse/SPARK-37713 Project: Spark Issue Type: Bug Components: Kubernetes Affects Versions: 3.2.0, 3.1.2, 3.1.1 Reporter: qian Fix For: 3.3.0 Since Spark 3.X, Executor pod needs separate executor configmap. But, no namespace is assigned in configmap when building it. K8s views configmap without namespace as global resource. Once pod access is restricted (global resources cannot be read), and executor cannot obtain configmap itself. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37645) Word spell error - "labeled" spells as "labled"
qian created SPARK-37645: Summary: Word spell error - "labeled" spells as "labled" Key: SPARK-37645 URL: https://issues.apache.org/jira/browse/SPARK-37645 Project: Spark Issue Type: Improvement Components: Kubernetes Affects Versions: 3.2.0, 3.1.1, 3.1.0 Reporter: qian Fix For: 3.3.0 Word spell error - "labeled" spells as "labled" -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17317) Add package vignette to SparkR
[ https://issues.apache.org/jira/browse/SPARK-17317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15450013#comment-15450013 ] Junyang Qian commented on SPARK-17317: -- WIP > Add package vignette to SparkR > -- > > Key: SPARK-17317 > URL: https://issues.apache.org/jira/browse/SPARK-17317 > Project: Spark > Issue Type: Improvement >Reporter: Junyang Qian > > In publishing SparkR to CRAN, it would be nice to have a vignette as a user > guide that > * describes the big picture > * introduces the use of various methods > This is important for new users because they may not even know which method > to look up. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-17317) Add package vignette to SparkR
Junyang Qian created SPARK-17317: Summary: Add package vignette to SparkR Key: SPARK-17317 URL: https://issues.apache.org/jira/browse/SPARK-17317 Project: Spark Issue Type: Improvement Reporter: Junyang Qian In publishing SparkR to CRAN, it would be nice to have a vignette as a user guide that * describes the big picture * introduces the use of various methods This is important for new users because they may not even know which method to look up. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-17315) Add Kolmogorov-Smirnov Test to SparkR
Junyang Qian created SPARK-17315: Summary: Add Kolmogorov-Smirnov Test to SparkR Key: SPARK-17315 URL: https://issues.apache.org/jira/browse/SPARK-17315 Project: Spark Issue Type: New Feature Reporter: Junyang Qian Kolmogorov-Smirnov Test is a popular nonparametric test of equality of distributions. There is implementation in MLlib. It will be nice if we can expose that in SparkR. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17241) SparkR spark.glm should have configurable regularization parameter
[ https://issues.apache.org/jira/browse/SPARK-17241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437717#comment-15437717 ] Junyang Qian commented on SPARK-17241: -- I'll take a closer look and see if we can add it easily. > SparkR spark.glm should have configurable regularization parameter > -- > > Key: SPARK-17241 > URL: https://issues.apache.org/jira/browse/SPARK-17241 > Project: Spark > Issue Type: Improvement >Reporter: Junyang Qian > > Spark has configurable L2 regularization parameter for generalized linear > regression. It is very important to have them in SparkR so that users can run > ridge regression. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17241) SparkR spark.glm should have configurable regularization parameter
[ https://issues.apache.org/jira/browse/SPARK-17241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437692#comment-15437692 ] Junyang Qian commented on SPARK-17241: -- [~shivaram] It seems that spark has it for linear regression but not for glm. > SparkR spark.glm should have configurable regularization parameter > -- > > Key: SPARK-17241 > URL: https://issues.apache.org/jira/browse/SPARK-17241 > Project: Spark > Issue Type: Improvement >Reporter: Junyang Qian > > Spark has configurable L2 regularization parameter for generalized linear > regression. It is very important to have them in SparkR so that users can run > ridge regression. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-17241) SparkR spark.glm should have configurable regularization parameter
[ https://issues.apache.org/jira/browse/SPARK-17241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junyang Qian updated SPARK-17241: - Summary: SparkR spark.glm should have configurable regularization parameter (was: SparkR spark.glm should have configurable regularization parameter(s)) > SparkR spark.glm should have configurable regularization parameter > -- > > Key: SPARK-17241 > URL: https://issues.apache.org/jira/browse/SPARK-17241 > Project: Spark > Issue Type: Improvement >Reporter: Junyang Qian > > Spark has configurable L2 regularization parameter for linear regression and > an additional elastic-net parameter for generalized linear model. It is very > important to have them in SparkR so that users can run ridge regression and > elastic-net. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-17241) SparkR spark.glm should have configurable regularization parameter
[ https://issues.apache.org/jira/browse/SPARK-17241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junyang Qian updated SPARK-17241: - Description: Spark has configurable L2 regularization parameter for generalized linear regression. It is very important to have them in SparkR so that users can run ridge regression. (was: Spark has configurable L2 regularization parameter for linear regression and an additional elastic-net parameter for generalized linear model. It is very important to have them in SparkR so that users can run ridge regression and elastic-net.) > SparkR spark.glm should have configurable regularization parameter > -- > > Key: SPARK-17241 > URL: https://issues.apache.org/jira/browse/SPARK-17241 > Project: Spark > Issue Type: Improvement >Reporter: Junyang Qian > > Spark has configurable L2 regularization parameter for generalized linear > regression. It is very important to have them in SparkR so that users can run > ridge regression. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-17241) SparkR spark.glm should have configurable regularization parameter(s)
Junyang Qian created SPARK-17241: Summary: SparkR spark.glm should have configurable regularization parameter(s) Key: SPARK-17241 URL: https://issues.apache.org/jira/browse/SPARK-17241 Project: Spark Issue Type: Improvement Reporter: Junyang Qian Spark has configurable L2 regularization parameter for linear regression and an additional elastic-net parameter for generalized linear model. It is very important to have them in SparkR so that users can run ridge regression and elastic-net. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16508) Fix documentation warnings found by R CMD check
[ https://issues.apache.org/jira/browse/SPARK-16508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15411132#comment-15411132 ] Junyang Qian commented on SPARK-16508: -- Sounds good. I'll be working on the undocumented/duplicated argument warnings. > Fix documentation warnings found by R CMD check > --- > > Key: SPARK-16508 > URL: https://issues.apache.org/jira/browse/SPARK-16508 > Project: Spark > Issue Type: Sub-task > Components: SparkR >Reporter: Shivaram Venkataraman > > A full list of warnings after the fixes in SPARK-16507 is at > https://gist.github.com/shivaram/62866c4ca59c5d34b8963939cf04b5eb -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16508) Fix documentation warnings found by R CMD check
[ https://issues.apache.org/jira/browse/SPARK-16508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410296#comment-15410296 ] Junyang Qian commented on SPARK-16508: -- It seems that there are still some warnings in my local check, e.g. undocumented arguments in as.data.frame "row.names", "optional". I was wondering if I missed something or if we should deal with those? > Fix documentation warnings found by R CMD check > --- > > Key: SPARK-16508 > URL: https://issues.apache.org/jira/browse/SPARK-16508 > Project: Spark > Issue Type: Sub-task > Components: SparkR >Reporter: Shivaram Venkataraman > > A full list of warnings after the fixes in SPARK-16507 is at > https://gist.github.com/shivaram/62866c4ca59c5d34b8963939cf04b5eb -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-16727) SparkR unit test fails - incorrect expected output
Junyang Qian created SPARK-16727: Summary: SparkR unit test fails - incorrect expected output Key: SPARK-16727 URL: https://issues.apache.org/jira/browse/SPARK-16727 Project: Spark Issue Type: Bug Reporter: Junyang Qian https://github.com/apache/spark/blob/master/R/pkg/inst/tests/testthat/test_sparkSQL.R#L1827 When I run spark/R/run-tests.sh, the tests failed with the following message: 1. Failure (at test_sparkSQL.R#1827): describe() and summarize() on a DataFrame collect(stats)[4, "name"] not equal to "Andy" target is NULL, current is character 2. Failure (at test_sparkSQL.R#1831): describe() and summarize() on a DataFrame collect(stats2)[4, "name"] not equal to "Andy" target is NULL, current is character Error: Test failures Execution halted -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16579) Add a spark install function
[ https://issues.apache.org/jira/browse/SPARK-16579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384628#comment-15384628 ] Junyang Qian commented on SPARK-16579: -- If we find Spark home and the JARs missing, do we want to still install to a cache dir and then redirect Spark home to that dir? > Add a spark install function > > > Key: SPARK-16579 > URL: https://issues.apache.org/jira/browse/SPARK-16579 > Project: Spark > Issue Type: Sub-task > Components: SparkR >Reporter: Shivaram Venkataraman >Assignee: Junyang Qian > > As described in the design doc we need to introduce a function to install > Spark in case the user directly downloads SparkR from CRAN. > To do that we can introduce a install_spark function that takes in the > following arguments > {code} > hadoop_version > url_to_use # defaults to apache > local_dir # defaults to a cache dir > {code} > Further more I think we can automatically run this from sparkR.init if we > find Spark home and the JARs missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org