[jira] [Updated] (SPARK-39526) Remove no-null conditional statements in SparkSubmitCommandBuilder#isThriftServer

2022-06-19 Thread qian (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

qian updated SPARK-39526:
-
Description: 
{code:java}
private boolean isThriftServer(String mainClass) {
  return (mainClass != null &&
   
mainClass.equals("org.apache.spark.sql.hive.thriftserver.HiveThriftServer2")); 
} {code}
No scenario that *mainClass* is null, because already have defensive code.
{code:java}
if (isExample && !isSpecialCommand) {
  checkArgument(mainClass != null, "Missing example class name.");
}

if (mainClass != null) {
  args.add(parser.CLASS);
  args.add(mainClass);
} {code}

  was:
{code:java}
private boolean isThriftServer(String mainClass) {
  return (mainClass != null &&
   
mainClass.equals("org.apache.spark.sql.hive.thriftserver.HiveThriftServer2")); 
} {code}
No scenario that *mainClass* is null, because already have defensive code.
{code:java}
if (isExample && !isSpecialCommand) {
  checkArgument(mainClass != null, "Missing example class name.");
}

if (mainClass != null) {
  args.add(parser.CLASS);
  args.add(mainClass);
} {code}


> Remove no-null conditional statements in 
> SparkSubmitCommandBuilder#isThriftServer
> -
>
> Key: SPARK-39526
> URL: https://issues.apache.org/jira/browse/SPARK-39526
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: qian
>Priority: Minor
>
> {code:java}
> private boolean isThriftServer(String mainClass) {
>   return (mainClass != null &&
>
> mainClass.equals("org.apache.spark.sql.hive.thriftserver.HiveThriftServer2"));
>  
> } {code}
> No scenario that *mainClass* is null, because already have defensive code.
> {code:java}
> if (isExample && !isSpecialCommand) {
>   checkArgument(mainClass != null, "Missing example class name.");
> }
> if (mainClass != null) {
>   args.add(parser.CLASS);
>   args.add(mainClass);
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39526) Remove no-null conditional statements in SparkSubmitCommandBuilder#isThriftServer

2022-06-19 Thread qian (Jira)
qian created SPARK-39526:


 Summary: Remove no-null conditional statements in 
SparkSubmitCommandBuilder#isThriftServer
 Key: SPARK-39526
 URL: https://issues.apache.org/jira/browse/SPARK-39526
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.3.0
Reporter: qian


{code:java}
private boolean isThriftServer(String mainClass) {
  return (mainClass != null &&
   
mainClass.equals("org.apache.spark.sql.hive.thriftserver.HiveThriftServer2")); 
} {code}
No scenario that *mainClass* is null, because already have defensive code.
{code:java}
if (isExample && !isSpecialCommand) {
  checkArgument(mainClass != null, "Missing example class name.");
}

if (mainClass != null) {
  args.add(parser.CLASS);
  args.add(mainClass);
} {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39504) No-null object invoke equal to avoid NPE

2022-06-17 Thread qian (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

qian updated SPARK-39504:
-
Component/s: SQL

> No-null object invoke equal to avoid NPE
> 
>
> Key: SPARK-39504
> URL: https://issues.apache.org/jira/browse/SPARK-39504
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 3.2.1
>Reporter: qian
>Priority: Major
>
> Since {{NullPointerException}} can possibly be thrown while calling the 
> _equals_ method of {{{}Object{}}}, _equals_ should be invoked by a constant 
> or an object that is definitely not {_}null{_}.
> {quote}Positive example: {{"test".equals(object);}}
> Counter example: {{object.equals("test");}}
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39504) No-null object invoke equal to avoid NPE

2022-06-17 Thread qian (Jira)
qian created SPARK-39504:


 Summary: No-null object invoke equal to avoid NPE
 Key: SPARK-39504
 URL: https://issues.apache.org/jira/browse/SPARK-39504
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.2.1
Reporter: qian


Since {{NullPointerException}} can possibly be thrown while calling the 
_equals_ method of {{{}Object{}}}, _equals_ should be invoked by a constant or 
an object that is definitely not {_}null{_}.
{quote}Positive example: {{"test".equals(object);}}
Counter example: {{object.equals("test");}}
{quote}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39428) use code block for `Coalesce Hints for SQL Queries`

2022-06-09 Thread qian (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

qian updated SPARK-39428:
-
Attachment: Coalesce.png

> use code block for `Coalesce Hints for SQL Queries`
> ---
>
> Key: SPARK-39428
> URL: https://issues.apache.org/jira/browse/SPARK-39428
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.0.0, 3.0.1, 3.0.2, 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.2.1
>Reporter: qian
>Priority: Minor
> Attachments: Coalesce.png
>
>
> Use code block for `Coalesce Hints for SQL Queries`, now it is plain block.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39428) use code block for `Coalesce Hints for SQL Queries`

2022-06-09 Thread qian (Jira)
qian created SPARK-39428:


 Summary: use code block for `Coalesce Hints for SQL Queries`
 Key: SPARK-39428
 URL: https://issues.apache.org/jira/browse/SPARK-39428
 Project: Spark
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 3.2.1, 3.2.0, 3.1.2, 3.1.1, 3.1.0, 3.0.2, 3.0.1, 3.0.0
Reporter: qian


Use code block for `Coalesce Hints for SQL Queries`, now it is plain block.

!image-2022-06-09-16-33-00-359.png!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39428) use code block for `Coalesce Hints for SQL Queries`

2022-06-09 Thread qian (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

qian updated SPARK-39428:
-
Description: Use code block for `Coalesce Hints for SQL Queries`, now it is 
plain block.  (was: Use code block for `Coalesce Hints for SQL Queries`, now it 
is plain block.

!image-2022-06-09-16-33-00-359.png!)

> use code block for `Coalesce Hints for SQL Queries`
> ---
>
> Key: SPARK-39428
> URL: https://issues.apache.org/jira/browse/SPARK-39428
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 3.0.0, 3.0.1, 3.0.2, 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.2.1
>Reporter: qian
>Priority: Minor
>
> Use code block for `Coalesce Hints for SQL Queries`, now it is plain block.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39390) Hide and optimize `viewAcls`/`viewAclsGroups`/`modifyAcls`/`modifyAclsGroups` fron INFO log

2022-06-06 Thread qian (Jira)
qian created SPARK-39390:


 Summary: Hide and optimize 
`viewAcls`/`viewAclsGroups`/`modifyAcls`/`modifyAclsGroups` fron INFO log
 Key: SPARK-39390
 URL: https://issues.apache.org/jira/browse/SPARK-39390
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.3.0
Reporter: qian


This issue aims to hide and optimize 
`viewAcls`/`viewAclsGroups`/`modifyAcls`/`modifyAclsGroups` fron INFO log.
{code:java}
2022-06-02 22:02:48.328 - stderr> 22/06/03 05:02:48 INFO SecurityManager: 
SecurityManager: authentication disabled; ui acls disabled; users  with view 
permissions: Set(root); groups with view permissions: Set(); users  with modify 
permissions: Set(root); groups with modify permissions: Set(){code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39289) Replace map.getOrElse(false/true) with exists/forall

2022-05-25 Thread qian (Jira)
qian created SPARK-39289:


 Summary: Replace map.getOrElse(false/true) with exists/forall
 Key: SPARK-39289
 URL: https://issues.apache.org/jira/browse/SPARK-39289
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core, SQL, Structured Streaming
Affects Versions: 3.3.0
Reporter: qian


Replace _map(_.toBoolean).getOrElse(false)_ with _exists(_.toBoolean)_

Replace _map(_.toBoolean).getOrElse(true)_ with _exists(_.toBoolean)_



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39196) Replace getOrElse(null) with orNull

2022-05-16 Thread qian (Jira)
qian created SPARK-39196:


 Summary: Replace getOrElse(null) with orNull
 Key: SPARK-39196
 URL: https://issues.apache.org/jira/browse/SPARK-39196
 Project: Spark
  Issue Type: Improvement
  Components: Kubernetes, Spark Core
Affects Versions: 3.3.0
Reporter: qian


Code Simplification. Replace _getOrElse(null)_ with _orNull_



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-39111) Mark overriden methods with `@override` annotation

2022-05-06 Thread qian (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17533176#comment-17533176
 ] 

qian commented on SPARK-39111:
--

[~bjornjorgensen] I use Alibaba Java Coding Guidelines. Please refer 
https://plugins.jetbrains.com/plugin/10046-alibaba-java-coding-guidelines

> Mark overriden methods with `@override` annotation
> --
>
> Key: SPARK-39111
> URL: https://issues.apache.org/jira/browse/SPARK-39111
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: qian
>Assignee: qian
>Priority: Minor
> Fix For: 3.4.0
>
> Attachments: override.png
>
>
> An overridden method from an interface or abstract class must be marked with 
> {{@Override}} annotation. To accurately determine whether the overriding is 
> successful, an {{@Override}} annotation is necessary. Meanwhile, once the 
> method signature in the abstract class is changed, the implementation class 
> will report a compile-time error immediately.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39111) Mark overriden methods with `@override` annotation

2022-05-06 Thread qian (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

qian updated SPARK-39111:
-
Summary: Mark overriden methods with `@override` annotation  (was: Add 
override annotation for handleExtraArgs#AbstractLauncher)

> Mark overriden methods with `@override` annotation
> --
>
> Key: SPARK-39111
> URL: https://issues.apache.org/jira/browse/SPARK-39111
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.3.0
>Reporter: qian
>Priority: Minor
>
> An overridden method from an interface or abstract class must be marked with 
> {{@Override}} annotation. To accurately determine whether the overriding is 
> successful, an {{@Override}} annotation is necessary. Meanwhile, once the 
> method signature in the abstract class is changed, the implementation class 
> will report a compile-time error immediately.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39111) Add override annotation for handleExtraArgs#AbstractLauncher

2022-05-05 Thread qian (Jira)
qian created SPARK-39111:


 Summary: Add override annotation for 
handleExtraArgs#AbstractLauncher
 Key: SPARK-39111
 URL: https://issues.apache.org/jira/browse/SPARK-39111
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 2.3.0
Reporter: qian


An overridden method from an interface or abstract class must be marked with 
{{@Override}} annotation. To accurately determine whether the overriding is 
successful, an {{@Override}} annotation is necessary. Meanwhile, once the 
method signature in the abstract class is changed, the implementation class 
will report a compile-time error immediately.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-39006) Show a directional error message for PVC Dynamic Allocation Failure

2022-04-27 Thread qian (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

qian updated SPARK-39006:
-
Description: 
When spark application requires multiple executors and not set pvc claimName 
with onDemand or SPARK_EXECUTOR_ID, it always create executor pods. Because pvc 
has be created by first executor pod.
{noformat}
22/04/22 08:55:47 WARN ExecutorPodsSnapshotsStoreImpl: Exception when notifying 
snapshot subscriber.
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST 
at: 
https://kubernetes.default.svc/api/v1/namespaces/default/persistentvolumeclaims.
 Message: persistentvolumeclaims "test-1" already exists. Received status: 
Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=null, 
kind=persistentvolumeclaims, name=test-1, retryAfterSeconds=null, uid=null, 
additionalProperties={}), kind=Status, message=persistentvolumeclaims "test-1" 
already exists, metadata=ListMeta(_continue=null, remainingItemCount=null, 
resourceVersion=null, selfLink=null, additionalProperties={}), 
reason=AlreadyExists, status=Failure, additionalProperties={}).
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:697)
 ~[kubernetes-client-5.10.1.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:676)
 ~[kubernetes-client-5.10.1.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:629)
 ~[kubernetes-client-5.10.1.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:566)
 ~[kubernetes-client-5.10.1.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:527)
 ~[kubernetes-client-5.10.1.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:315)
 ~[kubernetes-client-5.10.1.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:651)
 ~[kubernetes-client-5.10.1.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:91)
 ~[kubernetes-client-5.10.1.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:61)
 ~[kubernetes-client-5.10.1.jar:?]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$3(ExecutorPodsAllocator.scala:415)
 ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at scala.collection.immutable.List.foreach(List.scala:431) 
~[scala-library-2.12.15.jar:?]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$1(ExecutorPodsAllocator.scala:408)
 ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) 
~[scala-library-2.12.15.jar:?]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.requestNewExecutors(ExecutorPodsAllocator.scala:385)
 ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$35(ExecutorPodsAllocator.scala:349)
 ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$35$adapted(ExecutorPodsAllocator.scala:342)
 ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at 
scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
~[scala-library-2.12.15.jar:?]
        at 
scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) 
~[scala-library-2.12.15.jar:?]
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) 
~[scala-library-2.12.15.jar:?]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.onNewSnapshots(ExecutorPodsAllocator.scala:342)
 ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$start$3(ExecutorPodsAllocator.scala:120)
 ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$start$3$adapted(ExecutorPodsAllocator.scala:120)
 ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsSnapshotsStoreImpl$SnapshotsSubscriber.org$apache$spark$scheduler$cluster$k8s$ExecutorPodsSnapshotsStoreImpl$SnapshotsSubscriber$$processSnapshotsInternal(ExecutorPodsSnapshotsStoreImpl.scala:138)
 ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at 

[jira] [Updated] (SPARK-39006) Show a directional error message for PVC Dynamic Allocation Failure

2022-04-27 Thread qian (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-39006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

qian updated SPARK-39006:
-
Summary: Show a directional error message for PVC Dynamic Allocation 
Failure  (was: Check PVC claimName must be OnDemand when multiple executor 
required)

> Show a directional error message for PVC Dynamic Allocation Failure
> ---
>
> Key: SPARK-39006
> URL: https://issues.apache.org/jira/browse/SPARK-39006
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 3.1.0
>Reporter: qian
>Priority: Major
>
> When spark application requires multiple executors and not set pvc claimName 
> be onDemand, it always create executor pods. Because pvc has be created by 
> first executor pod.
> {noformat}
> 22/04/22 08:55:47 WARN ExecutorPodsSnapshotsStoreImpl: Exception when 
> notifying snapshot subscriber.
> io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: 
> POST at: 
> https://kubernetes.default.svc/api/v1/namespaces/default/persistentvolumeclaims.
>  Message: persistentvolumeclaims "test-1" already exists. Received status: 
> Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=null, 
> kind=persistentvolumeclaims, name=test-1, retryAfterSeconds=null, uid=null, 
> additionalProperties={}), kind=Status, message=persistentvolumeclaims 
> "test-1" already exists, metadata=ListMeta(_continue=null, 
> remainingItemCount=null, resourceVersion=null, selfLink=null, 
> additionalProperties={}), reason=AlreadyExists, status=Failure, 
> additionalProperties={}).
>         at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:697)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:676)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:629)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:566)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:527)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:315)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:651)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:91)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> io.fabric8.kubernetes.client.dsl.base.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:61)
>  ~[kubernetes-client-5.10.1.jar:?]
>         at 
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$3(ExecutorPodsAllocator.scala:415)
>  ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
>         at scala.collection.immutable.List.foreach(List.scala:431) 
> ~[scala-library-2.12.15.jar:?]
>         at 
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$1(ExecutorPodsAllocator.scala:408)
>  ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
>         at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) 
> ~[scala-library-2.12.15.jar:?]
>         at 
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.requestNewExecutors(ExecutorPodsAllocator.scala:385)
>  ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
>         at 
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$35(ExecutorPodsAllocator.scala:349)
>  ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
>         at 
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$35$adapted(ExecutorPodsAllocator.scala:342)
>  ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
>         at 
> scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
> ~[scala-library-2.12.15.jar:?]
>         at 
> scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) 
> ~[scala-library-2.12.15.jar:?]
>         at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) 
> ~[scala-library-2.12.15.jar:?]
>         at 
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.onNewSnapshots(ExecutorPodsAllocator.scala:342)
>  ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
>         at 
> 

[jira] [Created] (SPARK-39026) Add k8s tolerations support for apache spark

2022-04-26 Thread qian (Jira)
qian created SPARK-39026:


 Summary: Add k8s tolerations support for apache spark
 Key: SPARK-39026
 URL: https://issues.apache.org/jira/browse/SPARK-39026
 Project: Spark
  Issue Type: Sub-task
  Components: Kubernetes
Affects Versions: 3.2.1
Reporter: qian
 Fix For: 3.4.0


[https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/]

As document shows, this issue aims to support tolerations for apache spark



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-39025) Add k8s Scheduling, Preemption and Eviction feature to apache spark

2022-04-26 Thread qian (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528081#comment-17528081
 ] 

qian commented on SPARK-39025:
--

I am working on this. Please assign to me [~dongjoon]  [~hyukjin.kwon] 

> Add k8s Scheduling, Preemption and Eviction feature to apache spark
> ---
>
> Key: SPARK-39025
> URL: https://issues.apache.org/jira/browse/SPARK-39025
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 3.2.1
>Reporter: qian
>Priority: Major
> Fix For: 3.4.0
>
>
> As [https://kubernetes.io/docs/concepts/scheduling-eviction/] show, apache 
> spark lacks of support for k8s scheduling, preemption and eviction.
> This issue wants to support toleration/priorityClass/runtimeClass etc , more 
> information please refer 
> [https://kubernetes.io/docs/concepts/scheduling-eviction/] 
> [https://kubernetes.io/docs/concepts/scheduling-eviction/kube-scheduler/]
> [https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/]
> [https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/]
> [https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/]
> [https://kubernetes.io/docs/concepts/scheduling-eviction/node-pressure-eviction/]
> [https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/]
> [https://kubernetes.io/docs/concepts/scheduling-eviction/resource-bin-packing/]
> [https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/]
> https://kubernetes.io/docs/concepts/scheduling-eviction/scheduler-perf-tuning/



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39025) Add k8s Scheduling, Preemption and Eviction feature to apache spark

2022-04-26 Thread qian (Jira)
qian created SPARK-39025:


 Summary: Add k8s Scheduling, Preemption and Eviction feature to 
apache spark
 Key: SPARK-39025
 URL: https://issues.apache.org/jira/browse/SPARK-39025
 Project: Spark
  Issue Type: Improvement
  Components: Kubernetes
Affects Versions: 3.2.1
Reporter: qian
 Fix For: 3.4.0


As [https://kubernetes.io/docs/concepts/scheduling-eviction/] show, apache 
spark lacks of support for k8s scheduling, preemption and eviction.

This issue wants to support toleration/priorityClass/runtimeClass etc , more 
information please refer 

[https://kubernetes.io/docs/concepts/scheduling-eviction/] 

[https://kubernetes.io/docs/concepts/scheduling-eviction/kube-scheduler/]

[https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/]

[https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/]

[https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/]

[https://kubernetes.io/docs/concepts/scheduling-eviction/node-pressure-eviction/]

[https://kubernetes.io/docs/concepts/scheduling-eviction/api-eviction/]

[https://kubernetes.io/docs/concepts/scheduling-eviction/resource-bin-packing/]

[https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/]

https://kubernetes.io/docs/concepts/scheduling-eviction/scheduler-perf-tuning/



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-39006) Check PVC claimName must be OnDemand when multiple executor required

2022-04-24 Thread qian (Jira)
qian created SPARK-39006:


 Summary: Check PVC claimName must be OnDemand when multiple 
executor required
 Key: SPARK-39006
 URL: https://issues.apache.org/jira/browse/SPARK-39006
 Project: Spark
  Issue Type: Improvement
  Components: Kubernetes
Affects Versions: 3.1.0
Reporter: qian
 Fix For: 3.4.0


When spark application requires multiple executors and not set pvc claimName be 
onDemand, it always create executor pods. Because pvc has be created by first 
executor pod.
{noformat}
22/04/22 08:55:47 WARN ExecutorPodsSnapshotsStoreImpl: Exception when notifying 
snapshot subscriber.
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST 
at: 
https://kubernetes.default.svc/api/v1/namespaces/default/persistentvolumeclaims.
 Message: persistentvolumeclaims "test-1" already exists. Received status: 
Status(apiVersion=v1, code=409, details=StatusDetails(causes=[], group=null, 
kind=persistentvolumeclaims, name=test-1, retryAfterSeconds=null, uid=null, 
additionalProperties={}), kind=Status, message=persistentvolumeclaims "test-1" 
already exists, metadata=ListMeta(_continue=null, remainingItemCount=null, 
resourceVersion=null, selfLink=null, additionalProperties={}), 
reason=AlreadyExists, status=Failure, additionalProperties={}).
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:697)
 ~[kubernetes-client-5.10.1.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:676)
 ~[kubernetes-client-5.10.1.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:629)
 ~[kubernetes-client-5.10.1.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:566)
 ~[kubernetes-client-5.10.1.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:527)
 ~[kubernetes-client-5.10.1.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:315)
 ~[kubernetes-client-5.10.1.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:651)
 ~[kubernetes-client-5.10.1.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:91)
 ~[kubernetes-client-5.10.1.jar:?]
        at 
io.fabric8.kubernetes.client.dsl.base.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:61)
 ~[kubernetes-client-5.10.1.jar:?]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$3(ExecutorPodsAllocator.scala:415)
 ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at scala.collection.immutable.List.foreach(List.scala:431) 
~[scala-library-2.12.15.jar:?]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$1(ExecutorPodsAllocator.scala:408)
 ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158) 
~[scala-library-2.12.15.jar:?]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.requestNewExecutors(ExecutorPodsAllocator.scala:385)
 ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$35(ExecutorPodsAllocator.scala:349)
 ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$35$adapted(ExecutorPodsAllocator.scala:342)
 ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at 
scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
~[scala-library-2.12.15.jar:?]
        at 
scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) 
~[scala-library-2.12.15.jar:?]
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) 
~[scala-library-2.12.15.jar:?]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.onNewSnapshots(ExecutorPodsAllocator.scala:342)
 ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$start$3(ExecutorPodsAllocator.scala:120)
 ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$start$3$adapted(ExecutorPodsAllocator.scala:120)
 ~[spark-kubernetes_2.12-3.3.0-SNAPSHOT.jar:3.3.0-SNAPSHOT]
        at 

[jira] [Created] (SPARK-38968) remove hadoopConf from KerberosConfDriverFeatureStep

2022-04-20 Thread qian (Jira)
qian created SPARK-38968:


 Summary: remove hadoopConf from KerberosConfDriverFeatureStep
 Key: SPARK-38968
 URL: https://issues.apache.org/jira/browse/SPARK-38968
 Project: Spark
  Issue Type: Improvement
  Components: Kubernetes
Affects Versions: 3.2.1
Reporter: qian
 Fix For: 3.4.0


remove no-use hadoopConf from KerberosConfDriverFeatureStep



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-38945) simply KEYTAB and PRINCIPAL in KerberosConfDriverFeatureStep

2022-04-19 Thread qian (Jira)
qian created SPARK-38945:


 Summary: simply KEYTAB and PRINCIPAL in 
KerberosConfDriverFeatureStep
 Key: SPARK-38945
 URL: https://issues.apache.org/jira/browse/SPARK-38945
 Project: Spark
  Issue Type: Improvement
  Components: Kubernetes
Affects Versions: 3.2.1
Reporter: qian
 Fix For: 3.4.0


Simply KEYTAB and PRINCIPAL in KerberosConfDriverFeatureStep, because already 
imported



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-38925) update guava to 30.1.1-jre

2022-04-16 Thread qian (Jira)
qian created SPARK-38925:


 Summary: update guava to 30.1.1-jre
 Key: SPARK-38925
 URL: https://issues.apache.org/jira/browse/SPARK-38925
 Project: Spark
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.2.1, 3.2.0, 3.1.2, 3.1.1, 3.1.0, 3.0.2, 3.0.1, 3.0.0
Reporter: qian
 Fix For: 3.4.0


Update guava to 30.1.1-jre

guava 14.0.1 has risk:
 * [CVE-2020-8908|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-8908]
 * 
[CVE-2018-10237|https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-10237]

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-38770) Simply steps to re write primary resource in k8s spark application

2022-04-01 Thread qian (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-38770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

qian updated SPARK-38770:
-
Fix Version/s: 3.4.0
   (was: 3.3.0)

> Simply steps to re write primary resource in k8s spark application
> --
>
> Key: SPARK-38770
> URL: https://issues.apache.org/jira/browse/SPARK-38770
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.2.1
>Reporter: qian
>Priority: Major
> Fix For: 3.4.0
>
>
> re-write primary resource actions use renameMainAppResource method twice and 
> second usage has no effect. So, Simply steps to re write primary resource in 
> k8s spark application



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-38770) Simply steps to re write primary resource in k8s spark application

2022-04-01 Thread qian (Jira)
qian created SPARK-38770:


 Summary: Simply steps to re write primary resource in k8s spark 
application
 Key: SPARK-38770
 URL: https://issues.apache.org/jira/browse/SPARK-38770
 Project: Spark
  Issue Type: Improvement
  Components: Kubernetes
Affects Versions: 3.2.1, 3.2.0, 3.1.2, 3.1.1, 3.1.0
Reporter: qian
 Fix For: 3.3.0


re-write primary resource actions use renameMainAppResource method twice and 
second usage has no effect. So, Simply steps to re write primary resource in 
k8s spark application



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2

2022-03-29 Thread qian (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17514390#comment-17514390
 ] 

qian commented on SPARK-38652:
--

[~dongjoon] Hi. DepsTestsSuite has tests as follow
 * Launcher client dependencies
 * SPARK-33615: Launcher client archives
 * SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
 * ...

spark-submit command is used by these tests. So, I think DepsTestsSuite blocks.

Could you please check these tests run? Maybe `-Dtest.exclude.tags` option 
doesn't need `minikube` value.

> K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2
> --
>
> Key: SPARK-38652
> URL: https://issues.apache.org/jira/browse/SPARK-38652
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Tests
>Affects Versions: 3.3.0
>Reporter: qian
>Priority: Major
>
> DepsTestsSuite in k8s IT test is blocked with PathIOException in 
> hadoop-aws-3.3.2. Exception Message is as follow
> {code:java}
> Exception in thread "main" org.apache.spark.SparkException: Uploading file 
> /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
>  failed...
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332)
> 
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277)
> 
> at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) 
>
> at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)   
>  
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)  
>   
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
> at scala.collection.TraversableLike.map(TraversableLike.scala:286)
> at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
> at scala.collection.AbstractTraversable.map(Traversable.scala:108)
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275)
> 
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187)
>
> at scala.collection.immutable.List.foreach(List.scala:431)
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86)
> at 
> scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
> 
> at 
> scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)   
>  
> at scala.collection.immutable.List.foldLeft(List.scala:91)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84)
> 
> at 
> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242)
> at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214)
> 
> at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
> 
> at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) 
>
> at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)  
>   
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: 
> org.apache.spark.SparkException: Error uploading file 
> spark-examples_2.12-3.4.0-SNAPSHOT.jar
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355)
> 
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328)
> 
> ... 30 more
> Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get relative path 
> for 
> 

[jira] [Commented] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2

2022-03-27 Thread qian (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17512846#comment-17512846
 ] 

qian commented on SPARK-38652:
--

[~ste...@apache.org] Hi, I run the same suite against an real aws s3 endpoint, 
and have same exception. I think we could exclude reason about minio deployment.
{noformat}
$ bin/spark-submit --deploy-mode cluster --class 
org.apache.spark.examples.SparkRemoteFileTest --master 
k8s://https://192.168.64.87:8443/ --conf 
spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem --conf 
spark.testing=false  --conf spark.hadoop.fs.s3a.access.key=XXX --conf 
spark.kubernetes.driver.label.spark-app-locator=a8937b5fdf6a444a806ee1c3ecac37fc
 --conf spark.kubernetes.file.upload.path=s3a://dcoliversun --conf 
spark.authenticate=true --conf spark.executor.instances=1 --conf 
spark.kubernetes.submission.waitAppCompletion=false --conf 
spark.kubernetes.executor.label.spark-app-locator=a8937b5fdf6a444a806ee1c3ecac37fc
 --conf spark.kubernetes.namespace=spark-job --conf 
spark.kubernetes.authenticate.driver.serviceAccountName=spark --conf 
spark.hadoop.fs.s3a.secret.key=XXX --conf 
spark.executor.extraJavaOptions=-Dlog4j2.debug --conf 
spark.hadoop.fs.s3a.endpoint=https://s3.ap-southeast-1.amazonaws.com --conf 
spark.app.name=spark-test-app --conf 
spark.files=/tmp/tmp7013228683780235449.txt --conf spark.ui.enabled=true --conf 
spark.driver.extraJavaOptions=-Dlog4j2.debug --conf 
spark.kubernetes.container.image=registry.cn-hangzhou.aliyuncs.com/smart-spark/spark:test
 --conf spark.executor.cores=1 --conf 
spark.hadoop.fs.s3a.connection.ssl.enabled=false 
/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
 tmp7013228683780235449.txt


22/03/27 15:16:27 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable
22/03/27 15:16:28 INFO SparkKubernetesClientFactory: Auto-configuring K8S 
client using current context from users K8S config file
22/03/27 15:16:28 INFO KerberosConfDriverFeatureStep: You have not specified a 
krb5.conf file locally or via a ConfigMap. Make sure that you have the 
krb5.conf locally on the driver image.
22/03/27 15:16:29 INFO KubernetesUtils: sq-isLocalAndResolvable => resource is 
file:/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
22/03/27 15:16:29 INFO KubernetesUtils: sq => uri is 
file:/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar,
 uri scheme is file
22/03/27 15:16:29 INFO KubernetesUtils: sq-uploadAndTransformFileUris, uri is 
file:/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
22/03/27 15:16:29 WARN MetricsConfig: Cannot locate configuration: tried 
hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
22/03/27 15:16:29 INFO MetricsSystemImpl: Scheduled Metric snapshot period at 
10 second(s).
22/03/27 15:16:29 INFO MetricsSystemImpl: s3a-file-system metrics system started
22/03/27 15:16:31 INFO KubernetesUtils: Uploading file: 
/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
 to dest: 
s3a://dcoliversun/spark-upload-eb20e2da-17b6-4dcd-b4f1-8e47bc80c1e9/spark-examples_2.12-3.4.0-SNAPSHOT.jar...
22/03/27 15:16:31 INFO S3AFileSystem: Copying local file from 
/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
 to 
s3a://dcoliversun/spark-upload-eb20e2da-17b6-4dcd-b4f1-8e47bc80c1e9/spark-examples_2.12-3.4.0-SNAPSHOT.jar
22/03/27 15:16:31 INFO CopyFromLocalOperation: Copying local file from 
/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
 to 
s3a://dcoliversun/spark-upload-eb20e2da-17b6-4dcd-b4f1-8e47bc80c1e9/spark-examples_2.12-3.4.0-SNAPSHOT.jar
22/03/27 15:16:31 INFO CopyFromLocalOperation: execute#CopyFromLocalOperation, 
sourceFile is 
/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
22/03/27 15:16:31 INFO CopyFromLocalOperation: 
uploadSourceFromFS#CopyFromLocalOperation, localFile 1: path is 
LocatedFileStatus{path=file:/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar;
 isDirectory=false; length=1567474; replication=1; blocksize=33554432; 
modification_time=1647874074000; access_time=1647874074000; owner=hengzhen.sq; 
group=staff; permission=rw-r--r--; isSymlink=false; hasAcl=false; 
isEncrypted=false; isErasureCoded=false}
22/03/27 15:16:31 INFO CopyFromLocalOperation: 
getFinalPath#CopyFromLocalOperation, src is 
file:/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar,
 source is 
/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
Exception in thread "main" 

[jira] [Commented] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2

2022-03-25 Thread qian (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17512657#comment-17512657
 ] 

qian commented on SPARK-38652:
--

[~ste...@apache.org] No. I can do it, which help us to confirm whether the 
cause of the problem is minio or hadoop-aws-3.3.2. And, I share result about 
test here.

> K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2
> --
>
> Key: SPARK-38652
> URL: https://issues.apache.org/jira/browse/SPARK-38652
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Tests
>Affects Versions: 3.3.0
>Reporter: qian
>Priority: Major
>
> DepsTestsSuite in k8s IT test is blocked with PathIOException in 
> hadoop-aws-3.3.2. Exception Message is as follow
> {code:java}
> Exception in thread "main" org.apache.spark.SparkException: Uploading file 
> /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
>  failed...
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332)
> 
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277)
> 
> at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) 
>
> at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)   
>  
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)  
>   
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
> at scala.collection.TraversableLike.map(TraversableLike.scala:286)
> at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
> at scala.collection.AbstractTraversable.map(Traversable.scala:108)
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275)
> 
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187)
>
> at scala.collection.immutable.List.foreach(List.scala:431)
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86)
> at 
> scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
> 
> at 
> scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)   
>  
> at scala.collection.immutable.List.foldLeft(List.scala:91)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84)
> 
> at 
> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242)
> at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214)
> 
> at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
> 
> at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) 
>
> at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)  
>   
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: 
> org.apache.spark.SparkException: Error uploading file 
> spark-examples_2.12-3.4.0-SNAPSHOT.jar
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355)
> 
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328)
> 
> ... 30 more
> Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get relative path 
> for 
> URI:file:///Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar':
>  Input/output error
> at 
> org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.getFinalPath(CopyFromLocalOperation.java:365)
>  

[jira] [Updated] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2

2022-03-24 Thread qian (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

qian updated SPARK-38652:
-
Description: 
DepsTestsSuite in k8s IT test is blocked with PathIOException in 
hadoop-aws-3.3.2. Exception Message is as follow
{code:java}
Exception in thread "main" org.apache.spark.SparkException: Uploading file 
/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
 failed...
at 
org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332)

at 
org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277)

at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)   
 
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
   
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)

at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at scala.collection.TraversableLike.map(TraversableLike.scala:286)
at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
at scala.collection.AbstractTraversable.map(Traversable.scala:108)
at 
org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275)

at 
org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187)
   
at scala.collection.immutable.List.foreach(List.scala:431)
at 
org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178)

at 
org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86)
at 
scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)  
  
at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)  
  
at scala.collection.immutable.List.foldLeft(List.scala:91)
at 
org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84)

at 
org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104)

at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248)

at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242)
at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738)
at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242)

at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214)

at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)

at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)   
 
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) 
   
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: 
org.apache.spark.SparkException: Error uploading file 
spark-examples_2.12-3.4.0-SNAPSHOT.jar
at 
org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355)

at 
org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328)

... 30 more
Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get relative path for 
URI:file:///Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar':
 Input/output error
at 
org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.getFinalPath(CopyFromLocalOperation.java:365)

at 
org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.uploadSourceFromFS(CopyFromLocalOperation.java:226)

at 
org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.execute(CopyFromLocalOperation.java:170)

at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$copyFromLocalFile$25(S3AFileSystem.java:3920)

at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:499)
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:444)

at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2337)

at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2356)

at 

[jira] [Updated] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2

2022-03-24 Thread qian (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

qian updated SPARK-38652:
-
Description: 
DepsTestsSuite in k8s IT test is blocked with PathIOException in 
hadoop-aws-3.3.2. Exception Message is as follow
{code:java}
Exception in thread "main" org.apache.spark.SparkException: Uploading file 
/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
 failed...
at 
org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332)

at 
org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277)

at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)   
 
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
   
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)

at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at scala.collection.TraversableLike.map(TraversableLike.scala:286)
at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
at scala.collection.AbstractTraversable.map(Traversable.scala:108)
at 
org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275)

at 
org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187)
   
at scala.collection.immutable.List.foreach(List.scala:431)
at 
org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178)

at 
org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86)
at 
scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)  
  
at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)  
  
at scala.collection.immutable.List.foldLeft(List.scala:91)
at 
org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84)

at 
org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104)

at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248)

at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242)
at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738)   
 
at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242)

at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214)

at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)

at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)   
 
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) 
   
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: 
org.apache.spark.SparkException: Error uploading file 
spark-examples_2.12-3.4.0-SNAPSHOT.jar
at 
org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355)

at 
org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328)

... 30 more
Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get relative path for 
URI:file:///Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar':
 Input/output errorat 
org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.getFinalPath(CopyFromLocalOperation.java:365)

at 
org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.uploadSourceFromFS(CopyFromLocalOperation.java:226)

at 
org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.execute(CopyFromLocalOperation.java:170)

at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$copyFromLocalFile$25(S3AFileSystem.java:3920)

at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:499)
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:444)

at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2337)

at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2356)

at 

[jira] [Commented] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2

2022-03-24 Thread qian (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-38652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17512163#comment-17512163
 ] 

qian commented on SPARK-38652:
--

I am working on it.

cc [~chaosun]  & [~dongjoon] 

> K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2
> --
>
> Key: SPARK-38652
> URL: https://issues.apache.org/jira/browse/SPARK-38652
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Tests
>Affects Versions: 3.3.0
>Reporter: qian
>Priority: Major
>
> DepsTestsSuite in k8s IT test is blocked with PathIOException in 
> hadoop-aws-3.3.2. Exception Message is as follow
> {code:java}
> Exception in thread "main" org.apache.spark.SparkException: Uploading file 
> /Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
>  failed...
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332)
> 
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277)
> 
> at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) 
>
> at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)   
>  
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)  
>   
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
> at scala.collection.TraversableLike.map(TraversableLike.scala:286)
> at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
> at scala.collection.AbstractTraversable.map(Traversable.scala:108)
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275)
> 
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187)
>
> at scala.collection.immutable.List.foreach(List.scala:431)
> at 
> org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86)
> at 
> scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
> 
> at 
> scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)   
>  
> at scala.collection.immutable.List.foldLeft(List.scala:91)
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84)
> 
> at 
> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242)
> at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738) 
>
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242)
> 
> at 
> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214)
> 
> at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)
> 
> at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) 
>
> at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
> at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
> at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046)  
>   
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: 
> org.apache.spark.SparkException: Error uploading file 
> spark-examples_2.12-3.4.0-SNAPSHOT.jar
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355)
> 
> at 
> org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328)
> 
> ... 30 more
> Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get relative path 
> for 
> URI:file:///Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar':
>  Input/output errorat 
> org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.getFinalPath(CopyFromLocalOperation.java:365)
> 
> at 
> 

[jira] [Created] (SPARK-38652) K8S IT Test DepsTestsSuite blocks with PathIOException in hadoop-aws-3.3.2

2022-03-24 Thread qian (Jira)
qian created SPARK-38652:


 Summary: K8S IT Test DepsTestsSuite blocks with PathIOException in 
hadoop-aws-3.3.2
 Key: SPARK-38652
 URL: https://issues.apache.org/jira/browse/SPARK-38652
 Project: Spark
  Issue Type: Bug
  Components: Kubernetes, Tests
Affects Versions: 3.3.0
Reporter: qian


DepsTestsSuite in k8s IT test is blocked with PathIOException in 
hadoop-aws-3.3.2. Exception Message is as follow
{code:java}
Exception in thread "main" org.apache.spark.SparkException: Uploading file 
/Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar
 failed...
at 
org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:332)

at 
org.apache.spark.deploy.k8s.KubernetesUtils$.$anonfun$uploadAndTransformFileUris$1(KubernetesUtils.scala:277)

at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)   
 
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) 
   
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)

at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at scala.collection.TraversableLike.map(TraversableLike.scala:286)
at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
at scala.collection.AbstractTraversable.map(Traversable.scala:108)
at 
org.apache.spark.deploy.k8s.KubernetesUtils$.uploadAndTransformFileUris(KubernetesUtils.scala:275)

at 
org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.$anonfun$getAdditionalPodSystemProperties$1(BasicDriverFeatureStep.scala:187)
   
at scala.collection.immutable.List.foreach(List.scala:431)
at 
org.apache.spark.deploy.k8s.features.BasicDriverFeatureStep.getAdditionalPodSystemProperties(BasicDriverFeatureStep.scala:178)
at 
org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.$anonfun$buildFromFeatures$5(KubernetesDriverBuilder.scala:86)
at 
scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)  
  
at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)  
  
at scala.collection.immutable.List.foldLeft(List.scala:91)
at 
org.apache.spark.deploy.k8s.submit.KubernetesDriverBuilder.buildFromFeatures(KubernetesDriverBuilder.scala:84)

at 
org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:104)

at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5(KubernetesClientApplication.scala:248)

at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$5$adapted(KubernetesClientApplication.scala:242)
at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2738)   
 
at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:242)

at 
org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:214)

at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958)

at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)   
 
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) 
   
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: 
org.apache.spark.SparkException: Error uploading file 
spark-examples_2.12-3.4.0-SNAPSHOT.jar
at 
org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileToHadoopCompatibleFS(KubernetesUtils.scala:355)

at 
org.apache.spark.deploy.k8s.KubernetesUtils$.uploadFileUri(KubernetesUtils.scala:328)

... 30 more
Caused by: org.apache.hadoop.fs.PathIOException: `Cannot get relative path for 
URI:file:///Users/hengzhen.sq/IdeaProjects/spark/dist/examples/jars/spark-examples_2.12-3.4.0-SNAPSHOT.jar':
 Input/output errorat 
org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.getFinalPath(CopyFromLocalOperation.java:365)

at 
org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.uploadSourceFromFS(CopyFromLocalOperation.java:226)

at 
org.apache.hadoop.fs.s3a.impl.CopyFromLocalOperation.execute(CopyFromLocalOperation.java:170)

at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$copyFromLocalFile$25(S3AFileSystem.java:3920)

at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:499)
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:444)

at 

[jira] [Created] (SPARK-38582) Introduce `buildEnvVarsWithKV` and `buildEnvVarsWithFieldRef` for `KubernetesUtils` to eliminate duplicate code pattern

2022-03-17 Thread qian (Jira)
qian created SPARK-38582:


 Summary: Introduce `buildEnvVarsWithKV` and 
`buildEnvVarsWithFieldRef` for `KubernetesUtils` to eliminate duplicate code 
pattern
 Key: SPARK-38582
 URL: https://issues.apache.org/jira/browse/SPARK-38582
 Project: Spark
  Issue Type: Improvement
  Components: Kubernetes
Affects Versions: 3.2.1
Reporter: qian


There are many duplicate code patterns in Spark Code:
{code:java}
new EnvVarBuilder()
  .withName(key)
  .withValue(value)
  .build() {code}
{code:java}
new EnvVarBuilder()
   .withName(name)
 .withValueFrom(new EnvVarSourceBuilder()
   .withNewFieldRef(version, field)
   .build())
   .build()





{code}
 

[The assignment statement for executor envVar | 
https://github.com/apache/spark/blob/branch-3.3/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala#L123-L185]
 has 63 lines.  We could introduce _buildEnvVarsWithKV_ and 
_buildEnvVarsWithFieldRef_ function to simplify the above code patterns.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-38546) replace deprecated ChiSqSelector with UnivariateFeatureSelector

2022-03-14 Thread qian (Jira)
qian created SPARK-38546:


 Summary: replace deprecated ChiSqSelector with 
UnivariateFeatureSelector
 Key: SPARK-38546
 URL: https://issues.apache.org/jira/browse/SPARK-38546
 Project: Spark
  Issue Type: Improvement
  Components: Examples
Affects Versions: 3.2.1, 3.2.0, 3.1.2
Reporter: qian


UnivariateFeatureSelector was added and ChiSqSelector was labeled as deprecated 
in  

SPARK-34080

So we need replace deprecated ChiSqSelector with UnivariateFeatureSelector.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-38507) DataFrame withColumn method not adding or replacing columns when alias is used

2022-03-11 Thread qian (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-38507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504971#comment-17504971
 ] 

qian commented on SPARK-38507:
--

[~amavrommatis] 

Method *select()* regards input argument like _xx.xx_ as {_}table.column{_}, 
which is by design. I don't agree that this is actually a bug. If you stick to 
your point, you could email to spark user email group about this case.  :)

> DataFrame withColumn method not adding or replacing columns when alias is used
> --
>
> Key: SPARK-38507
> URL: https://issues.apache.org/jira/browse/SPARK-38507
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.1.2
>Reporter: Alexandros Mavrommatis
>Priority: Major
>  Labels: SQL, catalyst
>
> I have an input DataFrame *df* created as follows:
> {code:java}
> import spark.implicits._
> val df = List((5, 10), (6, 20)).toDF("field1", "field2").alias("df") {code}
> When I execute either this command:
> {code:java}
> df.select("df.field2").show(2) {code}
> or that one:
> {code:java}
> df.withColumn("df.field2", lit(0)).select("df.field2").show(2) {code}
> I get the same result:
> {code:java}
> +--+
> |field2|
> +--+
> |    10|
> |    20|
> +--+ {code}
> Additionally, when I execute the following command:
> {code:java}
> df.withColumn("df.field3", lit(0)).select("df.field3").show(2){code}
> I get this exception:
> {code:java}
> org.apache.spark.sql.AnalysisException: cannot resolve '`df.field3`' given 
> input columns: [df.field3, df.field1, df.field2]; 'Project ['df.field3] +- 
> Project [field1#7, field2#8, 0 AS df.field3#31]    +- SubqueryAlias df       
> +- Project [_1#2 AS field1#7, _2#3 AS field2#8]          +- LocalRelation 
> [_1#2, _2#3]  at 
> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
>    at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:155)
>    at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:152)
>    at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$2(TreeNode.scala:342)
>    at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:74)
>    at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:342)  
>  at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformExpressionsUp$1(QueryPlan.scala:104)
>    at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:116)
>    at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:74)
>    at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:116)
>    at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:127)
>    at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$3(QueryPlan.scala:132)
>    at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)   
> at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)   
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)  
>  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)   at 
> scala.collection.TraversableLike.map(TraversableLike.scala:238)   at 
> scala.collection.TraversableLike.map$(TraversableLike.scala:231)   at 
> scala.collection.AbstractTraversable.map(Traversable.scala:108)   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:132)
>    at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$4(QueryPlan.scala:137)
>    at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:244)
>    at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:137)
>    at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUp(QueryPlan.scala:104)
>    at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1(CheckAnalysis.scala:152)
>    at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1$adapted(CheckAnalysis.scala:93)
>    at 
> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:184)   
> at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis(CheckAnalysis.scala:93)
>    at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis$(CheckAnalysis.scala:90)
>    at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:155)
>    at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:176)
>    at 

[jira] [Commented] (SPARK-38507) DataFrame withColumn method not adding or replacing columns when alias is used

2022-03-10 Thread qian (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-38507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504703#comment-17504703
 ] 

qian commented on SPARK-38507:
--

Hi [~amavrommatis]
The reason for this problem is because you alias dataframe *df* as *df*, 
resulting in a shema conflict.

You can try this command:

{code:scala}
df.withColumn("field3", lit(0)).select("field3").show(2)
{code}

While this command works, the result is not right
{code:scala}
df.withColumn("df.field2", lit(0)).select("df.field2").show(2) 
{code}

Result is origin column *field2*, not your new column *df.field2*, the value of 
which is 0.

> DataFrame withColumn method not adding or replacing columns when alias is used
> --
>
> Key: SPARK-38507
> URL: https://issues.apache.org/jira/browse/SPARK-38507
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.1.2
>Reporter: Alexandros Mavrommatis
>Priority: Major
>  Labels: SQL, catalyst
>
> I have an input DataFrame *df* created as follows:
> {code:java}
> import spark.implicits._
> val df = List((5, 10), (6, 20)).toDF("field1", "field2").alias("df") {code}
> When I execute either this command:
> {code:java}
> df.select("df.field2").show(2) {code}
> or that one:
> {code:java}
> df.withColumn("df.field2", lit(0)).select("df.field2").show(2) {code}
> I get the same result:
> {code:java}
> +--+
> |field2|
> +--+
> |    10|
> |    20|
> +--+ {code}
> Additionally, when I execute the following command:
> {code:java}
> df.withColumn("df.field3", lit(0)).select("df.field3").show(2){code}
> I get this exception:
> {code:java}
> org.apache.spark.sql.AnalysisException: cannot resolve '`df.field3`' given 
> input columns: [df.field3, df.field1, df.field2]; 'Project ['df.field3] +- 
> Project [field1#7, field2#8, 0 AS df.field3#31]    +- SubqueryAlias df       
> +- Project [_1#2 AS field1#7, _2#3 AS field2#8]          +- LocalRelation 
> [_1#2, _2#3]  at 
> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
>    at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:155)
>    at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:152)
>    at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$2(TreeNode.scala:342)
>    at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:74)
>    at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:342)  
>  at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$transformExpressionsUp$1(QueryPlan.scala:104)
>    at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$1(QueryPlan.scala:116)
>    at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:74)
>    at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpression$1(QueryPlan.scala:116)
>    at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:127)
>    at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$3(QueryPlan.scala:132)
>    at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)   
> at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)   
> at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)  
>  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)   at 
> scala.collection.TraversableLike.map(TraversableLike.scala:238)   at 
> scala.collection.TraversableLike.map$(TraversableLike.scala:231)   at 
> scala.collection.AbstractTraversable.map(Traversable.scala:108)   at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.recursiveTransform$1(QueryPlan.scala:132)
>    at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$mapExpressions$4(QueryPlan.scala:137)
>    at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:244)
>    at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.mapExpressions(QueryPlan.scala:137)
>    at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUp(QueryPlan.scala:104)
>    at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1(CheckAnalysis.scala:152)
>    at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1$adapted(CheckAnalysis.scala:93)
>    at 
> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:184)   
> at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis(CheckAnalysis.scala:93)
>    at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis$(CheckAnalysis.scala:90)

[jira] [Updated] (SPARK-38439) Add Braces with if,else,for,do and while statements

2022-03-09 Thread qian (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-38439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

qian updated SPARK-38439:
-
Priority: Trivial  (was: Minor)

> Add Braces with if,else,for,do and while statements
> ---
>
> Key: SPARK-38439
> URL: https://issues.apache.org/jira/browse/SPARK-38439
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0, 3.2.1
>Reporter: qian
>Priority: Trivial
>
> Braces are used with {_}if{_}, {_}else{_}, {_}for{_}, _do_ and _while_ 
> statements, even if the body contains only a single statement. Avoid using 
> the following example: 
> {code:java}
> if (condition) statements;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-38439) Add Braces with if,else,for,do and while statements

2022-03-09 Thread qian (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-38439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17503989#comment-17503989
 ] 

qian commented on SPARK-38439:
--

This is useless. Please ignore it

> Add Braces with if,else,for,do and while statements
> ---
>
> Key: SPARK-38439
> URL: https://issues.apache.org/jira/browse/SPARK-38439
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0, 3.2.1
>Reporter: qian
>Priority: Minor
>
> Braces are used with {_}if{_}, {_}else{_}, {_}for{_}, _do_ and _while_ 
> statements, even if the body contains only a single statement. Avoid using 
> the following example: 
> {code:java}
> if (condition) statements;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-38439) Add Braces with if,else,for,do and while statements

2022-03-07 Thread qian (Jira)
qian created SPARK-38439:


 Summary: Add Braces with if,else,for,do and while statements
 Key: SPARK-38439
 URL: https://issues.apache.org/jira/browse/SPARK-38439
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core, SQL
Affects Versions: 3.2.1, 3.2.0
Reporter: qian


Braces are used with {_}if{_}, {_}else{_}, {_}for{_}, _do_ and _while_ 
statements, even if the body contains only a single statement. Avoid using the 
following example: 
{code:java}
if (condition) statements;
{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-38302) Use Java 17 in K8S integration tests when setting spark-tgz

2022-02-24 Thread qian (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-38302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497833#comment-17497833
 ] 

qian commented on SPARK-38302:
--

[~dongjoon] Thanks for your work :)

> Use Java 17 in K8S integration tests when setting spark-tgz
> ---
>
> Key: SPARK-38302
> URL: https://issues.apache.org/jira/browse/SPARK-38302
> Project: Spark
>  Issue Type: Sub-task
>  Components: Kubernetes, Tests
>Affects Versions: 3.3.0
>Reporter: qian
>Assignee: qian
>Priority: Minor
>
> When setting parameters `spark-tgz` during integration tests, the error that 
> `resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.java17`
>  cannot be found occurs. This is due to the default value of 
> `spark.kubernetes.test.dockerFile` being 
> `resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.java17`.
>  When using the tgz, the working directory is 
> `${spark.kubernetes.test.unpackSparkDir}`, and the relative path 
> `resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.java17`
>  is invalid.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-38302) Dockerfile.java17 can't be used in K8s integration tests when

2022-02-23 Thread qian (Jira)
qian created SPARK-38302:


 Summary: Dockerfile.java17 can't be used in K8s integration tests 
when 
 Key: SPARK-38302
 URL: https://issues.apache.org/jira/browse/SPARK-38302
 Project: Spark
  Issue Type: Improvement
  Components: Kubernetes, Tests
Affects Versions: 3.3.0
Reporter: qian


When setting parameters `spark-tgz` during integration tests, the error that 
`resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.java17`
 cannot be found occurs. This is due to the default value of 
`spark.kubernetes.test.dockerFile` being 
`resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.java17`.
 When using the tgz, the working directory is 
`${spark.kubernetes.test.unpackSparkDir}`, and the relative path 
`resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.java17`
 is invalid.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37713) No namespace assigned in Executor Pod ConfigMap

2021-12-22 Thread qian (Jira)
qian created SPARK-37713:


 Summary: No namespace assigned in Executor Pod ConfigMap
 Key: SPARK-37713
 URL: https://issues.apache.org/jira/browse/SPARK-37713
 Project: Spark
  Issue Type: Bug
  Components: Kubernetes
Affects Versions: 3.2.0, 3.1.2, 3.1.1
Reporter: qian
 Fix For: 3.3.0


Since Spark 3.X, Executor pod needs separate executor configmap. But, no 
namespace is assigned in configmap when building it. K8s views configmap 
without namespace as global resource. Once pod access is restricted (global 
resources cannot be read), and executor cannot obtain configmap itself.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37645) Word spell error - "labeled" spells as "labled"

2021-12-14 Thread qian (Jira)
qian created SPARK-37645:


 Summary: Word spell error - "labeled" spells as "labled"
 Key: SPARK-37645
 URL: https://issues.apache.org/jira/browse/SPARK-37645
 Project: Spark
  Issue Type: Improvement
  Components: Kubernetes
Affects Versions: 3.2.0, 3.1.1, 3.1.0
Reporter: qian
 Fix For: 3.3.0


Word spell error - "labeled" spells as "labled"



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17317) Add package vignette to SparkR

2016-08-30 Thread Junyang Qian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15450013#comment-15450013
 ] 

Junyang Qian commented on SPARK-17317:
--

WIP

> Add package vignette to SparkR
> --
>
> Key: SPARK-17317
> URL: https://issues.apache.org/jira/browse/SPARK-17317
> Project: Spark
>  Issue Type: Improvement
>Reporter: Junyang Qian
>
> In publishing SparkR to CRAN, it would be nice to have a vignette as a user 
> guide that
> * describes the big picture
> * introduces the use of various methods
> This is important for new users because they may not even know which method 
> to look up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-17317) Add package vignette to SparkR

2016-08-30 Thread Junyang Qian (JIRA)
Junyang Qian created SPARK-17317:


 Summary: Add package vignette to SparkR
 Key: SPARK-17317
 URL: https://issues.apache.org/jira/browse/SPARK-17317
 Project: Spark
  Issue Type: Improvement
Reporter: Junyang Qian


In publishing SparkR to CRAN, it would be nice to have a vignette as a user 
guide that
* describes the big picture
* introduces the use of various methods

This is important for new users because they may not even know which method to 
look up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-17315) Add Kolmogorov-Smirnov Test to SparkR

2016-08-30 Thread Junyang Qian (JIRA)
Junyang Qian created SPARK-17315:


 Summary: Add Kolmogorov-Smirnov Test to SparkR
 Key: SPARK-17315
 URL: https://issues.apache.org/jira/browse/SPARK-17315
 Project: Spark
  Issue Type: New Feature
Reporter: Junyang Qian


Kolmogorov-Smirnov Test is a popular nonparametric test of equality of 
distributions. There is implementation in MLlib. It will be nice if we can 
expose that in SparkR. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17241) SparkR spark.glm should have configurable regularization parameter

2016-08-25 Thread Junyang Qian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437717#comment-15437717
 ] 

Junyang Qian commented on SPARK-17241:
--

I'll take a closer look and see if we can add it easily.

> SparkR spark.glm should have configurable regularization parameter
> --
>
> Key: SPARK-17241
> URL: https://issues.apache.org/jira/browse/SPARK-17241
> Project: Spark
>  Issue Type: Improvement
>Reporter: Junyang Qian
>
> Spark has configurable L2 regularization parameter for generalized linear 
> regression. It is very important to have them in SparkR so that users can run 
> ridge regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-17241) SparkR spark.glm should have configurable regularization parameter

2016-08-25 Thread Junyang Qian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437692#comment-15437692
 ] 

Junyang Qian commented on SPARK-17241:
--

[~shivaram] It seems that spark has it for linear regression but not for glm. 

> SparkR spark.glm should have configurable regularization parameter
> --
>
> Key: SPARK-17241
> URL: https://issues.apache.org/jira/browse/SPARK-17241
> Project: Spark
>  Issue Type: Improvement
>Reporter: Junyang Qian
>
> Spark has configurable L2 regularization parameter for generalized linear 
> regression. It is very important to have them in SparkR so that users can run 
> ridge regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-17241) SparkR spark.glm should have configurable regularization parameter

2016-08-25 Thread Junyang Qian (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-17241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junyang Qian updated SPARK-17241:
-
Summary: SparkR spark.glm should have configurable regularization parameter 
 (was: SparkR spark.glm should have configurable regularization parameter(s))

> SparkR spark.glm should have configurable regularization parameter
> --
>
> Key: SPARK-17241
> URL: https://issues.apache.org/jira/browse/SPARK-17241
> Project: Spark
>  Issue Type: Improvement
>Reporter: Junyang Qian
>
> Spark has configurable L2 regularization parameter for linear regression and 
> an additional elastic-net parameter for generalized linear model. It is very 
> important to have them in SparkR so that users can run ridge regression and 
> elastic-net.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-17241) SparkR spark.glm should have configurable regularization parameter

2016-08-25 Thread Junyang Qian (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-17241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junyang Qian updated SPARK-17241:
-
Description: Spark has configurable L2 regularization parameter for 
generalized linear regression. It is very important to have them in SparkR so 
that users can run ridge regression.  (was: Spark has configurable L2 
regularization parameter for linear regression and an additional elastic-net 
parameter for generalized linear model. It is very important to have them in 
SparkR so that users can run ridge regression and elastic-net.)

> SparkR spark.glm should have configurable regularization parameter
> --
>
> Key: SPARK-17241
> URL: https://issues.apache.org/jira/browse/SPARK-17241
> Project: Spark
>  Issue Type: Improvement
>Reporter: Junyang Qian
>
> Spark has configurable L2 regularization parameter for generalized linear 
> regression. It is very important to have them in SparkR so that users can run 
> ridge regression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-17241) SparkR spark.glm should have configurable regularization parameter(s)

2016-08-25 Thread Junyang Qian (JIRA)
Junyang Qian created SPARK-17241:


 Summary: SparkR spark.glm should have configurable regularization 
parameter(s)
 Key: SPARK-17241
 URL: https://issues.apache.org/jira/browse/SPARK-17241
 Project: Spark
  Issue Type: Improvement
Reporter: Junyang Qian


Spark has configurable L2 regularization parameter for linear regression and an 
additional elastic-net parameter for generalized linear model. It is very 
important to have them in SparkR so that users can run ridge regression and 
elastic-net.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16508) Fix documentation warnings found by R CMD check

2016-08-07 Thread Junyang Qian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15411132#comment-15411132
 ] 

Junyang Qian commented on SPARK-16508:
--

Sounds good. I'll be working on the undocumented/duplicated argument warnings. 

> Fix documentation warnings found by R CMD check
> ---
>
> Key: SPARK-16508
> URL: https://issues.apache.org/jira/browse/SPARK-16508
> Project: Spark
>  Issue Type: Sub-task
>  Components: SparkR
>Reporter: Shivaram Venkataraman
>
> A full list of warnings after the fixes in SPARK-16507 is at 
> https://gist.github.com/shivaram/62866c4ca59c5d34b8963939cf04b5eb 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16508) Fix documentation warnings found by R CMD check

2016-08-05 Thread Junyang Qian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410296#comment-15410296
 ] 

Junyang Qian commented on SPARK-16508:
--

It seems that there are still some warnings in my local check, e.g. 
undocumented arguments in as.data.frame "row.names", "optional". I was 
wondering if I missed something or if we should deal with those?

> Fix documentation warnings found by R CMD check
> ---
>
> Key: SPARK-16508
> URL: https://issues.apache.org/jira/browse/SPARK-16508
> Project: Spark
>  Issue Type: Sub-task
>  Components: SparkR
>Reporter: Shivaram Venkataraman
>
> A full list of warnings after the fixes in SPARK-16507 is at 
> https://gist.github.com/shivaram/62866c4ca59c5d34b8963939cf04b5eb 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-16727) SparkR unit test fails - incorrect expected output

2016-07-25 Thread Junyang Qian (JIRA)
Junyang Qian created SPARK-16727:


 Summary: SparkR unit test fails - incorrect expected output
 Key: SPARK-16727
 URL: https://issues.apache.org/jira/browse/SPARK-16727
 Project: Spark
  Issue Type: Bug
Reporter: Junyang Qian


https://github.com/apache/spark/blob/master/R/pkg/inst/tests/testthat/test_sparkSQL.R#L1827

When I run spark/R/run-tests.sh, the tests failed with the following message:

1. Failure (at test_sparkSQL.R#1827): describe() and summarize() on a DataFrame 
collect(stats)[4, "name"] not equal to "Andy"
target is NULL, current is character

2. Failure (at test_sparkSQL.R#1831): describe() and summarize() on a DataFrame 
collect(stats2)[4, "name"] not equal to "Andy"
target is NULL, current is character
Error: Test failures
Execution halted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16579) Add a spark install function

2016-07-19 Thread Junyang Qian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384628#comment-15384628
 ] 

Junyang Qian commented on SPARK-16579:
--

If we find Spark home and the JARs missing, do we want to still install to a 
cache dir and then redirect Spark home to that dir?

> Add a spark install function
> 
>
> Key: SPARK-16579
> URL: https://issues.apache.org/jira/browse/SPARK-16579
> Project: Spark
>  Issue Type: Sub-task
>  Components: SparkR
>Reporter: Shivaram Venkataraman
>Assignee: Junyang Qian
>
> As described in the design doc we need to introduce a function to install 
> Spark in case the user directly downloads SparkR from CRAN.
> To do that we can introduce a install_spark function that takes in the 
> following arguments
> {code}
> hadoop_version
> url_to_use # defaults to apache
> local_dir # defaults to a cache dir
> {code} 
> Further more I think we can automatically run this from sparkR.init if we 
> find Spark home and the JARs missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org