date:20201205

[jira] [Updated] (SPARK-32991) RESET can clear StaticSQLConfs

2020-12-05 Thread Xiao Li (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-32991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-32991:

Target Version/s: 3.1.0

> RESET can clear StaticSQLConfs
> --
>
> Key: SPARK-32991
> URL: https://issues.apache.org/jira/browse/SPARK-32991
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.1
>Reporter: Herman van Hövell
>Assignee: Kent Yao
>Priority: Blocker
> Fix For: 3.1.0
>
>
> The RESET command can clear a sessions' static SQL configurations, when that 
> static SQL configuration was set on a SparkSession that uses a pre-existing 
> SparkContext. Here is repro:
> {code:java}
> // Blow away any pre-existing session thread locals
> org.apache.spark.sql.SparkSession.clearDefaultSession()
> org.apache.spark.sql.SparkSession.clearActiveSession()
> // Create new session and explicitly set a spark context
> val newSession = org.apache.spark.sql.SparkSession.builder
>  .sparkContext(sc)
>  .config("spark.sql.globalTempDatabase", "bob")
>  .getOrCreate()
> assert(newSession.conf.get("spark.sql.globalTempDatabase") == "bob")
> newSession.sql("reset")
> assert(newSession.conf.get("spark.sql.globalTempDatabase") == "bob") // Boom!
> {code}
> The problem is that RESET assumes it can use the SparkContext's 
> configurations as its default.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-32991) RESET can clear StaticSQLConfs

2020-12-05 Thread Xiao Li (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-32991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-32991:

Priority: Blocker  (was: Major)

> RESET can clear StaticSQLConfs
> --
>
> Key: SPARK-32991
> URL: https://issues.apache.org/jira/browse/SPARK-32991
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.1
>Reporter: Herman van Hövell
>Assignee: Kent Yao
>Priority: Blocker
> Fix For: 3.1.0
>
>
> The RESET command can clear a sessions' static SQL configurations, when that 
> static SQL configuration was set on a SparkSession that uses a pre-existing 
> SparkContext. Here is repro:
> {code:java}
> // Blow away any pre-existing session thread locals
> org.apache.spark.sql.SparkSession.clearDefaultSession()
> org.apache.spark.sql.SparkSession.clearActiveSession()
> // Create new session and explicitly set a spark context
> val newSession = org.apache.spark.sql.SparkSession.builder
>  .sparkContext(sc)
>  .config("spark.sql.globalTempDatabase", "bob")
>  .getOrCreate()
> assert(newSession.conf.get("spark.sql.globalTempDatabase") == "bob")
> newSession.sql("reset")
> assert(newSession.conf.get("spark.sql.globalTempDatabase") == "bob") // Boom!
> {code}
> The problem is that RESET assumes it can use the SparkContext's 
> configurations as its default.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-32991) RESET can clear StaticSQLConfs

2020-12-05 Thread Xiao Li (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-32991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244675#comment-17244675
 ] 

Xiao Li commented on SPARK-32991:
-

[~Qin Yao] I reopened the Jira and set it to Blocker, because the follow-up PR 
is needed before  the 3.1 release.

> RESET can clear StaticSQLConfs
> --
>
> Key: SPARK-32991
> URL: https://issues.apache.org/jira/browse/SPARK-32991
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.1
>Reporter: Herman van Hövell
>Assignee: Kent Yao
>Priority: Blocker
> Fix For: 3.1.0
>
>
> The RESET command can clear a sessions' static SQL configurations, when that 
> static SQL configuration was set on a SparkSession that uses a pre-existing 
> SparkContext. Here is repro:
> {code:java}
> // Blow away any pre-existing session thread locals
> org.apache.spark.sql.SparkSession.clearDefaultSession()
> org.apache.spark.sql.SparkSession.clearActiveSession()
> // Create new session and explicitly set a spark context
> val newSession = org.apache.spark.sql.SparkSession.builder
>  .sparkContext(sc)
>  .config("spark.sql.globalTempDatabase", "bob")
>  .getOrCreate()
> assert(newSession.conf.get("spark.sql.globalTempDatabase") == "bob")
> newSession.sql("reset")
> assert(newSession.conf.get("spark.sql.globalTempDatabase") == "bob") // Boom!
> {code}
> The problem is that RESET assumes it can use the SparkContext's 
> configurations as its default.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Reopened] (SPARK-32991) RESET can clear StaticSQLConfs

2020-12-05 Thread Xiao Li (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-32991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li reopened SPARK-32991:
-

> RESET can clear StaticSQLConfs
> --
>
> Key: SPARK-32991
> URL: https://issues.apache.org/jira/browse/SPARK-32991
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.1
>Reporter: Herman van Hövell
>Assignee: Kent Yao
>Priority: Major
> Fix For: 3.1.0
>
>
> The RESET command can clear a sessions' static SQL configurations, when that 
> static SQL configuration was set on a SparkSession that uses a pre-existing 
> SparkContext. Here is repro:
> {code:java}
> // Blow away any pre-existing session thread locals
> org.apache.spark.sql.SparkSession.clearDefaultSession()
> org.apache.spark.sql.SparkSession.clearActiveSession()
> // Create new session and explicitly set a spark context
> val newSession = org.apache.spark.sql.SparkSession.builder
>  .sparkContext(sc)
>  .config("spark.sql.globalTempDatabase", "bob")
>  .getOrCreate()
> assert(newSession.conf.get("spark.sql.globalTempDatabase") == "bob")
> newSession.sql("reset")
> assert(newSession.conf.get("spark.sql.globalTempDatabase") == "bob") // Boom!
> {code}
> The problem is that RESET assumes it can use the SparkContext's 
> configurations as its default.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33044) Add a Jenkins build and test job for Scala 2.13

2020-12-05 Thread Dongjoon Hyun (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244672#comment-17244672
 ] 

Dongjoon Hyun commented on SPARK-33044:
---

I don't think so, [~srowen]. And, new Jenkins servers rejects my login. Is it 
still accessible to you?

> Add a Jenkins build and test job for Scala 2.13
> ---
>
> Key: SPARK-33044
> URL: https://issues.apache.org/jira/browse/SPARK-33044
> Project: Spark
>  Issue Type: Sub-task
>  Components: jenkins
>Affects Versions: 3.1.0
>Reporter: Yang Jie
>Priority: Major
>
> {{Master}} branch seems to be almost ready for Scala 2.13 now, we need a 
> Jenkins test job to verify current work results and CI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33668) Fix flaky test "Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties."

2020-12-05 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-33668:
-

Assignee: Prashant Sharma

> Fix flaky test "Verify logging configuration is picked from the provided 
> SPARK_CONF_DIR/log4j.properties."
> --
>
> Key: SPARK-33668
> URL: https://issues.apache.org/jira/browse/SPARK-33668
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Tests
>Affects Versions: 3.1.0
>Reporter: Prashant Sharma
>Assignee: Prashant Sharma
>Priority: Major
>
> The test is flaking, with multiple flaked instances - the reason for the 
> failure has been similar to:
> {code:java}
>   The code passed to eventually never returned normally. Attempted 109 times 
> over 3.007988241397 minutes. Last failure message: Failure executing: GET 
> at: 
> https://192.168.39.167:8443/api/v1/namespaces/b37fc72a991b49baa68a2eaaa1516463/pods/spark-pi-97a9bc76308e7fe3-exec-1/log?pretty=false.
>  Message: pods "spark-pi-97a9bc76308e7fe3-exec-1" not found. Received status: 
> Status(apiVersion=v1, code=404, details=StatusDetails(causes=[], group=null, 
> kind=pods, name=spark-pi-97a9bc76308e7fe3-exec-1, retryAfterSeconds=null, 
> uid=null, additionalProperties={}), kind=Status, message=pods 
> "spark-pi-97a9bc76308e7fe3-exec-1" not found, 
> metadata=ListMeta(_continue=null, remainingItemCount=null, 
> resourceVersion=null, selfLink=null, additionalProperties={}), 
> reason=NotFound, status=Failure, additionalProperties={}).. 
> (KubernetesSuite.scala:402)
> {code}
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36854/console
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36852/console
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36850/console
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36848/console
> From the above failures, it seems, that executor finishes too quickly and is 
> removed by spark before the test can complete. 
> So, in order to mitigate this situation, one way is to turn on the flag
> {code}
>"spark.kubernetes.executor.deleteOnTermination"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-33668) Fix flaky test "Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties."

2020-12-05 Thread Dongjoon Hyun (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-33668.
---
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 30616
[https://github.com/apache/spark/pull/30616]

> Fix flaky test "Verify logging configuration is picked from the provided 
> SPARK_CONF_DIR/log4j.properties."
> --
>
> Key: SPARK-33668
> URL: https://issues.apache.org/jira/browse/SPARK-33668
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Tests
>Affects Versions: 3.1.0
>Reporter: Prashant Sharma
>Assignee: Prashant Sharma
>Priority: Major
> Fix For: 3.2.0
>
>
> The test is flaking, with multiple flaked instances - the reason for the 
> failure has been similar to:
> {code:java}
>   The code passed to eventually never returned normally. Attempted 109 times 
> over 3.007988241397 minutes. Last failure message: Failure executing: GET 
> at: 
> https://192.168.39.167:8443/api/v1/namespaces/b37fc72a991b49baa68a2eaaa1516463/pods/spark-pi-97a9bc76308e7fe3-exec-1/log?pretty=false.
>  Message: pods "spark-pi-97a9bc76308e7fe3-exec-1" not found. Received status: 
> Status(apiVersion=v1, code=404, details=StatusDetails(causes=[], group=null, 
> kind=pods, name=spark-pi-97a9bc76308e7fe3-exec-1, retryAfterSeconds=null, 
> uid=null, additionalProperties={}), kind=Status, message=pods 
> "spark-pi-97a9bc76308e7fe3-exec-1" not found, 
> metadata=ListMeta(_continue=null, remainingItemCount=null, 
> resourceVersion=null, selfLink=null, additionalProperties={}), 
> reason=NotFound, status=Failure, additionalProperties={}).. 
> (KubernetesSuite.scala:402)
> {code}
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36854/console
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36852/console
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36850/console
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36848/console
> From the above failures, it seems, that executor finishes too quickly and is 
> removed by spark before the test can complete. 
> So, in order to mitigate this situation, one way is to turn on the flag
> {code}
>"spark.kubernetes.executor.deleteOnTermination"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33256) Update contribution guide about NumPy documentation style

2020-12-05 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244649#comment-17244649
 ] 

Apache Spark commented on SPARK-33256:
--

User 'HyukjinKwon' has created a pull request for this issue:
https://github.com/apache/spark/pull/30622

> Update contribution guide about NumPy documentation style
> -
>
> Key: SPARK-33256
> URL: https://issues.apache.org/jira/browse/SPARK-33256
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 3.1.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> We should document that PySpark uses NumPy documentation style.
> See also https://github.com/apache/spark/pull/30181#discussion_r517314341



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33256) Update contribution guide about NumPy documentation style

2020-12-05 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33256:


Assignee: (was: Apache Spark)

> Update contribution guide about NumPy documentation style
> -
>
> Key: SPARK-33256
> URL: https://issues.apache.org/jira/browse/SPARK-33256
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 3.1.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> We should document that PySpark uses NumPy documentation style.
> See also https://github.com/apache/spark/pull/30181#discussion_r517314341



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33256) Update contribution guide about NumPy documentation style

2020-12-05 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33256:


Assignee: Apache Spark

> Update contribution guide about NumPy documentation style
> -
>
> Key: SPARK-33256
> URL: https://issues.apache.org/jira/browse/SPARK-33256
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 3.1.0
>Reporter: Hyukjin Kwon
>Assignee: Apache Spark
>Priority: Major
>
> We should document that PySpark uses NumPy documentation style.
> See also https://github.com/apache/spark/pull/30181#discussion_r517314341



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33256) Update contribution guide about NumPy documentation style

2020-12-05 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244648#comment-17244648
 ] 

Apache Spark commented on SPARK-33256:
--

User 'HyukjinKwon' has created a pull request for this issue:
https://github.com/apache/spark/pull/30622

> Update contribution guide about NumPy documentation style
> -
>
> Key: SPARK-33256
> URL: https://issues.apache.org/jira/browse/SPARK-33256
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 3.1.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> We should document that PySpark uses NumPy documentation style.
> See also https://github.com/apache/spark/pull/30181#discussion_r517314341



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33674) Show Slowpoke notifications in SBT tests

2020-12-05 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244647#comment-17244647
 ] 

Apache Spark commented on SPARK-33674:
--

User 'gatorsmile' has created a pull request for this issue:
https://github.com/apache/spark/pull/30621

> Show Slowpoke notifications in SBT tests
> 
>
> Key: SPARK-33674
> URL: https://issues.apache.org/jira/browse/SPARK-33674
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 3.0.1
>Reporter: Xiao Li
>Assignee: Xiao Li
>Priority: Major
>
> When the tests/code has bug and enters the infinite loop, it is hard to tell 
> which test cases hit some issues from the log, especially when we are running 
> the tests in parallel. It would be nice to show the Slowpoke notifications.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33674) Show Slowpoke notifications in SBT tests

2020-12-05 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244646#comment-17244646
 ] 

Apache Spark commented on SPARK-33674:
--

User 'gatorsmile' has created a pull request for this issue:
https://github.com/apache/spark/pull/30621

> Show Slowpoke notifications in SBT tests
> 
>
> Key: SPARK-33674
> URL: https://issues.apache.org/jira/browse/SPARK-33674
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 3.0.1
>Reporter: Xiao Li
>Assignee: Xiao Li
>Priority: Major
>
> When the tests/code has bug and enters the infinite loop, it is hard to tell 
> which test cases hit some issues from the log, especially when we are running 
> the tests in parallel. It would be nice to show the Slowpoke notifications.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33674) Show Slowpoke notifications in SBT tests

2020-12-05 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33674:


Assignee: Xiao Li  (was: Apache Spark)

> Show Slowpoke notifications in SBT tests
> 
>
> Key: SPARK-33674
> URL: https://issues.apache.org/jira/browse/SPARK-33674
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 3.0.1
>Reporter: Xiao Li
>Assignee: Xiao Li
>Priority: Major
>
> When the tests/code has bug and enters the infinite loop, it is hard to tell 
> which test cases hit some issues from the log, especially when we are running 
> the tests in parallel. It would be nice to show the Slowpoke notifications.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33674) Show Slowpoke notifications in SBT tests

2020-12-05 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33674:


Assignee: Apache Spark  (was: Xiao Li)

> Show Slowpoke notifications in SBT tests
> 
>
> Key: SPARK-33674
> URL: https://issues.apache.org/jira/browse/SPARK-33674
> Project: Spark
>  Issue Type: Test
>  Components: Tests
>Affects Versions: 3.0.1
>Reporter: Xiao Li
>Assignee: Apache Spark
>Priority: Major
>
> When the tests/code has bug and enters the infinite loop, it is hard to tell 
> which test cases hit some issues from the log, especially when we are running 
> the tests in parallel. It would be nice to show the Slowpoke notifications.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33256) Update contribution guide about NumPy documentation style

2020-12-05 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244645#comment-17244645
 ] 

Hyukjin Kwon commented on SPARK-33256:
--

We should have a page like 
https://pandas.pydata.org/docs/development/contributing_docstring.html but I 
will leave it to the future work.

> Update contribution guide about NumPy documentation style
> -
>
> Key: SPARK-33256
> URL: https://issues.apache.org/jira/browse/SPARK-33256
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 3.1.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> We should document that PySpark uses NumPy documentation style.
> See also https://github.com/apache/spark/pull/30181#discussion_r517314341



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-33674) Show Slowpoke notifications in SBT tests

2020-12-05 Thread Xiao Li (Jira)

Xiao Li created SPARK-33674:
---

 Summary: Show Slowpoke notifications in SBT tests
 Key: SPARK-33674
 URL: https://issues.apache.org/jira/browse/SPARK-33674
 Project: Spark
  Issue Type: Test
  Components: Tests
Affects Versions: 3.0.1
Reporter: Xiao Li
Assignee: Xiao Li


When the tests/code has bug and enters the infinite loop, it is hard to tell 
which test cases hit some issues from the log, especially when we are running 
the tests in parallel. It would be nice to show the Slowpoke notifications.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-33637) Spark sql drop non-existent table will not report an error after failure

2020-12-05 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-33637.
--
Resolution: Invalid

No feedback from the reporter.

> Spark sql drop non-existent table will not report an error after failure
> 
>
> Key: SPARK-33637
> URL: https://issues.apache.org/jira/browse/SPARK-33637
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: guihuawen
>Priority: Major
>
> In the migration of spark sql to hive sql, in order to reduce user changes, 
> spark sql will report an error when executing a table where drop does not 
> exist, but hive sql will not. In order to ensure that the grammar can be 
> executed normally, no error is reported when changing to a table that does 
> not exist in drop.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33639) The external table does not specify a location

2020-12-05 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244640#comment-17244640
 ] 

Hyukjin Kwon commented on SPARK-33639:
--

[~guihuawen] what is the actual result and expected result? Can you share a 
self-contained reproducer with outputs?

>  The external table does not specify a location
> ---
>
> Key: SPARK-33639
> URL: https://issues.apache.org/jira/browse/SPARK-33639
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: guihuawen
>Priority: Major
>
> If the location of the external table is not specified, an error will be 
> reported. If the partition is not specified, the default is the same as the 
> specified partition of the internal table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-33648) Moving file stage failed cause dulpicated data

2020-12-05 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-33648.
--
Resolution: Invalid

[~blackpig] please fill the JIRA description.

> Moving file stage failed cause dulpicated data 
> ---
>
> Key: SPARK-33648
> URL: https://issues.apache.org/jira/browse/SPARK-33648
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.1
>Reporter: Tongwei
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33661) Unable to load RandomForestClassificationModel trained in Spark 2.x

2020-12-05 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244638#comment-17244638
 ] 

Hyukjin Kwon commented on SPARK-33661:
--

There's a major version bumped up from 2 -> 3. Breaking change is legitimate 
here.

> Unable to load RandomForestClassificationModel trained in Spark 2.x
> ---
>
> Key: SPARK-33661
> URL: https://issues.apache.org/jira/browse/SPARK-33661
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 3.0.1
>Reporter: Marcus Levine
>Priority: Major
>
> When attempting to load a RandomForestClassificationModel that was trained in 
> Spark 2.x using Spark 3.x, an exception is raised:
> {code:python}
> ...
> RandomForestClassificationModel.load('/path/to/my/model')
>   File "/usr/spark/python/lib/pyspark.zip/pyspark/ml/util.py", line 330, in 
> load
>   File "/usr/spark/python/lib/pyspark.zip/pyspark/ml/pipeline.py", line 291, 
> in load
>   File "/usr/spark/python/lib/pyspark.zip/pyspark/ml/util.py", line 280, in 
> load
>   File "/usr/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 
> 1305, in __call__
>   File "/usr/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 134, in 
> deco
>   File "", line 3, in raise_from
> pyspark.sql.utils.AnalysisException: No such struct field rawCount in id, 
> prediction, impurity, impurityStats, gain, leftChild, rightChild, split;
> {code}
> There seems to be a schema incompatibility between the trained model data 
> saved by Spark 2.x and the expected data for a model trained in Spark 3.x
> If this issue is not resolved, users will be forced to retrain any existing 
> random forest models they trained in Spark 2.x using Spark 3.x before they 
> can upgrade



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-33661) Unable to load RandomForestClassificationModel trained in Spark 2.x

2020-12-05 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-33661.
--
Resolution: Not A Problem

> Unable to load RandomForestClassificationModel trained in Spark 2.x
> ---
>
> Key: SPARK-33661
> URL: https://issues.apache.org/jira/browse/SPARK-33661
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 3.0.1
>Reporter: Marcus Levine
>Priority: Major
>
> When attempting to load a RandomForestClassificationModel that was trained in 
> Spark 2.x using Spark 3.x, an exception is raised:
> {code:python}
> ...
> RandomForestClassificationModel.load('/path/to/my/model')
>   File "/usr/spark/python/lib/pyspark.zip/pyspark/ml/util.py", line 330, in 
> load
>   File "/usr/spark/python/lib/pyspark.zip/pyspark/ml/pipeline.py", line 291, 
> in load
>   File "/usr/spark/python/lib/pyspark.zip/pyspark/ml/util.py", line 280, in 
> load
>   File "/usr/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 
> 1305, in __call__
>   File "/usr/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 134, in 
> deco
>   File "", line 3, in raise_from
> pyspark.sql.utils.AnalysisException: No such struct field rawCount in id, 
> prediction, impurity, impurityStats, gain, leftChild, rightChild, split;
> {code}
> There seems to be a schema incompatibility between the trained model data 
> saved by Spark 2.x and the expected data for a model trained in Spark 3.x
> If this issue is not resolved, users will be forced to retrain any existing 
> random forest models they trained in Spark 2.x using Spark 3.x before they 
> can upgrade



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33673) Do not push down partition filters to ParquetScan for DataSourceV2

2020-12-05 Thread Yuming Wang (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244624#comment-17244624
 ] 

Yuming Wang commented on SPARK-33673:
-

Yes, you can verify it by:
{code:sh}
mvn -Dtest=none 
-DwildcardSuites=org.apache.spark.sql.execution.datasources.parquet.ParquetV2SchemaPruningSuite
 -Dparquet.version=1.11.1 test
{code}


> Do not push down partition filters to ParquetScan for DataSourceV2
> --
>
> Key: SPARK-33673
> URL: https://issues.apache.org/jira/browse/SPARK-33673
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Yuming Wang
>Priority: Major
>
> {noformat}
> - Spark vectorized reader - with partition data column - select a single 
> complex field array and its parent struct array *** FAILED ***
> - Non-vectorized reader - with partition data column - select a single 
> complex field array and its parent struct array *** FAILED ***
> - Spark vectorized reader - with partition data column - select a single 
> complex field from a map entry and its parent map entry *** FAILED ***
> - Non-vectorized reader - with partition data column - select a single 
> complex field from a map entry and its parent map entry *** FAILED ***
> - Spark vectorized reader - with partition data column - partial schema 
> intersection - select missing subfield *** FAILED ***
> - Non-vectorized reader - with partition data column - partial schema 
> intersection - select missing subfield *** FAILED ***
> - Spark vectorized reader - with partition data column - no unnecessary 
> schema pruning *** FAILED ***
> - Non-vectorized reader - with partition data column - no unnecessary schema 
> pruning *** FAILED ***
> - Spark vectorized reader - with partition data column - empty schema 
> intersection *** FAILED ***
> - Non-vectorized reader - with partition data column - empty schema 
> intersection *** FAILED ***
> - Spark vectorized reader - with partition data column - select a single 
> complex field and in where clause *** FAILED ***
> - Non-vectorized reader - with partition data column - select a single 
> complex field and in where clause *** FAILED ***
> - Spark vectorized reader - with partition data column - select nullable 
> complex field and having is not null predicate *** FAILED ***
> - Non-vectorized reader - with partition data column - select nullable 
> complex field and having is not null predicate *** FAILED ***
> {noformat}
> These test will fail, this is because Parquet will return empty results for 
> non-existent column since PARQUET-1765.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33673) Do not push down partition filters to ParquetScan for DataSourceV2

2020-12-05 Thread L. C. Hsieh (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244615#comment-17244615
 ] 

L. C. Hsieh commented on SPARK-33673:
-

You mean these test failed once upgrading to Parquet 1.11.1?

> Do not push down partition filters to ParquetScan for DataSourceV2
> --
>
> Key: SPARK-33673
> URL: https://issues.apache.org/jira/browse/SPARK-33673
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Yuming Wang
>Priority: Major
>
> {noformat}
> - Spark vectorized reader - with partition data column - select a single 
> complex field array and its parent struct array *** FAILED ***
> - Non-vectorized reader - with partition data column - select a single 
> complex field array and its parent struct array *** FAILED ***
> - Spark vectorized reader - with partition data column - select a single 
> complex field from a map entry and its parent map entry *** FAILED ***
> - Non-vectorized reader - with partition data column - select a single 
> complex field from a map entry and its parent map entry *** FAILED ***
> - Spark vectorized reader - with partition data column - partial schema 
> intersection - select missing subfield *** FAILED ***
> - Non-vectorized reader - with partition data column - partial schema 
> intersection - select missing subfield *** FAILED ***
> - Spark vectorized reader - with partition data column - no unnecessary 
> schema pruning *** FAILED ***
> - Non-vectorized reader - with partition data column - no unnecessary schema 
> pruning *** FAILED ***
> - Spark vectorized reader - with partition data column - empty schema 
> intersection *** FAILED ***
> - Non-vectorized reader - with partition data column - empty schema 
> intersection *** FAILED ***
> - Spark vectorized reader - with partition data column - select a single 
> complex field and in where clause *** FAILED ***
> - Non-vectorized reader - with partition data column - select a single 
> complex field and in where clause *** FAILED ***
> - Spark vectorized reader - with partition data column - select nullable 
> complex field and having is not null predicate *** FAILED ***
> - Non-vectorized reader - with partition data column - select nullable 
> complex field and having is not null predicate *** FAILED ***
> {noformat}
> These test will fail, this is because Parquet will return empty results for 
> non-existent column since PARQUET-1765.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33673) Do not push down partition filters to ParquetScan for DataSourceV2

2020-12-05 Thread Yuming Wang (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244612#comment-17244612
 ] 

Yuming Wang commented on SPARK-33673:
-

[~cloud_fan] [~rdblue] [~viirya]

> Do not push down partition filters to ParquetScan for DataSourceV2
> --
>
> Key: SPARK-33673
> URL: https://issues.apache.org/jira/browse/SPARK-33673
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Yuming Wang
>Priority: Major
>
> {noformat}
> - Spark vectorized reader - with partition data column - select a single 
> complex field array and its parent struct array *** FAILED ***
> - Non-vectorized reader - with partition data column - select a single 
> complex field array and its parent struct array *** FAILED ***
> - Spark vectorized reader - with partition data column - select a single 
> complex field from a map entry and its parent map entry *** FAILED ***
> - Non-vectorized reader - with partition data column - select a single 
> complex field from a map entry and its parent map entry *** FAILED ***
> - Spark vectorized reader - with partition data column - partial schema 
> intersection - select missing subfield *** FAILED ***
> - Non-vectorized reader - with partition data column - partial schema 
> intersection - select missing subfield *** FAILED ***
> - Spark vectorized reader - with partition data column - no unnecessary 
> schema pruning *** FAILED ***
> - Non-vectorized reader - with partition data column - no unnecessary schema 
> pruning *** FAILED ***
> - Spark vectorized reader - with partition data column - empty schema 
> intersection *** FAILED ***
> - Non-vectorized reader - with partition data column - empty schema 
> intersection *** FAILED ***
> - Spark vectorized reader - with partition data column - select a single 
> complex field and in where clause *** FAILED ***
> - Non-vectorized reader - with partition data column - select a single 
> complex field and in where clause *** FAILED ***
> - Spark vectorized reader - with partition data column - select nullable 
> complex field and having is not null predicate *** FAILED ***
> - Non-vectorized reader - with partition data column - select nullable 
> complex field and having is not null predicate *** FAILED ***
> {noformat}
> These test will fail, this is because Parquet will return empty results for 
> non-existent column since PARQUET-1765.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-33673) Do not push down partition filters to ParquetScan for DataSourceV2

2020-12-05 Thread Yuming Wang (Jira)

Yuming Wang created SPARK-33673:
---

 Summary: Do not push down partition filters to ParquetScan for 
DataSourceV2
 Key: SPARK-33673
 URL: https://issues.apache.org/jira/browse/SPARK-33673
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.2.0
Reporter: Yuming Wang



{noformat}
- Spark vectorized reader - with partition data column - select a single 
complex field array and its parent struct array *** FAILED ***
- Non-vectorized reader - with partition data column - select a single complex 
field array and its parent struct array *** FAILED ***
- Spark vectorized reader - with partition data column - select a single 
complex field from a map entry and its parent map entry *** FAILED ***
- Non-vectorized reader - with partition data column - select a single complex 
field from a map entry and its parent map entry *** FAILED ***
- Spark vectorized reader - with partition data column - partial schema 
intersection - select missing subfield *** FAILED ***
- Non-vectorized reader - with partition data column - partial schema 
intersection - select missing subfield *** FAILED ***
- Spark vectorized reader - with partition data column - no unnecessary schema 
pruning *** FAILED ***
- Non-vectorized reader - with partition data column - no unnecessary schema 
pruning *** FAILED ***
- Spark vectorized reader - with partition data column - empty schema 
intersection *** FAILED ***
- Non-vectorized reader - with partition data column - empty schema 
intersection *** FAILED ***
- Spark vectorized reader - with partition data column - select a single 
complex field and in where clause *** FAILED ***
- Non-vectorized reader - with partition data column - select a single complex 
field and in where clause *** FAILED ***
- Spark vectorized reader - with partition data column - select nullable 
complex field and having is not null predicate *** FAILED ***
- Non-vectorized reader - with partition data column - select nullable complex 
field and having is not null predicate *** FAILED ***
{noformat}
These test will fail, this is because Parquet will return empty results for 
non-existent column since PARQUET-1765.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-33499) Enable `-Wunused:imports` in Scala 2.13 SBT

2020-12-05 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen updated SPARK-33499:
-
Parent: (was: SPARK-25075)
Issue Type: Improvement  (was: Sub-task)

> Enable `-Wunused:imports` in Scala 2.13 SBT
> ---
>
> Key: SPARK-33499
> URL: https://issues.apache.org/jira/browse/SPARK-33499
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.1.0
> Environment: !image-2020-11-20-18-49-16-384.png!
> Scala 2.13 think this is a {{unused import}} but Scala 2.12 compile failed 
> without this import, so comments {{-Wunused:imports}} in Scala 2.13 and left 
> a TODO in this pr
>  
>Reporter: Yang Jie
>Priority: Minor
> Attachments: image-2020-11-20-18-50-21-567.png
>
>
> As the image, Scala 2.13 think `scala.language.higherKinds` is a {{unused 
> import, }}but Scala 2.12 compile failed without this import, so comments 
> {{-Wunused:imports}} in Scala 2.13 and left a TODO, 
> we should enabled `-Wunused:imports` when Scala 2.12 is no longer supported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-33499) Enable `-Wunused:imports` in Scala 2.13 SBT

2020-12-05 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen resolved SPARK-33499.
--
Resolution: Not A Problem

OK, so we don't remove it. I don't see the issue.

> Enable `-Wunused:imports` in Scala 2.13 SBT
> ---
>
> Key: SPARK-33499
> URL: https://issues.apache.org/jira/browse/SPARK-33499
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 3.1.0
> Environment: !image-2020-11-20-18-49-16-384.png!
> Scala 2.13 think this is a {{unused import}} but Scala 2.12 compile failed 
> without this import, so comments {{-Wunused:imports}} in Scala 2.13 and left 
> a TODO in this pr
>  
>Reporter: Yang Jie
>Priority: Minor
> Attachments: image-2020-11-20-18-50-21-567.png
>
>
> As the image, Scala 2.13 think `scala.language.higherKinds` is a {{unused 
> import, }}but Scala 2.12 compile failed without this import, so comments 
> {{-Wunused:imports}} in Scala 2.13 and left a TODO, 
> we should enabled `-Wunused:imports` when Scala 2.12 is no longer supported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-33348) Use scala.jdk.CollectionConverters replace scala.collection.JavaConverters

2020-12-05 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen updated SPARK-33348:
-
Parent: (was: SPARK-25075)
Issue Type: Improvement  (was: Sub-task)

> Use scala.jdk.CollectionConverters replace scala.collection.JavaConverters
> --
>
> Key: SPARK-33348
> URL: https://issues.apache.org/jira/browse/SPARK-33348
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.1.0
>Reporter: Yang Jie
>Priority: Minor
>
> `scala.collection.JavaConverters` is deprecated in Scala 2.13, there are many 
> compilation warnings about this, should use `scala.jdk.CollectionConverters` 
> replace it. 
> But `scala.jdk.CollectionConverters` only available in Scala 2.13.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-33345) Batch fix compilation warnings about "Widening conversion from XXX to XXX is deprecated"

2020-12-05 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen updated SPARK-33345:
-
Parent: (was: SPARK-25075)
Issue Type: Improvement  (was: Sub-task)

>  Batch fix compilation warnings about "Widening conversion from XXX to XXX is 
> deprecated"
> -
>
> Key: SPARK-33345
> URL: https://issues.apache.org/jira/browse/SPARK-33345
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.1.0
>Reporter: Yang Jie
>Priority: Minor
>
> There is a batch of compilation warnings in Scala 2.13 as follows:
> {code:java}
> [WARNING] [Warn] 
> /spark/core/src/main/scala/org/apache/spark/input/FixedLengthBinaryInputFormat.scala:77:
>  Widening conversion from Long to Double is deprecated because it loses 
> precision. Write `.toDouble` instead.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-33355) Batch fix compilation warnings about "method copyArrayToImmutableIndexedSeq in class LowPriorityImplicits2 is deprecated (since 2.13.0)"

2020-12-05 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen updated SPARK-33355:
-
Parent: (was: SPARK-25075)
Issue Type: Improvement  (was: Sub-task)

> Batch fix compilation warnings about "method copyArrayToImmutableIndexedSeq 
> in class LowPriorityImplicits2 is deprecated (since 2.13.0)"
> 
>
> Key: SPARK-33355
> URL: https://issues.apache.org/jira/browse/SPARK-33355
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.1.0
>Reporter: Yang Jie
>Priority: Minor
>
> compilation warnings as follow：
> {code:java}
> [WARNING] [Warn] 
> /spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala:1711:
>  method copyArrayToImmutableIndexedSeq in class LowPriorityImplicits2 is 
> deprecated (since 2.13.0): Implicit conversions from Array to 
> immutable.IndexedSeq are implemented by copying; Use the more efficient 
> non-copying ArraySeq.unsafeWrapArray or an explicit toIndexedSeq call
> {code}
> this Jira is a placeholder now,  wait until Scala 2.12 is no longer supported



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33348) Use scala.jdk.CollectionConverters replace scala.collection.JavaConverters

2020-12-05 Thread Sean R. Owen (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244607#comment-17244607
 ] 

Sean R. Owen commented on SPARK-33348:
--

Yes, this isn't required for 2.13, and can't be done for 2.12, right?

> Use scala.jdk.CollectionConverters replace scala.collection.JavaConverters
> --
>
> Key: SPARK-33348
> URL: https://issues.apache.org/jira/browse/SPARK-33348
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 3.1.0
>Reporter: Yang Jie
>Priority: Minor
>
> `scala.collection.JavaConverters` is deprecated in Scala 2.13, there are many 
> compilation warnings about this, should use `scala.jdk.CollectionConverters` 
> replace it. 
> But `scala.jdk.CollectionConverters` only available in Scala 2.13.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-33344) Fix Compilation warnings of "multiarg infix syntax looks like a tuple and will be deprecated" in Scala 2.13

2020-12-05 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen updated SPARK-33344:
-
Summary: Fix Compilation warnings of "multiarg infix syntax looks like a 
tuple and will be deprecated" in Scala 2.13  (was: Fix Compilation warings of 
"multiarg infix syntax looks like a tuple and will be deprecated" in Scala 2.13)

Because this isn't required for 2.13, I moved it out of the umbrella

> Fix Compilation warnings of "multiarg infix syntax looks like a tuple and 
> will be deprecated" in Scala 2.13
> ---
>
> Key: SPARK-33344
> URL: https://issues.apache.org/jira/browse/SPARK-33344
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.1.0
>Reporter: Yang Jie
>Priority: Minor
>
> There is a batch of compilation warnings in Scala 2.13 as follow:
> {code:java}
> [WARNING] [Warn] 
> /spark/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala:656: 
> multiarg infix syntax looks like a tuple and will be deprecated
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-33344) Fix Compilation warings of "multiarg infix syntax looks like a tuple and will be deprecated" in Scala 2.13

2020-12-05 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen updated SPARK-33344:
-
Parent: (was: SPARK-25075)
Issue Type: Improvement  (was: Sub-task)

> Fix Compilation warings of "multiarg infix syntax looks like a tuple and will 
> be deprecated" in Scala 2.13
> --
>
> Key: SPARK-33344
> URL: https://issues.apache.org/jira/browse/SPARK-33344
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.1.0
>Reporter: Yang Jie
>Priority: Minor
>
> There is a batch of compilation warnings in Scala 2.13 as follow:
> {code:java}
> [WARNING] [Warn] 
> /spark/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala:656: 
> multiarg infix syntax looks like a tuple and will be deprecated
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33285) Too many "Auto-application to `()` is deprecated." related compilation warnings

2020-12-05 Thread Sean R. Owen (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244605#comment-17244605
 ] 

Sean R. Owen commented on SPARK-33285:
--

I moved this out from the 2.13 umbrella as it is not required for 2.13

> Too many "Auto-application to `()` is deprecated."  related compilation 
> warnings
> 
>
> Key: SPARK-33285
> URL: https://issues.apache.org/jira/browse/SPARK-33285
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.1.0
>Reporter: Yang Jie
>Priority: Minor
>
> There are too many  "Auto-application to `()` is deprecated." related 
> compilation warnings when compile with Scala 2.13 like
> {code:java}
> [WARNING] [Warn] 
> /spark-src/core/src/test/scala/org/apache/spark/PartitioningSuite.scala:246: 
> Auto-application to `()` is deprecated. Supply the empty argument list `()` 
> explicitly to invoke method stdev,
> or remove the empty argument list from its definition (Java-defined methods 
> are exempt).
> In Scala 3, an unapplied method like this will be eta-expanded into a 
> function.
> {code}
> A lot of them, but it's easy to fix.
> If there is a definition as follows:
> {code:java}
> Class Foo {
>def bar(): Unit = {}
> }
> val foo = new Foo{code}
> Should be
> {code:java}
> foo.bar()
> {code}
> not
> {code:java}
> foo.bar {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-33285) Too many "Auto-application to `()` is deprecated." related compilation warnings

2020-12-05 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen updated SPARK-33285:
-
Parent: (was: SPARK-25075)
Issue Type: Improvement  (was: Sub-task)

> Too many "Auto-application to `()` is deprecated."  related compilation 
> warnings
> 
>
> Key: SPARK-33285
> URL: https://issues.apache.org/jira/browse/SPARK-33285
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.1.0
>Reporter: Yang Jie
>Priority: Minor
>
> There are too many  "Auto-application to `()` is deprecated." related 
> compilation warnings when compile with Scala 2.13 like
> {code:java}
> [WARNING] [Warn] 
> /spark-src/core/src/test/scala/org/apache/spark/PartitioningSuite.scala:246: 
> Auto-application to `()` is deprecated. Supply the empty argument list `()` 
> explicitly to invoke method stdev,
> or remove the empty argument list from its definition (Java-defined methods 
> are exempt).
> In Scala 3, an unapplied method like this will be eta-expanded into a 
> function.
> {code}
> A lot of them, but it's easy to fix.
> If there is a definition as follows:
> {code:java}
> Class Foo {
>def bar(): Unit = {}
> }
> val foo = new Foo{code}
> Should be
> {code:java}
> foo.bar()
> {code}
> not
> {code:java}
> foo.bar {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33044) Add a Jenkins build and test job for Scala 2.13

2020-12-05 Thread Sean R. Owen (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244604#comment-17244604
 ] 

Sean R. Owen commented on SPARK-33044:
--

[~dongjoon] This is done now right?

> Add a Jenkins build and test job for Scala 2.13
> ---
>
> Key: SPARK-33044
> URL: https://issues.apache.org/jira/browse/SPARK-33044
> Project: Spark
>  Issue Type: Sub-task
>  Components: jenkins
>Affects Versions: 3.1.0
>Reporter: Yang Jie
>Priority: Major
>
> {{Master}} branch seems to be almost ready for Scala 2.13 now, we need a 
> Jenkins test job to verify current work results and CI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-32348) Get tests working for Scala 2.13 build

2020-12-05 Thread Sean R. Owen (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-32348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean R. Owen resolved SPARK-32348.
--
Fix Version/s: 3.1.0
   Resolution: Fixed

This should be largely done now, or at least, superseded by a few much more 
specific issues

> Get tests working for Scala 2.13 build
> --
>
> Key: SPARK-32348
> URL: https://issues.apache.org/jira/browse/SPARK-32348
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML, Spark Core, SQL, Structured Streaming
>Affects Versions: 3.0.0
>Reporter: Sean R. Owen
>Assignee: Sean R. Owen
>Priority: Major
> Fix For: 3.1.0
>
>
> This is a placeholder for the general task of getting the tests to pass in 
> the Scala 2.13 build, after it compiles.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33672) Check SQLContext.tables() for V2 session catalog

2020-12-05 Thread Maxim Gekk (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244542#comment-17244542
 ] 

Maxim Gekk commented on SPARK-33672:


[~cloud_fan] FYI

> Check SQLContext.tables() for V2 session catalog
> 
>
> Key: SPARK-33672
> URL: https://issues.apache.org/jira/browse/SPARK-33672
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Maxim Gekk
>Priority: Major
>
> V1 ShowTablesCommand is hard coded in SQLContext:
> https://github.com/apache/spark/blob/a088a801ed8c17171545c196a3f26ce415de0cd1/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala#L671
> The ticket aims to checks tables() behavior for V2 session catalog.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-33672) Check SQLContext.tables() for V2 session catalog

2020-12-05 Thread Maxim Gekk (Jira)

Maxim Gekk created SPARK-33672:
--

 Summary: Check SQLContext.tables() for V2 session catalog
 Key: SPARK-33672
 URL: https://issues.apache.org/jira/browse/SPARK-33672
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.1.0
Reporter: Maxim Gekk


V1 ShowTablesCommand is hard coded in SQLContext:
https://github.com/apache/spark/blob/a088a801ed8c17171545c196a3f26ce415de0cd1/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala#L671
The ticket aims to checks tables() behavior for V2 session catalog.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33671) Remove VIEW checks from V1 table commands

2020-12-05 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244538#comment-17244538
 ] 

Apache Spark commented on SPARK-33671:
--

User 'MaxGekk' has created a pull request for this issue:
https://github.com/apache/spark/pull/30620

> Remove VIEW checks from V1 table commands
> -
>
> Key: SPARK-33671
> URL: https://issues.apache.org/jira/browse/SPARK-33671
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.2, 3.1.0
>Reporter: Maxim Gekk
>Priority: Major
>
> Checking of VIEWs is performed earlier, see 
> https://github.com/apache/spark/pull/30461 . So, the checks can be removed 
> from some V1 commands.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33671) Remove VIEW checks from V1 table commands

2020-12-05 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244536#comment-17244536
 ] 

Apache Spark commented on SPARK-33671:
--

User 'MaxGekk' has created a pull request for this issue:
https://github.com/apache/spark/pull/30620

> Remove VIEW checks from V1 table commands
> -
>
> Key: SPARK-33671
> URL: https://issues.apache.org/jira/browse/SPARK-33671
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.2, 3.1.0
>Reporter: Maxim Gekk
>Priority: Major
>
> Checking of VIEWs is performed earlier, see 
> https://github.com/apache/spark/pull/30461 . So, the checks can be removed 
> from some V1 commands.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33671) Remove VIEW checks from V1 table commands

2020-12-05 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33671:


Assignee: (was: Apache Spark)

> Remove VIEW checks from V1 table commands
> -
>
> Key: SPARK-33671
> URL: https://issues.apache.org/jira/browse/SPARK-33671
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.2, 3.1.0
>Reporter: Maxim Gekk
>Priority: Major
>
> Checking of VIEWs is performed earlier, see 
> https://github.com/apache/spark/pull/30461 . So, the checks can be removed 
> from some V1 commands.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33671) Remove VIEW checks from V1 table commands

2020-12-05 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33671:


Assignee: Apache Spark

> Remove VIEW checks from V1 table commands
> -
>
> Key: SPARK-33671
> URL: https://issues.apache.org/jira/browse/SPARK-33671
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.2, 3.1.0
>Reporter: Maxim Gekk
>Assignee: Apache Spark
>Priority: Major
>
> Checking of VIEWs is performed earlier, see 
> https://github.com/apache/spark/pull/30461 . So, the checks can be removed 
> from some V1 commands.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-33671) Remove VIEW checks from V1 table commands

2020-12-05 Thread Maxim Gekk (Jira)

Maxim Gekk created SPARK-33671:
--

 Summary: Remove VIEW checks from V1 table commands
 Key: SPARK-33671
 URL: https://issues.apache.org/jira/browse/SPARK-33671
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.2, 3.1.0
Reporter: Maxim Gekk


Checking of VIEWs is performed earlier, see 
https://github.com/apache/spark/pull/30461 . So, the checks can be removed from 
some V1 commands.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31711) Register the executor source with the metrics system when running in local mode.

2020-12-05 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244497#comment-17244497
 ] 

Apache Spark commented on SPARK-31711:
--

User 'LucaCanali' has created a pull request for this issue:
https://github.com/apache/spark/pull/30619

> Register the executor source with the metrics system when running in local 
> mode.
> 
>
> Key: SPARK-31711
> URL: https://issues.apache.org/jira/browse/SPARK-31711
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Luca Canali
>Assignee: Luca Canali
>Priority: Minor
> Fix For: 3.1.0
>
>
> The Apache Spark metrics system provides many useful insights on the Spark 
> workload. In particular, the executor source metrics 
> (https://github.com/apache/spark/blob/master/docs/monitoring.md#component-instance--executor)
>  provide detailed info, including the number of active tasks, some I/O 
> metrics, and task metrics details. Executor source metrics, contrary to other 
> sources (for example ExecutorMetrics source), are not yet available when 
> running in local mode.
> This JIRA proposes to register the executor source with the Spark metrics 
> system when running in local mode, as this can be very useful when testing 
> and troubleshooting Spark workloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-31711) Register the executor source with the metrics system when running in local mode.

2020-12-05 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-31711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244496#comment-17244496
 ] 

Apache Spark commented on SPARK-31711:
--

User 'LucaCanali' has created a pull request for this issue:
https://github.com/apache/spark/pull/30619

> Register the executor source with the metrics system when running in local 
> mode.
> 
>
> Key: SPARK-31711
> URL: https://issues.apache.org/jira/browse/SPARK-31711
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Luca Canali
>Assignee: Luca Canali
>Priority: Minor
> Fix For: 3.1.0
>
>
> The Apache Spark metrics system provides many useful insights on the Spark 
> workload. In particular, the executor source metrics 
> (https://github.com/apache/spark/blob/master/docs/monitoring.md#component-instance--executor)
>  provide detailed info, including the number of active tasks, some I/O 
> metrics, and task metrics details. Executor source metrics, contrary to other 
> sources (for example ExecutorMetrics source), are not yet available when 
> running in local mode.
> This JIRA proposes to register the executor source with the Spark metrics 
> system when running in local mode, as this can be very useful when testing 
> and troubleshooting Spark workloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33670) Verify the partition provider is Hive in v1 SHOW TABLE EXTENDED

2020-12-05 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244486#comment-17244486
 ] 

Apache Spark commented on SPARK-33670:
--

User 'MaxGekk' has created a pull request for this issue:
https://github.com/apache/spark/pull/30618

> Verify the partition provider is Hive in v1 SHOW TABLE EXTENDED
> ---
>
> Key: SPARK-33670
> URL: https://issues.apache.org/jira/browse/SPARK-33670
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.2, 3.1.0
>Reporter: Maxim Gekk
>Priority: Major
>
> Invoke the check verifyPartitionProviderIsHive() from v1 implementation of 
> SHOW TABLE EXTENDED.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33670) Verify the partition provider is Hive in v1 SHOW TABLE EXTENDED

2020-12-05 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33670:


Assignee: (was: Apache Spark)

> Verify the partition provider is Hive in v1 SHOW TABLE EXTENDED
> ---
>
> Key: SPARK-33670
> URL: https://issues.apache.org/jira/browse/SPARK-33670
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.2, 3.1.0
>Reporter: Maxim Gekk
>Priority: Major
>
> Invoke the check verifyPartitionProviderIsHive() from v1 implementation of 
> SHOW TABLE EXTENDED.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33670) Verify the partition provider is Hive in v1 SHOW TABLE EXTENDED

2020-12-05 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33670:


Assignee: Apache Spark

> Verify the partition provider is Hive in v1 SHOW TABLE EXTENDED
> ---
>
> Key: SPARK-33670
> URL: https://issues.apache.org/jira/browse/SPARK-33670
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.2, 3.1.0
>Reporter: Maxim Gekk
>Assignee: Apache Spark
>Priority: Major
>
> Invoke the check verifyPartitionProviderIsHive() from v1 implementation of 
> SHOW TABLE EXTENDED.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33670) Verify the partition provider is Hive in v1 SHOW TABLE EXTENDED

2020-12-05 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33670:


Assignee: Apache Spark

> Verify the partition provider is Hive in v1 SHOW TABLE EXTENDED
> ---
>
> Key: SPARK-33670
> URL: https://issues.apache.org/jira/browse/SPARK-33670
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.2, 3.1.0
>Reporter: Maxim Gekk
>Assignee: Apache Spark
>Priority: Major
>
> Invoke the check verifyPartitionProviderIsHive() from v1 implementation of 
> SHOW TABLE EXTENDED.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-33670) Verify the partition provider is Hive in v1 SHOW TABLE EXTENDED

2020-12-05 Thread Maxim Gekk (Jira)

Maxim Gekk created SPARK-33670:
--

 Summary: Verify the partition provider is Hive in v1 SHOW TABLE 
EXTENDED
 Key: SPARK-33670
 URL: https://issues.apache.org/jira/browse/SPARK-33670
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.0.2, 3.1.0
Reporter: Maxim Gekk


Invoke the check verifyPartitionProviderIsHive() from v1 implementation of SHOW 
TABLE EXTENDED.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33669) Wrong error message from YARN application state monitor when sc.stop in yarn client mode

2020-12-05 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244456#comment-17244456
 ] 

Apache Spark commented on SPARK-33669:
--

User 'sqlwindspeaker' has created a pull request for this issue:
https://github.com/apache/spark/pull/30617

> Wrong error message from YARN application state monitor when sc.stop in yarn 
> client mode
> 
>
> Key: SPARK-33669
> URL: https://issues.apache.org/jira/browse/SPARK-33669
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.4.3, 3.0.1
>Reporter: Su Qilong
>Priority: Minor
>
> For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries 
> to interrupt Yarn application monitor thread. In MonitorThread.run() it 
> catches InterruptedException to gracefully response to stopping request.
> But client.monitorApplication method also throws InterruptedIOException when 
> the hadoop rpc call is calling. In this case, MonitorThread will not know it 
> is interrupted, a Yarn App failed is returned with "Failed to contact YARN 
> for application x;  YARN application has exited unexpectedly with state 
> x" is logged with error level. which confuse user a lot.
> We Should take considerate InterruptedIOException here to make it the same 
> behavior with InterruptedException.
> {code:java}
> private class MonitorThread extends Thread {
>   private var allowInterrupt = true
>   override def run() {
> try {
>   val YarnAppReport(_, state, diags) =
> client.monitorApplication(appId.get, logApplicationReport = false)
>   logError(s"YARN application has exited unexpectedly with state $state! 
> " +
> "Check the YARN application logs for more details.")
>   diags.foreach { err =>
> logError(s"Diagnostics message: $err")
>   }
>   allowInterrupt = false
>   sc.stop()
> } catch {
>   case e: InterruptedException => logInfo("Interrupting monitor thread")
> }
>   }
>   
> {code}
> {code:java}
> // wrong error message
> 2020-12-05 03:06:58,000 ERROR [YARN application state monitor]: 
> org.apache.spark.deploy.yarn.Client(91) - Failed to contact YARN for 
> application application_1605868815011_1154961. 
> java.io.InterruptedIOException: Call interrupted
> at org.apache.hadoop.ipc.Client.call(Client.java:1466)
> at org.apache.hadoop.ipc.Client.call(Client.java:1409)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
> at com.sun.proxy.$Proxy38.getApplicationReport(Unknown Source)
> at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:187)
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> at com.sun.proxy.$Proxy39.getApplicationReport(Unknown Source)
> at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:408)
> at 
> org.apache.spark.deploy.yarn.Client.getApplicationReport(Client.scala:327)
> at 
> org.apache.spark.deploy.yarn.Client.monitorApplication(Client.scala:1039)
> at 
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:116)
> 2020-12-05 03:06:58,000 ERROR [YARN application state monitor]: 
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend(70) - YARN 
> application has exited unexpectedly with state FAILED! Check the YARN 
> application logs for more details. 
> 2020-12-05 03:06:58,001 ERROR [YARN application state monitor]: 
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend(70) - 
> Diagnostics message: Failed to contact YARN for application 
> application_1605868815011_1154961.
> {code}
>  
> {code:java}
> // hadoop ipc code
> public Writable call(RPC.RpcKind rpcKind, Writable rpcRequest,
> ConnectionId remoteId, int serviceClass,
> AtomicBoolean fallbackToSimpleAuth) throws IOException {
>   final Call call = createCall(rpcKind, rpcRequest);
>   Connection connection = getConnection(remoteId, call, serviceClass,
> fallbackToSimpleAuth);
>   try {
> connection.sendRpcRequest(call); // send the rpc request
>   } catch (RejectedExecutionException e) {
> throw new IOException("connection has been

[jira] [Commented] (SPARK-33669) Wrong error message from YARN application state monitor when sc.stop in yarn client mode

2020-12-05 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244455#comment-17244455
 ] 

Apache Spark commented on SPARK-33669:
--

User 'sqlwindspeaker' has created a pull request for this issue:
https://github.com/apache/spark/pull/30617

> Wrong error message from YARN application state monitor when sc.stop in yarn 
> client mode
> 
>
> Key: SPARK-33669
> URL: https://issues.apache.org/jira/browse/SPARK-33669
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.4.3, 3.0.1
>Reporter: Su Qilong
>Priority: Minor
>
> For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries 
> to interrupt Yarn application monitor thread. In MonitorThread.run() it 
> catches InterruptedException to gracefully response to stopping request.
> But client.monitorApplication method also throws InterruptedIOException when 
> the hadoop rpc call is calling. In this case, MonitorThread will not know it 
> is interrupted, a Yarn App failed is returned with "Failed to contact YARN 
> for application x;  YARN application has exited unexpectedly with state 
> x" is logged with error level. which confuse user a lot.
> We Should take considerate InterruptedIOException here to make it the same 
> behavior with InterruptedException.
> {code:java}
> private class MonitorThread extends Thread {
>   private var allowInterrupt = true
>   override def run() {
> try {
>   val YarnAppReport(_, state, diags) =
> client.monitorApplication(appId.get, logApplicationReport = false)
>   logError(s"YARN application has exited unexpectedly with state $state! 
> " +
> "Check the YARN application logs for more details.")
>   diags.foreach { err =>
> logError(s"Diagnostics message: $err")
>   }
>   allowInterrupt = false
>   sc.stop()
> } catch {
>   case e: InterruptedException => logInfo("Interrupting monitor thread")
> }
>   }
>   
> {code}
> {code:java}
> // wrong error message
> 2020-12-05 03:06:58,000 ERROR [YARN application state monitor]: 
> org.apache.spark.deploy.yarn.Client(91) - Failed to contact YARN for 
> application application_1605868815011_1154961. 
> java.io.InterruptedIOException: Call interrupted
> at org.apache.hadoop.ipc.Client.call(Client.java:1466)
> at org.apache.hadoop.ipc.Client.call(Client.java:1409)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
> at com.sun.proxy.$Proxy38.getApplicationReport(Unknown Source)
> at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:187)
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> at com.sun.proxy.$Proxy39.getApplicationReport(Unknown Source)
> at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:408)
> at 
> org.apache.spark.deploy.yarn.Client.getApplicationReport(Client.scala:327)
> at 
> org.apache.spark.deploy.yarn.Client.monitorApplication(Client.scala:1039)
> at 
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:116)
> 2020-12-05 03:06:58,000 ERROR [YARN application state monitor]: 
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend(70) - YARN 
> application has exited unexpectedly with state FAILED! Check the YARN 
> application logs for more details. 
> 2020-12-05 03:06:58,001 ERROR [YARN application state monitor]: 
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend(70) - 
> Diagnostics message: Failed to contact YARN for application 
> application_1605868815011_1154961.
> {code}
>  
> {code:java}
> // hadoop ipc code
> public Writable call(RPC.RpcKind rpcKind, Writable rpcRequest,
> ConnectionId remoteId, int serviceClass,
> AtomicBoolean fallbackToSimpleAuth) throws IOException {
>   final Call call = createCall(rpcKind, rpcRequest);
>   Connection connection = getConnection(remoteId, call, serviceClass,
> fallbackToSimpleAuth);
>   try {
> connection.sendRpcRequest(call); // send the rpc request
>   } catch (RejectedExecutionException e) {
> throw new IOException("connection has been

[jira] [Assigned] (SPARK-33669) Wrong error message from YARN application state monitor when sc.stop in yarn client mode

2020-12-05 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33669:


Assignee: (was: Apache Spark)

> Wrong error message from YARN application state monitor when sc.stop in yarn 
> client mode
> 
>
> Key: SPARK-33669
> URL: https://issues.apache.org/jira/browse/SPARK-33669
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.4.3, 3.0.1
>Reporter: Su Qilong
>Priority: Minor
>
> For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries 
> to interrupt Yarn application monitor thread. In MonitorThread.run() it 
> catches InterruptedException to gracefully response to stopping request.
> But client.monitorApplication method also throws InterruptedIOException when 
> the hadoop rpc call is calling. In this case, MonitorThread will not know it 
> is interrupted, a Yarn App failed is returned with "Failed to contact YARN 
> for application x;  YARN application has exited unexpectedly with state 
> x" is logged with error level. which confuse user a lot.
> We Should take considerate InterruptedIOException here to make it the same 
> behavior with InterruptedException.
> {code:java}
> private class MonitorThread extends Thread {
>   private var allowInterrupt = true
>   override def run() {
> try {
>   val YarnAppReport(_, state, diags) =
> client.monitorApplication(appId.get, logApplicationReport = false)
>   logError(s"YARN application has exited unexpectedly with state $state! 
> " +
> "Check the YARN application logs for more details.")
>   diags.foreach { err =>
> logError(s"Diagnostics message: $err")
>   }
>   allowInterrupt = false
>   sc.stop()
> } catch {
>   case e: InterruptedException => logInfo("Interrupting monitor thread")
> }
>   }
>   
> {code}
> {code:java}
> // wrong error message
> 2020-12-05 03:06:58,000 ERROR [YARN application state monitor]: 
> org.apache.spark.deploy.yarn.Client(91) - Failed to contact YARN for 
> application application_1605868815011_1154961. 
> java.io.InterruptedIOException: Call interrupted
> at org.apache.hadoop.ipc.Client.call(Client.java:1466)
> at org.apache.hadoop.ipc.Client.call(Client.java:1409)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
> at com.sun.proxy.$Proxy38.getApplicationReport(Unknown Source)
> at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:187)
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> at com.sun.proxy.$Proxy39.getApplicationReport(Unknown Source)
> at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:408)
> at 
> org.apache.spark.deploy.yarn.Client.getApplicationReport(Client.scala:327)
> at 
> org.apache.spark.deploy.yarn.Client.monitorApplication(Client.scala:1039)
> at 
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:116)
> 2020-12-05 03:06:58,000 ERROR [YARN application state monitor]: 
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend(70) - YARN 
> application has exited unexpectedly with state FAILED! Check the YARN 
> application logs for more details. 
> 2020-12-05 03:06:58,001 ERROR [YARN application state monitor]: 
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend(70) - 
> Diagnostics message: Failed to contact YARN for application 
> application_1605868815011_1154961.
> {code}
>  
> {code:java}
> // hadoop ipc code
> public Writable call(RPC.RpcKind rpcKind, Writable rpcRequest,
> ConnectionId remoteId, int serviceClass,
> AtomicBoolean fallbackToSimpleAuth) throws IOException {
>   final Call call = createCall(rpcKind, rpcRequest);
>   Connection connection = getConnection(remoteId, call, serviceClass,
> fallbackToSimpleAuth);
>   try {
> connection.sendRpcRequest(call); // send the rpc request
>   } catch (RejectedExecutionException e) {
> throw new IOException("connection has been closed", e);
>   } catch (InterruptedException e) {
> Thread.currentThread().interrupt();
> LOG.warn("interrupte

[jira] [Assigned] (SPARK-33669) Wrong error message from YARN application state monitor when sc.stop in yarn client mode

2020-12-05 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33669:


Assignee: Apache Spark

> Wrong error message from YARN application state monitor when sc.stop in yarn 
> client mode
> 
>
> Key: SPARK-33669
> URL: https://issues.apache.org/jira/browse/SPARK-33669
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.4.3, 3.0.1
>Reporter: Su Qilong
>Assignee: Apache Spark
>Priority: Minor
>
> For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries 
> to interrupt Yarn application monitor thread. In MonitorThread.run() it 
> catches InterruptedException to gracefully response to stopping request.
> But client.monitorApplication method also throws InterruptedIOException when 
> the hadoop rpc call is calling. In this case, MonitorThread will not know it 
> is interrupted, a Yarn App failed is returned with "Failed to contact YARN 
> for application x;  YARN application has exited unexpectedly with state 
> x" is logged with error level. which confuse user a lot.
> We Should take considerate InterruptedIOException here to make it the same 
> behavior with InterruptedException.
> {code:java}
> private class MonitorThread extends Thread {
>   private var allowInterrupt = true
>   override def run() {
> try {
>   val YarnAppReport(_, state, diags) =
> client.monitorApplication(appId.get, logApplicationReport = false)
>   logError(s"YARN application has exited unexpectedly with state $state! 
> " +
> "Check the YARN application logs for more details.")
>   diags.foreach { err =>
> logError(s"Diagnostics message: $err")
>   }
>   allowInterrupt = false
>   sc.stop()
> } catch {
>   case e: InterruptedException => logInfo("Interrupting monitor thread")
> }
>   }
>   
> {code}
> {code:java}
> // wrong error message
> 2020-12-05 03:06:58,000 ERROR [YARN application state monitor]: 
> org.apache.spark.deploy.yarn.Client(91) - Failed to contact YARN for 
> application application_1605868815011_1154961. 
> java.io.InterruptedIOException: Call interrupted
> at org.apache.hadoop.ipc.Client.call(Client.java:1466)
> at org.apache.hadoop.ipc.Client.call(Client.java:1409)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
> at com.sun.proxy.$Proxy38.getApplicationReport(Unknown Source)
> at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:187)
> at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> at com.sun.proxy.$Proxy39.getApplicationReport(Unknown Source)
> at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:408)
> at 
> org.apache.spark.deploy.yarn.Client.getApplicationReport(Client.scala:327)
> at 
> org.apache.spark.deploy.yarn.Client.monitorApplication(Client.scala:1039)
> at 
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:116)
> 2020-12-05 03:06:58,000 ERROR [YARN application state monitor]: 
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend(70) - YARN 
> application has exited unexpectedly with state FAILED! Check the YARN 
> application logs for more details. 
> 2020-12-05 03:06:58,001 ERROR [YARN application state monitor]: 
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend(70) - 
> Diagnostics message: Failed to contact YARN for application 
> application_1605868815011_1154961.
> {code}
>  
> {code:java}
> // hadoop ipc code
> public Writable call(RPC.RpcKind rpcKind, Writable rpcRequest,
> ConnectionId remoteId, int serviceClass,
> AtomicBoolean fallbackToSimpleAuth) throws IOException {
>   final Call call = createCall(rpcKind, rpcRequest);
>   Connection connection = getConnection(remoteId, call, serviceClass,
> fallbackToSimpleAuth);
>   try {
> connection.sendRpcRequest(call); // send the rpc request
>   } catch (RejectedExecutionException e) {
> throw new IOException("connection has been closed", e);
>   } catch (InterruptedException e) {
> Thread.currentThread().interrupt();
>

[jira] [Updated] (SPARK-33669) Wrong error message from YARN application state monitor when sc.stop in yarn client mode

2020-12-05 Thread Su Qilong (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Su Qilong updated SPARK-33669:
--
Description: 
For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries 
to interrupt Yarn application monitor thread. In MonitorThread.run() it catches 
InterruptedException to gracefully response to stopping request.

But client.monitorApplication method also throws InterruptedIOException when 
the hadoop rpc call is calling. In this case, MonitorThread will not know it is 
interrupted, a Yarn App failed is returned with "Failed to contact YARN for 
application x;  YARN application has exited unexpectedly with state x" 
is logged with error level. which confuse user a lot.

We Should take considerate InterruptedIOException here to make it the same 
behavior with InterruptedException.
{code:java}
private class MonitorThread extends Thread {
  private var allowInterrupt = true

  override def run() {
try {
  val YarnAppReport(_, state, diags) =
client.monitorApplication(appId.get, logApplicationReport = false)
  logError(s"YARN application has exited unexpectedly with state $state! " +
"Check the YARN application logs for more details.")
  diags.foreach { err =>
logError(s"Diagnostics message: $err")
  }
  allowInterrupt = false
  sc.stop()
} catch {
  case e: InterruptedException => logInfo("Interrupting monitor thread")
}
  }

  
{code}
{code:java}
// wrong error message
2020-12-05 03:06:58,000 ERROR [YARN application state monitor]: 
org.apache.spark.deploy.yarn.Client(91) - Failed to contact YARN for 
application application_1605868815011_1154961. 
java.io.InterruptedIOException: Call interrupted
at org.apache.hadoop.ipc.Client.call(Client.java:1466)
at org.apache.hadoop.ipc.Client.call(Client.java:1409)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy38.getApplicationReport(Unknown Source)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:187)
at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy39.getApplicationReport(Unknown Source)
at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:408)
at 
org.apache.spark.deploy.yarn.Client.getApplicationReport(Client.scala:327)
at 
org.apache.spark.deploy.yarn.Client.monitorApplication(Client.scala:1039)
at 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:116)
2020-12-05 03:06:58,000 ERROR [YARN application state monitor]: 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend(70) - YARN 
application has exited unexpectedly with state FAILED! Check the YARN 
application logs for more details. 

2020-12-05 03:06:58,001 ERROR [YARN application state monitor]: 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend(70) - Diagnostics 
message: Failed to contact YARN for application 
application_1605868815011_1154961.

{code}
 
{code:java}
// hadoop ipc code
public Writable call(RPC.RpcKind rpcKind, Writable rpcRequest,
ConnectionId remoteId, int serviceClass,
AtomicBoolean fallbackToSimpleAuth) throws IOException {
  final Call call = createCall(rpcKind, rpcRequest);
  Connection connection = getConnection(remoteId, call, serviceClass,
fallbackToSimpleAuth);
  try {
connection.sendRpcRequest(call); // send the rpc request
  } catch (RejectedExecutionException e) {
throw new IOException("connection has been closed", e);
  } catch (InterruptedException e) {
Thread.currentThread().interrupt();
LOG.warn("interrupted waiting to send rpc request to server", e);
throw new IOException(e);
  }

  synchronized (call) {
while (!call.done) {
  try {
call.wait();   // wait for the result
  } catch (InterruptedException ie) {
Thread.currentThread().interrupt();
throw new InterruptedIOException("Call interrupted");
  }
}
{code}
 

  was:
For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries 
to interrupt Yarn application monitor thread. In MonitorThread.run() it catches 
InterruptedException to gracefully response to stopping request.

But client.monitorApplication method also throws InterruptedIOE

[jira] [Updated] (SPARK-33669) Wrong error message from YARN application state monitor when sc.stop in yarn client mode

2020-12-05 Thread Su Qilong (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Su Qilong updated SPARK-33669:
--
Description: 
For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries 
to interrupt Yarn application monitor thread. In MonitorThread.run() it catches 
InterruptedException to gracefully response to stopping request.

But client.monitorApplication method also throws InterruptedIOException when 
the hadoop rpc call is calling. In this case, MonitorThread will not know it is 
interrupted, a Yarn App failed is returned with "Failed to contact YARN for 
application x;  YARN application has exited unexpectedly with state x" 
is logged with error level. which confuse user a lot.

We Should take considerate InterruptedIOException here to make it the same 
behavior with InterruptedException.
{code:java}
private class MonitorThread extends Thread {
  private var allowInterrupt = true

  override def run() {
try {
  val YarnAppReport(_, state, diags) =
client.monitorApplication(appId.get, logApplicationReport = false)
  logError(s"YARN application has exited unexpectedly with state $state! " +
"Check the YARN application logs for more details.")
  diags.foreach { err =>
logError(s"Diagnostics message: $err")
  }
  allowInterrupt = false
  sc.stop()
} catch {
  case e: InterruptedException => logInfo("Interrupting monitor thread")
}
  }

  
{code}
{code:java}
2020-12-05 03:06:58,000 ERROR [YARN application state monitor]: 
org.apache.spark.deploy.yarn.Client(91) - Failed to contact YARN for 
application application_1605868815011_1154961. 
java.io.InterruptedIOException: Call interrupted
at org.apache.hadoop.ipc.Client.call(Client.java:1466)
at org.apache.hadoop.ipc.Client.call(Client.java:1409)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy38.getApplicationReport(Unknown Source)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:187)
at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy39.getApplicationReport(Unknown Source)
at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:408)
at 
org.apache.spark.deploy.yarn.Client.getApplicationReport(Client.scala:327)
at 
org.apache.spark.deploy.yarn.Client.monitorApplication(Client.scala:1039)
at 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:116)
2020-12-05 03:06:58,000 ERROR [YARN application state monitor]: 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend(70) - YARN 
application has exited unexpectedly with state FAILED! Check the YARN 
application logs for more details. 

2020-12-05 03:06:58,001 ERROR [YARN application state monitor]: 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend(70) - Diagnostics 
message: Failed to contact YARN for application 
application_1605868815011_1154961.

{code}
 
{code:java}
// hadoop ipc code
public Writable call(RPC.RpcKind rpcKind, Writable rpcRequest,
ConnectionId remoteId, int serviceClass,
AtomicBoolean fallbackToSimpleAuth) throws IOException {
  final Call call = createCall(rpcKind, rpcRequest);
  Connection connection = getConnection(remoteId, call, serviceClass,
fallbackToSimpleAuth);
  try {
connection.sendRpcRequest(call); // send the rpc request
  } catch (RejectedExecutionException e) {
throw new IOException("connection has been closed", e);
  } catch (InterruptedException e) {
Thread.currentThread().interrupt();
LOG.warn("interrupted waiting to send rpc request to server", e);
throw new IOException(e);
  }

  synchronized (call) {
while (!call.done) {
  try {
call.wait();   // wait for the result
  } catch (InterruptedException ie) {
Thread.currentThread().interrupt();
throw new InterruptedIOException("Call interrupted");
  }
}
{code}
 

  was:
For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries 
to interrupt Yarn application monitor thread. In MonitorThread.run() it catches 
InterruptedException to gracefully response to stopping request.

But client.monitorApplication method also throws InterruptedIOException when 
the hado

[jira] [Updated] (SPARK-33669) Wrong error message from YARN application state monitor when sc.stop in yarn client mode

2020-12-05 Thread Su Qilong (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Su Qilong updated SPARK-33669:
--
Description: 
For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries 
to interrupt Yarn application monitor thread. In MonitorThread.run() it catches 
InterruptedException to gracefully response to stopping request.

But client.monitorApplication method also throws InterruptedIOException when 
the hadoop rpc call is calling. In this case, MonitorThread will not know it is 
interrupted, a Yarn App failed is returned with "Failed to contact YARN for 
application x;  YARN application has exited unexpectedly with state x" 
is logged with error level. which confuse user a lot.

We Should take considerate InterruptedIOException here to make it the same 
behavior with InterruptedException.
{code:java}
private class MonitorThread extends Thread {
  private var allowInterrupt = true

  override def run() {
try {
  val YarnAppReport(_, state, diags) =
client.monitorApplication(appId.get, logApplicationReport = false)
  logError(s"YARN application has exited unexpectedly with state $state! " +
"Check the YARN application logs for more details.")
  diags.foreach { err =>
logError(s"Diagnostics message: $err")
  }
  allowInterrupt = false
  sc.stop()
} catch {
  case e: InterruptedException => logInfo("Interrupting monitor thread")
}
  }

  
{code}
{code:java}
2020-12-05 03:06:58,000 ERROR [YARN application state monitor]: 
org.apache.spark.deploy.yarn.Client(91) - Failed to contact YARN for 
application application_1605868815011_1154961. - 
sessionId[99e46a14-7995-41da-ba0a-c4c7387728a4] java.io.InterruptedIOException: 
Call interrupted at org.apache.hadoop.ipc.Client.call(Client.java:1466) at 
org.apache.hadoop.ipc.Client.call(Client.java:1409) at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
 at com.sun.proxy.$Proxy38.getApplicationReport(Unknown Source) at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:187)
 at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
 at com.sun.proxy.$Proxy39.getApplicationReport(Unknown Source) at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:408)
 at org.apache.spark.deploy.yarn.Client.getApplicationReport(Client.scala:327) 
at org.apache.spark.deploy.yarn.Client.monitorApplication(Client.scala:1039) at 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:116)
 2020-12-05 03:06:58,000 ERROR [YARN application state monitor]: 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend(70) - YARN 
application has exited unexpectedly with state FAILED! Check the YARN 
application logs for more details. - 
sessionId[99e46a14-7995-41da-ba0a-c4c7387728a4] 2020-12-05 03:06:58,000 INFO 
[FrontendService-Handler-Pool: Thread-6560]: 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend(54) - Shutting 
down all executors - sessionId[99e46a14-7995-41da-ba0a-c4c7387728a4] 2020-12-05 
03:06:58,001 ERROR [YARN application state monitor]: 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend(70) - Diagnostics 
message: Failed to contact YARN for application 
application_1605868815011_1154961.
{code}

  was:
For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries 
to interrupt Yarn application monitor thread. In MonitorThread.run() it catches 
InterruptedException to gracefully response to stopping request.

But client.monitorApplication method also throws InterruptedIOException when 
the hadoop rpc call is calling. In this case, MonitorThread will not know it is 
interrupted, a Yarn App failed is returned with "Failed to contact YARN for 
application x;  YARN application has exited unexpectedly with state x" 
is logged with error level. which confuse user a lot.

We Should take considerate InterruptedIOException here to make it the same 
behavior with InterruptedException.
{code:java}
private class MonitorThread extends Thread {
  private var allowInterrupt = true

  override def run() {
try {
  val YarnAppReport(_, state, diags) =
client.monitorApplication(appId.get, logApplicationReport = false)
  logError(s"YARN application has exited unexpectedly with state $state! " +
"Check the YARN application logs for more details.")
  diags.foreach { err =>

[jira] [Updated] (SPARK-33669) Wrong error message from YARN application state monitor when sc.stop in yarn client mode

2020-12-05 Thread Su Qilong (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Su Qilong updated SPARK-33669:
--
Description: 
For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries 
to interrupt Yarn application monitor thread. In MonitorThread.run() it catches 
InterruptedException to gracefully response to stopping request.

But client.monitorApplication method also throws InterruptedIOException when 
the hadoop rpc call is calling. In this case, MonitorThread will not know it is 
interrupted, a Yarn App failed is returned with "Failed to contact YARN for 
application x;  YARN application has exited unexpectedly with state x" 
is logged with error level. which confuse user a lot.

We Should take considerate InterruptedIOException here to make it the same 
behavior with InterruptedException.
{code:java}
private class MonitorThread extends Thread {
  private var allowInterrupt = true

  override def run() {
try {
  val YarnAppReport(_, state, diags) =
client.monitorApplication(appId.get, logApplicationReport = false)
  logError(s"YARN application has exited unexpectedly with state $state! " +
"Check the YARN application logs for more details.")
  diags.foreach { err =>
logError(s"Diagnostics message: $err")
  }
  allowInterrupt = false
  sc.stop()
} catch {
  case e: InterruptedException => logInfo("Interrupting monitor thread")
}
  }

{code}
{noformat}
// error message 2020-12-05 03:06:58,000 ERROR [YARN application state 
monitor]: org.apache.spark.deploy.yarn.Client(91) - Failed to contact YARN for 
application application_1605868815011_1154961. java.io.InterruptedIOException: 
Call interrupted at org.apache.hadoop.ipc.Client.call(Client.java:1466) at 
org.apache.hadoop.ipc.Client.call(Client.java:1409) at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
 at com.sun.proxy.$Proxy38.getApplicationReport(Unknown Source) at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:187)
 at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
 at com.sun.proxy.$Proxy39.getApplicationReport(Unknown Source) at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:408)
 at org.apache.spark.deploy.yarn.Client.getApplicationReport(Client.scala:327) 
at org.apache.spark.deploy.yarn.Client.monitorApplication(Client.scala:1039) at 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend$MonitorThread.run(YarnClientSchedulerBackend.scala:116)
 2020-12-05 03:06:58,000 ERROR [YARN application state monitor]: 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend(70) - YARN 
application has exited unexpectedly with state FAILED! Check the YARN 
application logs for more details. 2020-12-05 03:06:58,001 ERROR [YARN 
application state monitor]: 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend(70) - Diagnostics 
message: Failed to contact YARN for application 
application_1605868815011_1154961.  
{noformat}

  was:
For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries 
to interrupt Yarn application monitor thread. In MonitorThread.run() it catches 
InterruptedException to gracefully response to stopping request.

But client.monitorApplication method also throws InterruptedIOException when 
the hadoop rpc call is calling. In this case, MonitorThread will not know it is 
interrupted, a Yarn App failed is returned with "Failed to contact YARN for 
application x;  YARN application has exited unexpectedly with state x" 
is logged with error level. which confuse user a lot.

We Should take considerate InterruptedIOException here to make it the same 
behavior with InterruptedException.
{code:java}
// code placeholder
private class MonitorThread extends Thread {
  private var allowInterrupt = true

  override def run() {
try {
  val YarnAppReport(_, state, diags) =
client.monitorApplication(appId.get, logApplicationReport = false)
  logError(s"YARN application has exited unexpectedly with state $state! " +
"Check the YARN application logs for more details.")
  diags.foreach { err =>
logError(s"Diagnostics message: $err")
  }
  allowInterrupt = false
  sc.stop()
} catch {
  case e: InterruptedException => logInfo("Interrupting monitor thread")
}
  }
{code}
 


> Wrong error message from YARN application state monitor when sc.stop in y

[jira] [Updated] (SPARK-33669) Wrong error message from YARN application state monitor when sc.stop in yarn client mode

2020-12-05 Thread Su Qilong (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Su Qilong updated SPARK-33669:
--
Description: 
For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries 
to interrupt Yarn application monitor thread. In MonitorThread.run() it catches 
InterruptedException to gracefully response to stopping request.

But client.monitorApplication method also throws InterruptedIOException when 
the hadoop rpc call is calling. In this case, MonitorThread will not know it is 
interrupted, a Yarn App failed is returned with "Failed to contact YARN for 
application x;  YARN application has exited unexpectedly with state x" 
is logged with error level. which confuse user a lot.

We Should take considerate InterruptedIOException here to make it the same 
behavior with InterruptedException.
{code:java}
// code placeholder
private class MonitorThread extends Thread {
  private var allowInterrupt = true

  override def run() {
try {
  val YarnAppReport(_, state, diags) =
client.monitorApplication(appId.get, logApplicationReport = false)
  logError(s"YARN application has exited unexpectedly with state $state! " +
"Check the YARN application logs for more details.")
  diags.foreach { err =>
logError(s"Diagnostics message: $err")
  }
  allowInterrupt = false
  sc.stop()
} catch {
  case e: InterruptedException => logInfo("Interrupting monitor thread")
}
  }
{code}
 

  was:
For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries 
to interrupt Yarn application monitor thread. In MonitorThread.run() it catches 
InterruptedException to gracefully response to stopping request.

But client.monitorApplication method also throws InterruptedIOException when 
the hadoop rpc call is calling. In this case, MonitorThread will not know it is 
interrupted, a Yarn App failed is returned with "Failed to contact YARN for 
application x;  YARN application has exited unexpectedly with state x" 
is logged with error level. which confuse user a lot.

We Should take considerate InterruptedIOException here to make it the same 
behavior with InterruptedException.


> Wrong error message from YARN application state monitor when sc.stop in yarn 
> client mode
> 
>
> Key: SPARK-33669
> URL: https://issues.apache.org/jira/browse/SPARK-33669
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.4.3, 3.0.1
>Reporter: Su Qilong
>Priority: Minor
>
> For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries 
> to interrupt Yarn application monitor thread. In MonitorThread.run() it 
> catches InterruptedException to gracefully response to stopping request.
> But client.monitorApplication method also throws InterruptedIOException when 
> the hadoop rpc call is calling. In this case, MonitorThread will not know it 
> is interrupted, a Yarn App failed is returned with "Failed to contact YARN 
> for application x;  YARN application has exited unexpectedly with state 
> x" is logged with error level. which confuse user a lot.
> We Should take considerate InterruptedIOException here to make it the same 
> behavior with InterruptedException.
> {code:java}
> // code placeholder
> private class MonitorThread extends Thread {
>   private var allowInterrupt = true
>   override def run() {
> try {
>   val YarnAppReport(_, state, diags) =
> client.monitorApplication(appId.get, logApplicationReport = false)
>   logError(s"YARN application has exited unexpectedly with state $state! 
> " +
> "Check the YARN application logs for more details.")
>   diags.foreach { err =>
> logError(s"Diagnostics message: $err")
>   }
>   allowInterrupt = false
>   sc.stop()
> } catch {
>   case e: InterruptedException => logInfo("Interrupting monitor thread")
> }
>   }
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-33669) Wrong error message from YARN application state monitor when sc.stop in yarn client mode

2020-12-05 Thread Su Qilong (Jira)

Su Qilong created SPARK-33669:
-

 Summary: Wrong error message from YARN application state monitor 
when sc.stop in yarn client mode
 Key: SPARK-33669
 URL: https://issues.apache.org/jira/browse/SPARK-33669
 Project: Spark
  Issue Type: Bug
  Components: YARN
Affects Versions: 3.0.1, 2.4.3
Reporter: Su Qilong


For YarnClient mode, when stopping YarnClientSchedulerBackend, it first tries 
to interrupt Yarn application monitor thread. In MonitorThread.run() it catches 
InterruptedException to gracefully response to stopping request.

But client.monitorApplication method also throws InterruptedIOException when 
the hadoop rpc call is calling. In this case, MonitorThread will not know it is 
interrupted, a Yarn App failed is returned with "Failed to contact YARN for 
application x;  YARN application has exited unexpectedly with state x" 
is logged with error level. which confuse user a lot.

We Should take considerate InterruptedIOException here to make it the same 
behavior with InterruptedException.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33668) Fix flaky test "Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties."

2020-12-05 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33668:


Assignee: (was: Apache Spark)

> Fix flaky test "Verify logging configuration is picked from the provided 
> SPARK_CONF_DIR/log4j.properties."
> --
>
> Key: SPARK-33668
> URL: https://issues.apache.org/jira/browse/SPARK-33668
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Tests
>Affects Versions: 3.1.0
>Reporter: Prashant Sharma
>Priority: Major
>
> The test is flaking, with multiple flaked instances - the reason for the 
> failure has been similar to:
> {code:java}
>   The code passed to eventually never returned normally. Attempted 109 times 
> over 3.007988241397 minutes. Last failure message: Failure executing: GET 
> at: 
> https://192.168.39.167:8443/api/v1/namespaces/b37fc72a991b49baa68a2eaaa1516463/pods/spark-pi-97a9bc76308e7fe3-exec-1/log?pretty=false.
>  Message: pods "spark-pi-97a9bc76308e7fe3-exec-1" not found. Received status: 
> Status(apiVersion=v1, code=404, details=StatusDetails(causes=[], group=null, 
> kind=pods, name=spark-pi-97a9bc76308e7fe3-exec-1, retryAfterSeconds=null, 
> uid=null, additionalProperties={}), kind=Status, message=pods 
> "spark-pi-97a9bc76308e7fe3-exec-1" not found, 
> metadata=ListMeta(_continue=null, remainingItemCount=null, 
> resourceVersion=null, selfLink=null, additionalProperties={}), 
> reason=NotFound, status=Failure, additionalProperties={}).. 
> (KubernetesSuite.scala:402)
> {code}
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36854/console
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36852/console
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36850/console
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36848/console
> From the above failures, it seems, that executor finishes too quickly and is 
> removed by spark before the test can complete. 
> So, in order to mitigate this situation, one way is to turn on the flag
> {code}
>"spark.kubernetes.executor.deleteOnTermination"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-33668) Fix flaky test "Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties."

2020-12-05 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-33668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-33668:


Assignee: Apache Spark

> Fix flaky test "Verify logging configuration is picked from the provided 
> SPARK_CONF_DIR/log4j.properties."
> --
>
> Key: SPARK-33668
> URL: https://issues.apache.org/jira/browse/SPARK-33668
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Tests
>Affects Versions: 3.1.0
>Reporter: Prashant Sharma
>Assignee: Apache Spark
>Priority: Major
>
> The test is flaking, with multiple flaked instances - the reason for the 
> failure has been similar to:
> {code:java}
>   The code passed to eventually never returned normally. Attempted 109 times 
> over 3.007988241397 minutes. Last failure message: Failure executing: GET 
> at: 
> https://192.168.39.167:8443/api/v1/namespaces/b37fc72a991b49baa68a2eaaa1516463/pods/spark-pi-97a9bc76308e7fe3-exec-1/log?pretty=false.
>  Message: pods "spark-pi-97a9bc76308e7fe3-exec-1" not found. Received status: 
> Status(apiVersion=v1, code=404, details=StatusDetails(causes=[], group=null, 
> kind=pods, name=spark-pi-97a9bc76308e7fe3-exec-1, retryAfterSeconds=null, 
> uid=null, additionalProperties={}), kind=Status, message=pods 
> "spark-pi-97a9bc76308e7fe3-exec-1" not found, 
> metadata=ListMeta(_continue=null, remainingItemCount=null, 
> resourceVersion=null, selfLink=null, additionalProperties={}), 
> reason=NotFound, status=Failure, additionalProperties={}).. 
> (KubernetesSuite.scala:402)
> {code}
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36854/console
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36852/console
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36850/console
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36848/console
> From the above failures, it seems, that executor finishes too quickly and is 
> removed by spark before the test can complete. 
> So, in order to mitigate this situation, one way is to turn on the flag
> {code}
>"spark.kubernetes.executor.deleteOnTermination"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33668) Fix flaky test "Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties."

2020-12-05 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244434#comment-17244434
 ] 

Apache Spark commented on SPARK-33668:
--

User 'ScrapCodes' has created a pull request for this issue:
https://github.com/apache/spark/pull/30616

> Fix flaky test "Verify logging configuration is picked from the provided 
> SPARK_CONF_DIR/log4j.properties."
> --
>
> Key: SPARK-33668
> URL: https://issues.apache.org/jira/browse/SPARK-33668
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Tests
>Affects Versions: 3.1.0
>Reporter: Prashant Sharma
>Priority: Major
>
> The test is flaking, with multiple flaked instances - the reason for the 
> failure has been similar to:
> {code:java}
>   The code passed to eventually never returned normally. Attempted 109 times 
> over 3.007988241397 minutes. Last failure message: Failure executing: GET 
> at: 
> https://192.168.39.167:8443/api/v1/namespaces/b37fc72a991b49baa68a2eaaa1516463/pods/spark-pi-97a9bc76308e7fe3-exec-1/log?pretty=false.
>  Message: pods "spark-pi-97a9bc76308e7fe3-exec-1" not found. Received status: 
> Status(apiVersion=v1, code=404, details=StatusDetails(causes=[], group=null, 
> kind=pods, name=spark-pi-97a9bc76308e7fe3-exec-1, retryAfterSeconds=null, 
> uid=null, additionalProperties={}), kind=Status, message=pods 
> "spark-pi-97a9bc76308e7fe3-exec-1" not found, 
> metadata=ListMeta(_continue=null, remainingItemCount=null, 
> resourceVersion=null, selfLink=null, additionalProperties={}), 
> reason=NotFound, status=Failure, additionalProperties={}).. 
> (KubernetesSuite.scala:402)
> {code}
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36854/console
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36852/console
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36850/console
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36848/console
> From the above failures, it seems, that executor finishes too quickly and is 
> removed by spark before the test can complete. 
> So, in order to mitigate this situation, one way is to turn on the flag
> {code}
>"spark.kubernetes.executor.deleteOnTermination"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-33668) Fix flaky test "Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties."

2020-12-05 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-33668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244433#comment-17244433
 ] 

Apache Spark commented on SPARK-33668:
--

User 'ScrapCodes' has created a pull request for this issue:
https://github.com/apache/spark/pull/30616

> Fix flaky test "Verify logging configuration is picked from the provided 
> SPARK_CONF_DIR/log4j.properties."
> --
>
> Key: SPARK-33668
> URL: https://issues.apache.org/jira/browse/SPARK-33668
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Tests
>Affects Versions: 3.1.0
>Reporter: Prashant Sharma
>Priority: Major
>
> The test is flaking, with multiple flaked instances - the reason for the 
> failure has been similar to:
> {code:java}
>   The code passed to eventually never returned normally. Attempted 109 times 
> over 3.007988241397 minutes. Last failure message: Failure executing: GET 
> at: 
> https://192.168.39.167:8443/api/v1/namespaces/b37fc72a991b49baa68a2eaaa1516463/pods/spark-pi-97a9bc76308e7fe3-exec-1/log?pretty=false.
>  Message: pods "spark-pi-97a9bc76308e7fe3-exec-1" not found. Received status: 
> Status(apiVersion=v1, code=404, details=StatusDetails(causes=[], group=null, 
> kind=pods, name=spark-pi-97a9bc76308e7fe3-exec-1, retryAfterSeconds=null, 
> uid=null, additionalProperties={}), kind=Status, message=pods 
> "spark-pi-97a9bc76308e7fe3-exec-1" not found, 
> metadata=ListMeta(_continue=null, remainingItemCount=null, 
> resourceVersion=null, selfLink=null, additionalProperties={}), 
> reason=NotFound, status=Failure, additionalProperties={}).. 
> (KubernetesSuite.scala:402)
> {code}
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36854/console
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36852/console
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36850/console
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/36848/console
> From the above failures, it seems, that executor finishes too quickly and is 
> removed by spark before the test can complete. 
> So, in order to mitigate this situation, one way is to turn on the flag
> {code}
>"spark.kubernetes.executor.deleteOnTermination"
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

66 matches

Mail list logo