[jira] [Commented] (SPARK-26961) Found Java-level deadlock in Spark Driver

2021-01-26 Thread Ajith S (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272272#comment-17272272
 ] 

Ajith S commented on SPARK-26961:
-

Scala issue : https://github.com/scala/bug/issues/11429

> Found Java-level deadlock in Spark Driver
> -
>
> Key: SPARK-26961
> URL: https://issues.apache.org/jira/browse/SPARK-26961
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.0
>Reporter: Rong Jialei
>Assignee: Ajith S
>Priority: Major
> Fix For: 2.3.4, 2.4.2, 3.0.0
>
> Attachments: image-2019-03-13-19-53-52-390.png
>
>
> Our spark job usually will finish in minutes, however, we recently found it 
> take days to run, and we can only kill it when this happened.
> An investigation show all worker container could not connect drive after 
> start, and driver is hanging, using jstack, we found a Java-level deadlock.
>  
> *Jstack output for deadlock part is showing below:*
>  
> Found one Java-level deadlock:
> =
> "SparkUI-907":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> "ForkJoinPool-1-worker-57":
>  waiting to lock monitor 0x7f3860574298 (object 0x0005b7991168, a 
> org.apache.spark.util.MutableURLClassLoader),
>  which is held by "ForkJoinPool-1-worker-7"
> "ForkJoinPool-1-worker-7":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> Java stack information for the threads listed above:
> ===
> "SparkUI-907":
>  at org.apache.hadoop.conf.Configuration.getOverlay(Configuration.java:1328)
>  - waiting to lock <0x0005c0c1e5e0> (a 
> org.apache.hadoop.conf.Configuration)
>  at 
> org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:684)
>  at org.apache.hadoop.conf.Configuration.get(Configuration.java:1088)
>  at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1145)
>  at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2363)
>  at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2840)
>  at 
> org.apache.hadoop.fs.FsUrlStreamHandlerFactory.createURLStreamHandler(FsUrlStreamHandlerFactory.java:74)
>  at java.net.URL.getURLStreamHandler(URL.java:1142)
>  at java.net.URL.(URL.java:599)
>  at java.net.URL.(URL.java:490)
>  at java.net.URL.(URL.java:439)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doRequest(JettyUtils.scala:176)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doGet(JettyUtils.scala:161)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>  at 
> org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
>  at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:171)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>  at 
> org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
>  at 
> org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>  at 
> org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>  at org.spark_project.jetty.server.Server.handle(Server.java:534)
>  at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>  at 
> org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>  at 
> org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>  at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108)
>  at 
> org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>  at java.lang.Thread.run(Thread.java:748)
> "ForkJoinPool-1-work

[jira] [Created] (SPARK-30621) Dynamic Pruning thread propagates the localProperties to task

2020-01-23 Thread Ajith S (Jira)
Ajith S created SPARK-30621:
---

 Summary: Dynamic Pruning thread propagates the localProperties to 
task
 Key: SPARK-30621
 URL: https://issues.apache.org/jira/browse/SPARK-30621
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Ajith S


Local properties set via sparkContext are not available as TaskContext 
properties when executing parallel jobs and threadpools have idle threads

Explanation:
When executing parallel jobs via SubqueryBroadcastExec, the {{relationFuture}} 
is evaluated via a separate thread. The threads inherit the {{localProperties}} 
from sparkContext as they are the child threads.
These threads are controlled via the executionContext (thread pools). Each 
Thread pool has a default {{keepAliveSeconds}} of 60 seconds for idle threads.
Scenarios where the thread pool has threads which are idle and reused for a 
subsequent new query, the thread local properties will not be inherited from 
spark context (thread properties are inherited only on thread creation) hence 
end up having old or no properties set. This will cause taskset properties to 
be missing when properties are transferred by child thread



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30556) Copy sparkContext.localproperties to child thread inSubqueryExec.executionContext

2020-01-23 Thread Ajith S (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021937#comment-17021937
 ] 

Ajith S commented on SPARK-30556:
-

Yes, it exist in lower version like 2.3.x too

> Copy sparkContext.localproperties to child thread 
> inSubqueryExec.executionContext
> -
>
> Key: SPARK-30556
> URL: https://issues.apache.org/jira/browse/SPARK-30556
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.4, 3.0.0
>Reporter: Ajith S
>Assignee: Ajith S
>Priority: Major
> Fix For: 3.0.0
>
>
> Local properties set via sparkContext are not available as TaskContext 
> properties when executing  jobs and threadpools have idle threads which are 
> reused
> Explanation:
> When SubqueryExec, the {{relationFuture}} is evaluated via a separate thread. 
> The threads inherit the {{localProperties}} from sparkContext as they are the 
> child threads.
> These threads are controlled via the executionContext (thread pools). Each 
> Thread pool has a default {{keepAliveSeconds}} of 60 seconds for idle threads.
> Scenarios where the thread pool has threads which are idle and reused for a 
> subsequent new query, the thread local properties will not be inherited from 
> spark context (thread properties are inherited only on thread creation) hence 
> end up having old or no properties set. This will cause taskset properties to 
> be missing when properties are transferred by child thread via 
> {{sparkContext.runJob/submitJob}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30556) Copy sparkContext.localproperties to child thread inSubqueryExec.executionContext

2020-01-23 Thread Ajith S (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021936#comment-17021936
 ] 

Ajith S commented on SPARK-30556:
-

Raised backport PR for branch 2.4 [https://github.com/apache/spark/pull/27340]

> Copy sparkContext.localproperties to child thread 
> inSubqueryExec.executionContext
> -
>
> Key: SPARK-30556
> URL: https://issues.apache.org/jira/browse/SPARK-30556
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.4, 3.0.0
>Reporter: Ajith S
>Assignee: Ajith S
>Priority: Major
> Fix For: 3.0.0
>
>
> Local properties set via sparkContext are not available as TaskContext 
> properties when executing  jobs and threadpools have idle threads which are 
> reused
> Explanation:
> When SubqueryExec, the {{relationFuture}} is evaluated via a separate thread. 
> The threads inherit the {{localProperties}} from sparkContext as they are the 
> child threads.
> These threads are controlled via the executionContext (thread pools). Each 
> Thread pool has a default {{keepAliveSeconds}} of 60 seconds for idle threads.
> Scenarios where the thread pool has threads which are idle and reused for a 
> subsequent new query, the thread local properties will not be inherited from 
> spark context (thread properties are inherited only on thread creation) hence 
> end up having old or no properties set. This will cause taskset properties to 
> be missing when properties are transferred by child thread via 
> {{sparkContext.runJob/submitJob}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-30360) Avoid Redact classpath entries in History Server UI

2020-01-22 Thread Ajith S (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-30360:

Description: Currently SPARK history server display the classpath entries 
in the Environment tab with classpaths redacted, this is because EventLog file 
has the entry values redacted while writing. But when same is seen from a 
running application UI, its seen that it is not redacted. Classpath entries 
redact is not needed and can be avoided  (was: Currently SPARK history server 
display the classpath entries in the Environment tab with classpaths redacted, 
this is because EventLog file has the entry values redacted while writing. But 
when same is seen from a running application UI, its seen that it is not 
redacted. )

> Avoid Redact classpath entries in History Server UI
> ---
>
> Key: SPARK-30360
> URL: https://issues.apache.org/jira/browse/SPARK-30360
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Assignee: Ajith S
>Priority: Major
> Fix For: 3.0.0
>
>
> Currently SPARK history server display the classpath entries in the 
> Environment tab with classpaths redacted, this is because EventLog file has 
> the entry values redacted while writing. But when same is seen from a running 
> application UI, its seen that it is not redacted. Classpath entries redact is 
> not needed and can be avoided



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-30360) Avoid Redact classpath entries in History Server UI

2020-01-22 Thread Ajith S (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-30360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-30360:

Summary: Avoid Redact classpath entries in History Server UI  (was: Redact 
classpath entries in Spark UI)

> Avoid Redact classpath entries in History Server UI
> ---
>
> Key: SPARK-30360
> URL: https://issues.apache.org/jira/browse/SPARK-30360
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Assignee: Ajith S
>Priority: Major
> Fix For: 3.0.0
>
>
> Currently SPARK history server display the classpath entries in the 
> Environment tab with classpaths redacted, this is because EventLog file has 
> the entry values redacted while writing. But when same is seen from a running 
> application UI, its seen that it is not redacted. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-22590) Broadcast thread propagates the localProperties to task

2020-01-17 Thread Ajith S (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-22590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-22590:

Affects Version/s: 3.0.0
   2.4.4

> Broadcast thread propagates the localProperties to task
> ---
>
> Key: SPARK-22590
> URL: https://issues.apache.org/jira/browse/SPARK-22590
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0, 2.4.4, 3.0.0
>Reporter: Ajith S
>Priority: Major
>  Labels: bulk-closed
> Attachments: TestProps.scala
>
>
> Local properties set via sparkContext are not available as TaskContext 
> properties when executing parallel jobs and threadpools have idle threads
> Explanation: 
>  When executing parallel jobs via {{BroadcastExchangeExec}}, the 
> {{relationFuture}} is evaluated via a seperate thread. The threads inherit 
> the {{localProperties}} from sparkContext as they are the child threads.
>  These threads are controlled via the executionContext (thread pools). Each 
> Thread pool has a default {{keepAliveSeconds}} of 60 seconds for idle 
> threads. 
>  Scenarios where the thread pool has threads which are idle and reused for a 
> subsequent new query, the thread local properties will not be inherited from 
> spark context (thread properties are inherited only on thread creation) hence 
> end up having old or no properties set. This will cause taskset properties to 
> be missing when properties are transferred by child thread via 
> {{sparkContext.runJob/submitJob}}
> Attached is a test-case to simulate this behavior



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30556) SubqueryExec passes local properties to SubqueryExec.executionContext

2020-01-17 Thread Ajith S (Jira)
Ajith S created SPARK-30556:
---

 Summary: SubqueryExec passes local properties to 
SubqueryExec.executionContext
 Key: SPARK-30556
 URL: https://issues.apache.org/jira/browse/SPARK-30556
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.4.4, 3.0.0
Reporter: Ajith S


Local properties set via sparkContext are not available as TaskContext 
properties when executing  jobs and threadpools have idle threads which are 
reused

Explanation:
When SubqueryExec, the {{relationFuture}} is evaluated via a separate thread. 
The threads inherit the {{localProperties}} from sparkContext as they are the 
child threads.
These threads are controlled via the executionContext (thread pools). Each 
Thread pool has a default {{keepAliveSeconds}} of 60 seconds for idle threads.
Scenarios where the thread pool has threads which are idle and reused for a 
subsequent new query, the thread local properties will not be inherited from 
spark context (thread properties are inherited only on thread creation) hence 
end up having old or no properties set. This will cause taskset properties to 
be missing when properties are transferred by child thread via 
{{sparkContext.runJob/submitJob}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-22590) Broadcast thread propagates the localProperties to task

2020-01-17 Thread Ajith S (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-22590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-22590:

Summary: Broadcast thread propagates the localProperties to task  (was: 
SparkContext's local properties missing from TaskContext properties)

> Broadcast thread propagates the localProperties to task
> ---
>
> Key: SPARK-22590
> URL: https://issues.apache.org/jira/browse/SPARK-22590
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: Ajith S
>Priority: Major
>  Labels: bulk-closed
> Attachments: TestProps.scala
>
>
> Local properties set via sparkContext are not available as TaskContext 
> properties when executing parallel jobs and threadpools have idle threads
> Explanation: 
>  When executing parallel jobs via {{BroadcastExchangeExec}}, the 
> {{relationFuture}} is evaluated via a seperate thread. The threads inherit 
> the {{localProperties}} from sparkContext as they are the child threads.
>  These threads are controlled via the executionContext (thread pools). Each 
> Thread pool has a default {{keepAliveSeconds}} of 60 seconds for idle 
> threads. 
>  Scenarios where the thread pool has threads which are idle and reused for a 
> subsequent new query, the thread local properties will not be inherited from 
> spark context (thread properties are inherited only on thread creation) hence 
> end up having old or no properties set. This will cause taskset properties to 
> be missing when properties are transferred by child thread via 
> {{sparkContext.runJob/submitJob}}
> Attached is a test-case to simulate this behavior



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-22590) SparkContext's local properties missing from TaskContext properties

2020-01-17 Thread Ajith S (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-22590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-22590:

Description: 
Local properties set via sparkContext are not available as TaskContext 
properties when executing parallel jobs and threadpools have idle threads

Explanation: 
 When executing parallel jobs via {{BroadcastExchangeExec}}, the 
{{relationFuture}} is evaluated via a seperate thread. The threads inherit the 
{{localProperties}} from sparkContext as they are the child threads.
 These threads are controlled via the executionContext (thread pools). Each 
Thread pool has a default {{keepAliveSeconds}} of 60 seconds for idle threads. 
 Scenarios where the thread pool has threads which are idle and reused for a 
subsequent new query, the thread local properties will not be inherited from 
spark context (thread properties are inherited only on thread creation) hence 
end up having old or no properties set. This will cause taskset properties to 
be missing when properties are transferred by child thread via 
{{sparkContext.runJob/submitJob}}

Attached is a test-case to simulate this behavior

  was:
Local properties set via sparkContext are not available as TaskContext 
properties when executing parallel jobs and threadpools have idle threads

Explanation:  
When executing parallel jobs via {{BroadcastExchangeExec}} or {{SubqueryExec}}, 
the {{relationFuture}} is evaluated via a seperate thread. The threads inherit 
the {{localProperties}} from sparkContext as they are the child threads.
These threads are controlled via the executionContext (thread pools). Each 
Thread pool has a default {{keepAliveSeconds}} of 60 seconds for idle threads. 
Scenarios where the thread pool has threads which are idle and reused for a 
subsequent new query, the thread local properties will not be inherited from 
spark context (thread properties are inherited only on thread creation) hence 
end up having old or no properties set. This will cause taskset properties to 
be missing when properties are transferred by child thread via 
{{sparkContext.runJob/submitJob}}

Attached is a test-case to simulate this behavior



> SparkContext's local properties missing from TaskContext properties
> ---
>
> Key: SPARK-22590
> URL: https://issues.apache.org/jira/browse/SPARK-22590
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: Ajith S
>Priority: Major
>  Labels: bulk-closed
> Attachments: TestProps.scala
>
>
> Local properties set via sparkContext are not available as TaskContext 
> properties when executing parallel jobs and threadpools have idle threads
> Explanation: 
>  When executing parallel jobs via {{BroadcastExchangeExec}}, the 
> {{relationFuture}} is evaluated via a seperate thread. The threads inherit 
> the {{localProperties}} from sparkContext as they are the child threads.
>  These threads are controlled via the executionContext (thread pools). Each 
> Thread pool has a default {{keepAliveSeconds}} of 60 seconds for idle 
> threads. 
>  Scenarios where the thread pool has threads which are idle and reused for a 
> subsequent new query, the thread local properties will not be inherited from 
> spark context (thread properties are inherited only on thread creation) hence 
> end up having old or no properties set. This will cause taskset properties to 
> be missing when properties are transferred by child thread via 
> {{sparkContext.runJob/submitJob}}
> Attached is a test-case to simulate this behavior



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Reopened] (SPARK-22590) SparkContext's local properties missing from TaskContext properties

2020-01-17 Thread Ajith S (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-22590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S reopened SPARK-22590:
-

Adding Fix

> SparkContext's local properties missing from TaskContext properties
> ---
>
> Key: SPARK-22590
> URL: https://issues.apache.org/jira/browse/SPARK-22590
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: Ajith S
>Priority: Major
>  Labels: bulk-closed
> Attachments: TestProps.scala
>
>
> Local properties set via sparkContext are not available as TaskContext 
> properties when executing parallel jobs and threadpools have idle threads
> Explanation:  
> When executing parallel jobs via {{BroadcastExchangeExec}} or 
> {{SubqueryExec}}, the {{relationFuture}} is evaluated via a seperate thread. 
> The threads inherit the {{localProperties}} from sparkContext as they are the 
> child threads.
> These threads are controlled via the executionContext (thread pools). Each 
> Thread pool has a default {{keepAliveSeconds}} of 60 seconds for idle 
> threads. 
> Scenarios where the thread pool has threads which are idle and reused for a 
> subsequent new query, the thread local properties will not be inherited from 
> spark context (thread properties are inherited only on thread creation) hence 
> end up having old or no properties set. This will cause taskset properties to 
> be missing when properties are transferred by child thread via 
> {{sparkContext.runJob/submitJob}}
> Attached is a test-case to simulate this behavior



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Reopened] (SPARK-23626) DAGScheduler blocked due to JobSubmitted event

2020-01-16 Thread Ajith S (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-23626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S reopened SPARK-23626:
-

Old PR was closed due to inactivity. Reopening with a new PR and hence to 
conclude

>  DAGScheduler blocked due to JobSubmitted event
> ---
>
> Key: SPARK-23626
> URL: https://issues.apache.org/jira/browse/SPARK-23626
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: 2.2.1, 2.3.3, 2.4.3, 3.0.0
>Reporter: Ajith S
>Priority: Major
>
> DAGScheduler becomes a bottleneck in cluster when multiple JobSubmitted 
> events has to be processed as DAGSchedulerEventProcessLoop is single threaded 
> and it will block other tasks in queue like TaskCompletion.
> The JobSubmitted event is time consuming depending on the nature of the job 
> (Example: calculating parent stage dependencies, shuffle dependencies, 
> partitions) and thus it blocks all the events to be processed.
>  
> I see multiple JIRA referring to this behavior
> https://issues.apache.org/jira/browse/SPARK-2647
> https://issues.apache.org/jira/browse/SPARK-4961
>  
> Similarly in my cluster some jobs partition calculation is time consuming 
> (Similar to stack at SPARK-2647) hence it slows down the spark 
> DAGSchedulerEventProcessLoop which results in user jobs to slowdown, even if 
> its tasks are finished within seconds, as TaskCompletion Events are processed 
> at a slower rate due to blockage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-30517) Support SHOW TABLES EXTENDED

2020-01-15 Thread Ajith S (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17015706#comment-17015706
 ] 

Ajith S edited comment on SPARK-30517 at 1/15/20 8:01 AM:
--

[~srowen] [~dongjoon] [~vanzin] Please let me know your opinions about 
proposal. I would like to work if its acceptable


was (Author: ajithshetty):
[~srowen] [~dongjoon] [~vanzin] Please let me know about your opinion about 
proposal. I would like to work if its acceptable

> Support SHOW TABLES EXTENDED
> 
>
> Key: SPARK-30517
> URL: https://issues.apache.org/jira/browse/SPARK-30517
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Major
>
> {{Intention is to support show tables with a additional column 'type' where 
> type can be MANAGED,EXTERNAL,VIEW using which user can query only tables of 
> required types, like listing only views or only external tables (using a 
> 'where' clause over 'type' column).}}
> {{Usecase example:}}
> {{Currently its not possible to list all the VIEWS, but other technologies 
> like hive support it using 'SHOW VIEWS', mysql supports it using a more 
> complex query 'SHOW FULL TABLES WHERE table_type = 'VIEW';'}}
> Decide to take mysql approach as it provides more flexibility for querying.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30517) Support SHOW TABLES EXTENDED

2020-01-15 Thread Ajith S (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17015706#comment-17015706
 ] 

Ajith S commented on SPARK-30517:
-

[~srowen] [~dongjoon] [~vanzin] Please let me know about your opinion about 
proposal. I would like to work if its acceptable

> Support SHOW TABLES EXTENDED
> 
>
> Key: SPARK-30517
> URL: https://issues.apache.org/jira/browse/SPARK-30517
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Major
>
> {{Intention is to support show tables with a additional column 'type' where 
> type can be MANAGED,EXTERNAL,VIEW using which user can query only tables of 
> required types, like listing only views or only external tables (using a 
> 'where' clause over 'type' column).}}
> {{Usecase example:}}
> {{Currently its not possible to list all the VIEWS, but other technologies 
> like hive support it using 'SHOW VIEWS', mysql supports it using a more 
> complex query 'SHOW FULL TABLES WHERE table_type = 'VIEW';'}}
> Decide to take mysql approach as it provides more flexibility for querying.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30517) Support SHOW TABLES EXTENDED

2020-01-14 Thread Ajith S (Jira)
Ajith S created SPARK-30517:
---

 Summary: Support SHOW TABLES EXTENDED
 Key: SPARK-30517
 URL: https://issues.apache.org/jira/browse/SPARK-30517
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Affects Versions: 3.0.0
Reporter: Ajith S


{{Intention is to support show tables with a additional column 'type' where 
type can be MANAGED,EXTERNAL,VIEW using which user can query only tables of 
required types, like listing only views or only external tables (using a 
'where' clause over 'type' column).}}

{{Usecase example:}}
{{Currently its not possible to list all the VIEWS, but other technologies like 
hive support it using 'SHOW VIEWS', mysql supports it using a more complex 
query 'SHOW FULL TABLES WHERE table_type = 'VIEW';'}}

Decide to take mysql approach as it provides more flexibility for querying.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-30484) Job History Storage Tab does not display RDD Table

2020-01-11 Thread Ajith S (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17013656#comment-17013656
 ] 

Ajith S edited comment on SPARK-30484 at 1/12/20 6:36 AM:
--

This is not a issue. As SparkListenerBlockUpdated is filtered by default for 
performance reasons, set spark.eventLog.logBlockUpdates.enabled=true to view 
storage information


was (Author: ajithshetty):
As SparkListenerBlockUpdated is filtered by default for performance reasons, 
set spark.eventLog.logBlockUpdates.enabled=true to view storage information

> Job History Storage Tab does not display RDD Table
> --
>
> Key: SPARK-30484
> URL: https://issues.apache.org/jira/browse/SPARK-30484
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>
> scala> import org.apache.spark.storage.StorageLevel._
> import org.apache.spark.storage.StorageLevel._
> scala> val rdd = sc.range(0, 100, 1, 5).setName("rdd")
> rdd: org.apache.spark.rdd.RDD[Long] = rdd MapPartitionsRDD[1] at range at 
> :27
> scala> rdd.persist(MEMORY_ONLY_SER)
> res0: rdd.type = rdd MapPartitionsRDD[1] at range at :27
> scala> rdd.count
> res1: Long = 100  
>   
> scala> val df = Seq((1, "andy"), (2, "bob"), (2, "andy")).toDF("count", 
> "name")
> df: org.apache.spark.sql.DataFrame = [count: int, name: string]
> scala> df.persist(DISK_ONLY)
> res2: df.type = [count: int, name: string]
> scala> df.count
> res3: Long = 3
> Open Storage Tab under Incomplete Jobs in Job History Page
> UI will not display the RDD Table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30484) Job History Storage Tab does not display RDD Table

2020-01-11 Thread Ajith S (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17013656#comment-17013656
 ] 

Ajith S commented on SPARK-30484:
-

As SparkListenerBlockUpdated is filtered by default for performance reasons, 
set spark.eventLog.logBlockUpdates.enabled=true to view storage information

> Job History Storage Tab does not display RDD Table
> --
>
> Key: SPARK-30484
> URL: https://issues.apache.org/jira/browse/SPARK-30484
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>
> scala> import org.apache.spark.storage.StorageLevel._
> import org.apache.spark.storage.StorageLevel._
> scala> val rdd = sc.range(0, 100, 1, 5).setName("rdd")
> rdd: org.apache.spark.rdd.RDD[Long] = rdd MapPartitionsRDD[1] at range at 
> :27
> scala> rdd.persist(MEMORY_ONLY_SER)
> res0: rdd.type = rdd MapPartitionsRDD[1] at range at :27
> scala> rdd.count
> res1: Long = 100  
>   
> scala> val df = Seq((1, "andy"), (2, "bob"), (2, "andy")).toDF("count", 
> "name")
> df: org.apache.spark.sql.DataFrame = [count: int, name: string]
> scala> df.persist(DISK_ONLY)
> res2: df.type = [count: int, name: string]
> scala> df.count
> res3: Long = 3
> Open Storage Tab under Incomplete Jobs in Job History Page
> UI will not display the RDD Table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30488) Deadlock between block-manager-slave-async-thread-pool and spark context cleaner

2020-01-11 Thread Ajith S (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17013651#comment-17013651
 ] 

Ajith S commented on SPARK-30488:
-

As from log, i see `sbt` classes  in the deadlock threads,this is related to 
internal classloaders in sbt which was fixed in sbt 1.3.3 by marking 
classloaders as parallel capable.  [https://github.com/sbt/sbt/pull/5131]  also 
a similar issue '[https://github.com/sbt/sbt/issues/5116]'

 

[~rohit21agrawal]  Thanks for reporting this. Some questions, can you also 
please mention how the sparkcontext was created.?

 

> Deadlock between block-manager-slave-async-thread-pool and spark context 
> cleaner
> 
>
> Key: SPARK-30488
> URL: https://issues.apache.org/jira/browse/SPARK-30488
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.3
>Reporter: Rohit Agrawal
>Priority: Major
>
> Deadlock happens while cleaning up the spark context. Here is the full thread 
> dump:
>  
>   
>   2020-01-10T20:13:16.2884057Z Full thread dump Java HotSpot(TM) 64-Bit 
> Server VM (25.221-b11 mixed mode):
> 2020-01-10T20:13:16.2884392Z 
> 2020-01-10T20:13:16.2884660Z "SIGINT handler" #488 daemon prio=9 os_prio=2 
> tid=0x111fa000 nid=0x4794 waiting for monitor entry 
> [0x1c86e000]
> 2020-01-10T20:13:16.2884807Z java.lang.Thread.State: BLOCKED (on object 
> monitor)
> 2020-01-10T20:13:16.2884879Z at java.lang.Shutdown.exit(Shutdown.java:212)
> 2020-01-10T20:13:16.2885693Z - waiting to lock <0xc0155de0> (a 
> java.lang.Class for java.lang.Shutdown)
> 2020-01-10T20:13:16.2885840Z at 
> java.lang.Terminator$1.handle(Terminator.java:52)
> 2020-01-10T20:13:16.2885965Z at sun.misc.Signal$1.run(Signal.java:212)
> 2020-01-10T20:13:16.2886329Z at java.lang.Thread.run(Thread.java:748)
> 2020-01-10T20:13:16.2886430Z 
> 2020-01-10T20:13:16.2886752Z "Thread-3" #108 prio=5 os_prio=0 
> tid=0x111f7800 nid=0x48cc waiting for monitor entry 
> [0x2c33f000]
> 2020-01-10T20:13:16.2886881Z java.lang.Thread.State: BLOCKED (on object 
> monitor)
> 2020-01-10T20:13:16.2886999Z at 
> org.apache.hadoop.util.ShutdownHookManager.getShutdownHooksInOrder(ShutdownHookManager.java:273)
> 2020-01-10T20:13:16.2887107Z at 
> org.apache.hadoop.util.ShutdownHookManager.executeShutdown(ShutdownHookManager.java:121)
> 2020-01-10T20:13:16.2887212Z at 
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:95)
> 2020-01-10T20:13:16.2887421Z 
> 2020-01-10T20:13:16.2887798Z "block-manager-slave-async-thread-pool-81" #486 
> daemon prio=5 os_prio=0 tid=0x111fe800 nid=0x2e34 waiting for monitor 
> entry [0x2bf3d000]
> 2020-01-10T20:13:16.2889192Z java.lang.Thread.State: BLOCKED (on object 
> monitor)
> 2020-01-10T20:13:16.2889305Z at 
> java.lang.ClassLoader.loadClass(ClassLoader.java:404)
> 2020-01-10T20:13:16.2889405Z - waiting to lock <0xc1f359f0> (a 
> sbt.internal.LayeredClassLoader)
> 2020-01-10T20:13:16.2889482Z at 
> java.lang.ClassLoader.loadClass(ClassLoader.java:411)
> 2020-01-10T20:13:16.2889582Z - locked <0xca33e4c8> (a 
> sbt.internal.ManagedClassLoader$ZombieClassLoader)
> 2020-01-10T20:13:16.2889659Z at 
> java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> 2020-01-10T20:13:16.2890881Z at 
> org.apache.spark.storage.BlockManagerSlaveEndpoint$$anonfun$receiveAndReply$1$$anonfun$applyOrElse$3.apply$mcZ$sp(BlockManagerSlaveEndpoint.scala:58)
> 2020-01-10T20:13:16.2891006Z at 
> org.apache.spark.storage.BlockManagerSlaveEndpoint$$anonfun$receiveAndReply$1$$anonfun$applyOrElse$3.apply(BlockManagerSlaveEndpoint.scala:57)
> 2020-01-10T20:13:16.2891142Z at 
> org.apache.spark.storage.BlockManagerSlaveEndpoint$$anonfun$receiveAndReply$1$$anonfun$applyOrElse$3.apply(BlockManagerSlaveEndpoint.scala:57)
> 2020-01-10T20:13:16.2891260Z at 
> org.apache.spark.storage.BlockManagerSlaveEndpoint$$anonfun$1.apply(BlockManagerSlaveEndpoint.scala:86)
> 2020-01-10T20:13:16.2891375Z at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> 2020-01-10T20:13:16.2891624Z at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> 2020-01-10T20:13:16.2891737Z at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 2020-01-10T20:13:16.2891833Z at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 2020-01-10T20:13:16.2891925Z at java.lang.Thread.run(Thread.java:748)
> 2020-01-10T20:13:16.2891967Z 
> 2020-01-10T20:13:16.2892066Z "pool-31-thread-16" #335 prio=5 os_prio=0 
> tid=0x153b2000 nid=0x1aac waiting on condition [0x4b2ff000]
> 2020-01-10T20:13:16.2892147Z java.lang.Thread.State: WAITING (parking)
> 2020-01

[jira] [Commented] (SPARK-30440) Flaky test: org.apache.spark.scheduler.TaskSetManagerSuite.reset

2020-01-06 Thread Ajith S (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009433#comment-17009433
 ] 

Ajith S commented on SPARK-30440:
-

Found a race between reviveOffers in 
org.apache.spark.scheduler.TaskSchedulerImpl#submitTasks and 
org.apache.spark.scheduler.TaskSetManager#resourceOffer, in the testcase made 
PR for same

[https://github.com/apache/spark/pull/27115]

> Flaky test: org.apache.spark.scheduler.TaskSetManagerSuite.reset
> 
>
> Key: SPARK-30440
> URL: https://issues.apache.org/jira/browse/SPARK-30440
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, Tests
>Affects Versions: 3.0.0
>Reporter: Jungtaek Lim
>Priority: Major
>
> [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116126/testReport]
> [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/116159/testReport/]
> {noformat}
>  org.apache.spark.scheduler.TaskSetManagerSuite.reset Error 
> Detailsorg.scalatest.exceptions.TestFailedException: task0.isDefined was 
> true, but task1.isDefined was false Stack Tracesbt.ForkMain$ForkError: 
> org.scalatest.exceptions.TestFailedException: task0.isDefined was true, but 
> task1.isDefined was false
>   at 
> org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530)
>   at 
> org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529)
>   at 
> org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1560)
>   at 
> org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:503)
>   at 
> org.apache.spark.scheduler.TaskSetManagerSuite.$anonfun$new$107(TaskSetManagerSuite.scala:1933)
>   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
>   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
>   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>   at org.scalatest.Transformer.apply(Transformer.scala:22)
>   at org.scalatest.Transformer.apply(Transformer.scala:20)
>   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
>   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:149)
>   at 
> org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184)
>   at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196)
>   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:286)
>   at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196)
>   at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178)
>   at 
> org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:56)
>   at 
> org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:221)
>   at 
> org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:214)
>   at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:56)
>   at 
> org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229)
>   at 
> org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:393)
>   at scala.collection.immutable.List.foreach(List.scala:392)
>   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:381)
>   at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:376)
>   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:458)
>   at org.scalatest.FunSuiteLike.runTests(FunSuiteLike.scala:229)
>   at org.scalatest.FunSuiteLike.runTests$(FunSuiteLike.scala:228)
>   at org.scalatest.FunSuite.runTests(FunSuite.scala:1560)
>   at org.scalatest.Suite.run(Suite.scala:1124)
>   at org.scalatest.Suite.run$(Suite.scala:1106)
>   at 
> org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560)
>   at org.scalatest.FunSuiteLike.$anonfun$run$1(FunSuiteLike.scala:233)
>   at org.scalatest.SuperEngine.runImpl(Engine.scala:518)
>   at org.scalatest.FunSuiteLike.run(FunSuiteLike.scala:233)
>   at org.scalatest.FunSuiteLike.run$(FunSuiteLike.scala:232)
>   at 
> org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:56)
>   at 
> org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213)
>   at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
>   at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
>   at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:56)
>   at 
> org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:317)
>   at 
> org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:510)
>   at sbt.ForkMain$Run$2.call(ForkMain.java:296)
>   at sbt.ForkMain$Run$2.call(ForkMain.java:286)
>   at java.util.co

[jira] [Created] (SPARK-30406) OneForOneStreamManager use AtomicLong

2020-01-01 Thread Ajith S (Jira)
Ajith S created SPARK-30406:
---

 Summary: OneForOneStreamManager use AtomicLong
 Key: SPARK-30406
 URL: https://issues.apache.org/jira/browse/SPARK-30406
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Ajith S


Using compound operations as well as increments and decrements on primitive 
fields are not atomic operations. Here when volatile primitive field is 
incremented or decremented,  we run into data loss if threads interleave in 
steps of update. 

 

Refer: 
[https://wiki.sei.cmu.edu/confluence/display/java/VNA02-J.+Ensure+that+compound+operations+on+shared+variables+are+atomic]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30405) ArrayKeyIndexType should use Arrays.hashCode

2020-01-01 Thread Ajith S (Jira)
Ajith S created SPARK-30405:
---

 Summary: ArrayKeyIndexType should use Arrays.hashCode
 Key: SPARK-30405
 URL: https://issues.apache.org/jira/browse/SPARK-30405
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.4.4, 2.3.4
Reporter: Ajith S


hashCode on a array, returns arrays's identity hash and does not reflect the 
array's content instead. this cann be corrected by using Arrays.hashCode(array)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30382) start-thriftserver throws ClassNotFoundException

2019-12-29 Thread Ajith S (Jira)
Ajith S created SPARK-30382:
---

 Summary: start-thriftserver throws ClassNotFoundException
 Key: SPARK-30382
 URL: https://issues.apache.org/jira/browse/SPARK-30382
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 3.0.0
Reporter: Ajith S


start-thriftserver.sh --help throws

{code}
.
 

Thrift server options:
Exception in thread "main" java.lang.NoClassDefFoundError: 
org/apache/logging/log4j/spi/LoggerContextFactory
at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:167)
at 
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:82)
at 
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)
Caused by: java.lang.ClassNotFoundException: 
org.apache.logging.log4j.spi.LoggerContextFactory
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 3 more


{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25061) Spark SQL Thrift Server fails to not pick up hiveconf passing parameter

2019-12-29 Thread Ajith S (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-25061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004736#comment-17004736
 ] 

Ajith S commented on SPARK-25061:
-

I could reproduce this and as per documentation, 
https://spark.apache.org/docs/latest/sql-distributed-sql-engine.html --hiveconf 
can be used to pass hive properties to thrift server. Raising PR for fixing the 
same.

>  Spark SQL Thrift Server fails to not pick up hiveconf passing parameter
> 
>
> Key: SPARK-25061
> URL: https://issues.apache.org/jira/browse/SPARK-25061
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Zineng Yuan
>Priority: Major
>
> Spark thrift server should use passing parameter value and overwrites the 
> same conf from hive-site.xml. For example,  the server should overwrite what 
> exists in hive-site.xml.
>  ./sbin/start-thriftserver.sh --master yarn-client ...
> --hiveconf 
> "hive.server2.authentication.kerberos.principal=" ...
> 
> hive.server2.authentication.kerberos.principal
> hive/_HOST@
> 
> However, the server takes what in hive-site.xml.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25030) SparkSubmit.doSubmit will not return result if the mainClass submitted creates a Timer()

2019-12-26 Thread Ajith S (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-25030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17003932#comment-17003932
 ] 

Ajith S commented on SPARK-25030:
-

[~jiangxb1987] is there a reproduce case for this.? i just tried this and seems 
to work fine.

> SparkSubmit.doSubmit will not return result if the mainClass submitted 
> creates a Timer()
> 
>
> Key: SPARK-25030
> URL: https://issues.apache.org/jira/browse/SPARK-25030
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.1
>Reporter: Xingbo Jiang
>Priority: Major
>  Labels: bulk-closed
>
> Create a Timer() in the mainClass submitted to SparkSubmit makes it unable to 
> fetch result, it is very easy to reproduce the issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30361) Monitoring URL do not redact information about environment

2019-12-26 Thread Ajith S (Jira)
Ajith S created SPARK-30361:
---

 Summary: Monitoring URL do not redact information about environment
 Key: SPARK-30361
 URL: https://issues.apache.org/jira/browse/SPARK-30361
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Ajith S


UI and event logs redact sensitive information. But the monitoring URL,  
https://spark.apache.org/docs/latest/monitoring.html#rest-api , specifically  
/applications/[app-id]/environment does not, which is a security issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-30360) Redact classpath entries in Spark UI

2019-12-26 Thread Ajith S (Jira)
Ajith S created SPARK-30360:
---

 Summary: Redact classpath entries in Spark UI
 Key: SPARK-30360
 URL: https://issues.apache.org/jira/browse/SPARK-30360
 Project: Spark
  Issue Type: Bug
  Components: Web UI
Affects Versions: 3.0.0
Reporter: Ajith S


Currently SPARK history server display the classpath entries in the Environment 
tab with classpaths redacted, this is because EventLog file has the entry 
values redacted while writing. But when same is seen from a running application 
UI, its seen that it is not redacted. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27719) Set maxDisplayLogSize for spark history server

2019-11-29 Thread Ajith S (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16985021#comment-16985021
 ] 

Ajith S commented on SPARK-27719:
-

Currently our production also encounter this issue, I would like to work on 
this, as per suggestion of [~hao.li] is the idea acceptable [~dongjoon] .?

> Set maxDisplayLogSize for spark history server
> --
>
> Key: SPARK-27719
> URL: https://issues.apache.org/jira/browse/SPARK-27719
> Project: Spark
>  Issue Type: Improvement
>  Components: Web UI
>Affects Versions: 3.0.0
>Reporter: hao.li
>Priority: Minor
>
> Sometimes a very large eventllog may be useless, and parses it may waste many 
> resources.
> It may be useful to  avoid parse large enventlogs by setting a configuration 
> spark.history.fs.maxDisplayLogSize.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29174) LOCAL is not supported in INSERT OVERWRITE DIRECTORY to data source

2019-09-19 Thread Ajith S (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933203#comment-16933203
 ] 

Ajith S commented on SPARK-29174:
-

Had checked with original author for this
https://github.com/apache/spark/pull/18975#issuecomment-523261355

> LOCAL is not supported in INSERT OVERWRITE DIRECTORY to data source
> ---
>
> Key: SPARK-29174
> URL: https://issues.apache.org/jira/browse/SPARK-29174
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>
> *using does not work for insert overwrite when in local  but works when 
> insert overwrite in HDFS directory*
>  ** 
>  
> 0: jdbc:hive2://10.18.18.214:23040/default> insert overwrite directory 
> '/user/trash2/' using parquet select * from trash1 a where a.country='PAK';
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.448 seconds)
> 0: jdbc:hive2://10.18.18.214:23040/default> insert overwrite local directory 
> '/opt/trash2/' using parquet select * from trash1 a where a.country='PAK';
> Error: org.apache.spark.sql.catalyst.parser.ParseException:
> LOCAL is not supported in INSERT OVERWRITE DIRECTORY to data source(line 1, 
> pos 0)
>  
> == SQL ==
> insert overwrite local directory '/opt/trash2/' using parquet select * from 
> trash1 a where a.country='PAK'
> ^^^ (state=,code=0)
> 0: jdbc:hive2://10.18.18.214:23040/default> insert overwrite local directory 
> '/opt/trash2/' stored as parquet select * from trash1 a where a.country='PAK';
> +-+--+
> | Result  |
> +-+--+
> | | |
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-28848) insert overwrite local directory stored as parquet does not creates snappy.parquet data file at local directory path

2019-08-22 Thread Ajith S (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-28848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S resolved SPARK-28848.
-
Resolution: Duplicate

Will be fixed as part of SPARK-28659

> insert overwrite local directory  stored as parquet does not creates 
> snappy.parquet data file at local directory path
> ---
>
> Key: SPARK-28848
> URL: https://issues.apache.org/jira/browse/SPARK-28848
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.3
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>
> {code}
> 0: jdbc:hive2://10.18.18.214:23040/func> insert overwrite local directory 
> '/opt/trash4/' stored as parquet select * from trash1 a where a.country='PAK';
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (1.368 seconds)
> {code}
> Data file at local directory path:
> {code}
> vm1:/opt/trash4 # ll
> total 12
> -rw-r--r-- 1 root root   8 Aug 22 14:30 ._SUCCESS.crc
> -rw-r--r-- 1 root root  16 Aug 22 14:30 
> .part-1-2b17ec6a-ef7e-4b45-927e-f93b88ff4f65-c000.crc
> -rw-r--r-- 1 root root   0 Aug 22 14:30 _SUCCESS
> -rw-r--r-- 1 root root 619 Aug 22 14:30 
> part-1-2b17ec6a-ef7e-4b45-927e-f93b88ff4f65-c000
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28848) insert overwrite local directory stored as parquet does not creates snappy.parquet data file at local directory path

2019-08-21 Thread Ajith S (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-28848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16913027#comment-16913027
 ] 

Ajith S commented on SPARK-28848:
-

Thanks for reporting. Will look into this. On initial thoughts looks like 
org.apache.spark.sql.hive.execution.HiveFileFormat#prepareWrite here 
OutputWriterFactory.getFileExtension may not be passing the right file 
extension incase of stored as

> insert overwrite local directory  stored as parquet does not creates 
> snappy.parquet data file at local directory path
> ---
>
> Key: SPARK-28848
> URL: https://issues.apache.org/jira/browse/SPARK-28848
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.3
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
>
> 0: jdbc:hive2://10.18.18.214:23040/func> insert overwrite local directory 
> '/opt/trash4/' stored as parquet select * from trash1 a where a.country='PAK';
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (1.368 seconds)
> Data file at local directory path:
> vm1:/opt/trash4 # ll
> total 12
> -rw-r--r-- 1 root root   8 Aug 22 14:30 ._SUCCESS.crc
> -rw-r--r-- 1 root root  16 Aug 22 14:30 
> .part-1-2b17ec6a-ef7e-4b45-927e-f93b88ff4f65-c000.crc
> -rw-r--r-- 1 root root   0 Aug 22 14:30 _SUCCESS
> -rw-r--r-- 1 root root 619 Aug 22 14:30 
> part-1-2b17ec6a-ef7e-4b45-927e-f93b88ff4f65-c000



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28726) Spark with DynamicAllocation always got connect rest by peers

2019-08-14 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907112#comment-16907112
 ] 

Ajith S commented on SPARK-28726:
-

As i see, this is driver trying to clean up RDDs, broadcasts etc from the 
expiring executor and meanwhile the executor has gone down, which is why such 
exceptions are under warning. Does the issue occur with higher timeouts too.? 

> Spark with DynamicAllocation always got connect rest by peers
> -
>
> Key: SPARK-28726
> URL: https://issues.apache.org/jira/browse/SPARK-28726
> Project: Spark
>  Issue Type: Wish
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: angerszhu
>Priority: Major
>
> When use Spark with dynamic allocation, we set idle time to 5s
> We always got exception about neety 'Connect reset by peers'
>  
> I suspect that it's because we set idle time 5s is too small, it will cause 
> when Blockmanager call netty io, the executor has been remove because of 
> timeout.
> But not timely notify driver's BlocakManager
> {code:java}
> 19/08/14 00:00:46 WARN 
> org.apache.spark.network.server.TransportChannelHandler: "Exception in 
> connection from /host:port"
> java.io.IOException: Connection reset by peer
>  at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>  at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>  at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>  at sun.nio.ch.IOUtil.read(IOUtil.java:192)
>  at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
>  at 
> io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:288)
>  at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1106)
>  at 
> io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:343)
>  at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:123)
>  at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
>  at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
>  at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
>  at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
>  at 
> io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
>  at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
> --
> 19/08/14 00:00:46 WARN org.apache.spark.storage.BlockManagerMasterEndpoint: 
> "Error trying to remove broadcast 67 from block manager BlockManagerId(967, 
> host, port, None)"
> java.io.IOException: Connection reset by peer
>  at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>  at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>  at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>  at sun.nio.ch.IOUtil.read(IOUtil.java:192)
>  at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
>  at 
> io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:288)
>  at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1106)
>  at 
> io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:343)
>  at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:123)
>  at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
>  at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
>  at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
>  at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
>  at 
> io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
>  at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
> --
> 19/08/14 00:00:46 INFO org.apache.spark.ContextCleaner: "Cleaned accumulator 
> 162174"
> 19/08/14 00:00:46 WARN org.apache.spark.storage.BlockManagerMaster: "Failed 
> to remove shuffle 22 - Connection reset by peer"
> java.io.IOException: Connection reset by peer
>  at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>  at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39){code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28696) create database _; allowing in Spark but Hive throws Parse Exception

2019-08-13 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906881#comment-16906881
 ] 

Ajith S commented on SPARK-28696:
-

[~hyukjin.kwon] okay, should we disallow tables names starting with _ rather 
than throwing path error for SPARK-28697 .??

> create database _; allowing in Spark but Hive throws Parse Exception
> 
>
> Key: SPARK-28696
> URL: https://issues.apache.org/jira/browse/SPARK-28696
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Minor
>
> In Spark
> {code}
> spark-sql> create database _;
> Time taken: 0.062 seconds
> spark-sql> show databases;
> _
> adap
> adaptive
> adaptive_tc8
> {code}
> In Hive
> {code}
> 0: jdbc:hive2://10.18.98.147:21066/> create database _;
> Error: Error while compiling statement: FAILED: ParseException line 1:16 
> cannot recognize input near '_--0' '' '' in create database 
> statement (state=42000,code=4)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28696) create database _; allowing in Spark but Hive throws Parse Exception

2019-08-12 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905196#comment-16905196
 ] 

Ajith S commented on SPARK-28696:
-

I think we should disallow if a identifier starts with  _  for create database 
and create table
Partially we can see its effect in SPARK-28697 where as the table name starts 
with _ (like _sampleTable) , the FileFormat assumes it to be a hidden folder 
and do not list it which causes unusual behavior 

> create database _; allowing in Spark but Hive throws Parse Exception
> 
>
> Key: SPARK-28696
> URL: https://issues.apache.org/jira/browse/SPARK-28696
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Minor
>
> In Spark
> spark-sql> create database _;
> Time taken: 0.062 seconds
> spark-sql> show databases;
> _
> adap
> adaptive
> adaptive_tc8
> In Hive
> 0: jdbc:hive2://10.18.98.147:21066/> create database _;
> Error: Error while compiling statement: FAILED: ParseException line 1:16 
> cannot recognize input near '_--0' '' '' in create database 
> statement (state=42000,code=4)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-28697) select * from _; throws InvalidInputException and says path does not exists at HDFS side

2019-08-12 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905144#comment-16905144
 ] 

Ajith S edited comment on SPARK-28697 at 8/12/19 1:28 PM:
--

 !screenshot-1.png! 

Here due to 
org.apache.hadoop.fs.FileSystem#globStatus(org.apache.hadoop.fs.Path, 
org.apache.hadoop.fs.PathFilter), it return null matches when folder name 
starts with _

This infact is due to hadoop code, here 
org.apache.hadoop.mapred.FileInputFormat#hiddenFileFilter
which disallows paths which start with _


was (Author: ajithshetty):
 !screenshot-1.png! 

Here due to 
org.apache.hadoop.fs.FileSystem#globStatus(org.apache.hadoop.fs.Path, 
org.apache.hadoop.fs.PathFilter), it return null matches when folder name 
starts with _

> select * from _;  throws InvalidInputException and says path does not exists 
> at HDFS side
> -
>
> Key: SPARK-28697
> URL: https://issues.apache.org/jira/browse/SPARK-28697
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
> Attachments: screenshot-1.png
>
>
> spark-sql> create database func1;
> Time taken: 0.095 seconds
> spark-sql> use func1;
> Time taken: 0.031 seconds
> spark-sql> create table _(id int);
> Time taken: 0.351 seconds
> spark-sql> insert into _ values(1);
> Time taken: 3.148 seconds
> spark-sql> select * from _;
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
> hdfs://hacluster/user/sparkhive/warehouse/func1.db/_
> at 
> org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287)
> But at HDFS side it is present
> vm1:/opt/HA/C10/install/hadoop/nodemanager/bin # ./hdfs dfs -ls 
> /user/sparkhive/warehouse/func1.db
> Found 2 items
> drwxr-xr-x   - root hadoop  0 2019-08-12 20:02 
> /user/sparkhive/warehouse/func1.db/_



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-28697) select * from _; throws InvalidInputException and says path does not exists at HDFS side

2019-08-12 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905144#comment-16905144
 ] 

Ajith S edited comment on SPARK-28697 at 8/12/19 12:44 PM:
---

 !screenshot-1.png! 

Here due to 
org.apache.hadoop.fs.FileSystem#globStatus(org.apache.hadoop.fs.Path, 
org.apache.hadoop.fs.PathFilter), it return null matches when folder name 
starts with _


was (Author: ajithshetty):
 !screenshot-1.png! 

Here due to 
org.apache.hadoop.fs.FileSystem#globStatus(org.apache.hadoop.fs.Path, 
org.apache.hadoop.fs.PathFilter), it return 0 matches when folder name starts 
with _

> select * from _;  throws InvalidInputException and says path does not exists 
> at HDFS side
> -
>
> Key: SPARK-28697
> URL: https://issues.apache.org/jira/browse/SPARK-28697
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
> Attachments: screenshot-1.png
>
>
> spark-sql> create database func1;
> Time taken: 0.095 seconds
> spark-sql> use func1;
> Time taken: 0.031 seconds
> spark-sql> create table _(id int);
> Time taken: 0.351 seconds
> spark-sql> insert into _ values(1);
> Time taken: 3.148 seconds
> spark-sql> select * from _;
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
> hdfs://hacluster/user/sparkhive/warehouse/func1.db/_
> at 
> org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287)
> But at HDFS side it is present
> vm1:/opt/HA/C10/install/hadoop/nodemanager/bin # ./hdfs dfs -ls 
> /user/sparkhive/warehouse/func1.db
> Found 2 items
> drwxr-xr-x   - root hadoop  0 2019-08-12 20:02 
> /user/sparkhive/warehouse/func1.db/_



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-28697) select * from _; throws InvalidInputException and says path does not exists at HDFS side

2019-08-12 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905144#comment-16905144
 ] 

Ajith S commented on SPARK-28697:
-

 !screenshot-1.png! 

Here due to 
org.apache.hadoop.fs.FileSystem#globStatus(org.apache.hadoop.fs.Path, 
org.apache.hadoop.fs.PathFilter), it return 0 matches when folder name starts 
with _

> select * from _;  throws InvalidInputException and says path does not exists 
> at HDFS side
> -
>
> Key: SPARK-28697
> URL: https://issues.apache.org/jira/browse/SPARK-28697
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
> Attachments: screenshot-1.png
>
>
> spark-sql> create database func1;
> Time taken: 0.095 seconds
> spark-sql> use func1;
> Time taken: 0.031 seconds
> spark-sql> create table _(id int);
> Time taken: 0.351 seconds
> spark-sql> insert into _ values(1);
> Time taken: 3.148 seconds
> spark-sql> select * from _;
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
> hdfs://hacluster/user/sparkhive/warehouse/func1.db/_
> at 
> org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287)
> But at HDFS side it is present
> vm1:/opt/HA/C10/install/hadoop/nodemanager/bin # ./hdfs dfs -ls 
> /user/sparkhive/warehouse/func1.db
> Found 2 items
> drwxr-xr-x   - root hadoop  0 2019-08-12 20:02 
> /user/sparkhive/warehouse/func1.db/_



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28697) select * from _; throws InvalidInputException and says path does not exists at HDFS side

2019-08-12 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-28697:

Attachment: screenshot-1.png

> select * from _;  throws InvalidInputException and says path does not exists 
> at HDFS side
> -
>
> Key: SPARK-28697
> URL: https://issues.apache.org/jira/browse/SPARK-28697
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Major
> Attachments: screenshot-1.png
>
>
> spark-sql> create database func1;
> Time taken: 0.095 seconds
> spark-sql> use func1;
> Time taken: 0.031 seconds
> spark-sql> create table _(id int);
> Time taken: 0.351 seconds
> spark-sql> insert into _ values(1);
> Time taken: 3.148 seconds
> spark-sql> select * from _;
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
> hdfs://hacluster/user/sparkhive/warehouse/func1.db/_
> at 
> org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287)
> But at HDFS side it is present
> vm1:/opt/HA/C10/install/hadoop/nodemanager/bin # ./hdfs dfs -ls 
> /user/sparkhive/warehouse/func1.db
> Found 2 items
> drwxr-xr-x   - root hadoop  0 2019-08-12 20:02 
> /user/sparkhive/warehouse/func1.db/_



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-28697) select * from _; throws InvalidInputException and says path does not exists at HDFS side

2019-08-12 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905138#comment-16905138
 ] 

Ajith S edited comment on SPARK-28697 at 8/12/19 12:35 PM:
---

Found this even in single node, local filesystem case.

select * from _;
select * from _table1;

Below is the stack
19/08/12 18:00:18 ERROR SparkSQLDriver: Failed in [select * from _]
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
file:/home/root1/spark/install/spark-3.0.0-SNAPSHOT-bin-custom-spark/bin/spark-warehouse/_
at 
org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:297)
at 
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:239)
at 
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:325)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:205)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:256)
at scala.Option.getOrElse(Option.scala:138)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:254)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:256)
at scala.Option.getOrElse(Option.scala:138)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:254)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:256)
at scala.Option.getOrElse(Option.scala:138)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:254)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:256)
at scala.Option.getOrElse(Option.scala:138)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:254)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:256)
at scala.Option.getOrElse(Option.scala:138)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:254)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2119)
at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:961)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:366)
at org.apache.spark.rdd.RDD.collect(RDD.scala:960)
at 
org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:372)
at 
org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:399)
at 
org.apache.spark.sql.execution.HiveResult$.hiveResultString(HiveResult.scala:52)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.$anonfun$run$1(SparkSQLDriver.scala:65)
at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$4(SQLExecution.scala:100)
at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160)
at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:87)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:65)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:368)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:273)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:920)
at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:179)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:202)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:89)
at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:999)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1008)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
org.apache.hadoop.mapred.InvalidInputException: Input path does no

[jira] [Comment Edited] (SPARK-28697) select * from _; throws InvalidInputException and says path does not exists at HDFS side

2019-08-12 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905138#comment-16905138
 ] 

Ajith S edited comment on SPARK-28697 at 8/12/19 12:35 PM:
---

Found this even in single node, local filesystem case. Will work on it

select * from _;
select * from _table1;

Below is the stack
19/08/12 18:00:18 ERROR SparkSQLDriver: Failed in [select * from _]
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
file:/home/root1/spark/install/spark-3.0.0-SNAPSHOT-bin-custom-spark/bin/spark-warehouse/_
at 
org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:297)
at 
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:239)
at 
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:325)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:205)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:256)
at scala.Option.getOrElse(Option.scala:138)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:254)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:256)
at scala.Option.getOrElse(Option.scala:138)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:254)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:256)
at scala.Option.getOrElse(Option.scala:138)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:254)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:256)
at scala.Option.getOrElse(Option.scala:138)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:254)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:256)
at scala.Option.getOrElse(Option.scala:138)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:254)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2119)
at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:961)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:366)
at org.apache.spark.rdd.RDD.collect(RDD.scala:960)
at 
org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:372)
at 
org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:399)
at 
org.apache.spark.sql.execution.HiveResult$.hiveResultString(HiveResult.scala:52)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.$anonfun$run$1(SparkSQLDriver.scala:65)
at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$4(SQLExecution.scala:100)
at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160)
at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:87)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:65)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:368)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:273)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:920)
at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:179)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:202)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:89)
at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:999)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1008)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
org.apache.hadoop.mapred.InvalidInputException: In

[jira] [Commented] (SPARK-28697) select * from _; throws InvalidInputException and says path does not exists at HDFS side

2019-08-12 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905138#comment-16905138
 ] 

Ajith S commented on SPARK-28697:
-

Found this even in single node, local filesystem case.

Below is the stack
19/08/12 18:00:18 ERROR SparkSQLDriver: Failed in [select * from _]
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
file:/home/root1/spark/install/spark-3.0.0-SNAPSHOT-bin-custom-spark/bin/spark-warehouse/_
at 
org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:297)
at 
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:239)
at 
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:325)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:205)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:256)
at scala.Option.getOrElse(Option.scala:138)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:254)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:256)
at scala.Option.getOrElse(Option.scala:138)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:254)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:256)
at scala.Option.getOrElse(Option.scala:138)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:254)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:256)
at scala.Option.getOrElse(Option.scala:138)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:254)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:256)
at scala.Option.getOrElse(Option.scala:138)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:254)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2119)
at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:961)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:366)
at org.apache.spark.rdd.RDD.collect(RDD.scala:960)
at 
org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:372)
at 
org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:399)
at 
org.apache.spark.sql.execution.HiveResult$.hiveResultString(HiveResult.scala:52)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.$anonfun$run$1(SparkSQLDriver.scala:65)
at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$4(SQLExecution.scala:100)
at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160)
at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:87)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:65)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:368)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:273)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:920)
at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:179)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:202)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:89)
at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:999)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1008)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
file:/home/root1/spark/install/spark-3.0.0-SNAPSHOT-bin-custom-spark/bin/spark-war

[jira] [Commented] (SPARK-28696) create database _; allowing in Spark but Hive throws Parse Exception

2019-08-12 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-28696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905127#comment-16905127
 ] 

Ajith S commented on SPARK-28696:
-

Thanks for reporting. I can see this due to 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4#L1694
 where just "_" is a valid identifier. To keep it compatible with Hive i can 
fix this. Any thoughts [~srowen] [~dongjoon]

> create database _; allowing in Spark but Hive throws Parse Exception
> 
>
> Key: SPARK-28696
> URL: https://issues.apache.org/jira/browse/SPARK-28696
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: ABHISHEK KUMAR GUPTA
>Priority: Minor
>
> In Spark
> spark-sql> create database _;
> Time taken: 0.062 seconds
> spark-sql> show databases;
> _
> adap
> adaptive
> adaptive_tc8
> In Hive
> 0: jdbc:hive2://10.18.98.147:21066/> create database _;
> Error: Error while compiling statement: FAILED: ParseException line 1:16 
> cannot recognize input near '_--0' '' '' in create database 
> statement (state=42000,code=4)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-28676) Avoid Excessive logging from ContextCleaner

2019-08-09 Thread Ajith S (JIRA)
Ajith S created SPARK-28676:
---

 Summary: Avoid Excessive logging from ContextCleaner
 Key: SPARK-28676
 URL: https://issues.apache.org/jira/browse/SPARK-28676
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 2.4.3, 2.3.3, 3.0.0
Reporter: Ajith S


In high workload environments, ContextCleaner seems to have excessive logging 
at INFO level which do not give much information. In one Particular case we see 
that ``INFO ContextCleaner: Cleaned accumulator`` message is 25-30% of the 
generated logs. We can log this information for cleanup in DEBUG level instead.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23626) DAGScheduler blocked due to JobSubmitted event

2019-05-31 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-23626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-23626:

Affects Version/s: 2.4.3

>  DAGScheduler blocked due to JobSubmitted event
> ---
>
> Key: SPARK-23626
> URL: https://issues.apache.org/jira/browse/SPARK-23626
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: 2.2.1, 2.3.3, 3.0.0, 2.4.3
>Reporter: Ajith S
>Priority: Major
>
> DAGScheduler becomes a bottleneck in cluster when multiple JobSubmitted 
> events has to be processed as DAGSchedulerEventProcessLoop is single threaded 
> and it will block other tasks in queue like TaskCompletion.
> The JobSubmitted event is time consuming depending on the nature of the job 
> (Example: calculating parent stage dependencies, shuffle dependencies, 
> partitions) and thus it blocks all the events to be processed.
>  
> I see multiple JIRA referring to this behavior
> https://issues.apache.org/jira/browse/SPARK-2647
> https://issues.apache.org/jira/browse/SPARK-4961
>  
> Similarly in my cluster some jobs partition calculation is time consuming 
> (Similar to stack at SPARK-2647) hence it slows down the spark 
> DAGSchedulerEventProcessLoop which results in user jobs to slowdown, even if 
> its tasks are finished within seconds, as TaskCompletion Events are processed 
> at a slower rate due to blockage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23626) DAGScheduler blocked due to JobSubmitted event

2019-05-31 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-23626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-23626:

Summary:  DAGScheduler blocked due to JobSubmitted event  (was: Spark 
DAGScheduler scheduling performance hindered on JobSubmitted Event)

>  DAGScheduler blocked due to JobSubmitted event
> ---
>
> Key: SPARK-23626
> URL: https://issues.apache.org/jira/browse/SPARK-23626
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: 2.2.1, 2.3.3, 3.0.0
>Reporter: Ajith S
>Priority: Major
>
> DAGScheduler becomes a bottleneck in cluster when multiple JobSubmitted 
> events has to be processed as DAGSchedulerEventProcessLoop is single threaded 
> and it will block other tasks in queue like TaskCompletion.
> The JobSubmitted event is time consuming depending on the nature of the job 
> (Example: calculating parent stage dependencies, shuffle dependencies, 
> partitions) and thus it blocks all the events to be processed.
>  
> I see multiple JIRA referring to this behavior
> https://issues.apache.org/jira/browse/SPARK-2647
> https://issues.apache.org/jira/browse/SPARK-4961
>  
> Similarly in my cluster some jobs partition calculation is time consuming 
> (Similar to stack at SPARK-2647) hence it slows down the spark 
> DAGSchedulerEventProcessLoop which results in user jobs to slowdown, even if 
> its tasks are finished within seconds, as TaskCompletion Events are processed 
> at a slower rate due to blockage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23626) Spark DAGScheduler scheduling performance hindered on JobSubmitted Event

2019-05-31 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-23626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-23626:

Labels:   (was: bulk-closed)

> Spark DAGScheduler scheduling performance hindered on JobSubmitted Event
> 
>
> Key: SPARK-23626
> URL: https://issues.apache.org/jira/browse/SPARK-23626
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: 2.2.1, 2.3.3, 3.0.0
>Reporter: Ajith S
>Priority: Major
>
> DAGScheduler becomes a bottleneck in cluster when multiple JobSubmitted 
> events has to be processed as DAGSchedulerEventProcessLoop is single threaded 
> and it will block other tasks in queue like TaskCompletion.
> The JobSubmitted event is time consuming depending on the nature of the job 
> (Example: calculating parent stage dependencies, shuffle dependencies, 
> partitions) and thus it blocks all the events to be processed.
>  
> I see multiple JIRA referring to this behavior
> https://issues.apache.org/jira/browse/SPARK-2647
> https://issues.apache.org/jira/browse/SPARK-4961
>  
> Similarly in my cluster some jobs partition calculation is time consuming 
> (Similar to stack at SPARK-2647) hence it slows down the spark 
> DAGSchedulerEventProcessLoop which results in user jobs to slowdown, even if 
> its tasks are finished within seconds, as TaskCompletion Events are processed 
> at a slower rate due to blockage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27907) HiveUDAF with 0 rows throw NPE

2019-05-31 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-27907:

Description: 
When query returns zero rows, the HiveUDAFFunction throws NPE

CASE 1:
create table abc(a int)
select histogram_numeric(a,2) from abc // NPE

Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most 
recent failure: Lost task 0.0 in stage 1.0 (TID 0, localhost, executor driver): 
java.lang.NullPointerException
at org.apache.spark.sql.hive.HiveUDAFFunction.eval(hiveUDFs.scala:471)
at org.apache.spark.sql.hive.HiveUDAFFunction.eval(hiveUDFs.scala:315)
at 
org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.eval(interfaces.scala:543)
at 
org.apache.spark.sql.execution.aggregate.AggregationIterator.$anonfun$generateResultProjection$5(AggregationIterator.scala:231)
at 
org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.outputForEmptyGroupingKeyWithoutInput(ObjectAggregationIterator.scala:97)
at 
org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec.$anonfun$doExecute$2(ObjectHashAggregateExec.scala:132)
at 
org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec.$anonfun$doExecute$2$adapted(ObjectHashAggregateExec.scala:107)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndexInternal$2(RDD.scala:839)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndexInternal$2$adapted(RDD.scala:839)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:327)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:291)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:327)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:291)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:327)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:291)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:122)
at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:425)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1350)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:428)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


CASE 2:

create table abc(a int)
insert into abc values (1)
select histogram_numeric(a,2) from abc where a=3 //NPE

Job aborted due to stage failure: Task 0 in stage 4.0 failed 1 times, most 
recent failure: Lost task 0.0 in stage 4.0 (TID 5, localhost, executor driver): 
java.lang.NullPointerException
at 
org.apache.spark.sql.hive.HiveUDAFFunction.serialize(hiveUDFs.scala:477)
at 
org.apache.spark.sql.hive.HiveUDAFFunction.serialize(hiveUDFs.scala:315)
at 
org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.serializeAggregateBufferInPlace(interfaces.scala:570)
at 
org.apache.spark.sql.execution.aggregate.AggregationIterator.$anonfun$generateResultProjection$6(AggregationIterator.scala:254)
at 
org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.outputForEmptyGroupingKeyWithoutInput(ObjectAggregationIterator.scala:97)
at 
org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec.$anonfun$doExecute$2(ObjectHashAggregateExec.scala:132)
at 
org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec.$anonfun$doExecute$2$adapted(ObjectHashAggregateExec.scala:107)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndexInternal$2(RDD.scala:839)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndexInternal$2$adapted(RDD.scala:839)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:327)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:291)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:327)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:291)
at 
org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:94)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
at org.apache.spark.scheduler.Task.run(T

[jira] [Updated] (SPARK-27907) HiveUDAF with 0 rows throw NPE

2019-05-31 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-27907:

Summary: HiveUDAF with 0 rows throw NPE  (was: HiveUDAF with 0 rows throw 
NPE when try to serialize)

> HiveUDAF with 0 rows throw NPE
> --
>
> Key: SPARK-27907
> URL: https://issues.apache.org/jira/browse/SPARK-27907
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.3.3, 3.0.0, 2.4.3, 3.1.0
>Reporter: Ajith S
>Priority: Major
>
> When query returns zero rows, the HiveUDAFFunction.seralize throws NPE
> create table abc(a int)
> insert into abc values (1)
> insert into abc values (2)
> select histogram_numeric(a,2) from abc where a=3 //NPE
> Job aborted due to stage failure: Task 0 in stage 4.0 failed 1 times, most 
> recent failure: Lost task 0.0 in stage 4.0 (TID 5, localhost, executor 
> driver): java.lang.NullPointerException
>   at 
> org.apache.spark.sql.hive.HiveUDAFFunction.serialize(hiveUDFs.scala:477)
>   at 
> org.apache.spark.sql.hive.HiveUDAFFunction.serialize(hiveUDFs.scala:315)
>   at 
> org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.serializeAggregateBufferInPlace(interfaces.scala:570)
>   at 
> org.apache.spark.sql.execution.aggregate.AggregationIterator.$anonfun$generateResultProjection$6(AggregationIterator.scala:254)
>   at 
> org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.outputForEmptyGroupingKeyWithoutInput(ObjectAggregationIterator.scala:97)
>   at 
> org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec.$anonfun$doExecute$2(ObjectHashAggregateExec.scala:132)
>   at 
> org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec.$anonfun$doExecute$2$adapted(ObjectHashAggregateExec.scala:107)
>   at 
> org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndexInternal$2(RDD.scala:839)
>   at 
> org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndexInternal$2$adapted(RDD.scala:839)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:327)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:291)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:327)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:291)
>   at 
> org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:94)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
>   at org.apache.spark.scheduler.Task.run(Task.scala:122)
>   at 
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:425)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1350)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:428)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27907) HiveUDAF with 0 rows throw NPE when try to serialize

2019-05-31 Thread Ajith S (JIRA)
Ajith S created SPARK-27907:
---

 Summary: HiveUDAF with 0 rows throw NPE when try to serialize
 Key: SPARK-27907
 URL: https://issues.apache.org/jira/browse/SPARK-27907
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.4.3, 2.3.3, 3.0.0, 3.1.0
Reporter: Ajith S


When query returns zero rows, the HiveUDAFFunction.seralize throws NPE

create table abc(a int)
insert into abc values (1)
insert into abc values (2)
select histogram_numeric(a,2) from abc where a=3 //NPE

Job aborted due to stage failure: Task 0 in stage 4.0 failed 1 times, most 
recent failure: Lost task 0.0 in stage 4.0 (TID 5, localhost, executor driver): 
java.lang.NullPointerException
at 
org.apache.spark.sql.hive.HiveUDAFFunction.serialize(hiveUDFs.scala:477)
at 
org.apache.spark.sql.hive.HiveUDAFFunction.serialize(hiveUDFs.scala:315)
at 
org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.serializeAggregateBufferInPlace(interfaces.scala:570)
at 
org.apache.spark.sql.execution.aggregate.AggregationIterator.$anonfun$generateResultProjection$6(AggregationIterator.scala:254)
at 
org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.outputForEmptyGroupingKeyWithoutInput(ObjectAggregationIterator.scala:97)
at 
org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec.$anonfun$doExecute$2(ObjectHashAggregateExec.scala:132)
at 
org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec.$anonfun$doExecute$2$adapted(ObjectHashAggregateExec.scala:107)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndexInternal$2(RDD.scala:839)
at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndexInternal$2$adapted(RDD.scala:839)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:327)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:291)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:327)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:291)
at 
org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:94)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
at org.apache.spark.scheduler.Task.run(Task.scala:122)
at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:425)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1350)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:428)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23626) Spark DAGScheduler scheduling performance hindered on JobSubmitted Event

2019-05-21 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-23626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-23626:

Affects Version/s: 3.0.0

> Spark DAGScheduler scheduling performance hindered on JobSubmitted Event
> 
>
> Key: SPARK-23626
> URL: https://issues.apache.org/jira/browse/SPARK-23626
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: 2.2.1, 2.3.3, 3.0.0
>Reporter: Ajith S
>Priority: Major
>  Labels: bulk-closed
>
> DAGScheduler becomes a bottleneck in cluster when multiple JobSubmitted 
> events has to be processed as DAGSchedulerEventProcessLoop is single threaded 
> and it will block other tasks in queue like TaskCompletion.
> The JobSubmitted event is time consuming depending on the nature of the job 
> (Example: calculating parent stage dependencies, shuffle dependencies, 
> partitions) and thus it blocks all the events to be processed.
>  
> I see multiple JIRA referring to this behavior
> https://issues.apache.org/jira/browse/SPARK-2647
> https://issues.apache.org/jira/browse/SPARK-4961
>  
> Similarly in my cluster some jobs partition calculation is time consuming 
> (Similar to stack at SPARK-2647) hence it slows down the spark 
> DAGSchedulerEventProcessLoop which results in user jobs to slowdown, even if 
> its tasks are finished within seconds, as TaskCompletion Events are processed 
> at a slower rate due to blockage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23626) Spark DAGScheduler scheduling performance hindered on JobSubmitted Event

2019-05-21 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-23626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-23626:

Affects Version/s: 2.3.3

> Spark DAGScheduler scheduling performance hindered on JobSubmitted Event
> 
>
> Key: SPARK-23626
> URL: https://issues.apache.org/jira/browse/SPARK-23626
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: 2.2.1, 2.3.3
>Reporter: Ajith S
>Priority: Major
>  Labels: bulk-closed
>
> DAGScheduler becomes a bottleneck in cluster when multiple JobSubmitted 
> events has to be processed as DAGSchedulerEventProcessLoop is single threaded 
> and it will block other tasks in queue like TaskCompletion.
> The JobSubmitted event is time consuming depending on the nature of the job 
> (Example: calculating parent stage dependencies, shuffle dependencies, 
> partitions) and thus it blocks all the events to be processed.
>  
> I see multiple JIRA referring to this behavior
> https://issues.apache.org/jira/browse/SPARK-2647
> https://issues.apache.org/jira/browse/SPARK-4961
>  
> Similarly in my cluster some jobs partition calculation is time consuming 
> (Similar to stack at SPARK-2647) hence it slows down the spark 
> DAGSchedulerEventProcessLoop which results in user jobs to slowdown, even if 
> its tasks are finished within seconds, as TaskCompletion Events are processed 
> at a slower rate due to blockage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Reopened] (SPARK-23626) Spark DAGScheduler scheduling performance hindered on JobSubmitted Event

2019-05-21 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-23626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S reopened SPARK-23626:
-

Resolution in progress

> Spark DAGScheduler scheduling performance hindered on JobSubmitted Event
> 
>
> Key: SPARK-23626
> URL: https://issues.apache.org/jira/browse/SPARK-23626
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: 2.2.1
>Reporter: Ajith S
>Priority: Major
>  Labels: bulk-closed
>
> DAGScheduler becomes a bottleneck in cluster when multiple JobSubmitted 
> events has to be processed as DAGSchedulerEventProcessLoop is single threaded 
> and it will block other tasks in queue like TaskCompletion.
> The JobSubmitted event is time consuming depending on the nature of the job 
> (Example: calculating parent stage dependencies, shuffle dependencies, 
> partitions) and thus it blocks all the events to be processed.
>  
> I see multiple JIRA referring to this behavior
> https://issues.apache.org/jira/browse/SPARK-2647
> https://issues.apache.org/jira/browse/SPARK-4961
>  
> Similarly in my cluster some jobs partition calculation is time consuming 
> (Similar to stack at SPARK-2647) hence it slows down the spark 
> DAGSchedulerEventProcessLoop which results in user jobs to slowdown, even if 
> its tasks are finished within seconds, as TaskCompletion Events are processed 
> at a slower rate due to blockage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27264) spark sql released all executor but the job is not done

2019-03-24 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800323#comment-16800323
 ] 

Ajith S commented on SPARK-27264:
-

Can you add a snippet to reproduce this.?

> spark sql released all executor but the job is not done
> ---
>
> Key: SPARK-27264
> URL: https://issues.apache.org/jira/browse/SPARK-27264
> Project: Spark
>  Issue Type: Question
>  Components: SQL
>Affects Versions: 2.4.0
> Environment: Azure HDinsight spark 2.4 on Azure storage SQL: Read and 
> Join some data and finally write result to a Hive metastore; query executed 
> on jupyterhub; while the pre-migration cluster is a jupyter (non-hub)
>Reporter: Mike Chan
>Priority: Major
>
> I have a spark sql that used to execute < 10 mins now running at 3 hours 
> after a cluster migration and need to deep dive on what it's actually doing. 
> I'm new to spark and please don't mind if I'm asking something unrelated.
> Increased spark.executor.memory but no luck. Env: Azure HDinsight spark 2.4 
> on Azure storage SQL: Read and Join some data and finally write result to a 
> Hive metastore
> The sparl.sql ends with below code: 
> .write.mode("overwrite").saveAsTable("default.mikemiketable")
> Application Behavior: Within the first 15 mins, it loads and complete most 
> tasks (199/200); left only 1 executor process alive and continually to 
> shuffle read / write data. Because now it only leave 1 executor, we need to 
> wait 3 hours until this application finish. 
> [!https://i.stack.imgur.com/6hqvh.png!|https://i.stack.imgur.com/6hqvh.png]
> Left only 1 executor alive 
> [!https://i.stack.imgur.com/55162.png!|https://i.stack.imgur.com/55162.png]
> Not sure what's the executor doing: 
> [!https://i.stack.imgur.com/TwhuX.png!|https://i.stack.imgur.com/TwhuX.png]
> From time to time, we can tell the shuffle read increased: 
> [!https://i.stack.imgur.com/WhF9A.png!|https://i.stack.imgur.com/WhF9A.png]
> Therefore I increased the spark.executor.memory to 20g, but nothing changed. 
> From Ambari and YARN I can tell the cluster has many resources left. 
> [!https://i.stack.imgur.com/pngQA.png!|https://i.stack.imgur.com/pngQA.png]
> Release of almost all executor 
> [!https://i.stack.imgur.com/pA134.png!|https://i.stack.imgur.com/pA134.png]
> Any guidance is greatly appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27219) Misleading exceptions in transport code's SASL fallback path

2019-03-20 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797435#comment-16797435
 ] 

Ajith S commented on SPARK-27219:
-

So do we just log a simple warn with one line message and print the stack in a 
finer(DEBUG, TRACE) log level.? 

> Misleading exceptions in transport code's SASL fallback path
> 
>
> Key: SPARK-27219
> URL: https://issues.apache.org/jira/browse/SPARK-27219
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: Marcelo Vanzin
>Priority: Minor
>
> There are a couple of code paths in the SASL fallback handling that result in 
> misleading exceptions printed to logs. One of them is if a timeout occurs 
> during authentication; for example:
> {noformat}
> 19/03/15 11:21:37 WARN crypto.AuthClientBootstrap: New auth protocol failed, 
> trying SASL.
> java.lang.RuntimeException: java.util.concurrent.TimeoutException: Timeout 
> waiting for task.
> at 
> org.spark_project.guava.base.Throwables.propagate(Throwables.java:160)
> at 
> org.apache.spark.network.client.TransportClient.sendRpcSync(TransportClient.java:258)
> at 
> org.apache.spark.network.crypto.AuthClientBootstrap.doSparkAuth(AuthClientBootstrap.java:105)
> at 
> org.apache.spark.network.crypto.AuthClientBootstrap.doBootstrap(AuthClientBootstrap.java:79)
> at 
> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:262)
> at 
> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:192)
> at 
> org.apache.spark.network.shuffle.ExternalShuffleClient.lambda$fetchBlocks$0(ExternalShuffleClient.java:100)
> at 
> org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:141)
> ...
> Caused by: java.util.concurrent.TimeoutException: Timeout waiting for task.
> at 
> org.spark_project.guava.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:276)
> at 
> org.spark_project.guava.util.concurrent.AbstractFuture.get(AbstractFuture.java:96)
> at 
> org.apache.spark.network.client.TransportClient.sendRpcSync(TransportClient.java:254)
> ... 38 more
> 19/03/15 11:21:38 WARN server.TransportChannelHandler: Exception in 
> connection from vc1033.halxg.cloudera.com/10.17.216.43:7337
> java.lang.IllegalArgumentException: Frame length should be positive: 
> -3702202170875367528
> at 
> org.spark_project.guava.base.Preconditions.checkArgument(Preconditions.java:119)
> {noformat}
> The IllegalArgumentException shouldn't happen, it only happens because the 
> code is ignoring the time out and retrying, at which point the remote side is 
> in a different state and thus doesn't expect the message.
> The same line that prints that exception can result in a noisy log message 
> when the remote side (e.g. an old shuffle service) does not understand the 
> new auth protocol. Since it's a warning it seems like something is wrong, 
> when it's just doing what's expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27220) Remove Yarn specific leftover from CoarseGrainedSchedulerBackend

2019-03-20 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797428#comment-16797428
 ] 

Ajith S commented on SPARK-27220:
-

# About making *currentExecutorIdCounter* datatype consistent, Yes, 
*currentExecutorIdCounter* is int initially in *CoarseGrainedSchedulerBackend*, 
but when it expects *RegisterExecutor* it expects String which makes it 
confusing.  ** Also *CoarseGrainedExecutorBackend* fires *RegisterExecutor* 
incase of yarn,mesos with executorId as String
 # About moving out *currentExecutorIdCounter* from 
*CoarseGrainedSchedulerBackend,* this i am unsure as 
*CoarseGrainedSchedulerBackend* is just offering a mechanism to maintain 
executor ids which yarn is just reusing (But  i see mesos ignores it completely 
and instead uses mesosTaskId, so makes sense of moving 
*currentExecutorIdCounter* out to yarn)

cc [~srowen] [~dongjoon] [~hyukjin.kwon] any thoughts.?

> Remove Yarn specific leftover from CoarseGrainedSchedulerBackend
> 
>
> Key: SPARK-27220
> URL: https://issues.apache.org/jira/browse/SPARK-27220
> Project: Spark
>  Issue Type: Task
>  Components: Spark Core, YARN
>Affects Versions: 2.0.2, 2.1.3, 2.2.3, 2.3.3, 2.4.0
>Reporter: Jacek Lewandowski
>Priority: Minor
>
> {{CoarseGrainedSchedulerBackend}} has the following field:
> {code:scala}
>   // The num of current max ExecutorId used to re-register appMaster
>   @volatile protected var currentExecutorIdCounter = 0
> {code}
> which is then updated:
> {code:scala}
>   case RegisterExecutor(executorId, executorRef, hostname, cores, 
> logUrls) =>
> ...
>   // This must be synchronized because variables mutated
>   // in this block are read when requesting executors
>   CoarseGrainedSchedulerBackend.this.synchronized {
> executorDataMap.put(executorId, data)
> if (currentExecutorIdCounter < executorId.toInt) {
>   currentExecutorIdCounter = executorId.toInt
> }
> ...
> {code}
> However it is never really used in {{CoarseGrainedSchedulerBackend}}. Its 
> only usage is in Yarn-specific code. It should be moved to Yarn then because 
> {{executorId}} is a {{String}} and there are really no guarantees that it is 
> always an integer. It was introduced in SPARK-12864



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27194) Job failures when task attempts do not clean up spark-staging parquet files

2019-03-20 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797241#comment-16797241
 ] 

Ajith S commented on SPARK-27194:
-

Hi [~dongjoon] , i have some analysis here : 
[https://github.com/apache/spark/pull/24142#issuecomment-474866759] Please let 
me know your views

> Job failures when task attempts do not clean up spark-staging parquet files
> ---
>
> Key: SPARK-27194
> URL: https://issues.apache.org/jira/browse/SPARK-27194
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 2.3.1, 2.3.2
>Reporter: Reza Safi
>Priority: Major
>
> When a container fails for some reason (for example when killed by yarn for 
> exceeding memory limits), the subsequent task attempts for the tasks that 
> were running on that container all fail with a FileAlreadyExistsException. 
> The original task attempt does not seem to successfully call abortTask (or at 
> least its "best effort" delete is unsuccessful) and clean up the parquet file 
> it was writing to, so when later task attempts try to write to the same 
> spark-staging directory using the same file name, the job fails.
> Here is what transpires in the logs:
> The container where task 200.0 is running is killed and the task is lost:
> {code}
> 19/02/20 09:33:25 ERROR cluster.YarnClusterScheduler: Lost executor y on 
> t.y.z.com: Container killed by YARN for exceeding memory limits. 8.1 GB of 8 
> GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.
>  19/02/20 09:33:25 WARN scheduler.TaskSetManager: Lost task 200.0 in stage 
> 0.0 (TID xxx, t.y.z.com, executor 93): ExecutorLostFailure (executor 93 
> exited caused by one of the running tasks) Reason: Container killed by YARN 
> for exceeding memory limits. 8.1 GB of 8 GB physical memory used. Consider 
> boosting spark.yarn.executor.memoryOverhead.
> {code}
> The task is re-attempted on a different executor and fails because the 
> part-00200-blah-blah.c000.snappy.parquet file from the first task attempt 
> already exists:
> {code}
> 19/02/20 09:35:01 WARN scheduler.TaskSetManager: Lost task 200.1 in stage 0.0 
> (TID 594, tn.y.z.com, executor 70): org.apache.spark.SparkException: Task 
> failed while writing rows.
>  at 
> org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:285)
>  at 
> org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:197)
>  at 
> org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:196)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>  at org.apache.spark.scheduler.Task.run(Task.scala:109)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
>  Caused by: org.apache.hadoop.fs.FileAlreadyExistsException: 
> /user/hive/warehouse/tmp_supply_feb1/.spark-staging-blah-blah-blah/dt=2019-02-17/part-00200-blah-blah.c000.snappy.parquet
>  for client a.b.c.d already exists
> {code}
> The job fails when the the configured task attempts (spark.task.maxFailures) 
> have failed with the same error:
> {code}
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 200 
> in stage 0.0 failed 20 times, most recent failure: Lost task 284.19 in stage 
> 0.0 (TID yyy, tm.y.z.com, executor 16): org.apache.spark.SparkException: Task 
> failed while writing rows.
>  at 
> org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:285)
>  ...
>  Caused by: org.apache.hadoop.fs.FileAlreadyExistsException: 
> /user/hive/warehouse/tmp_supply_feb1/.spark-staging-blah-blah-blah/dt=2019-02-17/part-00200-blah-blah.c000.snappy.parquet
>  for client i.p.a.d already exists
> {code}
> SPARK-26682 wasn't the root cause here, since there wasn't any stage 
> reattempt.
> This issue seems to happen when 
> spark.sql.sources.partitionOverwriteMode=dynamic. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-27194) Job failures when task attempts do not clean up spark-staging parquet files

2019-03-20 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797216#comment-16797216
 ] 

Ajith S edited comment on SPARK-27194 at 3/20/19 2:38 PM:
--

[~dongjoon] Yes i tried with spark 2.3.3, and the issue persist. Here is the 
operation i performed
{code:java}
spark.sql.sources.partitionOverwriteMode=DYNAMIC{code}
{code:java}
create table t1 (i int, part1 int, part2 int) using parquet partitioned by 
(part1, part2)
insert into t1 partition(part1=1, part2=1) select 1
insert overwrite table t1 partition(part1=1, part2=1) select 2
insert overwrite table t1 partition(part1=2, part2) select 2, 2   // here the 
exec is killed and task respawns{code}
and here is the full stacktrace as per 2.3.3
{code:java}
2019-03-20 19:58:06 WARN TaskSetManager:66 - Lost task 0.1 in stage 2.0 (TID 3, 
QWERTY, executor 2): org.apache.spark.SparkException: Task failed while writing 
rows.
at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:285)
at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:197)
at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:196)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.fs.FileAlreadyExistsException: 
/user/hive/warehouse/t2/.spark-staging-1f1efbfd-7e20-4e0f-a49c-a7fa3eae4cb1/part1=2/part2=2/part-0-1f1efbfd-7e20-4e0f-a49c-a7fa3eae4cb1.c000.snappy.parquet
 for client 127.0.0.1 already exists
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2578)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2465)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2349)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:624)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:398)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2213)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
at 
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1653)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1689)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1624)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at 
org.apache.parquet.hadoop.ParquetFileWriter.(ParquetFileWriter.java:236)
at 
org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:342)
at 
org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:302)
at 
org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.(ParquetOutputWriter.scala:37)
at 
org

[jira] [Commented] (SPARK-27194) Job failures when task attempts do not clean up spark-staging parquet files

2019-03-20 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797216#comment-16797216
 ] 

Ajith S commented on SPARK-27194:
-

[~dongjoon] Yes i tried with spark 2.3.3, and the issue persist. Here is the 
operation i performed
{code:java}
create table t1 (i int, part1 int, part2 int) using parquet partitioned by 
(part1, part2)
insert into t1 partition(part1=1, part2=1) select 1
insert overwrite table t1 partition(part1=1, part2=1) select 2
insert overwrite table t1 partition(part1=2, part2) select 2, 2   // here the 
exec is killed and task respawns{code}
and here is the full stacktrace as per 2.3.3
{code:java}
2019-03-20 19:58:06 WARN TaskSetManager:66 - Lost task 0.1 in stage 2.0 (TID 3, 
QWERTY, executor 2): org.apache.spark.SparkException: Task failed while writing 
rows.
at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:285)
at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:197)
at 
org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:196)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.fs.FileAlreadyExistsException: 
/user/hive/warehouse/t2/.spark-staging-1f1efbfd-7e20-4e0f-a49c-a7fa3eae4cb1/part1=2/part2=2/part-0-1f1efbfd-7e20-4e0f-a49c-a7fa3eae4cb1.c000.snappy.parquet
 for client 127.0.0.1 already exists
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2578)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2465)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2349)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:624)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:398)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2213)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
at 
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1653)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1689)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1624)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at 
org.apache.parquet.hadoop.ParquetFileWriter.(ParquetFileWriter.java:236)
at 
org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:342)
at 
org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:302)
at 
org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.(ParquetOutputWriter.scala:37)
at 
org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anon$1.newInstance(ParquetFileFormat.scala:151)
a

[jira] [Created] (SPARK-27200) History Environment tab must sort Configurations/Properties by default

2019-03-19 Thread Ajith S (JIRA)
Ajith S created SPARK-27200:
---

 Summary: History Environment tab must sort 
Configurations/Properties by default
 Key: SPARK-27200
 URL: https://issues.apache.org/jira/browse/SPARK-27200
 Project: Spark
  Issue Type: Bug
  Components: Web UI
Affects Versions: 3.0.0
Reporter: Ajith S


Environment Page in SparkUI have all the configuration sorted by key. But this 
is not the case in History server case, to keep UX same, we can have it sorted 
in history server too



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27198) Heartbeat interval mismatch in driver and executor

2019-03-18 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16795666#comment-16795666
 ] 

Ajith S commented on SPARK-27198:
-

will be working on this

> Heartbeat interval mismatch in driver and executor
> --
>
> Key: SPARK-27198
> URL: https://issues.apache.org/jira/browse/SPARK-27198
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.3, 2.4.0
>Reporter: Ajith S
>Priority: Major
>
> When heartbeat interval is configured via *spark.executor.heartbeatInterval* 
> without specifying units, we have time mismatched between driver(considers in 
> seconds) and executor(considers as milliseconds)
>  
> [https://github.com/apache/spark/blob/v2.4.1-rc8/core/src/main/scala/org/apache/spark/SparkConf.scala#L613]
> vs
> [https://github.com/apache/spark/blob/v2.4.1-rc8/core/src/main/scala/org/apache/spark/executor/Executor.scala#L858]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27198) Heartbeat interval mismatch in driver and executor

2019-03-18 Thread Ajith S (JIRA)
Ajith S created SPARK-27198:
---

 Summary: Heartbeat interval mismatch in driver and executor
 Key: SPARK-27198
 URL: https://issues.apache.org/jira/browse/SPARK-27198
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 2.4.0, 2.3.3
Reporter: Ajith S


When heartbeat interval is configured via *spark.executor.heartbeatInterval* 
without specifying units, we have time mismatched between driver(considers in 
seconds) and executor(considers as milliseconds)

 
[https://github.com/apache/spark/blob/v2.4.1-rc8/core/src/main/scala/org/apache/spark/SparkConf.scala#L613]

vs

[https://github.com/apache/spark/blob/v2.4.1-rc8/core/src/main/scala/org/apache/spark/executor/Executor.scala#L858]

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27194) Job failures when task attempts do not clean up spark-staging parquet files

2019-03-18 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16795575#comment-16795575
 ] 

Ajith S commented on SPARK-27194:
-

Currently looks like from logs the file name for task 200.0 and 
200.1(reattempt) expected file name to be, 
part-00200-blah-blah.c000.snappy.parquet. (refer 
org.apache.spark.internal.io.HadoopMapReduceCommitProtocol#getFilename)

May be we should have taskId_attemptId in the part file name so that rerun 
tasks do not conflict with older failed tasks.

cc [~srowen] [~cloud_fan] [~dongjoon] any thoughts.?

> Job failures when task attempts do not clean up spark-staging parquet files
> ---
>
> Key: SPARK-27194
> URL: https://issues.apache.org/jira/browse/SPARK-27194
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 2.3.1, 2.3.2
>Reporter: Reza Safi
>Priority: Major
>
> When a container fails for some reason (for example when killed by yarn for 
> exceeding memory limits), the subsequent task attempts for the tasks that 
> were running on that container all fail with a FileAlreadyExistsException. 
> The original task attempt does not seem to successfully call abortTask (or at 
> least its "best effort" delete is unsuccessful) and clean up the parquet file 
> it was writing to, so when later task attempts try to write to the same 
> spark-staging directory using the same file name, the job fails.
> Here is what transpires in the logs:
> The container where task 200.0 is running is killed and the task is lost:
> 19/02/20 09:33:25 ERROR cluster.YarnClusterScheduler: Lost executor y on 
> t.y.z.com: Container killed by YARN for exceeding memory limits. 8.1 GB of 8 
> GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.
>  19/02/20 09:33:25 WARN scheduler.TaskSetManager: Lost task 200.0 in stage 
> 0.0 (TID xxx, t.y.z.com, executor 93): ExecutorLostFailure (executor 93 
> exited caused by one of the running tasks) Reason: Container killed by YARN 
> for exceeding memory limits. 8.1 GB of 8 GB physical memory used. Consider 
> boosting spark.yarn.executor.memoryOverhead.
> The task is re-attempted on a different executor and fails because the 
> part-00200-blah-blah.c000.snappy.parquet file from the first task attempt 
> already exists:
> 19/02/20 09:35:01 WARN scheduler.TaskSetManager: Lost task 200.1 in stage 0.0 
> (TID 594, tn.y.z.com, executor 70): org.apache.spark.SparkException: Task 
> failed while writing rows.
>  at 
> org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:285)
>  at 
> org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:197)
>  at 
> org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:196)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>  at org.apache.spark.scheduler.Task.run(Task.scala:109)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
>  Caused by: org.apache.hadoop.fs.FileAlreadyExistsException: 
> /user/hive/warehouse/tmp_supply_feb1/.spark-staging-blah-blah-blah/dt=2019-02-17/part-00200-blah-blah.c000.snappy.parquet
>  for client 17.161.235.91 already exists
> The job fails when the the configured task attempts (spark.task.maxFailures) 
> have failed with the same error:
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 200 
> in stage 0.0 failed 20 times, most recent failure: Lost task 284.19 in stage 
> 0.0 (TID yyy, tm.y.z.com, executor 16): org.apache.spark.SparkException: Task 
> failed while writing rows.
>  at 
> org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:285)
>  ...
>  Caused by: org.apache.hadoop.fs.FileAlreadyExistsException: 
> /user/hive/warehouse/tmp_supply_feb1/.spark-staging-blah-blah-blah/dt=2019-02-17/part-00200-blah-blah.c000.snappy.parquet
>  for client i.p.a.d already exists
>  
> SPARK-26682 wasn't the root cause here, since there wasn't any stage 
> reattempt.
> This issue seems to happen when 
> spark.sql.sources.partitionOverwriteMode=dynamic. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-

[jira] [Commented] (SPARK-26961) Found Java-level deadlock in Spark Driver

2019-03-17 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16794551#comment-16794551
 ] 

Ajith S commented on SPARK-26961:
-

[~srowen] ok, will raise a PR for this. Thanks

> Found Java-level deadlock in Spark Driver
> -
>
> Key: SPARK-26961
> URL: https://issues.apache.org/jira/browse/SPARK-26961
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.0
>Reporter: Rong Jialei
>Priority: Major
> Attachments: image-2019-03-13-19-53-52-390.png
>
>
> Our spark job usually will finish in minutes, however, we recently found it 
> take days to run, and we can only kill it when this happened.
> An investigation show all worker container could not connect drive after 
> start, and driver is hanging, using jstack, we found a Java-level deadlock.
>  
> *Jstack output for deadlock part is showing below:*
>  
> Found one Java-level deadlock:
> =
> "SparkUI-907":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> "ForkJoinPool-1-worker-57":
>  waiting to lock monitor 0x7f3860574298 (object 0x0005b7991168, a 
> org.apache.spark.util.MutableURLClassLoader),
>  which is held by "ForkJoinPool-1-worker-7"
> "ForkJoinPool-1-worker-7":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> Java stack information for the threads listed above:
> ===
> "SparkUI-907":
>  at org.apache.hadoop.conf.Configuration.getOverlay(Configuration.java:1328)
>  - waiting to lock <0x0005c0c1e5e0> (a 
> org.apache.hadoop.conf.Configuration)
>  at 
> org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:684)
>  at org.apache.hadoop.conf.Configuration.get(Configuration.java:1088)
>  at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1145)
>  at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2363)
>  at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2840)
>  at 
> org.apache.hadoop.fs.FsUrlStreamHandlerFactory.createURLStreamHandler(FsUrlStreamHandlerFactory.java:74)
>  at java.net.URL.getURLStreamHandler(URL.java:1142)
>  at java.net.URL.(URL.java:599)
>  at java.net.URL.(URL.java:490)
>  at java.net.URL.(URL.java:439)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doRequest(JettyUtils.scala:176)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doGet(JettyUtils.scala:161)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>  at 
> org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
>  at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:171)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>  at 
> org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
>  at 
> org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>  at 
> org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>  at org.spark_project.jetty.server.Server.handle(Server.java:534)
>  at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>  at 
> org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>  at 
> org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>  at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108)
>  at 
> org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>  at java.lang.Thread.run(Thread.java:748)
> "ForkJoinPool-1-worker-57":
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:404)
>  - waiting to l

[jira] [Commented] (SPARK-27142) Provide REST API for SQL level information

2019-03-17 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16794519#comment-16794519
 ] 

Ajith S commented on SPARK-27142:
-

[~srowen] any inputs about proposal.?

> Provide REST API for SQL level information
> --
>
> Key: SPARK-27142
> URL: https://issues.apache.org/jira/browse/SPARK-27142
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Minor
> Attachments: image-2019-03-13-19-29-26-896.png
>
>
> Currently for Monitoring Spark application SQL information is not available 
> from REST but only via UI. REST provides only 
> applications,jobs,stages,environment. This Jira is targeted to provide a REST 
> API so that SQL level information can be found
>  
> Details: 
> https://issues.apache.org/jira/browse/SPARK-27142?focusedCommentId=16791728&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16791728



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27164) RDD.countApprox on empty RDDs schedules jobs which never complete

2019-03-14 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16793222#comment-16793222
 ] 

Ajith S commented on SPARK-27164:
-

i will be working on this

> RDD.countApprox on empty RDDs schedules jobs which never complete 
> --
>
> Key: SPARK-27164
> URL: https://issues.apache.org/jira/browse/SPARK-27164
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.3, 2.4.0
> Environment: macOS, Spark-2.4.0 with Hadoop 2.7 running on Java 11.0.1
> Also observed on:
> macOS, Spark-2.2.3 with Hadoop 2.7 running on Java 1.8.0_151
>Reporter: Ryan Moore
>Priority: Major
> Attachments: Screen Shot 2019-03-14 at 1.49.19 PM.png
>
>
> When calling `countApprox` on an RDD which has no partitions (such as those 
> created by `sparkContext.emptyRDD`) a job is scheduled with 0 stages and 0 
> tasks. That job appears under the "Active Jobs" in the Spark UI until it is 
> either killed or the Spark context is shut down.
>  
> {code:java}
> Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java 11.0.1)
> Type in expressions to have them evaluated.
> Type :help for more information.
> scala> val ints = sc.makeRDD(Seq(1))
> ints: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at makeRDD at 
> :24
> scala> ints.countApprox(1000)
> res0: 
> org.apache.spark.partial.PartialResult[org.apache.spark.partial.BoundedDouble]
>  = (final: [1.000, 1.000])
> // PartialResult is returned, Scheduled job completed
> scala> ints.filter(_ => false).countApprox(1000)
> res1: 
> org.apache.spark.partial.PartialResult[org.apache.spark.partial.BoundedDouble]
>  = (final: [0.000, 0.000])
> // PartialResult is returned, Scheduled job completed
> scala> sc.emptyRDD[Int].countApprox(1000)
> res5: 
> org.apache.spark.partial.PartialResult[org.apache.spark.partial.BoundedDouble]
>  = (final: [0.000, 0.000])
> // PartialResult is returned, Scheduled job is ACTIVE but never completes
> scala> sc.union(Nil : Seq[org.apache.spark.rdd.RDD[Int]]).countApprox(1000)
> res16: 
> org.apache.spark.partial.PartialResult[org.apache.spark.partial.BoundedDouble]
>  = (final: [0.000, 0.000])
> // PartialResult is returned, Scheduled job is ACTIVE but never completes
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27122) YARN test failures in Java 9+

2019-03-13 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16792355#comment-16792355
 ] 

Ajith S commented on SPARK-27122:
-

ping [~srowen] [~dongjoon] [~Gengliang.Wang]

> YARN test failures in Java 9+
> -
>
> Key: SPARK-27122
> URL: https://issues.apache.org/jira/browse/SPARK-27122
> Project: Spark
>  Issue Type: Sub-task
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Sean Owen
>Priority: Major
> Attachments: image-2019-03-14-09-34-20-592.png, 
> image-2019-03-14-09-35-23-046.png
>
>
> Currently on Java 11:
> {code}
> YarnSchedulerBackendSuite:
> - RequestExecutors reflects node blacklist and is serializable
> - Respect user filters when adding AM IP filter *** FAILED ***
>   java.lang.ClassCastException: 
> org.spark_project.jetty.servlet.ServletContextHandler cannot be cast to 
> org.eclipse.jetty.servlet.ServletContextHandler
>   at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
>   at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>   at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>   at scala.collection.TraversableLike.map(TraversableLike.scala:237)
>   at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:108)
>   at 
> org.apache.spark.scheduler.cluster.YarnSchedulerBackend.$anonfun$addWebUIFilter$2(YarnSchedulerBackend.scala:183)
>   at 
> org.apache.spark.scheduler.cluster.YarnSchedulerBackend.$anonfun$addWebUIFilter$2$adapted(YarnSchedulerBackend.scala:174)
>   at scala.Option.foreach(Option.scala:274)
>   ...
> {code}
> This looks like a classpath issue, probably ultimately related to the same 
> classloader issues in https://issues.apache.org/jira/browse/SPARK-26839 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-27122) YARN test failures in Java 9+

2019-03-13 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16792318#comment-16792318
 ] 

Ajith S edited comment on SPARK-27122 at 3/14/19 4:15 AM:
--

The problem seems to be shading of jetty package.

When we run test, the class path seems to be made from the classes 
folder(resource-managers/yarn/target/scala-2.12/classes) instead of jar 
(resource-managers/yarn/target/spark-yarn_2.12-3.0.0-SNAPSHOT.jar) . Check 
attachment.

Here in  org.apache.spark.scheduler.cluster.YarnSchedulerBackend, as its not 
shaded, it expects 
{color:#FF}org.eclipse.jetty.servlet.ServletContextHandler{color}
{code:java}
ui.getHandlers.map(_.getServletHandler()).foreach { h =>
  val holder = new FilterHolder(){code}
ui.getHandlers is in spark-core and its loaded from spark-core.jar which is 
shaded and hence refers to 
{color:#FF}org.spark_project.jetty.servlet.ServletContextHandler{color}

 And here is the javap command which shows the difference between 
org.apache.spark.scheduler.cluster.YarnSchedulerBackend present in jar folder 
and classes folder

!image-2019-03-14-09-35-23-046.png!


was (Author: ajithshetty):
The problem seems to be shading of jetty package.

When we run test, the class path seems to be made from the classes 
folder(resource-managers/yarn/target/scala-2.12/classes) instead of jar 
(resource-managers/yarn/target/spark-yarn_2.12-3.0.0-SNAPSHOT.jar) . Check 
attachment.

Here in  org.apache.spark.scheduler.cluster.YarnSchedulerBackend, as its not 
shaded, it expects org.eclipse.jetty.servlet.ServletContextHandler
{code:java}
ui.getHandlers.map(_.getServletHandler()).foreach { h =>
  val holder = new FilterHolder(){code}
ui.getHandlers is in spark-core and its loaded from spark-core.jar which is 
shaded and hence refers to org.spark_project.jetty.servlet.ServletContextHandler

 And here is the javap command which shows the difference between 
org.apache.spark.scheduler.cluster.YarnSchedulerBackend present in jar folder 
and classes folder

!image-2019-03-14-09-35-23-046.png!

> YARN test failures in Java 9+
> -
>
> Key: SPARK-27122
> URL: https://issues.apache.org/jira/browse/SPARK-27122
> Project: Spark
>  Issue Type: Sub-task
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Sean Owen
>Priority: Major
> Attachments: image-2019-03-14-09-34-20-592.png, 
> image-2019-03-14-09-35-23-046.png
>
>
> Currently on Java 11:
> {code}
> YarnSchedulerBackendSuite:
> - RequestExecutors reflects node blacklist and is serializable
> - Respect user filters when adding AM IP filter *** FAILED ***
>   java.lang.ClassCastException: 
> org.spark_project.jetty.servlet.ServletContextHandler cannot be cast to 
> org.eclipse.jetty.servlet.ServletContextHandler
>   at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
>   at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>   at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>   at scala.collection.TraversableLike.map(TraversableLike.scala:237)
>   at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:108)
>   at 
> org.apache.spark.scheduler.cluster.YarnSchedulerBackend.$anonfun$addWebUIFilter$2(YarnSchedulerBackend.scala:183)
>   at 
> org.apache.spark.scheduler.cluster.YarnSchedulerBackend.$anonfun$addWebUIFilter$2$adapted(YarnSchedulerBackend.scala:174)
>   at scala.Option.foreach(Option.scala:274)
>   ...
> {code}
> This looks like a classpath issue, probably ultimately related to the same 
> classloader issues in https://issues.apache.org/jira/browse/SPARK-26839 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-27122) YARN test failures in Java 9+

2019-03-13 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16792318#comment-16792318
 ] 

Ajith S edited comment on SPARK-27122 at 3/14/19 4:14 AM:
--

The problem seems to be shading of jetty package.

When we run test, the class path seems to be made from the classes 
folder(resource-managers/yarn/target/scala-2.12/classes) instead of jar 
(resource-managers/yarn/target/spark-yarn_2.12-3.0.0-SNAPSHOT.jar) . Check 
attachment.

Here in  org.apache.spark.scheduler.cluster.YarnSchedulerBackend, as its not 
shaded, it expects org.eclipse.jetty.servlet.ServletContextHandler
{code:java}
ui.getHandlers.map(_.getServletHandler()).foreach { h =>
  val holder = new FilterHolder(){code}
ui.getHandlers is in spark-core and its loaded from spark-core.jar which is 
shaded and hence refers to org.spark_project.jetty.servlet.ServletContextHandler

 And here is the javap command which shows the difference between 
org.apache.spark.scheduler.cluster.YarnSchedulerBackend present in jar folder 
and classes folder

!image-2019-03-14-09-35-23-046.png!


was (Author: ajithshetty):
The problem seems to be shading of jetty package.

When we run test, the class path seems to be made from the classes 
folder(resource-managers/yarn/target/scala-2.12/classes) instead of jar 
(resource-managers/yarn/target/spark-yarn_2.12-3.0.0-SNAPSHOT.jar) . Check 
attachment.

 

And here is the javap command which shows the difference between 
org.apache.spark.scheduler.cluster.YarnSchedulerBackend present in jar folder 
and classes folder

!image-2019-03-14-09-35-23-046.png!

> YARN test failures in Java 9+
> -
>
> Key: SPARK-27122
> URL: https://issues.apache.org/jira/browse/SPARK-27122
> Project: Spark
>  Issue Type: Sub-task
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Sean Owen
>Priority: Major
> Attachments: image-2019-03-14-09-34-20-592.png, 
> image-2019-03-14-09-35-23-046.png
>
>
> Currently on Java 11:
> {code}
> YarnSchedulerBackendSuite:
> - RequestExecutors reflects node blacklist and is serializable
> - Respect user filters when adding AM IP filter *** FAILED ***
>   java.lang.ClassCastException: 
> org.spark_project.jetty.servlet.ServletContextHandler cannot be cast to 
> org.eclipse.jetty.servlet.ServletContextHandler
>   at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
>   at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>   at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>   at scala.collection.TraversableLike.map(TraversableLike.scala:237)
>   at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:108)
>   at 
> org.apache.spark.scheduler.cluster.YarnSchedulerBackend.$anonfun$addWebUIFilter$2(YarnSchedulerBackend.scala:183)
>   at 
> org.apache.spark.scheduler.cluster.YarnSchedulerBackend.$anonfun$addWebUIFilter$2$adapted(YarnSchedulerBackend.scala:174)
>   at scala.Option.foreach(Option.scala:274)
>   ...
> {code}
> This looks like a classpath issue, probably ultimately related to the same 
> classloader issues in https://issues.apache.org/jira/browse/SPARK-26839 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-27122) YARN test failures in Java 9+

2019-03-13 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16792318#comment-16792318
 ] 

Ajith S edited comment on SPARK-27122 at 3/14/19 4:07 AM:
--

The problem seems to be shading of jetty package.

When we run test, the class path seems to be made from the classes 
folder(resource-managers/yarn/target/scala-2.12/classes) instead of jar 
(resource-managers/yarn/target/spark-yarn_2.12-3.0.0-SNAPSHOT.jar) . Check 
attachment.

 

And here is the javap command which shows the difference between 
org.apache.spark.scheduler.cluster.YarnSchedulerBackend present in jar folder 
and classes folder

!image-2019-03-14-09-35-23-046.png!


was (Author: ajithshetty):
The problem seems to be shading of jetty package.

When we run test, the class path seems to be made from the classes 
folder(resource-managers/yarn/target/scala-2.12/classes) instead of jar 
(resource-managers/yarn/target/spark-yarn_2.12-3.0.0-SNAPSHOT.jar) 

 

And here is the javap command which shows the difference between 
org.apache.spark.scheduler.cluster.YarnSchedulerBackend present in jar folder 
and classes folder

!image-2019-03-14-09-35-23-046.png!

> YARN test failures in Java 9+
> -
>
> Key: SPARK-27122
> URL: https://issues.apache.org/jira/browse/SPARK-27122
> Project: Spark
>  Issue Type: Sub-task
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Sean Owen
>Priority: Major
> Attachments: image-2019-03-14-09-34-20-592.png, 
> image-2019-03-14-09-35-23-046.png
>
>
> Currently on Java 11:
> {code}
> YarnSchedulerBackendSuite:
> - RequestExecutors reflects node blacklist and is serializable
> - Respect user filters when adding AM IP filter *** FAILED ***
>   java.lang.ClassCastException: 
> org.spark_project.jetty.servlet.ServletContextHandler cannot be cast to 
> org.eclipse.jetty.servlet.ServletContextHandler
>   at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
>   at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>   at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>   at scala.collection.TraversableLike.map(TraversableLike.scala:237)
>   at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:108)
>   at 
> org.apache.spark.scheduler.cluster.YarnSchedulerBackend.$anonfun$addWebUIFilter$2(YarnSchedulerBackend.scala:183)
>   at 
> org.apache.spark.scheduler.cluster.YarnSchedulerBackend.$anonfun$addWebUIFilter$2$adapted(YarnSchedulerBackend.scala:174)
>   at scala.Option.foreach(Option.scala:274)
>   ...
> {code}
> This looks like a classpath issue, probably ultimately related to the same 
> classloader issues in https://issues.apache.org/jira/browse/SPARK-26839 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-27122) YARN test failures in Java 9+

2019-03-13 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16792318#comment-16792318
 ] 

Ajith S edited comment on SPARK-27122 at 3/14/19 4:06 AM:
--

The problem seems to be shading of jetty package.

When we run test, the class path seems to be made from the classes 
folder(resource-managers/yarn/target/scala-2.12/classes) instead of jar 
(resource-managers/yarn/target/spark-yarn_2.12-3.0.0-SNAPSHOT.jar) 

 

And here is the javap command which shows the difference between 
org.apache.spark.scheduler.cluster.YarnSchedulerBackend present in jar folder 
and classes folder

!image-2019-03-14-09-35-23-046.png!


was (Author: ajithshetty):
The problem seems to be shading of jetty package.

When we run test, the class path seems to be made from the classes 
folder(resource-managers/yarn/target/scala-2.12/classes) instead of jar 
(resource-managers/yarn/target/spark-yarn_2.12-3.0.0-SNAPSHOT.jar) 

Here is test classpath info:

!image-2019-03-14-09-34-20-592.png!

 

And here is the javap command which shows the difference between 
org.apache.spark.scheduler.cluster.YarnSchedulerBackend present in jar folder 
and classes folder

!image-2019-03-14-09-35-23-046.png!

> YARN test failures in Java 9+
> -
>
> Key: SPARK-27122
> URL: https://issues.apache.org/jira/browse/SPARK-27122
> Project: Spark
>  Issue Type: Sub-task
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Sean Owen
>Priority: Major
> Attachments: image-2019-03-14-09-34-20-592.png, 
> image-2019-03-14-09-35-23-046.png
>
>
> Currently on Java 11:
> {code}
> YarnSchedulerBackendSuite:
> - RequestExecutors reflects node blacklist and is serializable
> - Respect user filters when adding AM IP filter *** FAILED ***
>   java.lang.ClassCastException: 
> org.spark_project.jetty.servlet.ServletContextHandler cannot be cast to 
> org.eclipse.jetty.servlet.ServletContextHandler
>   at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
>   at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>   at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>   at scala.collection.TraversableLike.map(TraversableLike.scala:237)
>   at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:108)
>   at 
> org.apache.spark.scheduler.cluster.YarnSchedulerBackend.$anonfun$addWebUIFilter$2(YarnSchedulerBackend.scala:183)
>   at 
> org.apache.spark.scheduler.cluster.YarnSchedulerBackend.$anonfun$addWebUIFilter$2$adapted(YarnSchedulerBackend.scala:174)
>   at scala.Option.foreach(Option.scala:274)
>   ...
> {code}
> This looks like a classpath issue, probably ultimately related to the same 
> classloader issues in https://issues.apache.org/jira/browse/SPARK-26839 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27122) YARN test failures in Java 9+

2019-03-13 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-27122:

Attachment: image-2019-03-14-09-35-23-046.png

> YARN test failures in Java 9+
> -
>
> Key: SPARK-27122
> URL: https://issues.apache.org/jira/browse/SPARK-27122
> Project: Spark
>  Issue Type: Sub-task
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Sean Owen
>Priority: Major
> Attachments: image-2019-03-14-09-34-20-592.png, 
> image-2019-03-14-09-35-23-046.png
>
>
> Currently on Java 11:
> {code}
> YarnSchedulerBackendSuite:
> - RequestExecutors reflects node blacklist and is serializable
> - Respect user filters when adding AM IP filter *** FAILED ***
>   java.lang.ClassCastException: 
> org.spark_project.jetty.servlet.ServletContextHandler cannot be cast to 
> org.eclipse.jetty.servlet.ServletContextHandler
>   at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
>   at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>   at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>   at scala.collection.TraversableLike.map(TraversableLike.scala:237)
>   at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:108)
>   at 
> org.apache.spark.scheduler.cluster.YarnSchedulerBackend.$anonfun$addWebUIFilter$2(YarnSchedulerBackend.scala:183)
>   at 
> org.apache.spark.scheduler.cluster.YarnSchedulerBackend.$anonfun$addWebUIFilter$2$adapted(YarnSchedulerBackend.scala:174)
>   at scala.Option.foreach(Option.scala:274)
>   ...
> {code}
> This looks like a classpath issue, probably ultimately related to the same 
> classloader issues in https://issues.apache.org/jira/browse/SPARK-26839 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27122) YARN test failures in Java 9+

2019-03-13 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16792318#comment-16792318
 ] 

Ajith S commented on SPARK-27122:
-

The problem seems to be shading of jetty package.

When we run test, the class path seems to be made from the classes 
folder(resource-managers/yarn/target/scala-2.12/classes) instead of jar 
(resource-managers/yarn/target/spark-yarn_2.12-3.0.0-SNAPSHOT.jar) 

Here is test classpath info:

!image-2019-03-14-09-34-20-592.png!

 

And here is the javap command which shows the difference between 
org.apache.spark.scheduler.cluster.YarnSchedulerBackend present in jar folder 
and classes folder

!image-2019-03-14-09-35-23-046.png!

> YARN test failures in Java 9+
> -
>
> Key: SPARK-27122
> URL: https://issues.apache.org/jira/browse/SPARK-27122
> Project: Spark
>  Issue Type: Sub-task
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Sean Owen
>Priority: Major
> Attachments: image-2019-03-14-09-34-20-592.png, 
> image-2019-03-14-09-35-23-046.png
>
>
> Currently on Java 11:
> {code}
> YarnSchedulerBackendSuite:
> - RequestExecutors reflects node blacklist and is serializable
> - Respect user filters when adding AM IP filter *** FAILED ***
>   java.lang.ClassCastException: 
> org.spark_project.jetty.servlet.ServletContextHandler cannot be cast to 
> org.eclipse.jetty.servlet.ServletContextHandler
>   at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
>   at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>   at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>   at scala.collection.TraversableLike.map(TraversableLike.scala:237)
>   at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:108)
>   at 
> org.apache.spark.scheduler.cluster.YarnSchedulerBackend.$anonfun$addWebUIFilter$2(YarnSchedulerBackend.scala:183)
>   at 
> org.apache.spark.scheduler.cluster.YarnSchedulerBackend.$anonfun$addWebUIFilter$2$adapted(YarnSchedulerBackend.scala:174)
>   at scala.Option.foreach(Option.scala:274)
>   ...
> {code}
> This looks like a classpath issue, probably ultimately related to the same 
> classloader issues in https://issues.apache.org/jira/browse/SPARK-26839 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27122) YARN test failures in Java 9+

2019-03-13 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-27122:

Attachment: image-2019-03-14-09-34-20-592.png

> YARN test failures in Java 9+
> -
>
> Key: SPARK-27122
> URL: https://issues.apache.org/jira/browse/SPARK-27122
> Project: Spark
>  Issue Type: Sub-task
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Sean Owen
>Priority: Major
> Attachments: image-2019-03-14-09-34-20-592.png
>
>
> Currently on Java 11:
> {code}
> YarnSchedulerBackendSuite:
> - RequestExecutors reflects node blacklist and is serializable
> - Respect user filters when adding AM IP filter *** FAILED ***
>   java.lang.ClassCastException: 
> org.spark_project.jetty.servlet.ServletContextHandler cannot be cast to 
> org.eclipse.jetty.servlet.ServletContextHandler
>   at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
>   at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>   at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>   at scala.collection.TraversableLike.map(TraversableLike.scala:237)
>   at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:108)
>   at 
> org.apache.spark.scheduler.cluster.YarnSchedulerBackend.$anonfun$addWebUIFilter$2(YarnSchedulerBackend.scala:183)
>   at 
> org.apache.spark.scheduler.cluster.YarnSchedulerBackend.$anonfun$addWebUIFilter$2$adapted(YarnSchedulerBackend.scala:174)
>   at scala.Option.foreach(Option.scala:274)
>   ...
> {code}
> This looks like a classpath issue, probably ultimately related to the same 
> classloader issues in https://issues.apache.org/jira/browse/SPARK-26839 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27122) YARN test failures in Java 9+

2019-03-13 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16792299#comment-16792299
 ] 

Ajith S commented on SPARK-27122:
-

I can reproduce this issue even in Java8. I would like to work on this.

 

> YARN test failures in Java 9+
> -
>
> Key: SPARK-27122
> URL: https://issues.apache.org/jira/browse/SPARK-27122
> Project: Spark
>  Issue Type: Sub-task
>  Components: YARN
>Affects Versions: 3.0.0
>Reporter: Sean Owen
>Priority: Major
>
> Currently on Java 11:
> {code}
> YarnSchedulerBackendSuite:
> - RequestExecutors reflects node blacklist and is serializable
> - Respect user filters when adding AM IP filter *** FAILED ***
>   java.lang.ClassCastException: 
> org.spark_project.jetty.servlet.ServletContextHandler cannot be cast to 
> org.eclipse.jetty.servlet.ServletContextHandler
>   at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
>   at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>   at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>   at scala.collection.TraversableLike.map(TraversableLike.scala:237)
>   at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:108)
>   at 
> org.apache.spark.scheduler.cluster.YarnSchedulerBackend.$anonfun$addWebUIFilter$2(YarnSchedulerBackend.scala:183)
>   at 
> org.apache.spark.scheduler.cluster.YarnSchedulerBackend.$anonfun$addWebUIFilter$2$adapted(YarnSchedulerBackend.scala:174)
>   at scala.Option.foreach(Option.scala:274)
>   ...
> {code}
> This looks like a classpath issue, probably ultimately related to the same 
> classloader issues in https://issues.apache.org/jira/browse/SPARK-26839 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26961) Found Java-level deadlock in Spark Driver

2019-03-13 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791751#comment-16791751
 ] 

Ajith S edited comment on SPARK-26961 at 3/13/19 2:37 PM:
--

1) Yes, the registerAsParallelCapable will return true, but if you inspect the 
classloader instance, parallelLockMap is still null as it was already 
initalized via super class constructor. so *it has no effect for already 
created instance*

!image-2019-03-13-19-53-52-390.png!

 

2) URLClassLoader is parallel capable as it does registration in static block 
which is before calling parent(ClassLoader) constructor. Also as per javadoc

[https://docs.oracle.com/javase/8/docs/api/java/lang/ClassLoader.html]
{code:java}
Note that the ClassLoader class is registered as parallel capable by default. 
However, its subclasses still need to register themselves if they are parallel 
capable. {code}
Hence MutableURLClassLoader lost its parallel capability by failing to register 
unlike URLClassLoader

 


was (Author: ajithshetty):
1) Yes, the registerAsParallelCapable will return true, but if you inspect the 
classloader instance, parallelLockMap is still null as it was already 
initalized via super class constructor. so it has no effect

!image-2019-03-13-19-53-52-390.png!

 

2) URLClassLoader is parallel capable as it does registration in static block 
which is before calling parent(ClassLoader) constructor. Also as per javadoc

[https://docs.oracle.com/javase/8/docs/api/java/lang/ClassLoader.html]
{code:java}
Note that the ClassLoader class is registered as parallel capable by default. 
However, its subclasses still need to register themselves if they are parallel 
capable. {code}
Hence MutableURLClassLoader lost its parallel capability by failing to register 
unlike URLClassLoader

 

> Found Java-level deadlock in Spark Driver
> -
>
> Key: SPARK-26961
> URL: https://issues.apache.org/jira/browse/SPARK-26961
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.0
>Reporter: Rong Jialei
>Priority: Major
> Attachments: image-2019-03-13-19-53-52-390.png
>
>
> Our spark job usually will finish in minutes, however, we recently found it 
> take days to run, and we can only kill it when this happened.
> An investigation show all worker container could not connect drive after 
> start, and driver is hanging, using jstack, we found a Java-level deadlock.
>  
> *Jstack output for deadlock part is showing below:*
>  
> Found one Java-level deadlock:
> =
> "SparkUI-907":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> "ForkJoinPool-1-worker-57":
>  waiting to lock monitor 0x7f3860574298 (object 0x0005b7991168, a 
> org.apache.spark.util.MutableURLClassLoader),
>  which is held by "ForkJoinPool-1-worker-7"
> "ForkJoinPool-1-worker-7":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> Java stack information for the threads listed above:
> ===
> "SparkUI-907":
>  at org.apache.hadoop.conf.Configuration.getOverlay(Configuration.java:1328)
>  - waiting to lock <0x0005c0c1e5e0> (a 
> org.apache.hadoop.conf.Configuration)
>  at 
> org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:684)
>  at org.apache.hadoop.conf.Configuration.get(Configuration.java:1088)
>  at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1145)
>  at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2363)
>  at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2840)
>  at 
> org.apache.hadoop.fs.FsUrlStreamHandlerFactory.createURLStreamHandler(FsUrlStreamHandlerFactory.java:74)
>  at java.net.URL.getURLStreamHandler(URL.java:1142)
>  at java.net.URL.(URL.java:599)
>  at java.net.URL.(URL.java:490)
>  at java.net.URL.(URL.java:439)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doRequest(JettyUtils.scala:176)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doGet(JettyUtils.scala:161)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>  at 
> org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
>  at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:171)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>  at 
> org.spark_pr

[jira] [Updated] (SPARK-27142) Provide REST API for SQL level information

2019-03-13 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-27142:

Description: 
Currently for Monitoring Spark application SQL information is not available 
from REST but only via UI. REST provides only 
applications,jobs,stages,environment. This Jira is targeted to provide a REST 
API so that SQL level information can be found

 

Details: 
https://issues.apache.org/jira/browse/SPARK-27142?focusedCommentId=16791728&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16791728

  was:Currently for Monitoring Spark application SQL information is not 
available from REST but only via UI. REST provides only 
applications,jobs,stages,environment. This Jira is targeted to provide a REST 
API so that SQL level information can be found


> Provide REST API for SQL level information
> --
>
> Key: SPARK-27142
> URL: https://issues.apache.org/jira/browse/SPARK-27142
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Minor
> Attachments: image-2019-03-13-19-29-26-896.png
>
>
> Currently for Monitoring Spark application SQL information is not available 
> from REST but only via UI. REST provides only 
> applications,jobs,stages,environment. This Jira is targeted to provide a REST 
> API so that SQL level information can be found
>  
> Details: 
> https://issues.apache.org/jira/browse/SPARK-27142?focusedCommentId=16791728&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16791728



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26961) Found Java-level deadlock in Spark Driver

2019-03-13 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791751#comment-16791751
 ] 

Ajith S edited comment on SPARK-26961 at 3/13/19 2:32 PM:
--

1) Yes, the registerAsParallelCapable will return true, but if you inspect the 
classloader instance, parallelLockMap is still null as it was already 
initalized via super class constructor. so it has no effect

!image-2019-03-13-19-53-52-390.png!

 

2) URLClassLoader is parallel capable as it does registration in static block 
which is before calling parent(ClassLoader) constructor. Also as per javadoc

[https://docs.oracle.com/javase/8/docs/api/java/lang/ClassLoader.html]
{code:java}
Note that the ClassLoader class is registered as parallel capable by default. 
However, its subclasses still need to register themselves if they are parallel 
capable. {code}
Hence MutableURLClassLoader lost its parallel capability by failing to register 
unlike URLClassLoader

 


was (Author: ajithshetty):
Yes, the registerAsParallelCapable will return true, but if you inspect the 
classloader instance, parallelLockMap is still null as it was already 
initalized via super class constructor. so it has no effect

!image-2019-03-13-19-53-52-390.png!

 

URLClassLoader is parallel capable as it does registration in static block 
which is before calling parent(ClassLoader) constructor

> Found Java-level deadlock in Spark Driver
> -
>
> Key: SPARK-26961
> URL: https://issues.apache.org/jira/browse/SPARK-26961
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.0
>Reporter: Rong Jialei
>Priority: Major
> Attachments: image-2019-03-13-19-53-52-390.png
>
>
> Our spark job usually will finish in minutes, however, we recently found it 
> take days to run, and we can only kill it when this happened.
> An investigation show all worker container could not connect drive after 
> start, and driver is hanging, using jstack, we found a Java-level deadlock.
>  
> *Jstack output for deadlock part is showing below:*
>  
> Found one Java-level deadlock:
> =
> "SparkUI-907":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> "ForkJoinPool-1-worker-57":
>  waiting to lock monitor 0x7f3860574298 (object 0x0005b7991168, a 
> org.apache.spark.util.MutableURLClassLoader),
>  which is held by "ForkJoinPool-1-worker-7"
> "ForkJoinPool-1-worker-7":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> Java stack information for the threads listed above:
> ===
> "SparkUI-907":
>  at org.apache.hadoop.conf.Configuration.getOverlay(Configuration.java:1328)
>  - waiting to lock <0x0005c0c1e5e0> (a 
> org.apache.hadoop.conf.Configuration)
>  at 
> org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:684)
>  at org.apache.hadoop.conf.Configuration.get(Configuration.java:1088)
>  at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1145)
>  at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2363)
>  at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2840)
>  at 
> org.apache.hadoop.fs.FsUrlStreamHandlerFactory.createURLStreamHandler(FsUrlStreamHandlerFactory.java:74)
>  at java.net.URL.getURLStreamHandler(URL.java:1142)
>  at java.net.URL.(URL.java:599)
>  at java.net.URL.(URL.java:490)
>  at java.net.URL.(URL.java:439)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doRequest(JettyUtils.scala:176)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doGet(JettyUtils.scala:161)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>  at 
> org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
>  at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:171)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.handle(Sc

[jira] [Commented] (SPARK-26961) Found Java-level deadlock in Spark Driver

2019-03-13 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791751#comment-16791751
 ] 

Ajith S commented on SPARK-26961:
-

Yes, the registerAsParallelCapable will return true, but if you inspect the 
classloader instance, parallelLockMap is still null as it was already 
initalized via super class constructor. so it has no effect

!image-2019-03-13-19-53-52-390.png!

 

URLClassLoader is parallel capable as it does registration in static block 
which is before calling parent(ClassLoader) constructor

> Found Java-level deadlock in Spark Driver
> -
>
> Key: SPARK-26961
> URL: https://issues.apache.org/jira/browse/SPARK-26961
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.0
>Reporter: Rong Jialei
>Priority: Major
> Attachments: image-2019-03-13-19-53-52-390.png
>
>
> Our spark job usually will finish in minutes, however, we recently found it 
> take days to run, and we can only kill it when this happened.
> An investigation show all worker container could not connect drive after 
> start, and driver is hanging, using jstack, we found a Java-level deadlock.
>  
> *Jstack output for deadlock part is showing below:*
>  
> Found one Java-level deadlock:
> =
> "SparkUI-907":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> "ForkJoinPool-1-worker-57":
>  waiting to lock monitor 0x7f3860574298 (object 0x0005b7991168, a 
> org.apache.spark.util.MutableURLClassLoader),
>  which is held by "ForkJoinPool-1-worker-7"
> "ForkJoinPool-1-worker-7":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> Java stack information for the threads listed above:
> ===
> "SparkUI-907":
>  at org.apache.hadoop.conf.Configuration.getOverlay(Configuration.java:1328)
>  - waiting to lock <0x0005c0c1e5e0> (a 
> org.apache.hadoop.conf.Configuration)
>  at 
> org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:684)
>  at org.apache.hadoop.conf.Configuration.get(Configuration.java:1088)
>  at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1145)
>  at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2363)
>  at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2840)
>  at 
> org.apache.hadoop.fs.FsUrlStreamHandlerFactory.createURLStreamHandler(FsUrlStreamHandlerFactory.java:74)
>  at java.net.URL.getURLStreamHandler(URL.java:1142)
>  at java.net.URL.(URL.java:599)
>  at java.net.URL.(URL.java:490)
>  at java.net.URL.(URL.java:439)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doRequest(JettyUtils.scala:176)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doGet(JettyUtils.scala:161)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>  at 
> org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
>  at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:171)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>  at 
> org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
>  at 
> org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>  at 
> org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>  at org.spark_project.jetty.server.Server.handle(Server.java:534)
>  at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>  at 
> org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>  at 
> org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>  at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108)
>  at 
> org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>  at 
>

[jira] [Updated] (SPARK-26961) Found Java-level deadlock in Spark Driver

2019-03-13 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-26961:

Attachment: image-2019-03-13-19-53-52-390.png

> Found Java-level deadlock in Spark Driver
> -
>
> Key: SPARK-26961
> URL: https://issues.apache.org/jira/browse/SPARK-26961
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.0
>Reporter: Rong Jialei
>Priority: Major
> Attachments: image-2019-03-13-19-53-52-390.png
>
>
> Our spark job usually will finish in minutes, however, we recently found it 
> take days to run, and we can only kill it when this happened.
> An investigation show all worker container could not connect drive after 
> start, and driver is hanging, using jstack, we found a Java-level deadlock.
>  
> *Jstack output for deadlock part is showing below:*
>  
> Found one Java-level deadlock:
> =
> "SparkUI-907":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> "ForkJoinPool-1-worker-57":
>  waiting to lock monitor 0x7f3860574298 (object 0x0005b7991168, a 
> org.apache.spark.util.MutableURLClassLoader),
>  which is held by "ForkJoinPool-1-worker-7"
> "ForkJoinPool-1-worker-7":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> Java stack information for the threads listed above:
> ===
> "SparkUI-907":
>  at org.apache.hadoop.conf.Configuration.getOverlay(Configuration.java:1328)
>  - waiting to lock <0x0005c0c1e5e0> (a 
> org.apache.hadoop.conf.Configuration)
>  at 
> org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:684)
>  at org.apache.hadoop.conf.Configuration.get(Configuration.java:1088)
>  at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1145)
>  at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2363)
>  at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2840)
>  at 
> org.apache.hadoop.fs.FsUrlStreamHandlerFactory.createURLStreamHandler(FsUrlStreamHandlerFactory.java:74)
>  at java.net.URL.getURLStreamHandler(URL.java:1142)
>  at java.net.URL.(URL.java:599)
>  at java.net.URL.(URL.java:490)
>  at java.net.URL.(URL.java:439)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doRequest(JettyUtils.scala:176)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doGet(JettyUtils.scala:161)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>  at 
> org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
>  at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:171)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>  at 
> org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
>  at 
> org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>  at 
> org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>  at org.spark_project.jetty.server.Server.handle(Server.java:534)
>  at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>  at 
> org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>  at 
> org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>  at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108)
>  at 
> org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>  at java.lang.Thread.run(Thread.java:748)
> "ForkJoinPool-1-worker-57":
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:404)
>  - waiting to lock <0x0005b7991168> (a 
> org.apache.spark.util.Mu

[jira] [Updated] (SPARK-26961) Found Java-level deadlock in Spark Driver

2019-03-13 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-26961:

Attachment: (was: image-2019-03-13-19-51-38-708.png)

> Found Java-level deadlock in Spark Driver
> -
>
> Key: SPARK-26961
> URL: https://issues.apache.org/jira/browse/SPARK-26961
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.0
>Reporter: Rong Jialei
>Priority: Major
>
> Our spark job usually will finish in minutes, however, we recently found it 
> take days to run, and we can only kill it when this happened.
> An investigation show all worker container could not connect drive after 
> start, and driver is hanging, using jstack, we found a Java-level deadlock.
>  
> *Jstack output for deadlock part is showing below:*
>  
> Found one Java-level deadlock:
> =
> "SparkUI-907":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> "ForkJoinPool-1-worker-57":
>  waiting to lock monitor 0x7f3860574298 (object 0x0005b7991168, a 
> org.apache.spark.util.MutableURLClassLoader),
>  which is held by "ForkJoinPool-1-worker-7"
> "ForkJoinPool-1-worker-7":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> Java stack information for the threads listed above:
> ===
> "SparkUI-907":
>  at org.apache.hadoop.conf.Configuration.getOverlay(Configuration.java:1328)
>  - waiting to lock <0x0005c0c1e5e0> (a 
> org.apache.hadoop.conf.Configuration)
>  at 
> org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:684)
>  at org.apache.hadoop.conf.Configuration.get(Configuration.java:1088)
>  at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1145)
>  at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2363)
>  at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2840)
>  at 
> org.apache.hadoop.fs.FsUrlStreamHandlerFactory.createURLStreamHandler(FsUrlStreamHandlerFactory.java:74)
>  at java.net.URL.getURLStreamHandler(URL.java:1142)
>  at java.net.URL.(URL.java:599)
>  at java.net.URL.(URL.java:490)
>  at java.net.URL.(URL.java:439)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doRequest(JettyUtils.scala:176)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doGet(JettyUtils.scala:161)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>  at 
> org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
>  at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:171)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>  at 
> org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
>  at 
> org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>  at 
> org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>  at org.spark_project.jetty.server.Server.handle(Server.java:534)
>  at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>  at 
> org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>  at 
> org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>  at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108)
>  at 
> org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>  at java.lang.Thread.run(Thread.java:748)
> "ForkJoinPool-1-worker-57":
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:404)
>  - waiting to lock <0x0005b7991168> (a 
> org.apache.spark.util.MutableURLClassLoader)
>  at java.lang.ClassLoader

[jira] [Updated] (SPARK-26961) Found Java-level deadlock in Spark Driver

2019-03-13 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-26961:

Attachment: image-2019-03-13-19-51-38-708.png

> Found Java-level deadlock in Spark Driver
> -
>
> Key: SPARK-26961
> URL: https://issues.apache.org/jira/browse/SPARK-26961
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.0
>Reporter: Rong Jialei
>Priority: Major
> Attachments: image-2019-03-13-19-51-38-708.png
>
>
> Our spark job usually will finish in minutes, however, we recently found it 
> take days to run, and we can only kill it when this happened.
> An investigation show all worker container could not connect drive after 
> start, and driver is hanging, using jstack, we found a Java-level deadlock.
>  
> *Jstack output for deadlock part is showing below:*
>  
> Found one Java-level deadlock:
> =
> "SparkUI-907":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> "ForkJoinPool-1-worker-57":
>  waiting to lock monitor 0x7f3860574298 (object 0x0005b7991168, a 
> org.apache.spark.util.MutableURLClassLoader),
>  which is held by "ForkJoinPool-1-worker-7"
> "ForkJoinPool-1-worker-7":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> Java stack information for the threads listed above:
> ===
> "SparkUI-907":
>  at org.apache.hadoop.conf.Configuration.getOverlay(Configuration.java:1328)
>  - waiting to lock <0x0005c0c1e5e0> (a 
> org.apache.hadoop.conf.Configuration)
>  at 
> org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:684)
>  at org.apache.hadoop.conf.Configuration.get(Configuration.java:1088)
>  at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1145)
>  at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2363)
>  at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2840)
>  at 
> org.apache.hadoop.fs.FsUrlStreamHandlerFactory.createURLStreamHandler(FsUrlStreamHandlerFactory.java:74)
>  at java.net.URL.getURLStreamHandler(URL.java:1142)
>  at java.net.URL.(URL.java:599)
>  at java.net.URL.(URL.java:490)
>  at java.net.URL.(URL.java:439)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doRequest(JettyUtils.scala:176)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doGet(JettyUtils.scala:161)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>  at 
> org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
>  at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:171)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>  at 
> org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
>  at 
> org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>  at 
> org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>  at org.spark_project.jetty.server.Server.handle(Server.java:534)
>  at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>  at 
> org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>  at 
> org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>  at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108)
>  at 
> org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>  at java.lang.Thread.run(Thread.java:748)
> "ForkJoinPool-1-worker-57":
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:404)
>  - waiting to lock <0x0005b7991168> (a 
> org.apache.spark.util.Mu

[jira] [Updated] (SPARK-27142) Provide REST API for SQL level information

2019-03-13 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-27142:

Attachment: image-2019-03-13-19-29-26-896.png

> Provide REST API for SQL level information
> --
>
> Key: SPARK-27142
> URL: https://issues.apache.org/jira/browse/SPARK-27142
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Minor
> Attachments: image-2019-03-13-19-29-26-896.png
>
>
> Currently for Monitoring Spark application SQL information is not available 
> from REST but only via UI. REST provides only 
> applications,jobs,stages,environment. This Jira is targeted to provide a REST 
> API so that SQL level information can be found



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27142) Provide REST API for SQL level information

2019-03-13 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791728#comment-16791728
 ] 

Ajith S commented on SPARK-27142:
-

Ok apologies for being abstract about this requirement. Let me explain. A 
single SQL query can result into multiple jobs. So for end user who is using 
STS or spark-sql, the intended highest level of probe is the SQL which he has 
executed. This information can be seen from SQL tab. Attaching a sample. 

!image-2019-03-13-19-29-26-896.png!

But same information he cannot access using the REST API exposed by spark and 
he always have to rely on jobs API which may be difficult. So i intend to 
expose the information seen in SQL tab in UI via REST API

Mainly:
 # executionId :  long
 # status : string - possible values COMPLETED/RUNNING/FAILED
 # description : string - executed SQL string
 # submissionTime : formatted time of SQL submission
 # duration : string - total run time
 # runningJobIds : Seq[Int] - sequence of running job ids
 # failedJobIds : Seq[Int] - sequence of failed job ids
 # successJobIds : Seq[Int] - sequence of success job ids

 

> Provide REST API for SQL level information
> --
>
> Key: SPARK-27142
> URL: https://issues.apache.org/jira/browse/SPARK-27142
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Minor
> Attachments: image-2019-03-13-19-29-26-896.png
>
>
> Currently for Monitoring Spark application SQL information is not available 
> from REST but only via UI. REST provides only 
> applications,jobs,stages,environment. This Jira is targeted to provide a REST 
> API so that SQL level information can be found



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27142) Provide REST API for SQL level information

2019-03-13 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-27142:

Attachment: (was: image-2019-03-13-19-19-27-831.png)

> Provide REST API for SQL level information
> --
>
> Key: SPARK-27142
> URL: https://issues.apache.org/jira/browse/SPARK-27142
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Minor
>
> Currently for Monitoring Spark application SQL information is not available 
> from REST but only via UI. REST provides only 
> applications,jobs,stages,environment. This Jira is targeted to provide a REST 
> API so that SQL level information can be found



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27142) Provide REST API for SQL level information

2019-03-13 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-27142:

Attachment: (was: image-2019-03-13-19-19-24-951.png)

> Provide REST API for SQL level information
> --
>
> Key: SPARK-27142
> URL: https://issues.apache.org/jira/browse/SPARK-27142
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Minor
>
> Currently for Monitoring Spark application SQL information is not available 
> from REST but only via UI. REST provides only 
> applications,jobs,stages,environment. This Jira is targeted to provide a REST 
> API so that SQL level information can be found



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27142) Provide REST API for SQL level information

2019-03-13 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-27142:

Attachment: image-2019-03-13-19-19-24-951.png

> Provide REST API for SQL level information
> --
>
> Key: SPARK-27142
> URL: https://issues.apache.org/jira/browse/SPARK-27142
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Minor
> Attachments: image-2019-03-13-19-19-24-951.png, 
> image-2019-03-13-19-19-27-831.png
>
>
> Currently for Monitoring Spark application SQL information is not available 
> from REST but only via UI. REST provides only 
> applications,jobs,stages,environment. This Jira is targeted to provide a REST 
> API so that SQL level information can be found



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-27142) Provide REST API for SQL level information

2019-03-13 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-27142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-27142:

Attachment: image-2019-03-13-19-19-27-831.png

> Provide REST API for SQL level information
> --
>
> Key: SPARK-27142
> URL: https://issues.apache.org/jira/browse/SPARK-27142
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Minor
> Attachments: image-2019-03-13-19-19-24-951.png, 
> image-2019-03-13-19-19-27-831.png
>
>
> Currently for Monitoring Spark application SQL information is not available 
> from REST but only via UI. REST provides only 
> applications,jobs,stages,environment. This Jira is targeted to provide a REST 
> API so that SQL level information can be found



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26961) Found Java-level deadlock in Spark Driver

2019-03-13 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791712#comment-16791712
 ] 

Ajith S commented on SPARK-26961:
-

[~srowen] That too will not work

Here is my custom classloader
{code:java}
class MYClassLoader(urls: Array[URL], parent: ClassLoader)
  extends URLClassLoader(urls, parent) {

  ClassLoader.registerAsParallelCapable()

  override def loadClass(name: String): Class[_] = {
super.loadClass(name)
  }
}
{code}
If we see class initialization flow, we see that super constructor is called 
before ClassLoader.registerAsParallelCapable() line is hit, hence it doesn't 
take effect 
{code:java}
:280, ClassLoader (java.lang)
:316, ClassLoader (java.lang)
:76, SecureClassLoader (java.security)
:100, URLClassLoader (java.net)
:23, MYClassLoader (org.apache.spark.util.ajith)
{code}
as per [https://github.com/scala/bug/issues/11429] scala 2.x do not have a pure 
static support yet. So moving classloader to a java based implementation may be 
only option we have

> Found Java-level deadlock in Spark Driver
> -
>
> Key: SPARK-26961
> URL: https://issues.apache.org/jira/browse/SPARK-26961
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.0
>Reporter: Rong Jialei
>Priority: Major
>
> Our spark job usually will finish in minutes, however, we recently found it 
> take days to run, and we can only kill it when this happened.
> An investigation show all worker container could not connect drive after 
> start, and driver is hanging, using jstack, we found a Java-level deadlock.
>  
> *Jstack output for deadlock part is showing below:*
>  
> Found one Java-level deadlock:
> =
> "SparkUI-907":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> "ForkJoinPool-1-worker-57":
>  waiting to lock monitor 0x7f3860574298 (object 0x0005b7991168, a 
> org.apache.spark.util.MutableURLClassLoader),
>  which is held by "ForkJoinPool-1-worker-7"
> "ForkJoinPool-1-worker-7":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> Java stack information for the threads listed above:
> ===
> "SparkUI-907":
>  at org.apache.hadoop.conf.Configuration.getOverlay(Configuration.java:1328)
>  - waiting to lock <0x0005c0c1e5e0> (a 
> org.apache.hadoop.conf.Configuration)
>  at 
> org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:684)
>  at org.apache.hadoop.conf.Configuration.get(Configuration.java:1088)
>  at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1145)
>  at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2363)
>  at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2840)
>  at 
> org.apache.hadoop.fs.FsUrlStreamHandlerFactory.createURLStreamHandler(FsUrlStreamHandlerFactory.java:74)
>  at java.net.URL.getURLStreamHandler(URL.java:1142)
>  at java.net.URL.(URL.java:599)
>  at java.net.URL.(URL.java:490)
>  at java.net.URL.(URL.java:439)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doRequest(JettyUtils.scala:176)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doGet(JettyUtils.scala:161)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>  at 
> org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
>  at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:171)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>  at 
> org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
>  at 
> org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>  at 
> org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>  at org.spark_project.jetty.server.Server.handle(Server.java:534)
>  at org.spark_project.jetty.serve

[jira] [Commented] (SPARK-27143) Provide REST API for JDBC/ODBC level information

2019-03-12 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791272#comment-16791272
 ] 

Ajith S commented on SPARK-27143:
-

ping [~srowen] [~cloud_fan] [~dongjoon] 

Please suggest if this sounds reasonable

> Provide REST API for JDBC/ODBC level information
> 
>
> Key: SPARK-27143
> URL: https://issues.apache.org/jira/browse/SPARK-27143
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Minor
>
> Currently for Monitoring Spark application JDBC/ODBC information is not 
> available from REST but only via UI. REST provides only 
> applications,jobs,stages,environment. This Jira is targeted to provide a REST 
> API so that JDBC/ODBC level information like session statistics, sql 
> staistics can be provided



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27143) Provide REST API for JDBC/ODBC level information

2019-03-12 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791270#comment-16791270
 ] 

Ajith S commented on SPARK-27143:
-

I will be working on this

> Provide REST API for JDBC/ODBC level information
> 
>
> Key: SPARK-27143
> URL: https://issues.apache.org/jira/browse/SPARK-27143
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Minor
>
> Currently for Monitoring Spark application JDBC/ODBC information is not 
> available from REST but only via UI. REST provides only 
> applications,jobs,stages,environment. This Jira is targeted to provide a REST 
> API so that JDBC/ODBC level information like session statistics, sql 
> staistics can be provided



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27143) Provide REST API for JDBC/ODBC level information

2019-03-12 Thread Ajith S (JIRA)
Ajith S created SPARK-27143:
---

 Summary: Provide REST API for JDBC/ODBC level information
 Key: SPARK-27143
 URL: https://issues.apache.org/jira/browse/SPARK-27143
 Project: Spark
  Issue Type: New Feature
  Components: Spark Core
Affects Versions: 3.0.0
Reporter: Ajith S


Currently for Monitoring Spark application JDBC/ODBC information is not 
available from REST but only via UI. REST provides only 
applications,jobs,stages,environment. This Jira is targeted to provide a REST 
API so that JDBC/ODBC level information like session statistics, sql staistics 
can be provided



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27142) Provide REST API for SQL level information

2019-03-12 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791269#comment-16791269
 ] 

Ajith S commented on SPARK-27142:
-

I will be working on this

> Provide REST API for SQL level information
> --
>
> Key: SPARK-27142
> URL: https://issues.apache.org/jira/browse/SPARK-27142
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Minor
>
> Currently for Monitoring Spark application SQL information is not available 
> from REST but only via UI. REST provides only 
> applications,jobs,stages,environment. This Jira is targeted to provide a REST 
> API so that SQL level information can be found



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-27142) Provide REST API for SQL level information

2019-03-12 Thread Ajith S (JIRA)
Ajith S created SPARK-27142:
---

 Summary: Provide REST API for SQL level information
 Key: SPARK-27142
 URL: https://issues.apache.org/jira/browse/SPARK-27142
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Affects Versions: 3.0.0
Reporter: Ajith S


Currently for Monitoring Spark application SQL information is not available 
from REST but only via UI. REST provides only 
applications,jobs,stages,environment. This Jira is targeted to provide a REST 
API so that SQL level information can be found



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26961) Found Java-level deadlock in Spark Driver

2019-03-11 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790208#comment-16790208
 ] 

Ajith S edited comment on SPARK-26961 at 3/12/19 6:45 AM:
--

[~srowen] Yes. I too have same opinion of fixing it via 
registerAsParallelCapable.  But Its not possible to do via Companion Object. I 
Tried and found a issue. Refer [https://github.com/scala/bug/issues/11429]

May be we need to move them to java implementation from scala to achieve this

[~xsapphire] i think these class loaders are child classloaders of 
LaunchAppClassLoader which already has classes for jar in class path. So 
overhead may not be of higher magnitude


was (Author: ajithshetty):
[~srowen] Yes. I too have same opinion of fixing it via 
registerAsParallelCapable.  But Its not possible to do via Companion Object. I 
Tried and found a issue. Refer https://github.com/scala/bug/issues/11429

[~xsapphire] i think these class loaders are child classloaders of 
LaunchAppClassLoader which already has classes for jar in class path. So 
overhead may not be of higher magnitude

> Found Java-level deadlock in Spark Driver
> -
>
> Key: SPARK-26961
> URL: https://issues.apache.org/jira/browse/SPARK-26961
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.0
>Reporter: Rong Jialei
>Priority: Major
>
> Our spark job usually will finish in minutes, however, we recently found it 
> take days to run, and we can only kill it when this happened.
> An investigation show all worker container could not connect drive after 
> start, and driver is hanging, using jstack, we found a Java-level deadlock.
>  
> *Jstack output for deadlock part is showing below:*
>  
> Found one Java-level deadlock:
> =
> "SparkUI-907":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> "ForkJoinPool-1-worker-57":
>  waiting to lock monitor 0x7f3860574298 (object 0x0005b7991168, a 
> org.apache.spark.util.MutableURLClassLoader),
>  which is held by "ForkJoinPool-1-worker-7"
> "ForkJoinPool-1-worker-7":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> Java stack information for the threads listed above:
> ===
> "SparkUI-907":
>  at org.apache.hadoop.conf.Configuration.getOverlay(Configuration.java:1328)
>  - waiting to lock <0x0005c0c1e5e0> (a 
> org.apache.hadoop.conf.Configuration)
>  at 
> org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:684)
>  at org.apache.hadoop.conf.Configuration.get(Configuration.java:1088)
>  at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1145)
>  at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2363)
>  at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2840)
>  at 
> org.apache.hadoop.fs.FsUrlStreamHandlerFactory.createURLStreamHandler(FsUrlStreamHandlerFactory.java:74)
>  at java.net.URL.getURLStreamHandler(URL.java:1142)
>  at java.net.URL.(URL.java:599)
>  at java.net.URL.(URL.java:490)
>  at java.net.URL.(URL.java:439)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doRequest(JettyUtils.scala:176)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doGet(JettyUtils.scala:161)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>  at 
> org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
>  at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:171)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>  at 
> org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
>  at 
> org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>  at 
> org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>  at org.spark_proj

[jira] [Comment Edited] (SPARK-26961) Found Java-level deadlock in Spark Driver

2019-03-11 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790208#comment-16790208
 ] 

Ajith S edited comment on SPARK-26961 at 3/12/19 6:43 AM:
--

[~srowen] Yes. I too have same opinion of fixing it via 
registerAsParallelCapable.  But Its not possible to do via Companion Object. I 
Tried and found a issue. Refer https://github.com/scala/bug/issues/11429

[~xsapphire] i think these class loaders are child classloaders of 
LaunchAppClassLoader which already has classes for jar in class path. So 
overhead may not be of higher magnitude


was (Author: ajithshetty):
[~srowen] Yes. I too have same opinion of fixing it via 
registerAsParallelCapable. Will raise a PR for this

[~xsapphire] i think these class loaders are child classloaders of 
LaunchAppClassLoader which already has classes for jar in class path. So 
overhead may not be of higher magnitude

> Found Java-level deadlock in Spark Driver
> -
>
> Key: SPARK-26961
> URL: https://issues.apache.org/jira/browse/SPARK-26961
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.0
>Reporter: Rong Jialei
>Priority: Major
>
> Our spark job usually will finish in minutes, however, we recently found it 
> take days to run, and we can only kill it when this happened.
> An investigation show all worker container could not connect drive after 
> start, and driver is hanging, using jstack, we found a Java-level deadlock.
>  
> *Jstack output for deadlock part is showing below:*
>  
> Found one Java-level deadlock:
> =
> "SparkUI-907":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> "ForkJoinPool-1-worker-57":
>  waiting to lock monitor 0x7f3860574298 (object 0x0005b7991168, a 
> org.apache.spark.util.MutableURLClassLoader),
>  which is held by "ForkJoinPool-1-worker-7"
> "ForkJoinPool-1-worker-7":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> Java stack information for the threads listed above:
> ===
> "SparkUI-907":
>  at org.apache.hadoop.conf.Configuration.getOverlay(Configuration.java:1328)
>  - waiting to lock <0x0005c0c1e5e0> (a 
> org.apache.hadoop.conf.Configuration)
>  at 
> org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:684)
>  at org.apache.hadoop.conf.Configuration.get(Configuration.java:1088)
>  at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1145)
>  at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2363)
>  at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2840)
>  at 
> org.apache.hadoop.fs.FsUrlStreamHandlerFactory.createURLStreamHandler(FsUrlStreamHandlerFactory.java:74)
>  at java.net.URL.getURLStreamHandler(URL.java:1142)
>  at java.net.URL.(URL.java:599)
>  at java.net.URL.(URL.java:490)
>  at java.net.URL.(URL.java:439)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doRequest(JettyUtils.scala:176)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doGet(JettyUtils.scala:161)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>  at 
> org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
>  at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:171)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>  at 
> org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
>  at 
> org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>  at 
> org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>  at org.spark_project.jetty.server.Server.handle(Server.java:534)
>  at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>  at 
> org.spark_project.jetty.server.HttpConnection.o

[jira] [Commented] (SPARK-27011) reset command fails after cache table

2019-03-11 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790220#comment-16790220
 ] 

Ajith S commented on SPARK-27011:
-

@ [~cloud_fan] As [https://github.com/apache/spark/pull/23918] is merged, can 
we close this.?

> reset command fails after cache table
> -
>
> Key: SPARK-27011
> URL: https://issues.apache.org/jira/browse/SPARK-27011
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 2.3.3, 2.4.0, 3.0.0
>Reporter: Ajith S
>Priority: Minor
>
>  
> h3. Commands to reproduce 
> spark-sql> create table abcde ( a int);
> spark-sql> reset; // can work success
> spark-sql> cache table abcde;
> spark-sql> reset; //fails with exception
> h3. Below is the stack
> {{org.apache.spark.sql.catalyst.errors.package$TreeNodeException: makeCopy, 
> tree:}}
> {{ResetCommand$}}{{at 
> org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56)}}
> {{ at 
> org.apache.spark.sql.catalyst.trees.TreeNode.makeCopy(TreeNode.scala:379)}}
> {{ at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized$lzycompute(QueryPlan.scala:216)}}
> {{ at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized(QueryPlan.scala:211)}}
> {{ at 
> org.apache.spark.sql.catalyst.plans.QueryPlan.sameResult(QueryPlan.scala:259)}}
> {{ at 
> org.apache.spark.sql.execution.CacheManager.$anonfun$lookupCachedData$3(CacheManager.scala:236)}}
> {{ at 
> org.apache.spark.sql.execution.CacheManager.$anonfun$lookupCachedData$3$adapted(CacheManager.scala:236)}}
> {{ at scala.collection.Iterator.find(Iterator.scala:993)}}
> {{ at scala.collection.Iterator.find$(Iterator.scala:990)}}
> {{ at scala.collection.AbstractIterator.find(Iterator.scala:1429)}}
> {{ at scala.collection.IterableLike.find(IterableLike.scala:81)}}
> {{ at scala.collection.IterableLike.find$(IterableLike.scala:80)}}
> {{ at scala.collection.AbstractIterable.find(Iterable.scala:56)}}
> {{ at 
> org.apache.spark.sql.execution.CacheManager.$anonfun$lookupCachedData$2(CacheManager.scala:236)}}
> {{ at 
> org.apache.spark.sql.execution.CacheManager.readLock(CacheManager.scala:59)}}
> {{ at 
> org.apache.spark.sql.execution.CacheManager.lookupCachedData(CacheManager.scala:236)}}
> {{ at 
> org.apache.spark.sql.execution.CacheManager$$anonfun$1.applyOrElse(CacheManager.scala:250)}}
> {{ at 
> org.apache.spark.sql.execution.CacheManager$$anonfun$1.applyOrElse(CacheManager.scala:241)}}
> {{ at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDown$1(TreeNode.scala:258)}}
> {{ at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:72)}}
> {{ at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:258)}}
> {{ at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29)}}
> {{ at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown(AnalysisHelper.scala:149)}}
> {{ at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDown$(AnalysisHelper.scala:147)}}
> {{ at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29)}}
> {{ at 
> org.apache.spark.sql.execution.CacheManager.useCachedData(CacheManager.scala:241)}}
> {{ at 
> org.apache.spark.sql.execution.QueryExecution.withCachedData$lzycompute(QueryExecution.scala:68)}}
> {{ at 
> org.apache.spark.sql.execution.QueryExecution.withCachedData(QueryExecution.scala:65)}}
> {{ at 
> org.apache.spark.sql.execution.QueryExecution.$anonfun$optimizedPlan$1(QueryExecution.scala:72)}}
> {{ at 
> org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)}}
> {{ at 
> org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:72)}}
> {{ at 
> org.apache.spark.sql.execution.QueryExecution.optimizedPlan(QueryExecution.scala:71)}}
> {{ at 
> org.apache.spark.sql.execution.QueryExecution.$anonfun$writePlans$4(QueryExecution.scala:139)}}
> {{ at 
> org.apache.spark.sql.catalyst.plans.QueryPlan$.append(QueryPlan.scala:316)}}
> {{ at 
> org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$writePlans(QueryExecution.scala:139)}}
> {{ at 
> org.apache.spark.sql.execution.QueryExecution.toString(QueryExecution.scala:146)}}
> {{ at 
> org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:82)}}
> {{ at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:147)}}
> {{ at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:74)}}
> {{ at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3346)}}
> {{ at org.apache

[jira] [Commented] (SPARK-26961) Found Java-level deadlock in Spark Driver

2019-03-11 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790208#comment-16790208
 ] 

Ajith S commented on SPARK-26961:
-

[~srowen] Yes. I too have same opinion of fixing it via 
registerAsParallelCapable. Will raise a PR for this

[~xsapphire] i think these class loaders are child classloaders of 
LaunchAppClassLoader which already has classes for jar in class path. So 
overhead may not be of higher magnitude

> Found Java-level deadlock in Spark Driver
> -
>
> Key: SPARK-26961
> URL: https://issues.apache.org/jira/browse/SPARK-26961
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 2.3.0
>Reporter: Rong Jialei
>Priority: Major
>
> Our spark job usually will finish in minutes, however, we recently found it 
> take days to run, and we can only kill it when this happened.
> An investigation show all worker container could not connect drive after 
> start, and driver is hanging, using jstack, we found a Java-level deadlock.
>  
> *Jstack output for deadlock part is showing below:*
>  
> Found one Java-level deadlock:
> =
> "SparkUI-907":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> "ForkJoinPool-1-worker-57":
>  waiting to lock monitor 0x7f3860574298 (object 0x0005b7991168, a 
> org.apache.spark.util.MutableURLClassLoader),
>  which is held by "ForkJoinPool-1-worker-7"
> "ForkJoinPool-1-worker-7":
>  waiting to lock monitor 0x7f387761b398 (object 0x0005c0c1e5e0, a 
> org.apache.hadoop.conf.Configuration),
>  which is held by "ForkJoinPool-1-worker-57"
> Java stack information for the threads listed above:
> ===
> "SparkUI-907":
>  at org.apache.hadoop.conf.Configuration.getOverlay(Configuration.java:1328)
>  - waiting to lock <0x0005c0c1e5e0> (a 
> org.apache.hadoop.conf.Configuration)
>  at 
> org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:684)
>  at org.apache.hadoop.conf.Configuration.get(Configuration.java:1088)
>  at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1145)
>  at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2363)
>  at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2840)
>  at 
> org.apache.hadoop.fs.FsUrlStreamHandlerFactory.createURLStreamHandler(FsUrlStreamHandlerFactory.java:74)
>  at java.net.URL.getURLStreamHandler(URL.java:1142)
>  at java.net.URL.(URL.java:599)
>  at java.net.URL.(URL.java:490)
>  at java.net.URL.(URL.java:439)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doRequest(JettyUtils.scala:176)
>  at org.apache.spark.ui.JettyUtils$$anon$4.doGet(JettyUtils.scala:161)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
>  at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>  at 
> org.spark_project.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
>  at 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:171)
>  at 
> org.spark_project.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>  at 
> org.spark_project.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>  at 
> org.spark_project.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>  at 
> org.spark_project.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>  at 
> org.spark_project.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)
>  at 
> org.spark_project.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>  at 
> org.spark_project.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>  at org.spark_project.jetty.server.Server.handle(Server.java:534)
>  at org.spark_project.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>  at 
> org.spark_project.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>  at 
> org.spark_project.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>  at org.spark_project.jetty.io.FillInterest.fillable(FillInterest.java:108)
>  at 
> org.spark_project.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>  at 
> org.spark_project.jetty.util.thread.QueuedThreadPoo

[jira] [Commented] (SPARK-27114) SQL Tab shows duplicate executions for some commands

2019-03-11 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-27114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790136#comment-16790136
 ] 

Ajith S commented on SPARK-27114:
-

[~srowen] as *LocalRelation* is eagerly evaluated hence can skip second 
evaluation, i suppose its not executed on second time as it will throw a 
exception(in this case table already exists). Currently it will use two 
execution IDs and it will fire a duplicate *SparkListenerSQLExecutionStart* 
event. This cause app store to record a duplicate event and hence it shows up 
in UI twice

> SQL Tab shows duplicate executions for some commands
> 
>
> Key: SPARK-27114
> URL: https://issues.apache.org/jira/browse/SPARK-27114
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Ajith S
>Priority: Minor
> Attachments: Screenshot from 2019-03-09 14-04-07.png
>
>
> run simple sql  command
> {{create table abc ( a int );}}
> Open SQL tab in SparkUI, we can see duplicate entries for the execution. 
> Tested behaviour in thriftserver and sparksql
> *check attachment*
> The Problem seems be due to eager execution of commands @ 
> org.apache.spark.sql.Dataset#logicalPlan
> After analysis for spark-sql, the call stacks for duplicate execution id 
> seems to be
> {code:java}
> $anonfun$withNewExecutionId$1:78, SQLExecution$ 
> (org.apache.spark.sql.execution)
> apply:-1, 2057192703 
> (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
> withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
> withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
> withAction:3346, Dataset (org.apache.spark.sql)
> :203, Dataset (org.apache.spark.sql)
> ofRows:88, Dataset$ (org.apache.spark.sql)
> sql:656, SparkSession (org.apache.spark.sql)
> sql:685, SQLContext (org.apache.spark.sql)
> run:63, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
> processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> processLine:376, CliDriver (org.apache.hadoop.hive.cli)
> main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
> main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
> invoke:62, NativeMethodAccessorImpl (sun.reflect)
> invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
> invoke:498, Method (java.lang.reflect)
> start:52, JavaMainApplication (org.apache.spark.deploy)
> org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
> (org.apache.spark.deploy)
> doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
> submit:185, SparkSubmit (org.apache.spark.deploy)
> doSubmit:87, SparkSubmit (org.apache.spark.deploy)
> doSubmit:934, SparkSubmit$$anon$2 (org.apache.spark.deploy)
> main:943, SparkSubmit$ (org.apache.spark.deploy)
> main:-1, SparkSubmit (org.apache.spark.deploy){code}
> {code:java}
> $anonfun$withNewExecutionId$1:78, SQLExecution$ 
> (org.apache.spark.sql.execution)
> apply:-1, 2057192703 
> (org.apache.spark.sql.execution.SQLExecution$$$Lambda$1036)
> withSQLConfPropagated:147, SQLExecution$ (org.apache.spark.sql.execution)
> withNewExecutionId:74, SQLExecution$ (org.apache.spark.sql.execution)
> run:65, SparkSQLDriver (org.apache.spark.sql.hive.thriftserver)
> processCmd:372, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> processLine:376, CliDriver (org.apache.hadoop.hive.cli)
> main:275, SparkSQLCLIDriver$ (org.apache.spark.sql.hive.thriftserver)
> main:-1, SparkSQLCLIDriver (org.apache.spark.sql.hive.thriftserver)
> invoke0:-1, NativeMethodAccessorImpl (sun.reflect)
> invoke:62, NativeMethodAccessorImpl (sun.reflect)
> invoke:43, DelegatingMethodAccessorImpl (sun.reflect)
> invoke:498, Method (java.lang.reflect)
> start:52, JavaMainApplication (org.apache.spark.deploy)
> org$apache$spark$deploy$SparkSubmit$$runMain:855, SparkSubmit 
> (org.apache.spark.deploy)
> doRunMain$1:162, SparkSubmit (org.apache.spark.deploy)
> submit:185, SparkSubmit (org.apache.spark.deploy)
> doSubmit:87, SparkSubmit (org.apache.spark.deploy)
> doSubmit:934, SparkSubmit$$anon$2 (org.apache.spark.deploy)
> main:943, SparkSubmit$ (org.apache.spark.deploy)
> main:-1, SparkSubmit (org.apache.spark.deploy){code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26152) Flaky test: BroadcastSuite

2019-03-11 Thread Ajith S (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajith S updated SPARK-26152:

Attachment: Screenshot from 2019-03-11 17-03-40.png

> Flaky test: BroadcastSuite
> --
>
> Key: SPARK-26152
> URL: https://issues.apache.org/jira/browse/SPARK-26152
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Dongjoon Hyun
>Priority: Critical
> Attachments: Screenshot from 2019-03-11 17-03-40.png
>
>
> - 
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5627
>  (2018-11-16)
> {code}
> BroadcastSuite:
> - Using TorrentBroadcast locally
> - Accessing TorrentBroadcast variables from multiple threads
> - Accessing TorrentBroadcast variables in a local cluster (encryption = off)
> java.util.concurrent.RejectedExecutionException: Task 
> scala.concurrent.impl.CallbackRunnable@59428a1 rejected from 
> java.util.concurrent.ThreadPoolExecutor@4096a677[Shutting down, pool size = 
> 1, active threads = 1, queued tasks = 0, completed tasks = 0]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
>   at 
> java.util.concurrent.Executors$DelegatedExecutorService.execute(Executors.java:668)
>   at 
> scala.concurrent.impl.ExecutionContextImpl$$anon$1.execute(ExecutionContextImpl.scala:134)
>   at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:284)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:284)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:284)
>   at scala.concurrent.Promise.complete(Promise.scala:49)
>   at scala.concurrent.Promise.complete$(Promise.scala:48)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:183)
>   at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:29)
>   at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
>   at 
> scala.concurrent.BatchingExecutor$Batch.processBatch$1(BatchingExecutor.scala:63)
>   at 
> scala.concurrent.BatchingExecutor$Batch.$anonfun$run$1(BatchingExecutor.scala:78)
>   at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
>   at 
> scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:81)
>   at 
> scala.concurrent.BatchingExecutor$Batch.run(BatchingExecutor.scala:55)
>   at 
> scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:870)
>   at scala.concurrent.BatchingExecutor.execute(BatchingExecutor.scala:106)
>   at 
> scala.concurrent.BatchingExecutor.execute$(BatchingExecutor.scala:103)
>   at 
> scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:868)
>   at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:284)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:284)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:284)
>   at scala.concurrent.Promise.complete(Promise.scala:49)
>   at scala.concurrent.Promise.complete$(Promise.scala:48)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:183)
>   at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:29)
>   at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> java.util.concurrent.RejectedExecutionException: Task 
> scala.concurrent.impl.CallbackRunnable@40a5bf17 rejected from 
> java.util.concurrent.ThreadPoolExecutor@5a73967[Shutting down, pool size = 1, 
> active threads = 1, queued tasks = 0, completed tasks = 0]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
>   at 
> java.util.concurrent.Executors$DelegatedExecutorService.execute(Executors.java:668)
>   at 
> sc

[jira] [Comment Edited] (SPARK-26152) Flaky test: BroadcastSuite

2019-03-11 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16789462#comment-16789462
 ] 

Ajith S edited comment on SPARK-26152 at 3/11/19 12:07 PM:
---

I encountered this issue on latest master branch and see that its the race 
between *org.apache.spark.deploy.DeployMessages.WorkDirCleanup* event and  
*org.apache.spark.deploy.worker.Worker#onStop*. Here its possible that while 
the WorkDirCleanup event is being processed, 
*org.apache.spark.deploy.worker.Worker#cleanupThreadExecutor* was shutdown. 
hence any submission after ThreadPoolExecutor will result in 
*java.util.concurrent.RejectedExecutionException*

Attaching the debug snapshot of same. I would like to work on this. Please 
suggest


was (Author: ajithshetty):
I encountered this issue and see that its the race between 
*org.apache.spark.deploy.DeployMessages.WorkDirCleanup* event and  
*org.apache.spark.deploy.worker.Worker#onStop*. Here its possible that while 
the WorkDirCleanup event is being processed, 
*org.apache.spark.deploy.worker.Worker#cleanupThreadExecutor* was shutdown. 
hence any submission after ThreadPoolExecutor will result in 
*java.util.concurrent.RejectedExecutionException*

Attaching the debug snapshot of same. I would like to work on this. Please 
suggest

> Flaky test: BroadcastSuite
> --
>
> Key: SPARK-26152
> URL: https://issues.apache.org/jira/browse/SPARK-26152
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Dongjoon Hyun
>Priority: Critical
>
> - 
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5627
>  (2018-11-16)
> {code}
> BroadcastSuite:
> - Using TorrentBroadcast locally
> - Accessing TorrentBroadcast variables from multiple threads
> - Accessing TorrentBroadcast variables in a local cluster (encryption = off)
> java.util.concurrent.RejectedExecutionException: Task 
> scala.concurrent.impl.CallbackRunnable@59428a1 rejected from 
> java.util.concurrent.ThreadPoolExecutor@4096a677[Shutting down, pool size = 
> 1, active threads = 1, queued tasks = 0, completed tasks = 0]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
>   at 
> java.util.concurrent.Executors$DelegatedExecutorService.execute(Executors.java:668)
>   at 
> scala.concurrent.impl.ExecutionContextImpl$$anon$1.execute(ExecutionContextImpl.scala:134)
>   at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:284)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:284)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:284)
>   at scala.concurrent.Promise.complete(Promise.scala:49)
>   at scala.concurrent.Promise.complete$(Promise.scala:48)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:183)
>   at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:29)
>   at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
>   at 
> scala.concurrent.BatchingExecutor$Batch.processBatch$1(BatchingExecutor.scala:63)
>   at 
> scala.concurrent.BatchingExecutor$Batch.$anonfun$run$1(BatchingExecutor.scala:78)
>   at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
>   at 
> scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:81)
>   at 
> scala.concurrent.BatchingExecutor$Batch.run(BatchingExecutor.scala:55)
>   at 
> scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:870)
>   at scala.concurrent.BatchingExecutor.execute(BatchingExecutor.scala:106)
>   at 
> scala.concurrent.BatchingExecutor.execute$(BatchingExecutor.scala:103)
>   at 
> scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:868)
>   at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:284)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:284)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:284)
>   at scala.concurrent.Promise.complete(Promise.scala:49)
>   at scala.concurrent.Promise.complete$(Promise.scala:48)
>   at 
> scala.concurrent.impl.Promise$DefaultPromis

[jira] [Comment Edited] (SPARK-26152) Flaky test: BroadcastSuite

2019-03-11 Thread Ajith S (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16789462#comment-16789462
 ] 

Ajith S edited comment on SPARK-26152 at 3/11/19 12:07 PM:
---

I encountered this issue and see that its the race between 
*org.apache.spark.deploy.DeployMessages.WorkDirCleanup* event and  
*org.apache.spark.deploy.worker.Worker#onStop*. Here its possible that while 
the WorkDirCleanup event is being processed, 
*org.apache.spark.deploy.worker.Worker#cleanupThreadExecutor* was shutdown. 
hence any submission after ThreadPoolExecutor will result in 
*java.util.concurrent.RejectedExecutionException*

Attaching the debug snapshot of same. I would like to work on this. Please 
suggest


was (Author: ajithshetty):
I encountered this issue and see that its the race between 
``org.apache.spark.deploy.DeployMessages.WorkDirCleanup`` event and onStop call 
of org.apache.spark.deploy.worker.Worker#onStop. Here its possible that while 
the WorkDirCleanup event is being processed, 
org.apache.spark.deploy.worker.Worker#cleanupThreadExecutor was shutdown

Attaching the debug snapshot of same. I would like to work on this. Please 
suggest

> Flaky test: BroadcastSuite
> --
>
> Key: SPARK-26152
> URL: https://issues.apache.org/jira/browse/SPARK-26152
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.0.0
>Reporter: Dongjoon Hyun
>Priority: Critical
>
> - 
> https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/5627
>  (2018-11-16)
> {code}
> BroadcastSuite:
> - Using TorrentBroadcast locally
> - Accessing TorrentBroadcast variables from multiple threads
> - Accessing TorrentBroadcast variables in a local cluster (encryption = off)
> java.util.concurrent.RejectedExecutionException: Task 
> scala.concurrent.impl.CallbackRunnable@59428a1 rejected from 
> java.util.concurrent.ThreadPoolExecutor@4096a677[Shutting down, pool size = 
> 1, active threads = 1, queued tasks = 0, completed tasks = 0]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
>   at 
> java.util.concurrent.Executors$DelegatedExecutorService.execute(Executors.java:668)
>   at 
> scala.concurrent.impl.ExecutionContextImpl$$anon$1.execute(ExecutionContextImpl.scala:134)
>   at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:284)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:284)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:284)
>   at scala.concurrent.Promise.complete(Promise.scala:49)
>   at scala.concurrent.Promise.complete$(Promise.scala:48)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:183)
>   at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:29)
>   at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
>   at 
> scala.concurrent.BatchingExecutor$Batch.processBatch$1(BatchingExecutor.scala:63)
>   at 
> scala.concurrent.BatchingExecutor$Batch.$anonfun$run$1(BatchingExecutor.scala:78)
>   at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
>   at 
> scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:81)
>   at 
> scala.concurrent.BatchingExecutor$Batch.run(BatchingExecutor.scala:55)
>   at 
> scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:870)
>   at scala.concurrent.BatchingExecutor.execute(BatchingExecutor.scala:106)
>   at 
> scala.concurrent.BatchingExecutor.execute$(BatchingExecutor.scala:103)
>   at 
> scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:868)
>   at 
> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:284)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:284)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:284)
>   at scala.concurrent.Promise.complete(Promise.scala:49)
>   at scala.concurrent.Promise.complete$(Promise.scala:48)
>   at 
> scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:183)
>   at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:29)
>   at scala

  1   2   >