[jira] [Updated] (SPARK-7233) ClosureCleaner#clean blocks concurrent job submitter threads

2015-04-29 Thread Oleksii Kostyliev (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksii Kostyliev updated SPARK-7233:
-
Attachment: blocked_threads_closurecleaner.png

> ClosureCleaner#clean blocks concurrent job submitter threads
> 
>
> Key: SPARK-7233
> URL: https://issues.apache.org/jira/browse/SPARK-7233
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.3.1, 1.4.0
>Reporter: Oleksii Kostyliev
> Attachments: blocked_threads_closurecleaner.png
>
>
> {{org.apache.spark.util.ClosureCleaner#clean}} method contains logic to 
> determine if Spark is run in interpreter mode: 
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala#L120
> While this behavior is indeed valuable in particular situations, in addition 
> to this it causes concurrent submitter threads to be blocked on a native call 
> to {{java.lang.Class#forName0}} since it appears only 1 thread at a time can 
> make the call.
> This becomes a major issue when you have multiple threads concurrently 
> submitting short-lived jobs. This is one of the patterns how we use Spark in 
> production, and the number of parallel requests is expected to be quite high, 
> up to a couple of thousand at a time.
> A typical stacktrace of a blocked thread looks like:
> {code}
> http-bio-8091-exec-14 [BLOCKED] [DAEMON]
> java.lang.Class.forName0(String, boolean, ClassLoader, Class) Class.java 
> (native)
> java.lang.Class.forName(String) Class.java:260
> org.apache.spark.util.ClosureCleaner$.clean(Object, boolean) 
> ClosureCleaner.scala:122
> org.apache.spark.SparkContext.clean(Object, boolean) SparkContext.scala:1623
> org.apache.spark.rdd.RDD.reduce(Function2) RDD.scala:883
> org.apache.spark.rdd.RDD.takeOrdered(int, Ordering) RDD.scala:1240
> org.apache.spark.api.java.JavaRDDLike$class.takeOrdered(JavaRDDLike, int, 
> Comparator) JavaRDDLike.scala:586
> org.apache.spark.api.java.AbstractJavaRDDLike.takeOrdered(int, Comparator) 
> JavaRDDLike.scala:46
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-7233) ClosureCleaner#clean blocks concurrent job submitter threads

2015-04-29 Thread Patrick Wendell (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-7233:
---
Priority: Critical  (was: Major)

> ClosureCleaner#clean blocks concurrent job submitter threads
> 
>
> Key: SPARK-7233
> URL: https://issues.apache.org/jira/browse/SPARK-7233
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.3.1, 1.4.0
>Reporter: Oleksii Kostyliev
>Priority: Critical
> Attachments: blocked_threads_closurecleaner.png
>
>
> {{org.apache.spark.util.ClosureCleaner#clean}} method contains logic to 
> determine if Spark is run in interpreter mode: 
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala#L120
> While this behavior is indeed valuable in particular situations, in addition 
> to this it causes concurrent submitter threads to be blocked on a native call 
> to {{java.lang.Class#forName0}} since it appears only 1 thread at a time can 
> make the call.
> This becomes a major issue when you have multiple threads concurrently 
> submitting short-lived jobs. This is one of the patterns how we use Spark in 
> production, and the number of parallel requests is expected to be quite high, 
> up to a couple of thousand at a time.
> A typical stacktrace of a blocked thread looks like:
> {code}
> http-bio-8091-exec-14 [BLOCKED] [DAEMON]
> java.lang.Class.forName0(String, boolean, ClassLoader, Class) Class.java 
> (native)
> java.lang.Class.forName(String) Class.java:260
> org.apache.spark.util.ClosureCleaner$.clean(Object, boolean) 
> ClosureCleaner.scala:122
> org.apache.spark.SparkContext.clean(Object, boolean) SparkContext.scala:1623
> org.apache.spark.rdd.RDD.reduce(Function2) RDD.scala:883
> org.apache.spark.rdd.RDD.takeOrdered(int, Ordering) RDD.scala:1240
> org.apache.spark.api.java.JavaRDDLike$class.takeOrdered(JavaRDDLike, int, 
> Comparator) JavaRDDLike.scala:586
> org.apache.spark.api.java.AbstractJavaRDDLike.takeOrdered(int, Comparator) 
> JavaRDDLike.scala:46
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-7233) ClosureCleaner#clean blocks concurrent job submitter threads

2015-04-29 Thread Patrick Wendell (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-7233:
---
Target Version/s: 1.4.0

> ClosureCleaner#clean blocks concurrent job submitter threads
> 
>
> Key: SPARK-7233
> URL: https://issues.apache.org/jira/browse/SPARK-7233
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.3.1, 1.4.0
>Reporter: Oleksii Kostyliev
>Priority: Critical
> Attachments: blocked_threads_closurecleaner.png
>
>
> {{org.apache.spark.util.ClosureCleaner#clean}} method contains logic to 
> determine if Spark is run in interpreter mode: 
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala#L120
> While this behavior is indeed valuable in particular situations, in addition 
> to this it causes concurrent submitter threads to be blocked on a native call 
> to {{java.lang.Class#forName0}} since it appears only 1 thread at a time can 
> make the call.
> This becomes a major issue when you have multiple threads concurrently 
> submitting short-lived jobs. This is one of the patterns how we use Spark in 
> production, and the number of parallel requests is expected to be quite high, 
> up to a couple of thousand at a time.
> A typical stacktrace of a blocked thread looks like:
> {code}
> http-bio-8091-exec-14 [BLOCKED] [DAEMON]
> java.lang.Class.forName0(String, boolean, ClassLoader, Class) Class.java 
> (native)
> java.lang.Class.forName(String) Class.java:260
> org.apache.spark.util.ClosureCleaner$.clean(Object, boolean) 
> ClosureCleaner.scala:122
> org.apache.spark.SparkContext.clean(Object, boolean) SparkContext.scala:1623
> org.apache.spark.rdd.RDD.reduce(Function2) RDD.scala:883
> org.apache.spark.rdd.RDD.takeOrdered(int, Ordering) RDD.scala:1240
> org.apache.spark.api.java.JavaRDDLike$class.takeOrdered(JavaRDDLike, int, 
> Comparator) JavaRDDLike.scala:586
> org.apache.spark.api.java.AbstractJavaRDDLike.takeOrdered(int, Comparator) 
> JavaRDDLike.scala:46
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-7233) ClosureCleaner#clean blocks concurrent job submitter threads

2015-05-15 Thread Andrew Or (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Or updated SPARK-7233:
-
Fix Version/s: 1.4.0

> ClosureCleaner#clean blocks concurrent job submitter threads
> 
>
> Key: SPARK-7233
> URL: https://issues.apache.org/jira/browse/SPARK-7233
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.3.1, 1.4.0
>Reporter: Oleksii Kostyliev
>Priority: Critical
> Fix For: 1.4.0
>
> Attachments: blocked_threads_closurecleaner.png
>
>
> {{org.apache.spark.util.ClosureCleaner#clean}} method contains logic to 
> determine if Spark is run in interpreter mode: 
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala#L120
> While this behavior is indeed valuable in particular situations, in addition 
> to this it causes concurrent submitter threads to be blocked on a native call 
> to {{java.lang.Class#forName0}} since it appears only 1 thread at a time can 
> make the call.
> This becomes a major issue when you have multiple threads concurrently 
> submitting short-lived jobs. This is one of the patterns how we use Spark in 
> production, and the number of parallel requests is expected to be quite high, 
> up to a couple of thousand at a time.
> A typical stacktrace of a blocked thread looks like:
> {code}
> http-bio-8091-exec-14 [BLOCKED] [DAEMON]
> java.lang.Class.forName0(String, boolean, ClassLoader, Class) Class.java 
> (native)
> java.lang.Class.forName(String) Class.java:260
> org.apache.spark.util.ClosureCleaner$.clean(Object, boolean) 
> ClosureCleaner.scala:122
> org.apache.spark.SparkContext.clean(Object, boolean) SparkContext.scala:1623
> org.apache.spark.rdd.RDD.reduce(Function2) RDD.scala:883
> org.apache.spark.rdd.RDD.takeOrdered(int, Ordering) RDD.scala:1240
> org.apache.spark.api.java.JavaRDDLike$class.takeOrdered(JavaRDDLike, int, 
> Comparator) JavaRDDLike.scala:586
> org.apache.spark.api.java.AbstractJavaRDDLike.takeOrdered(int, Comparator) 
> JavaRDDLike.scala:46
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org