[jira] [Commented] (SPARK-4514) SparkContext localProperties does not inherit property updates across thread reuse

2015-11-22 Thread Richard W. Eggert II (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15021185#comment-15021185
 ] 

Richard W. Eggert II commented on SPARK-4514:
-

The unit test attached to this issue fails in master, but passes in 
https://github.com/apache/spark/pull/9264

> SparkContext localProperties does not inherit property updates across thread 
> reuse
> --
>
> Key: SPARK-4514
> URL: https://issues.apache.org/jira/browse/SPARK-4514
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.1.0, 1.1.1, 1.2.0
>Reporter: Erik Erlandson
>Assignee: Josh Rosen
>Priority: Critical
>
> The current job group id of a Spark context is stored in the 
> {{localProperties}} member value.   This data structure is designed to be 
> thread local, and its settings are not preserved when {{ComplexFutureAction}} 
> instantiates a new {{Future}}.  
> One consequence of this is that {{takeAsync()}} does not behave in the same 
> way as other async actions, e.g. {{countAsync()}}.  For example, this test 
> (if copied into StatusTrackerSuite.scala), will fail, because 
> {{"my-job-group2"}} is not propagated to the Future which actually 
> instantiates the job:
> {code:java}
>   test("getJobIdsForGroup() with takeAsync()") {
> sc = new SparkContext("local", "test", new SparkConf(false))
> sc.setJobGroup("my-job-group2", "description")
> sc.statusTracker.getJobIdsForGroup("my-job-group2") should be (Seq.empty)
> val firstJobFuture = sc.parallelize(1 to 1000, 1).takeAsync(1)
> val firstJobId = eventually(timeout(10 seconds)) {
>   firstJobFuture.jobIds.head
> }
> eventually(timeout(10 seconds)) {
>   sc.statusTracker.getJobIdsForGroup("my-job-group2") should be 
> (Seq(firstJobId))
> }
>   }
> {code}
> It also impacts current PR for SPARK-1021, which involves additional uses of 
> {{ComplexFutureAction}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4514) SparkContext localProperties does not inherit property updates across thread reuse

2015-11-22 Thread Richard W. Eggert II (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15021187#comment-15021187
 ] 

Richard W. Eggert II commented on SPARK-4514:
-

This test, however, still fails:

{code}
 test("getJobIdsForGroup() with takeAsync() across multiple partitions") {
sc = new SparkContext("local", "test", new SparkConf(false))
sc.setJobGroup("my-job-group2", "description")
sc.statusTracker.getJobIdsForGroup("my-job-group2") shouldBe empty
val firstJobFuture = sc.parallelize(1 to 1000, 2).takeAsync(999)
val firstJobId = eventually(timeout(10 seconds)) {
  firstJobFuture.jobIds.head
}
eventually(timeout(10 seconds)) {
  sc.statusTracker.getJobIdsForGroup("my-job-group2") should have size 2
}
  }
{code}

> SparkContext localProperties does not inherit property updates across thread 
> reuse
> --
>
> Key: SPARK-4514
> URL: https://issues.apache.org/jira/browse/SPARK-4514
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.1.0, 1.1.1, 1.2.0
>Reporter: Erik Erlandson
>Assignee: Josh Rosen
>Priority: Critical
>
> The current job group id of a Spark context is stored in the 
> {{localProperties}} member value.   This data structure is designed to be 
> thread local, and its settings are not preserved when {{ComplexFutureAction}} 
> instantiates a new {{Future}}.  
> One consequence of this is that {{takeAsync()}} does not behave in the same 
> way as other async actions, e.g. {{countAsync()}}.  For example, this test 
> (if copied into StatusTrackerSuite.scala), will fail, because 
> {{"my-job-group2"}} is not propagated to the Future which actually 
> instantiates the job:
> {code:java}
>   test("getJobIdsForGroup() with takeAsync()") {
> sc = new SparkContext("local", "test", new SparkConf(false))
> sc.setJobGroup("my-job-group2", "description")
> sc.statusTracker.getJobIdsForGroup("my-job-group2") should be (Seq.empty)
> val firstJobFuture = sc.parallelize(1 to 1000, 1).takeAsync(1)
> val firstJobId = eventually(timeout(10 seconds)) {
>   firstJobFuture.jobIds.head
> }
> eventually(timeout(10 seconds)) {
>   sc.statusTracker.getJobIdsForGroup("my-job-group2") should be 
> (Seq(firstJobId))
> }
>   }
> {code}
> It also impacts current PR for SPARK-1021, which involves additional uses of 
> {{ComplexFutureAction}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4514) SparkContext localProperties does not inherit property updates across thread reuse

2015-11-22 Thread Richard W. Eggert II (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15021194#comment-15021194
 ] 

Richard W. Eggert II commented on SPARK-4514:
-

I implemented a two-line fix that causes this test to now pass in that PR.

> SparkContext localProperties does not inherit property updates across thread 
> reuse
> --
>
> Key: SPARK-4514
> URL: https://issues.apache.org/jira/browse/SPARK-4514
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.1.0, 1.1.1, 1.2.0
>Reporter: Erik Erlandson
>Assignee: Josh Rosen
>Priority: Critical
>
> The current job group id of a Spark context is stored in the 
> {{localProperties}} member value.   This data structure is designed to be 
> thread local, and its settings are not preserved when {{ComplexFutureAction}} 
> instantiates a new {{Future}}.  
> One consequence of this is that {{takeAsync()}} does not behave in the same 
> way as other async actions, e.g. {{countAsync()}}.  For example, this test 
> (if copied into StatusTrackerSuite.scala), will fail, because 
> {{"my-job-group2"}} is not propagated to the Future which actually 
> instantiates the job:
> {code:java}
>   test("getJobIdsForGroup() with takeAsync()") {
> sc = new SparkContext("local", "test", new SparkConf(false))
> sc.setJobGroup("my-job-group2", "description")
> sc.statusTracker.getJobIdsForGroup("my-job-group2") should be (Seq.empty)
> val firstJobFuture = sc.parallelize(1 to 1000, 1).takeAsync(1)
> val firstJobId = eventually(timeout(10 seconds)) {
>   firstJobFuture.jobIds.head
> }
> eventually(timeout(10 seconds)) {
>   sc.statusTracker.getJobIdsForGroup("my-job-group2") should be 
> (Seq(firstJobId))
> }
>   }
> {code}
> It also impacts current PR for SPARK-1021, which involves additional uses of 
> {{ComplexFutureAction}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4514) SparkContext localProperties does not inherit property updates across thread reuse

2015-11-22 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15021199#comment-15021199
 ] 

Apache Spark commented on SPARK-4514:
-

User 'reggert' has created a pull request for this issue:
https://github.com/apache/spark/pull/9264

> SparkContext localProperties does not inherit property updates across thread 
> reuse
> --
>
> Key: SPARK-4514
> URL: https://issues.apache.org/jira/browse/SPARK-4514
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.1.0, 1.1.1, 1.2.0
>Reporter: Erik Erlandson
>Assignee: Josh Rosen
>Priority: Critical
>
> The current job group id of a Spark context is stored in the 
> {{localProperties}} member value.   This data structure is designed to be 
> thread local, and its settings are not preserved when {{ComplexFutureAction}} 
> instantiates a new {{Future}}.  
> One consequence of this is that {{takeAsync()}} does not behave in the same 
> way as other async actions, e.g. {{countAsync()}}.  For example, this test 
> (if copied into StatusTrackerSuite.scala), will fail, because 
> {{"my-job-group2"}} is not propagated to the Future which actually 
> instantiates the job:
> {code:java}
>   test("getJobIdsForGroup() with takeAsync()") {
> sc = new SparkContext("local", "test", new SparkConf(false))
> sc.setJobGroup("my-job-group2", "description")
> sc.statusTracker.getJobIdsForGroup("my-job-group2") should be (Seq.empty)
> val firstJobFuture = sc.parallelize(1 to 1000, 1).takeAsync(1)
> val firstJobId = eventually(timeout(10 seconds)) {
>   firstJobFuture.jobIds.head
> }
> eventually(timeout(10 seconds)) {
>   sc.statusTracker.getJobIdsForGroup("my-job-group2") should be 
> (Seq(firstJobId))
> }
>   }
> {code}
> It also impacts current PR for SPARK-1021, which involves additional uses of 
> {{ComplexFutureAction}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4514) SparkContext localProperties does not inherit property updates across thread reuse

2015-04-24 Thread Ilya Ganelin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511537#comment-14511537
 ] 

Ilya Ganelin commented on SPARK-4514:
-

[~joshrosen] - given your work on SPARK-6629 is this still relevant - I saw 
that there was a comment there stating that issue may not be a problem? I can 
knock this one out if it's still necessary.

 SparkContext localProperties does not inherit property updates across thread 
 reuse
 --

 Key: SPARK-4514
 URL: https://issues.apache.org/jira/browse/SPARK-4514
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0, 1.1.1, 1.2.0
Reporter: Erik Erlandson
Assignee: Josh Rosen
Priority: Critical

 The current job group id of a Spark context is stored in the 
 {{localProperties}} member value.   This data structure is designed to be 
 thread local, and its settings are not preserved when {{ComplexFutureAction}} 
 instantiates a new {{Future}}.  
 One consequence of this is that {{takeAsync()}} does not behave in the same 
 way as other async actions, e.g. {{countAsync()}}.  For example, this test 
 (if copied into StatusTrackerSuite.scala), will fail, because 
 {{my-job-group2}} is not propagated to the Future which actually 
 instantiates the job:
 {code:java}
   test(getJobIdsForGroup() with takeAsync()) {
 sc = new SparkContext(local, test, new SparkConf(false))
 sc.setJobGroup(my-job-group2, description)
 sc.statusTracker.getJobIdsForGroup(my-job-group2) should be (Seq.empty)
 val firstJobFuture = sc.parallelize(1 to 1000, 1).takeAsync(1)
 val firstJobId = eventually(timeout(10 seconds)) {
   firstJobFuture.jobIds.head
 }
 eventually(timeout(10 seconds)) {
   sc.statusTracker.getJobIdsForGroup(my-job-group2) should be 
 (Seq(firstJobId))
 }
   }
 {code}
 It also impacts current PR for SPARK-1021, which involves additional uses of 
 {{ComplexFutureAction}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4514) SparkContext localProperties does not inherit property updates across thread reuse

2015-03-31 Thread Josh Rosen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388095#comment-14388095
 ] 

Josh Rosen commented on SPARK-4514:
---

I don't know that there's a good way to fix this for all arbitrary ways in 
which users might create or re-use threads.  This inheritance behavior is 
slightly more understandable in cases where users explicitly create child 
threads.  Although our documentation doesn't seem to explicitly promise that 
properties will be inherited, I think that users might have come to rely on 
this behavior so I don't think that we can remove it at this point.  We can 
certainly fix it for the AsyncRDDActions case, though, because we can manually 
thread the properties in the constructor.

This pain could have probably been avoided if the original design used 
something like Scala's {{DynamicVariable}} where you're forced to explicitly 
consider the scope / lifecycle of the thread-local property.
 
I'm going to try to fix this for the AsyncRDDActions case and will try to 
improve the documentation to warn about this pitfall for the more general cases 
involving arbitrary user code.  Let me know if you can spot another solution 
which won't break existing user code that relies on property inheritance in the 
non-thread-reuse cases.

 SparkContext localProperties does not inherit property updates across thread 
 reuse
 --

 Key: SPARK-4514
 URL: https://issues.apache.org/jira/browse/SPARK-4514
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0, 1.1.1, 1.2.0
Reporter: Erik Erlandson
Assignee: Josh Rosen
Priority: Critical

 The current job group id of a Spark context is stored in the 
 {{localProperties}} member value.   This data structure is designed to be 
 thread local, and its settings are not preserved when {{ComplexFutureAction}} 
 instantiates a new {{Future}}.  
 One consequence of this is that {{takeAsync()}} does not behave in the same 
 way as other async actions, e.g. {{countAsync()}}.  For example, this test 
 (if copied into StatusTrackerSuite.scala), will fail, because 
 {{my-job-group2}} is not propagated to the Future which actually 
 instantiates the job:
 {code:java}
   test(getJobIdsForGroup() with takeAsync()) {
 sc = new SparkContext(local, test, new SparkConf(false))
 sc.setJobGroup(my-job-group2, description)
 sc.statusTracker.getJobIdsForGroup(my-job-group2) should be (Seq.empty)
 val firstJobFuture = sc.parallelize(1 to 1000, 1).takeAsync(1)
 val firstJobId = eventually(timeout(10 seconds)) {
   firstJobFuture.jobIds.head
 }
 eventually(timeout(10 seconds)) {
   sc.statusTracker.getJobIdsForGroup(my-job-group2) should be 
 (Seq(firstJobId))
 }
   }
 {code}
 It also impacts current PR for SPARK-1021, which involves additional uses of 
 {{ComplexFutureAction}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4514) SparkContext localProperties does not inherit property updates across thread reuse

2015-03-31 Thread Josh Rosen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388231#comment-14388231
 ] 

Josh Rosen commented on SPARK-4514:
---

I've filed SPARK-6629 to fix a related issue where inherited job groups did not 
play nicely with cancellation.

 SparkContext localProperties does not inherit property updates across thread 
 reuse
 --

 Key: SPARK-4514
 URL: https://issues.apache.org/jira/browse/SPARK-4514
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.1.0, 1.1.1, 1.2.0
Reporter: Erik Erlandson
Assignee: Josh Rosen
Priority: Critical

 The current job group id of a Spark context is stored in the 
 {{localProperties}} member value.   This data structure is designed to be 
 thread local, and its settings are not preserved when {{ComplexFutureAction}} 
 instantiates a new {{Future}}.  
 One consequence of this is that {{takeAsync()}} does not behave in the same 
 way as other async actions, e.g. {{countAsync()}}.  For example, this test 
 (if copied into StatusTrackerSuite.scala), will fail, because 
 {{my-job-group2}} is not propagated to the Future which actually 
 instantiates the job:
 {code:java}
   test(getJobIdsForGroup() with takeAsync()) {
 sc = new SparkContext(local, test, new SparkConf(false))
 sc.setJobGroup(my-job-group2, description)
 sc.statusTracker.getJobIdsForGroup(my-job-group2) should be (Seq.empty)
 val firstJobFuture = sc.parallelize(1 to 1000, 1).takeAsync(1)
 val firstJobId = eventually(timeout(10 seconds)) {
   firstJobFuture.jobIds.head
 }
 eventually(timeout(10 seconds)) {
   sc.statusTracker.getJobIdsForGroup(my-job-group2) should be 
 (Seq(firstJobId))
 }
   }
 {code}
 It also impacts current PR for SPARK-1021, which involves additional uses of 
 {{ComplexFutureAction}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org