[jira] [Commented] (SPARK-23894) Flaky Test: BucketedWriteWithoutHiveSupportSuite

2018-05-08 Thread Imran Rashid (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467570#comment-16467570
 ] 

Imran Rashid commented on SPARK-23894:
--

After discussion in related PRs, SPARK-22938 should cover the main problem, and 
the PR for that will include the appropriate defensive checks preventing this 
in the future.

> Flaky Test:  BucketedWriteWithoutHiveSupportSuite
> -
>
> Key: SPARK-23894
> URL: https://issues.apache.org/jira/browse/SPARK-23894
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Imran Rashid
>Priority: Minor
> Attachments: unit-tests.log
>
>
> Flaky test observed here: 
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88991/
> I'll attach a snippet of the unit-tests logs, for this suite and the 
> preceeding one.  Here's a snippet of the exception.
> {noformat}
> 08:36:34.694 Executor task launch worker for task 436 ERROR Executor: 
> Exception in task 0.0 in stage 402.0 (TID 436)
> java.lang.IllegalStateException: LiveListenerBus is stopped.
> at 
> org.apache.spark.scheduler.LiveListenerBus.addToQueue(LiveListenerBus.scala:97)
> at 
> org.apache.spark.scheduler.LiveListenerBus.addToStatusQueue(LiveListenerBus.scala:80)
> at 
> org.apache.spark.sql.internal.SharedState.(SharedState.scala:93)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:117)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:117)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:117)
> at 
> org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:116)
> at 
> org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:286)
> at 
> org.apache.spark.sql.test.TestSparkSession.sessionState$lzycompute(TestSQLContext.scala:42)
> at 
> org.apache.spark.sql.test.TestSparkSession.sessionState(TestSQLContext.scala:41)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$1.apply(SparkSession.scala:92)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$1.apply(SparkSession.scala:92)
> at scala.Option.map(Option.scala:146)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:92)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:91)
> at org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:110)
> at org.apache.spark.sql.types.DataType.sameType(DataType.scala:84)
> at 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$$anonfun$1.apply(TypeCoercion.scala:105)
> at 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$$anonfun$1.apply(TypeCoercion.scala:86)
> {noformat}
> I doubt this is actually because of BucketedWriteWithoutHiveSupportSuite.  I 
> think it has something more to do with {{SparkSession}} 's lazy evaluation of 
> {{SharedState}} doing something funny with the way we setup the test spark 
> context etc ... though I don't really understand it yet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23894) Flaky Test: BucketedWriteWithoutHiveSupportSuite

2018-04-27 Thread Imran Rashid (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457003#comment-16457003
 ] 

Imran Rashid commented on SPARK-23894:
--

I believe this issue has existed since SPARK-10810 / 
https://github.com/apache/spark/commit/3390b400d04e40f767d8a51f1078fcccb4e64abd 
though originally the SQLContext is what was in the InheritableThreadLocal

> Flaky Test:  BucketedWriteWithoutHiveSupportSuite
> -
>
> Key: SPARK-23894
> URL: https://issues.apache.org/jira/browse/SPARK-23894
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Imran Rashid
>Priority: Minor
> Attachments: unit-tests.log
>
>
> Flaky test observed here: 
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88991/
> I'll attach a snippet of the unit-tests logs, for this suite and the 
> preceeding one.  Here's a snippet of the exception.
> {noformat}
> 08:36:34.694 Executor task launch worker for task 436 ERROR Executor: 
> Exception in task 0.0 in stage 402.0 (TID 436)
> java.lang.IllegalStateException: LiveListenerBus is stopped.
> at 
> org.apache.spark.scheduler.LiveListenerBus.addToQueue(LiveListenerBus.scala:97)
> at 
> org.apache.spark.scheduler.LiveListenerBus.addToStatusQueue(LiveListenerBus.scala:80)
> at 
> org.apache.spark.sql.internal.SharedState.(SharedState.scala:93)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:117)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:117)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:117)
> at 
> org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:116)
> at 
> org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:286)
> at 
> org.apache.spark.sql.test.TestSparkSession.sessionState$lzycompute(TestSQLContext.scala:42)
> at 
> org.apache.spark.sql.test.TestSparkSession.sessionState(TestSQLContext.scala:41)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$1.apply(SparkSession.scala:92)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$1.apply(SparkSession.scala:92)
> at scala.Option.map(Option.scala:146)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:92)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:91)
> at org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:110)
> at org.apache.spark.sql.types.DataType.sameType(DataType.scala:84)
> at 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$$anonfun$1.apply(TypeCoercion.scala:105)
> at 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$$anonfun$1.apply(TypeCoercion.scala:86)
> {noformat}
> I doubt this is actually because of BucketedWriteWithoutHiveSupportSuite.  I 
> think it has something more to do with {{SparkSession}} 's lazy evaluation of 
> {{SharedState}} doing something funny with the way we setup the test spark 
> context etc ... though I don't really understand it yet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23894) Flaky Test: BucketedWriteWithoutHiveSupportSuite

2018-04-27 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456984#comment-16456984
 ] 

Apache Spark commented on SPARK-23894:
--

User 'squito' has created a pull request for this issue:
https://github.com/apache/spark/pull/21185

> Flaky Test:  BucketedWriteWithoutHiveSupportSuite
> -
>
> Key: SPARK-23894
> URL: https://issues.apache.org/jira/browse/SPARK-23894
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Imran Rashid
>Priority: Minor
> Attachments: unit-tests.log
>
>
> Flaky test observed here: 
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88991/
> I'll attach a snippet of the unit-tests logs, for this suite and the 
> preceeding one.  Here's a snippet of the exception.
> {noformat}
> 08:36:34.694 Executor task launch worker for task 436 ERROR Executor: 
> Exception in task 0.0 in stage 402.0 (TID 436)
> java.lang.IllegalStateException: LiveListenerBus is stopped.
> at 
> org.apache.spark.scheduler.LiveListenerBus.addToQueue(LiveListenerBus.scala:97)
> at 
> org.apache.spark.scheduler.LiveListenerBus.addToStatusQueue(LiveListenerBus.scala:80)
> at 
> org.apache.spark.sql.internal.SharedState.(SharedState.scala:93)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:117)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:117)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:117)
> at 
> org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:116)
> at 
> org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:286)
> at 
> org.apache.spark.sql.test.TestSparkSession.sessionState$lzycompute(TestSQLContext.scala:42)
> at 
> org.apache.spark.sql.test.TestSparkSession.sessionState(TestSQLContext.scala:41)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$1.apply(SparkSession.scala:92)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$1.apply(SparkSession.scala:92)
> at scala.Option.map(Option.scala:146)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:92)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:91)
> at org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:110)
> at org.apache.spark.sql.types.DataType.sameType(DataType.scala:84)
> at 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$$anonfun$1.apply(TypeCoercion.scala:105)
> at 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$$anonfun$1.apply(TypeCoercion.scala:86)
> {noformat}
> I doubt this is actually because of BucketedWriteWithoutHiveSupportSuite.  I 
> think it has something more to do with {{SparkSession}} 's lazy evaluation of 
> {{SharedState}} doing something funny with the way we setup the test spark 
> context etc ... though I don't really understand it yet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23894) Flaky Test: BucketedWriteWithoutHiveSupportSuite

2018-04-27 Thread Imran Rashid (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456969#comment-16456969
 ] 

Imran Rashid commented on SPARK-23894:
--

I think I understand what is happening here, but I don't know how to fix it.

Normally, there is no active spark session for the executor threads.  I added 
some debugging code to where an executor might call {{SQLConf.get}} to show the 
active session, and under my test runs, there isn't an active session:

{noformat}
12:49:35.801 dispatcher-event-loop-0 INFO Executor: Creating task runner thread 
with activeSession = None
...
getting conf, activeSession = None in Executor task launch worker for task 24
java.lang.Exception: getting conf in thread Executor task launch worker for 
task 23
at 
org.apache.spark.sql.catalyst.plans.QueryPlan.conf(QueryPlan.scala:35)
at 
org.apache.spark.sql.execution.columnar.InMemoryTableScanExec.org$apache$spark$sql$execution$columnar$InMemoryTableScanExec$$createAndDecompressColumn(InMemoryTableScanExe
c.scala:84)
...
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
{noformat}

So how come sometimes its defined?  Note that activeSession is an *Inheritable* 
thread local.  Normally the executor threads are created before activeSession 
is defined, so they don't inherit anything.  But a threadpool is free to create 
more threads at any time.  And when they do, then suddenly the new executor 
threads will inherit the active session from their parent, a thread in the 
driver with the activeSession defined.

I'll submit a PR to defensively always clear the active session in the executor 
thread.

> Flaky Test:  BucketedWriteWithoutHiveSupportSuite
> -
>
> Key: SPARK-23894
> URL: https://issues.apache.org/jira/browse/SPARK-23894
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Imran Rashid
>Priority: Minor
> Attachments: unit-tests.log
>
>
> Flaky test observed here: 
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88991/
> I'll attach a snippet of the unit-tests logs, for this suite and the 
> preceeding one.  Here's a snippet of the exception.
> {noformat}
> 08:36:34.694 Executor task launch worker for task 436 ERROR Executor: 
> Exception in task 0.0 in stage 402.0 (TID 436)
> java.lang.IllegalStateException: LiveListenerBus is stopped.
> at 
> org.apache.spark.scheduler.LiveListenerBus.addToQueue(LiveListenerBus.scala:97)
> at 
> org.apache.spark.scheduler.LiveListenerBus.addToStatusQueue(LiveListenerBus.scala:80)
> at 
> org.apache.spark.sql.internal.SharedState.(SharedState.scala:93)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:117)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:117)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:117)
> at 
> org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:116)
> at 
> org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:286)
> at 
> org.apache.spark.sql.test.TestSparkSession.sessionState$lzycompute(TestSQLContext.scala:42)
> at 
> org.apache.spark.sql.test.TestSparkSession.sessionState(TestSQLContext.scala:41)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$1.apply(SparkSession.scala:92)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$1.apply(SparkSession.scala:92)
> at scala.Option.map(Option.scala:146)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:92)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:91)
> at org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:110)
> at org.apache.spark.sql.types.DataType.sameType(DataType.scala:84)
> at 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$$anonfun$1.apply(TypeCoercion.scala:105)
> at 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$$anonfun$1.apply(TypeCoercion.scala:86)
> {noformat}
> I doubt this is actually because of BucketedWriteWithoutHiveSupportSuite.  I 
> think it has something more to do with {{SparkSession}} 's lazy evaluation of 
> {{SharedState}} doing something funny with the way we setup the test spark 
> context etc ... though I don't really understand it yet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (SPARK-23894) Flaky Test: BucketedWriteWithoutHiveSupportSuite

2018-04-27 Thread Imran Rashid (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456919#comment-16456919
 ] 

Imran Rashid commented on SPARK-23894:
--

One thing I've noticed from looking at more instances of this is that normally, 
we don't see any log lines from {{SharedState}} from the executor threads.  
Normally we see this:

{noformat}
09:37:38.203 pool-1-thread-1-ScalaTest-running-ParquetQuerySuite INFO 
SharedState: Warehouse path is 
'file:/Users/irashid/github/pub/spark/sql/core/spark-warehouse/'.
{noformat}

but in failures, we see

{noformat}
23:37:56.728 Executor task launch worker for task 48 INFO SharedState: 
Warehouse path is 
'file:/home/jenkins/workspace/spark-branch-2.3-test-sbt-hadoop-2.6/sql/core/spark-warehouse'.
{noformat}

(notice the thread).  I don't understand why this happens yet.  Nor can I 
reproduce locally.

> Flaky Test:  BucketedWriteWithoutHiveSupportSuite
> -
>
> Key: SPARK-23894
> URL: https://issues.apache.org/jira/browse/SPARK-23894
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Imran Rashid
>Priority: Minor
> Attachments: unit-tests.log
>
>
> Flaky test observed here: 
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/88991/
> I'll attach a snippet of the unit-tests logs, for this suite and the 
> preceeding one.  Here's a snippet of the exception.
> {noformat}
> 08:36:34.694 Executor task launch worker for task 436 ERROR Executor: 
> Exception in task 0.0 in stage 402.0 (TID 436)
> java.lang.IllegalStateException: LiveListenerBus is stopped.
> at 
> org.apache.spark.scheduler.LiveListenerBus.addToQueue(LiveListenerBus.scala:97)
> at 
> org.apache.spark.scheduler.LiveListenerBus.addToStatusQueue(LiveListenerBus.scala:80)
> at 
> org.apache.spark.sql.internal.SharedState.(SharedState.scala:93)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:117)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:117)
> at scala.Option.getOrElse(Option.scala:121)
> at 
> org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:117)
> at 
> org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:116)
> at 
> org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:286)
> at 
> org.apache.spark.sql.test.TestSparkSession.sessionState$lzycompute(TestSQLContext.scala:42)
> at 
> org.apache.spark.sql.test.TestSparkSession.sessionState(TestSQLContext.scala:41)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$1.apply(SparkSession.scala:92)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1$$anonfun$apply$1.apply(SparkSession.scala:92)
> at scala.Option.map(Option.scala:146)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:92)
> at 
> org.apache.spark.sql.SparkSession$$anonfun$1.apply(SparkSession.scala:91)
> at org.apache.spark.sql.internal.SQLConf$.get(SQLConf.scala:110)
> at org.apache.spark.sql.types.DataType.sameType(DataType.scala:84)
> at 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$$anonfun$1.apply(TypeCoercion.scala:105)
> at 
> org.apache.spark.sql.catalyst.analysis.TypeCoercion$$anonfun$1.apply(TypeCoercion.scala:86)
> {noformat}
> I doubt this is actually because of BucketedWriteWithoutHiveSupportSuite.  I 
> think it has something more to do with {{SparkSession}} 's lazy evaluation of 
> {{SharedState}} doing something funny with the way we setup the test spark 
> context etc ... though I don't really understand it yet.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org