[ 
https://issues.apache.org/jira/browse/TOREE-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15990490#comment-15990490
 ] 

Ryan Blue commented on TOREE-390:
---------------------------------

[~luciano resende], we aren't having the weird issues with requests any more, 
since the commit you linked to and 
[8f1775d|https://github.com/apache/incubator-toree/commit/8f1775d9dcdc56dd4f3a7c44abf0c36ae0417444].

The problem this fixes is the initial wait time is both startup for Toree and 
Spark. This separates the two waits, making it clear which one is due to Spark 
starting up and getting resources from YARN (in our deployment). This is better 
for the user experience because people don't think Toree is slow, they 
correctly attribute it to Spark. The secondary benefit is that users can 
customize their Spark session before it starts by creating one in the first 
cell, and Toree works just fine with that.

> Lazily start Spark sessions
> ---------------------------
>
>                 Key: TOREE-390
>                 URL: https://issues.apache.org/jira/browse/TOREE-390
>             Project: TOREE
>          Issue Type: Improvement
>            Reporter: Ryan Blue
>
> In our deployment, more than half of the startup time for a Toree notebook is 
> taken by starting a Spark session and waiting for containers. Lazily starting 
> Spark sessions helps the notebook environment feel faster, even if the user 
> is waiting on Spark to start up because the time waiting for Spark to start 
> is clearly Spark, not Toree, and is initiated by the user.
> Also, lazily starting a Spark session allows users to change settings that 
> can't be changed in a Spark context. It also enables the same startup code 
> that would be used in a spark-submit application:
> {code:lang=java}
> SparkSession.builder
>     .config(...)
>     .setAppName(...)
>     .getOrCreate()
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to