[ 
https://issues.apache.org/jira/browse/SPARK-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14203786#comment-14203786
 ] 

Nicholas Chammas commented on SPARK-3821:
-----------------------------------------

Thanks for the feedback [~shivaram].

{quote}
1. My preference would be to just have a single AMI across Spark versions for a 
couple of reasons. 
{quote}

I agree. Maintaining images for specific versions of Spark is worth it only if 
you're really crazy about getting the lowest cluster launch times possible. 
Well, that was my [original motivation | 
http://apache-spark-developers-list.1001551.n3.nabble.com/EC2-clusters-ready-in-launch-time-30-seconds-td7262.html]
 for doing this work, but ultimately I agree the complexity is not worth it at 
the moment. I'll take this out unless someone wants to advocate for leaving it 
in.

{quote}
2. Could you clarify if Hadoop is pre-installed in new AMIs or are is it still 
installed on startup ?
{quote}

Currently, I have it set to install Hadoop 2 on the AMIs with Spark 
pre-installed. Again, this was done with the intention of aiming for the lowest 
launch time possible, but if we'd like to do away with the Spark-pre-installed 
AMIs then this is not an issue.

{quote}
Are the init scripts run during AMI creation or during startup ?
{quote}

For the AMIs with Spark pre-installed, they are run during AMI creation. That's 
why the [init runtimes in the second benchmark | 
https://github.com/nchammas/spark-ec2/blob/214d5e4cac392a0eac21f949fe25c0075044411f/packer/proposal.md#new-amis---latest-os-updates-and-spark-110-pre-installed-single-run]
 are all 0 ms; the init script sees that such and such is already installed and 
just exits.

{quote}
3. Do you have some benchmarks for the new AMI without Spark 1.1.0 
pre-installed ?
{quote}

Nope, but I can run one and get back to you on Monday or Tuesday with those 
numbers.

> Develop an automated way of creating Spark images (AMI, Docker, and others)
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-3821
>                 URL: https://issues.apache.org/jira/browse/SPARK-3821
>             Project: Spark
>          Issue Type: Improvement
>          Components: Build, EC2
>            Reporter: Nicholas Chammas
>            Assignee: Nicholas Chammas
>         Attachments: packer-proposal.html
>
>
> Right now the creation of Spark AMIs or Docker containers is done manually. 
> With tools like [Packer|http://www.packer.io/], we should be able to automate 
> this work, and do so in such a way that multiple types of machine images can 
> be created from a single template.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to