[ https://issues.apache.org/jira/browse/SPARK-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14203786#comment-14203786 ]
Nicholas Chammas commented on SPARK-3821: ----------------------------------------- Thanks for the feedback [~shivaram]. {quote} 1. My preference would be to just have a single AMI across Spark versions for a couple of reasons. {quote} I agree. Maintaining images for specific versions of Spark is worth it only if you're really crazy about getting the lowest cluster launch times possible. Well, that was my [original motivation | http://apache-spark-developers-list.1001551.n3.nabble.com/EC2-clusters-ready-in-launch-time-30-seconds-td7262.html] for doing this work, but ultimately I agree the complexity is not worth it at the moment. I'll take this out unless someone wants to advocate for leaving it in. {quote} 2. Could you clarify if Hadoop is pre-installed in new AMIs or are is it still installed on startup ? {quote} Currently, I have it set to install Hadoop 2 on the AMIs with Spark pre-installed. Again, this was done with the intention of aiming for the lowest launch time possible, but if we'd like to do away with the Spark-pre-installed AMIs then this is not an issue. {quote} Are the init scripts run during AMI creation or during startup ? {quote} For the AMIs with Spark pre-installed, they are run during AMI creation. That's why the [init runtimes in the second benchmark | https://github.com/nchammas/spark-ec2/blob/214d5e4cac392a0eac21f949fe25c0075044411f/packer/proposal.md#new-amis---latest-os-updates-and-spark-110-pre-installed-single-run] are all 0 ms; the init script sees that such and such is already installed and just exits. {quote} 3. Do you have some benchmarks for the new AMI without Spark 1.1.0 pre-installed ? {quote} Nope, but I can run one and get back to you on Monday or Tuesday with those numbers. > Develop an automated way of creating Spark images (AMI, Docker, and others) > --------------------------------------------------------------------------- > > Key: SPARK-3821 > URL: https://issues.apache.org/jira/browse/SPARK-3821 > Project: Spark > Issue Type: Improvement > Components: Build, EC2 > Reporter: Nicholas Chammas > Assignee: Nicholas Chammas > Attachments: packer-proposal.html > > > Right now the creation of Spark AMIs or Docker containers is done manually. > With tools like [Packer|http://www.packer.io/], we should be able to automate > this work, and do so in such a way that multiple types of machine images can > be created from a single template. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org