@Daniel, there are at least 3 things that EMR can not solve, yet:
- HA support
- AWS provides auto scaling feature, but scale up/down EMR needs manual
operations
- security concerns in a public VPC
EMR is basically designed for short term running use cases with some
pre-defined bootstrap actions
EMR does cost more than vanilla EC2. Using spark-ec2 can result in savings
with large clusters, though that is not everybody's cup of tea.
Regards
Sab
On 19-Feb-2016 7:55 pm, "Daniel Siegmann"
wrote:
> With EMR supporting Spark, I don't see much reason to use the
The docs mention spark-ec2 because it is part of the Spark project. There
are many, many alternatives to spark-ec2 out there like EMR, but it's
probably not the place of the official docs to promote any one of those
third-party solutions.
On Fri, Feb 19, 2016 at 11:05 AM James Hammerton
Hi,
Having looked at how easy it is to use EMR, I reckon you may be right,
especially if using Java 8 is no more difficult with that than with
spark-ec2 (where I had to install it on the master and slaves and edit the
spark-env.sh).
I'm now curious as to why the Spark documentation (
With EMR supporting Spark, I don't see much reason to use the spark-ec2
script unless it is important for you to be able to launch clusters using
the bleeding edge version of Spark. EMR does seem to do a pretty decent job
of keeping up to date - the latest version (4.3.0) supports the latest
Spark
I have now... So far I think the issues I've had are not related to this,
but I wanted to be sure in case it should be something that needs to be
patched. I've had some jobs run successfully but this warning appears in
the logs.
Regards,
James
On 18 February 2016 at 12:23, Ted Yu
I'm fairly new to Spark.
The documentation suggests using the spark-ec2 script to launch clusters in
AWS, hence I used it.
Would EMR offer any advantage?
Regards,
James
On 18 February 2016 at 14:04, Gourav Sengupta
wrote:
> Hi,
>
> Just out of sheet curiosity why
Hi Ted/ Teng,
Just read the content in the email which is very different from what the
facts are:
Just to want to add another point, spark-ec2 is nice to keep and improve
because it allows users to any version of spark (nightly-build for
example). EMR does not allow you to do that without manual
Hi Teng,
Are you using VPC in EMR? Seems quite curious though that you can lock in
traffic at gateway, subnet, security group (using private setting using
NAT) and still feel insecured. I will be really interested to know what
your feelings are based on. I bet Amazon guys will also find it very
Please see the last 3 posts on this thread:
http://search-hadoop.com/m/q3RTtTorTf2o3UGK1=Re+spark+ec2+vs+EMR
FYI
On Thu, Feb 18, 2016 at 6:25 AM, Teng Qiu wrote:
> EMR is great, but I'm curiosity how are you dealing with security settings
> with EMR, only whitelisting some
EMR is great, but I'm curiosity how are you dealing with security settings
with EMR, only whitelisting some IP range with security group setting is
really too weak.
are there really many production system are using EMR? for me, i feel using
EMR means everyone in my IP range (for some ISP it may
Hi,
Just out of sheet curiosity why are you not using EMR to start your SPARK
cluster?
Regards,
Gourav
On Thu, Feb 18, 2016 at 12:23 PM, Ted Yu wrote:
> Have you seen this ?
>
> HADOOP-10988
>
> Cheers
>
> On Thu, Feb 18, 2016 at 3:39 AM, James Hammerton
Have you seen this ?
HADOOP-10988
Cheers
On Thu, Feb 18, 2016 at 3:39 AM, James Hammerton wrote:
> HI,
>
> I am seeing warnings like this in the logs when I run Spark jobs:
>
> OpenJDK 64-Bit Server VM warning: You have loaded library
>
HI,
I am seeing warnings like this in the logs when I run Spark jobs:
OpenJDK 64-Bit Server VM warning: You have loaded library
/root/ephemeral-hdfs/lib/native/libhadoop.so.1.0.0 which might have
disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you
14 matches
Mail list logo