Re: hdfs-ha on mesos - odd bug

Adrian Bridgett Tue, 15 Sep 2015 06:33:25 -0700

Thanks Steve - we are already taking the safe route - putting NN anddatanodes on the central mesos-masters which are on demand. Later (muchlater!) we _may_ put some datanodes on spot instances (and using severalspot instance types as the spikes seem to only affect one type - worstcase we can rebuild the data as well). OTOH this would mainly only bebeneficial if spark/mesos understood the data locality which is probablysome time off (we don't need this ability now).

Indeed, the error we are seeing is orthogonal to the setup - however myunderstanding of ha-hdfs is that it should be resolved via thehdfs-site.xml file and doesn't use DNS whatsoever (and indeed, it _does_work - but only after we initialise the driver with a bad hdfs url.) Ithink there's some (missing) HDFS initialisation therefore when runningspark on mesos - my suspicion is on the spark side (or my spark config).


http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html#Configuration_details

On 15/09/2015 10:24, Steve Loughran wrote:

On 15 Sep 2015, at 08:55, Adrian Bridgett <adr...@opensignal.com> wrote:

Hi Sam, in short, no, it's a traditional install as we plan to use spot 
instances and didn't want price spikes to kill off HDFS.

We're actually doing a bit of a hybrid, using spot instances for the mesos 
slaves, ondemand for the mesos masters.  So for the time being, putting hdfs on 
the masters (we'll probably move to multiple slave instance types to avoid 
losing too many when spot price spikes, but for now this is acceptable).   
Masters running CDH5.

It's incredibly dangerous using hdfs NNs on spot vms; a significant enough 
spike will lose all of them in one go, and there goes your entire filesystem. 
Have a static VM, maybe even backed by EBS.

If you look at Hadoop architectures from Hortonworks, Cloudera and Amazon 
themselves, the usual stance is HDFS on static nodes, spot instances for 
compute only

Using hdfs://current-hdfs-master:8020 works fine, however using 
hdfs://nameservice1 fails in the rather odd way described (well, more that the 
workaround actually works!)  I think there's some underlying bug here that's 
being exposed.


this sounds an issue orthogonal to spot instances. Maybe related to how JVMs 
cache DNS entries forever?

--

*Adrian Bridgett* | Sysadmin Engineer, OpenSignal<http://www.opensignal.com>

_____________________________________________________

Office: First Floor, Scriptor Court, 155-157 Farringdon Road,Clerkenwell, London, EC1R 3AD

Phone #: +44 777-377-8251

Skype: abridgett |@adrianbridgett <http://twitter.com/adrianbridgett>|LinkedIn link <https://uk.linkedin.com/in/abridgett>

_____________________________________________________

Re: hdfs-ha on mesos - odd bug

Reply via email to