Re: Avoiding using hostname for YARN nodemanagers

2017-12-05 Thread Alvaro Brandon
Thanks for your answer Vinay:

The thing is that I'm using Marathon and not the Docker engine per se. I
don't want to set a -h parameter to each instance that is launched, since
this is the responsibility of the container orchestrator platform. That's
why I need an option like the HDFS one.

Alvaro

2017-12-05 17:03 GMT+01:00 Vinayakumar B :

> Hi Alvaro,
>
> I think you can configure to use custom hostname for docker containers as
> well.
> Hostname should be provided durin launch of containers using -h parameter.
>
> And with user created docker network DNS resolution of these hostnames
> among the containers is possible. provide --network-alias parameter to add
> hostname for DNS
>
> Check if that works for you.
> -Vinay
>
>
> On 5 Dec 2017 9:20 pm, "Alvaro Brandon"  wrote:
>
> Hello:
>
> I'm using Docker images to build a YARN cluster. I have a problem when the
> node managers register with the resource manager.
>
> Since they are containers they use the hash that the Docker engine assigns
> to them as the hostname.
>
> *17/12/05 14:56:16 INFO nodemanager.NodeStatusUpdaterImpl: Registered with
> ResourceManager as ba3aeecd656a:45989 with total resource of  vCores:1>*
>
> This, of course, is a problem when the resourcemanager tries to contact
> the node.
>
> *17/12/05 15:42:04 ERROR scheduler.SchedulerApplicationAttempt: Error
> trying to assign container token and NM token to an allocated container
> container_1512485507238_0003_01_01*
> *java.lang.IllegalArgumentException: java.net.UnknownHostException:
> ba3aeecd656a*
>
> With HDFS I had no problems since you can always set the
> *dfs.datanode.use.datanode.hostname *and similar configuration options to
> avoid the problem. However, I cannot find a similar option in YARN node
> managers.
>
> Is there an option to not use the hostname when registering with the
> resource manager?
>
>
>


Avoiding using hostname for YARN nodemanagers

2017-12-05 Thread Alvaro Brandon
Hello:

I'm using Docker images to build a YARN cluster. I have a problem when the
node managers register with the resource manager.

Since they are containers they use the hash that the Docker engine assigns
to them as the hostname.

*17/12/05 14:56:16 INFO nodemanager.NodeStatusUpdaterImpl: Registered with
ResourceManager as ba3aeecd656a:45989 with total resource of *

This, of course, is a problem when the resourcemanager tries to contact the
node.

*17/12/05 15:42:04 ERROR scheduler.SchedulerApplicationAttempt: Error
trying to assign container token and NM token to an allocated container
container_1512485507238_0003_01_01*
*java.lang.IllegalArgumentException: java.net.UnknownHostException:
ba3aeecd656a*

With HDFS I had no problems since you can always set the
*dfs.datanode.use.datanode.hostname *and similar configuration options to
avoid the problem. However, I cannot find a similar option in YARN node
managers.

Is there an option to not use the hostname when registering with the
resource manager?


Parameter repeated twice in hdfs-site.xml

2017-11-30 Thread Alvaro Brandon
What will happen if I have a repeated parameter in the configuration file
for HDFS?. You can see here an example of a file where the parameters in
bold are repeated with contradictory values: right and false.
I need to know because I'm using a Docker image that builds the
configuration file this way, through environmental variables and I want to
know if it will create any conflicts. A related question is how can I see
the parameters with which a datanode was launched in order to check these
values



*dfs.datanode.use.datanode.hostname*
false
dfs.datanode.use.datanode.ip.hostnamefalse
dfs.namenode.datanode.registration.ip-hostname-checkfalse
dfs.datanode.data.dirfile:///hadoop/dfs/data
*dfs.client.use.datanode.hostname*
false
dfs.namenode.rpc-bind-host0.0.0.0
dfs.namenode.servicerpc-bind-host0.0.0.0
dfs.namenode.http-bind-host0.0.0.0
dfs.namenode.https-bind-host0.0.0.0
*dfs.client.use.datanode.hostname*
true
*dfs.datanode.use.datanode.hostname*
true



Choosing a subset of machines to launch Spark Application

2017-02-07 Thread Alvaro Brandon
Hello all:

I have the following scenario.
- I have a cluster of 50 machines with Hadoop and Spark installed on them.
- I want to launch one Spark application through spark submit. However I
want this application to run on only a subset of these machines,
disregarding data locality. (e.g. 10 machines)

Is this possible?. Is there any option in YARN that allows such thing?.


Restart number of vcores in YARN

2016-07-15 Thread Alvaro Brandon
Hello everyone:

I've changed yarn.nodemanager.resource.cpu-vcores in my yarn-site.xml
configuration file and restarted all the yarn and hdfs services. However
the nodes doesn't reflect this change in the number of available virtual
cores, at least when I query the resource manager API. How can you refresh
this yarn.nodemanager.resource.cpu-vcores in the cluster?

Thanks in advance


YARN application start event

2016-07-07 Thread Alvaro Brandon
Hello everyone:

I was wondering if there is any way to capture the event of an application
starting in YARN. The idea is to implement a Listener that every time a
YARN application starts, will query the REST API to get the current memory
and cores availables in the cluster. Any ideas on this?

Thanks in advance,

Alvaro