Re: YARN - How is a node for a container determined?

2017-08-29 Thread Grant Overby
Most of the applications are twill apps and are some what long running, but
not perpetual, a few hours to a day. Many of the apps (say about half) have
a lot of idle time. These apps come from across the enterprise, Idk why
they're idle. There are also a few MR, TEZ, and Spark apps in the mix.

If we don't over commit (or modestly over commit) vCores and memory, then
having the apps stacked on fewer boxes isn't that much of an impact when
compared to spread across more nodes.

The number of applications running at a given time is very fluid. We have
an elastic infrastructure that can add and remove YARN VMs as needed. It
needs bit of scripting to be "YARN aware," but that part isn't rocket
surgery.

Removing those VMs after the work load drops is hampered when the
containers are spread out. Consider a situation where a node is only
running 1 container and is 10% utilized, I wouldn't want a new container
landing on this node as it's ripe for being removed.

I also prefer to place containers on nodes that are also HDFS data nodes so
that they can benefit from the locality. It's not desirable for a new
container to land on an overflow VM if there are resources on a proper node.

As a stretch goal, I also may want to prioritize some jobs to the proper
nodes and others to the VMs in the future.

I don't want to preempt containers and make them restart on another node.

There's a preferred node feature which I think I can beat into submission,
but I'd much rather adjust a value or plug in a proper, new algo.

PS: Sorry Philippe. I hit reply instead of reply all, so you'll
unfortunately get spammed with two copies of this.

On Tue, Aug 29, 2017 at 6:12 AM, Philippe Kernévez 
wrote:

> " densely pack containers on fewer nodes" : quite surprising, +1 with
> Daemon
>
> You have Yarn labels that can be used for that.
> Classical example are the need of specific hardware fir some processing.
> https://hadoop.apache.org/docs/stable/hadoop-yarn/
> hadoop-yarn-site/NodeLabel.html
>
> Regards,
> Philippe
>
> On Tue, Aug 29, 2017 at 12:53 AM, daemeon reiydelle 
> wrote:
>
>> Perhaps you can go into a bit more detail? Especially for e.g. a map job
>> (or reduce in mapR), this seems like a major antipattern.
>>
>>
>>
>>
>> *Daemeon C.M. ReiydelleSan Francisco 1.415.501.0198
>> <(415)%20501-0198>London 44 020 8144 9872*
>>
>>
>> On Mon, Aug 28, 2017 at 3:37 PM, Grant Overby 
>> wrote:
>>
>>> When YARN receives a request for a container, which can be met by many
>>> nodes, what is the algorithm for determining which node is given the
>>> container?
>>>
>>> Is this tunable? I'd like to densely pack containers on fewer nodes.
>>>
>>> A pointer to source code would be nice.
>>>
>>>
>>
>
>
> --
> Philippe Kernévez
>
>
>
> Directeur technique (Suisse),
> pkerne...@octo.com
> +41 79 888 33 32 <+41%2079%20888%2033%2032>
>
> Retrouvez OCTO sur OCTO Talk : http://blog.octo.com
> OCTO Technology http://www.octo.ch
>


Re: unsubscribe

2017-08-29 Thread Ravi Prakash
Hi Corne!

Please send an email to user-unsubscr...@hadoop.apache.org as mentioned on
https://hadoop.apache.org/mailing_lists.html

Thanks

On Sun, Aug 27, 2017 at 10:25 PM, Corne Van Rensburg 
wrote:

> [image: Softsure]
>
> unsubscribe
>
>
>
> *Corne Van RensburgManaging Director Softsure*
> [image: Tel] 044 805 3746
> [image: Fax]
> [image: Email] co...@softsure.co.za
> *Softsure (Pty) Ltd | Registration No. 2004/008528/07 | 127A York Street,
> George, 6530 *
>
> Disclaimer
> The views and opinions expressed in this email are those of the author and
> do not necessarily reflect the views and opinions of Softsure (Pty) Ltd,
> its directors or management. Softsure (Pty) Ltd expressly reserves the
> right to manage, monitor and intercept emails. Softsure (Pty) Ltd do not
> warrant that this email is free of viruses, worms, Trojan horses or other
> harmful programmes. This email is intended for the addressed recipient
> alone, and access, copying, distribution, acting or omitting to act
> pursuant to the receipt of the email may be unlawful. No liability is
> accepted if the information contained in this email is corrupted or fails
> to reach the addressee. The information contained in this email is
> confidential.
>
>


Re:

2017-08-29 Thread Ravi Prakash
Hi Dominique,

Please send an email to user-unsubscr...@hadoop.apache.org as mentioned on
https://hadoop.apache.org/mailing_lists.html

Thanks
Ravi

2017-08-26 10:49 GMT-07:00 Dominique Rozenberg :

> unsubscribe
>
>
>
>
>
> [image: cid:image001.jpg@01D10A65.E830C520]
>
> *דומיניק רוזנברג*, מנהלת פרויקטים
>
> *נייד*: 052-7722006 >  *משרד*: 08-6343595 > *פקס*: 08-9202801
>
> *d...@datacube.co.il *
>
> *www.datacube.co.il *
>
>
>
>
>


Re: Recommendation for Resourcemanager GC configuration

2017-08-29 Thread Ravuri, Venkata Puneet
Hi Naga, Ravi,

We have lots of small applications running on the cluster. We use Java 8 and 
Hadoop version 2.7.3.
Resourcemanager is running on 40GB heap with NewRatio set to 3. We store 
100,000 completed apps in memory (max-completed-apps).
Tenured space occupies ~28 GB after a full GC. Is this footprint expected for 
100,000 apps?

We did try CMS before with 70% occupancy fraction, there were ‘promotion 
failures’ that ended up in stop the world.

Regards,
Puneet

From: Naganarasimha Garla 
Date: Wednesday, August 23, 2017 at 5:23 PM
To: Ravi Prakash 
Cc: "Ravuri, Venkata Puneet" , "common-u...@hadoop.apache.org" 

Subject: Re: Recommendation for Resourcemanager GC configuration

Hi Puneet,

Along with the heap dump details, I would also like to know the version of the 
Hadoop-Yarn being used, size of the cluster, all Memory configurations, and JRE 
version.
Also if possible can you share the rational behind the choice for Parallel GC 
collector over others (CMS or G1) ?

Regards,
+ Naga

On Thu, Aug 24, 2017 at 2:54 AM, Ravi Prakash 
> wrote:
Hi Puneet

Can you take a heap dump and see where most of the churn is? Is it lots of 
small applications / few really large applications with small containers etc. ?
Cheers
Ravi

On Wed, Aug 23, 2017 at 9:23 AM, Ravuri, Venkata Puneet 
> wrote:
Hello,

I wanted to know if there is any recommendation for ResourceManager GC settings.
Full GC (with Parallel GC, 8 threads) is sometimes taking more than 30 sec due 
to which state store sessions to Zookeeper time out resulting in FATAL errors.
The YARN cluster is heavily used with 1000’s of applications launched per hour.

Could you please share any documentation related to best practices for tuning 
resourcemanager GC?

Thanks,
Puneet




Re: Recommendation for Resourcemanager GC configuration

2017-08-29 Thread Ravuri, Venkata Puneet
Hi Vinod,

The heap size is 40GB and NewRatio is set to 3. We have max completed 
applications set to 10.

Regards,
Puneet

From: Vinod Kumar Vavilapalli 
Date: Wednesday, August 23, 2017 at 5:47 PM
To: "Ravuri, Venkata Puneet" 
Cc: "common-u...@hadoop.apache.org" 
Subject: Re: Recommendation for Resourcemanager GC configuration

What is the ResourceManager JVM’s heap size? What is the value for the 
configuration yarn.resourcemanager.max-completed-applications?

+Vinod

On Aug 23, 2017, at 9:23 AM, Ravuri, Venkata Puneet 
> wrote:

Hello,

I wanted to know if there is any recommendation for ResourceManager GC settings.
Full GC (with Parallel GC, 8 threads) is sometimes taking more than 30 sec due 
to which state store sessions to Zookeeper time out resulting in FATAL errors.
The YARN cluster is heavily used with 1000’s of applications launched per hour.

Could you please share any documentation related to best practices for tuning 
resourcemanager GC?

Thanks,
Puneet



Re: File copy from local to hdfs error

2017-08-29 Thread Atul Rajan
Hello Istavan,

Thanks for the help it worked finally 
There was firewall issue solving that part made the hdfs work and take entry 
from local file system.

Thanks and Regards
Atul Rajan

-Sent from my iPhone

On 28-Aug-2017, at 11:20 PM, István Fajth  wrote:

Hi Atul,

as suggested before, set the blockmanager log level to debug, and check logs 
for reasons. You can either set the whole NameNode log to DEBUG level, and see 
for the messages logged by the BlockManager. Around the INFO level message in 
the NameNode log similar to the message you see now in the client as well, you 
will find the reason why BlockManager thinks that the two DN is not good to 
accept the block the client wish to allocate.

Cheers,
Istvan

> On Aug 28, 2017 19:24, "Atul Rajan"  wrote:
> DataNodes were having issue earlier, i added the ports required in the 
> iptables after that data logs are running but HDFS not able to distribute the 
> file and make blocks. and any file copied on the cluster is throwing this 
> error.
> 
>> On 28 August 2017 at 21:46, István Fajth  wrote:
>> Hi Atul,
>> 
>> you can check NameNode logs if the DataNodes were in service or there were 
>> issues with them. As well you can check for BlockManager's debug level logs 
>> for more exact reasons if you can reproduce the issue at will.
>> 
>> Istvan
>> 
>> On Aug 28, 2017 17:56, "Atul Rajan"  wrote:
>> Hello Sir,
>> 
>> when i am copying data from local to hdfs i am getting below error :
>> can you please guide me how to proceed further with the error
>> 
>> copying could only be replicated to 0 nodes instead of multiplication (=1) 
>> There are 2 datanodes running and 2 nodes are excluded in this operation
>> 
>> 
>> Thanks and Regards
>> Atul Rajan
>> 
>> -Sent from my iPhone
>> 
> 
> 
> 
> -- 
> Best Regards
> Atul Rajan


Re: YARN - How is a node for a container determined?

2017-08-29 Thread Philippe Kernévez
" densely pack containers on fewer nodes" : quite surprising, +1 with Daemon

You have Yarn labels that can be used for that.
Classical example are the need of specific hardware fir some processing.
https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/NodeLabel.html

Regards,
Philippe

On Tue, Aug 29, 2017 at 12:53 AM, daemeon reiydelle 
wrote:

> Perhaps you can go into a bit more detail? Especially for e.g. a map job
> (or reduce in mapR), this seems like a major antipattern.
>
>
>
>
> *Daemeon C.M. ReiydelleSan Francisco 1.415.501.0198
> <(415)%20501-0198>London 44 020 8144 9872*
>
>
> On Mon, Aug 28, 2017 at 3:37 PM, Grant Overby 
> wrote:
>
>> When YARN receives a request for a container, which can be met by many
>> nodes, what is the algorithm for determining which node is given the
>> container?
>>
>> Is this tunable? I'd like to densely pack containers on fewer nodes.
>>
>> A pointer to source code would be nice.
>>
>>
>


-- 
Philippe Kernévez



Directeur technique (Suisse),
pkerne...@octo.com
+41 79 888 33 32

Retrouvez OCTO sur OCTO Talk : http://blog.octo.com
OCTO Technology http://www.octo.ch