it is still not clear to me
lets suppose block size of my hdfs is 128 mb so every mapper will process only 128 mb of data then what is the meaning of setting the property mapreduce.map.memory.mb that is already known from the block size then why this property


On Wednesday 15 October 2014 07:06 PM, Shahab Yunus wrote:
Explanation here.

http://stackoverflow.com/questions/24070557/what-is-the-relation-between-mapreduce-map-memory-mb-and-mapred-map-child-jav
https://support.pivotal.io/hc/en-us/articles/201462036-Mapreduce-YARN-Memory-Parameters
http://hadoop.apache.org/docs/r2.5.1/hadoop-project-dist/hadoop-common/ClusterSetup.html (scroll towards the end.)

Regards,
Shahab

On Wed, Oct 15, 2014 at 9:24 AM, SACHINGUPTA <sac...@datametica.com <mailto:sac...@datametica.com>> wrote:

    I have one more doubt i was reading this

    
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html

    there is one property as

    mapreduce.map.memory.mb     = 2*1024 MB
    mapreduce.reduce.memory.mb  = 2 * 2 = 4*1024 MB


    what are these properties mapreduce.map.memory.mb and
    mapreduce.reduce.memory.mb

    On Wednesday 15 October 2014 06:17 PM, Shahab Yunus wrote:
    It cannot run more mappers (tasks) in parallel than the
    underlying cores available. Just like it cannot run multiple
    mappers in parallel if each mapper's (task's) memory requirements
    are greater than allocated and available container size
    configured on each node.

    The links that I provided earlier...see the following section in
    that one:
    Section:"Configuring YARN"

    Also this:
    
http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
    Section "1. YARN Concurrency (aka “What Happened to Slots?”)"

    This should help in putting things in perspective regarding how
    resource allocation for each task, container and resources
    available on the node relate to each other.

    Regards,
    Shahab

    On Wed, Oct 15, 2014 at 8:18 AM, SACHINGUPTA
    <sac...@datametica.com <mailto:sac...@datametica.com>> wrote:

        but Shahab if i have only 4 core machine then how yarn can
        run more then 4 mappers in parallel
        On Wednesday 15 October 2014 05:45 PM, Shahab Yunus wrote:
        It depends on memory settings as well, that how much you
        want to assign resources to each container. Then yarn will
        run as many mappers in parallel as possible.

        See this:
        http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
        
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_installing_manually_book/content/rpm-chap1-11.html

        Regards,
        Shahab

        On Wed, Oct 15, 2014 at 8:09 AM, SACHINGUPTA
        <sac...@datametica.com <mailto:sac...@datametica.com>> wrote:

            Hi guys

            I have situation in which i have machine with 4
            processor and i have 5 containers so does it mean i can
            have only 4 mappers running parallely at a time

            and number of mappers is not dependent on the number of
            containers in a machine then what is the use of
            container concept

            sorry if i have asked anything obvious.

-- Thanks
            Sachin Gupta



-- Thanks
        Sachin Gupta



-- Thanks
    Sachin Gupta



--
Thanks
Sachin Gupta

Reply via email to