Heap Space Error

2015-09-09 Thread Pedro Magalhaes
I am trying to execute a job and I am getting Heap Space Error. After make some tests, I realize (maybe) the problem is the custom key size. (the mapper out key/ reduce input key) First I test: 1. Custom Key holding only one Long. This not cause heap space error problems. 2. Custom Key holding 3

Re: yarn.nodemanager.resource.cpu-vcores vs yarn.scheduler.maximum-allocation-vcores

2015-08-23 Thread Pedro Magalhaes
hope this answers your question. Regards, Varun Saxena. On Sun, Aug 23, 2015 at 9:40 PM, Pedro Magalhaes pedror...@gmail.com wrote: I was looking at default parameters for: yarn.nodemanager.resource.cpu-vcores = 8 yarn.scheduler.maximum-allocation-vcores = 32 For me this two parameters

yarn.nodemanager.resource.cpu-vcores vs yarn.scheduler.maximum-allocation-vcores

2015-08-23 Thread Pedro Magalhaes
I was looking at default parameters for: yarn.nodemanager.resource.cpu-vcores = 8 yarn.scheduler.maximum-allocation-vcores = 32 For me this two parameters as default doesnt make any sense. The first one say the number of CPU cores that can be allocated for containers. (I imagine that is vcore)

Re: yarn.nodemanager.resource.cpu-vcores vs yarn.scheduler.maximum-allocation-vcores

2015-08-23 Thread Pedro Magalhaes
. Regards, Varun Saxena. On Sun, Aug 23, 2015 at 10:39 PM, Pedro Magalhaes pedror...@gmail.com wrote: Varun, Thanks for the reply. I undestand the arn.scheduler.maximum- allocation-vcores parameter. I just asking why the default parameter is yarn.scheduler.maximum-allocation-vcores=32

Re: yarn.nodemanager.resource.cpu-vcores vs yarn.scheduler.maximum-allocation-vcores

2015-08-23 Thread Pedro Magalhaes
not be available AFAIK, However, this XML(yarn-default.xml) can be checked online in git repository. Associated JIRA which fixes this is https://issues.apache.org/jira/browse/YARN-3823 Regards, Varun Saxena. On Mon, Aug 24, 2015 at 12:53 AM, Pedro Magalhaes pedror...@gmail.com wrote: Thanks

Re: yarn.nodemanager.resource.cpu-vcores vs yarn.scheduler.maximum-allocation-vcores

2015-08-23 Thread Pedro Magalhaes
by NodeManager, on whichever node its running. If it is not configured, default value will be taken. Regards, Varun Saxena. On Mon, Aug 24, 2015 at 1:21 AM, Pedro Magalhaes pedror...@gmail.com wrote: Thanks Varun! Like we say in Brazil. U are the guy! (Você é o cara!) I have another question

MultithreadedMapper - Sharing Data Structure

2015-08-22 Thread Pedro Magalhaes
I am developig a job that has 30B of records in the input path. (File A) I need to filter these records using another file that can have 30K to 180M of records. (File B) So fo each record in File A, i will make a lookup in File B. I am using distributed cache to share the File B. The problem is

Multiple Input

2014-10-13 Thread Pedro Magalhaes
Does anyone can help me? http://stackoverflow.com/questions/26341913/hadoop-multipleinputs

Re: map-side and reduce-side join implementations

2014-08-30 Thread Pedro Magalhaes
Keren, The map side join can be implementes using CompositeInputFormat or DistributedCache. If u googled these two words you can find some implementations. Hope that it helps. Em sexta-feira, 29 de agosto de 2014, Keren Ouaknine ker...@gmail.com escreveu: Hi, I am looking for

Re: CompositeInputFormat

2014-08-09 Thread Pedro Magalhaes
I forgot the quote is from Hadoop, Definitive Guide. On Thu, Aug 7, 2014 at 6:04 PM, Pedro Magalhaes pedror...@gmail.com wrote: Thanks for reply.. Really, what i am doing is trying to implement a mapside join. In my mind, i am gonna need that files must be no splittable, so each map