I am trying to execute a job and I am getting Heap Space Error.
After make some tests, I realize (maybe) the problem is the custom key
size. (the mapper out key/ reduce input key)
First I test:
1. Custom Key holding only one Long. This not cause heap space error
problems.
2. Custom Key holding 3
hope this answers your question.
Regards,
Varun Saxena.
On Sun, Aug 23, 2015 at 9:40 PM, Pedro Magalhaes pedror...@gmail.com
wrote:
I was looking at default parameters for:
yarn.nodemanager.resource.cpu-vcores = 8
yarn.scheduler.maximum-allocation-vcores = 32
For me this two parameters
I was looking at default parameters for:
yarn.nodemanager.resource.cpu-vcores = 8
yarn.scheduler.maximum-allocation-vcores = 32
For me this two parameters as default doesnt make any sense.
The first one say the number of CPU cores that can be allocated for
containers. (I imagine that is vcore)
.
Regards,
Varun Saxena.
On Sun, Aug 23, 2015 at 10:39 PM, Pedro Magalhaes pedror...@gmail.com
wrote:
Varun,
Thanks for the reply. I undestand the arn.scheduler.maximum-
allocation-vcores parameter. I just asking why the default parameter is
yarn.scheduler.maximum-allocation-vcores=32
not be available AFAIK,
However, this XML(yarn-default.xml) can be checked online in git
repository.
Associated JIRA which fixes this is
https://issues.apache.org/jira/browse/YARN-3823
Regards,
Varun Saxena.
On Mon, Aug 24, 2015 at 12:53 AM, Pedro Magalhaes pedror...@gmail.com
wrote:
Thanks
by NodeManager, on whichever node its
running.
If it is not configured, default value will be taken.
Regards,
Varun Saxena.
On Mon, Aug 24, 2015 at 1:21 AM, Pedro Magalhaes pedror...@gmail.com
wrote:
Thanks Varun! Like we say in Brazil. U are the guy! (Você é o cara!)
I have another question
I am developig a job that has 30B of records in the input path. (File A)
I need to filter these records using another file that can have 30K to 180M
of records. (File B)
So fo each record in File A, i will make a lookup in File B.
I am using distributed cache to share the File B. The problem is
Does anyone can help me?
http://stackoverflow.com/questions/26341913/hadoop-multipleinputs
Keren,
The map side join can be implementes using CompositeInputFormat or
DistributedCache.
If u googled these two words you can find some implementations.
Hope that it helps.
Em sexta-feira, 29 de agosto de 2014, Keren Ouaknine ker...@gmail.com
escreveu:
Hi,
I am looking for
I forgot the quote is from Hadoop, Definitive Guide.
On Thu, Aug 7, 2014 at 6:04 PM, Pedro Magalhaes pedror...@gmail.com wrote:
Thanks for reply..
Really, what i am doing is trying to implement a mapside join. In my
mind, i am gonna need that files must be no splittable, so each map
10 matches
Mail list logo