that depends on your cluster configuration. what is the maximum number of
mappers you can have concurrently on each node?


On Fri, Mar 7, 2014 at 4:42 PM, Suijian Zhou <suijian.z...@gmail.com> wrote:

> The current setting is:
>   <name>mapred.child.java.opts</name>
>   <value>-Xmx6144m -XX:+UseParallelGC -mx1024m -XX:MaxHeapFreeRatio=10
> -XX:MinHeapFreeRatio=10</value>
>
> Is 6144MB enough( for each task tracker)? I.e: I have 39 nodes to process
> the 8*2GB input files.
>
>   Best Regards,
>   Suijian
>
>
>
> 2014-03-07 9:21 GMT-06:00 Claudio Martella <claudio.marte...@gmail.com>:
>
> this setting won't be used by Giraph (or by any mapreduce application),
>> but by the hadoop infrastructure itself.
>> you should use mapred.child.java.opts instead.
>>
>>
>> On Fri, Mar 7, 2014 at 4:19 PM, Suijian Zhou <suijian.z...@gmail.com>wrote:
>>
>>> Hi, Claudio,
>>>   I have set the following when ran the program:
>>> export HADOOP_DATANODE_OPTS="-Xmx10g"
>>> and
>>> export HADOOP_HEAPSIZE=30000
>>>
>>> in hadoop-env.sh and restarted hadoop.
>>>
>>>   Best Regards,
>>>   Suijian
>>>
>>>
>>>
>>> 2014-03-06 17:29 GMT-06:00 Claudio Martella <claudio.marte...@gmail.com>
>>> :
>>>
>>> did you actually increase the heap?
>>>>
>>>>
>>>> On Thu, Mar 6, 2014 at 11:43 PM, Suijian Zhou 
>>>> <suijian.z...@gmail.com>wrote:
>>>>
>>>>> Hi,
>>>>>   I tried to process only 2 of the input files, i.e, 2GB + 2GB input,
>>>>> the program finished successfully in 6 minutes. But as I have 39 nodes,
>>>>> they should be enough to load  and process the 8*2GB=16GB size graph? Can
>>>>> somebody help to give some hints( Will all the nodes participate in graph
>>>>> loading from HDFS or only master node load the graph?)? Thanks!
>>>>>
>>>>>   Best Regards,
>>>>>   Suijian
>>>>>
>>>>>
>>>>>
>>>>> 2014-03-06 16:24 GMT-06:00 Suijian Zhou <suijian.z...@gmail.com>:
>>>>>
>>>>> Hi, Experts,
>>>>>>   I'm trying to process a graph by pagerank in giraph, but the
>>>>>> program always stucks there.
>>>>>> There are 8 input files, each one is with size ~2GB and all copied
>>>>>> onto HDFS. I use 39 nodes and each node has 16GB Mem and 8 cores. It 
>>>>>> keeps
>>>>>> printing the same info(as the following) on the screen after 2 hours, 
>>>>>> looks
>>>>>> no progress at all. What are the possible reasons? Testing small example
>>>>>> files run without problems. Thanks!
>>>>>>
>>>>>> 14/03/06 16:17:42 INFO job.JobProgressTracker: Data from 39 workers -
>>>>>> Compute superstep 0: 5854829 out of 49200000 vertices computed; 181 out 
>>>>>> of
>>>>>> 1521 partitions computed
>>>>>> 14/03/06 16:17:47 INFO job.JobProgressTracker: Data from 39 workers -
>>>>>> Compute superstep 0: 5854829 out of 49200000 vertices computed; 181 out 
>>>>>> of
>>>>>> 1521 partitions computed
>>>>>>
>>>>>>   Best Regards,
>>>>>>   Suijian
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>    Claudio Martella
>>>>
>>>>
>>>
>>>
>>
>>
>> --
>>    Claudio Martella
>>
>>
>
>


-- 
   Claudio Martella

Reply via email to