Re: Out of core execution has no effect on GC crash

2013-09-09 Thread David Boyd

Alexander:
You might try turning off the GC Overhead limit 
(-XX:-UseGCOverheadLimit)
Also you could turn on verbose GC logging (-verbose:gc 
-Xloggc:/tmp/@taskid@.gc)

to see what is happening.
Because the OOC still has to create and destroy objects I suspect that 
the heap is just

getting really fragmented.

There are options that you can set with Java to change the type of 
garbage collection and

how it is scheduled as well.

You might up the heap size slightly - what is the default heap size on 
your cluster?


On 9/9/2013 8:33 PM, Alexander Asplund wrote:

A small note: I'm not seeing any partitions directory being formed
under _bsp, which is where I have understood that they should be
appearing.

On 9/10/13, Alexander Asplund  wrote:

Really appreciate the swift responses! Thanks again.

I have not both increased mapper tasks and decreased max number of
partitions at the same time. I first did tests with increased Mapper
heap available, but reset the setting after it apparently caused
other, large volume, non-Giraph jobs to crash nodes when reducers also
were running.

I'm curious why increasing mapper heap is a requirement. Shouldn't the
OOC mode be able to work with the amount of heap that is available? Is
there some agreement on the minimum amount of heap necessary for OOC
to succeed, to guide the choice of Mapper heap amount?

Either way, I will try increasing mapper heap again as much as
possible, which hopefully will run.

On 9/9/13, Claudio Martella  wrote:

did you extend the heap available to the mapper tasks? e.g. through
mapred.child.java.opts.


On Tue, Sep 10, 2013 at 12:50 AM, Alexander Asplund
wrote:


Thanks for the reply.

I tried setting giraph.maxPartitionsInMemory to 1, but I'm still
getting OOM: GC limit exceeded.

Are there any particular cases the OOC will not be able to handle, or
is it supposed to work in all cases? If the latter, it might be that I
have made some configuration error.

I do have one concern that might indicateI have done something wrong:
to allow OOC to activate without crashing I had to modify the trunk
code. This was because Giraph relied on guava-12 and
DiskBackedPartitionStore used hasInt() - a method which does not exist
in guava-11 which hadoop 2 depends on. At runtime guava 11 was being
used

I suppose this problem might indicate I'm running submitting the job
using the wrong binary. Currently I am including the giraph
dependencies with the jar, and running using hadoop jar.

On 9/7/13, Claudio Martella  wrote:

OOC is used also at input superstep. try to decrease the number of
partitions kept in memory.


On Sat, Sep 7, 2013 at 1:37 AM, Alexander Asplund
wrote:


Hi,

I'm trying to process a graph that is about 3 times the size of
available memory. On the other hand, there is plenty of disk space. I
have enabled the giraph.useOutOfCoreGraph property, but it still
crashes with outOfMemoryError: GC limit exceeded when I try running
my
job.

I'm wondering of the spilling is supposed to work during the input
step. If so, are there any additional steps that must be taken to
ensure it functions?

Regards,
Alexander Asplund




--
Claudio Martella
claudio.marte...@gmail.com



--
Alexander Asplund




--
Claudio Martella
claudio.marte...@gmail.com



--
Alexander Asplund






--
= mailto:db...@data-tactics.com 
David W. Boyd
Director, Engineering
7901 Jones Branch, Suite 700
Mclean, VA 22102
office:   +1-571-279-2122
fax: +1-703-506-6703
cell: +1-703-402-7908
== http://www.data-tactics.com.com/ 
First Robotic Mentor - FRC, FTC - www.iliterobotics.org
President - USSTEM Foundation - www.usstem.org

The information contained in this message may be privileged
and/or confidential and protected from disclosure.
If the reader of this message is not the intended recipient
or an employee or agent responsible for delivering this message
to the intended recipient, you are hereby notified that any
dissemination, distribution or copying of this communication
is strictly prohibited.  If you have received this communication
in error, please notify the sender immediately by replying to
this message and deleting the material from any computer.

 



Re: Out of core execution has no effect on GC crash

2013-09-09 Thread Alexander Asplund
A small note: I'm not seeing any partitions directory being formed
under _bsp, which is where I have understood that they should be
appearing.

On 9/10/13, Alexander Asplund  wrote:
> Really appreciate the swift responses! Thanks again.
>
> I have not both increased mapper tasks and decreased max number of
> partitions at the same time. I first did tests with increased Mapper
> heap available, but reset the setting after it apparently caused
> other, large volume, non-Giraph jobs to crash nodes when reducers also
> were running.
>
> I'm curious why increasing mapper heap is a requirement. Shouldn't the
> OOC mode be able to work with the amount of heap that is available? Is
> there some agreement on the minimum amount of heap necessary for OOC
> to succeed, to guide the choice of Mapper heap amount?
>
> Either way, I will try increasing mapper heap again as much as
> possible, which hopefully will run.
>
> On 9/9/13, Claudio Martella  wrote:
>> did you extend the heap available to the mapper tasks? e.g. through
>> mapred.child.java.opts.
>>
>>
>> On Tue, Sep 10, 2013 at 12:50 AM, Alexander Asplund
>> wrote:
>>
>>> Thanks for the reply.
>>>
>>> I tried setting giraph.maxPartitionsInMemory to 1, but I'm still
>>> getting OOM: GC limit exceeded.
>>>
>>> Are there any particular cases the OOC will not be able to handle, or
>>> is it supposed to work in all cases? If the latter, it might be that I
>>> have made some configuration error.
>>>
>>> I do have one concern that might indicateI have done something wrong:
>>> to allow OOC to activate without crashing I had to modify the trunk
>>> code. This was because Giraph relied on guava-12 and
>>> DiskBackedPartitionStore used hasInt() - a method which does not exist
>>> in guava-11 which hadoop 2 depends on. At runtime guava 11 was being
>>> used
>>>
>>> I suppose this problem might indicate I'm running submitting the job
>>> using the wrong binary. Currently I am including the giraph
>>> dependencies with the jar, and running using hadoop jar.
>>>
>>> On 9/7/13, Claudio Martella  wrote:
>>> > OOC is used also at input superstep. try to decrease the number of
>>> > partitions kept in memory.
>>> >
>>> >
>>> > On Sat, Sep 7, 2013 at 1:37 AM, Alexander Asplund
>>> > wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> I'm trying to process a graph that is about 3 times the size of
>>> >> available memory. On the other hand, there is plenty of disk space. I
>>> >> have enabled the giraph.useOutOfCoreGraph property, but it still
>>> >> crashes with outOfMemoryError: GC limit exceeded when I try running
>>> >> my
>>> >> job.
>>> >>
>>> >> I'm wondering of the spilling is supposed to work during the input
>>> >> step. If so, are there any additional steps that must be taken to
>>> >> ensure it functions?
>>> >>
>>> >> Regards,
>>> >> Alexander Asplund
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> >Claudio Martella
>>> >claudio.marte...@gmail.com
>>> >
>>>
>>>
>>> --
>>> Alexander Asplund
>>>
>>
>>
>>
>> --
>>Claudio Martella
>>claudio.marte...@gmail.com
>>
>
>
> --
> Alexander Asplund
>


-- 
Alexander Asplund


Re: Out of core execution has no effect on GC crash

2013-09-09 Thread Alexander Asplund
Really appreciate the swift responses! Thanks again.

I have not both increased mapper tasks and decreased max number of
partitions at the same time. I first did tests with increased Mapper
heap available, but reset the setting after it apparently caused
other, large volume, non-Giraph jobs to crash nodes when reducers also
were running.

I'm curious why increasing mapper heap is a requirement. Shouldn't the
OOC mode be able to work with the amount of heap that is available? Is
there some agreement on the minimum amount of heap necessary for OOC
to succeed, to guide the choice of Mapper heap amount?

Either way, I will try increasing mapper heap again as much as
possible, which hopefully will run.

On 9/9/13, Claudio Martella  wrote:
> did you extend the heap available to the mapper tasks? e.g. through
> mapred.child.java.opts.
>
>
> On Tue, Sep 10, 2013 at 12:50 AM, Alexander Asplund
> wrote:
>
>> Thanks for the reply.
>>
>> I tried setting giraph.maxPartitionsInMemory to 1, but I'm still
>> getting OOM: GC limit exceeded.
>>
>> Are there any particular cases the OOC will not be able to handle, or
>> is it supposed to work in all cases? If the latter, it might be that I
>> have made some configuration error.
>>
>> I do have one concern that might indicateI have done something wrong:
>> to allow OOC to activate without crashing I had to modify the trunk
>> code. This was because Giraph relied on guava-12 and
>> DiskBackedPartitionStore used hasInt() - a method which does not exist
>> in guava-11 which hadoop 2 depends on. At runtime guava 11 was being
>> used
>>
>> I suppose this problem might indicate I'm running submitting the job
>> using the wrong binary. Currently I am including the giraph
>> dependencies with the jar, and running using hadoop jar.
>>
>> On 9/7/13, Claudio Martella  wrote:
>> > OOC is used also at input superstep. try to decrease the number of
>> > partitions kept in memory.
>> >
>> >
>> > On Sat, Sep 7, 2013 at 1:37 AM, Alexander Asplund
>> > wrote:
>> >
>> >> Hi,
>> >>
>> >> I'm trying to process a graph that is about 3 times the size of
>> >> available memory. On the other hand, there is plenty of disk space. I
>> >> have enabled the giraph.useOutOfCoreGraph property, but it still
>> >> crashes with outOfMemoryError: GC limit exceeded when I try running my
>> >> job.
>> >>
>> >> I'm wondering of the spilling is supposed to work during the input
>> >> step. If so, are there any additional steps that must be taken to
>> >> ensure it functions?
>> >>
>> >> Regards,
>> >> Alexander Asplund
>> >>
>> >
>> >
>> >
>> > --
>> >Claudio Martella
>> >claudio.marte...@gmail.com
>> >
>>
>>
>> --
>> Alexander Asplund
>>
>
>
>
> --
>Claudio Martella
>claudio.marte...@gmail.com
>


-- 
Alexander Asplund


Re: Out of core execution has no effect on GC crash

2013-09-09 Thread Claudio Martella
did you extend the heap available to the mapper tasks? e.g. through
mapred.child.java.opts.


On Tue, Sep 10, 2013 at 12:50 AM, Alexander Asplund
wrote:

> Thanks for the reply.
>
> I tried setting giraph.maxPartitionsInMemory to 1, but I'm still
> getting OOM: GC limit exceeded.
>
> Are there any particular cases the OOC will not be able to handle, or
> is it supposed to work in all cases? If the latter, it might be that I
> have made some configuration error.
>
> I do have one concern that might indicateI have done something wrong:
> to allow OOC to activate without crashing I had to modify the trunk
> code. This was because Giraph relied on guava-12 and
> DiskBackedPartitionStore used hasInt() - a method which does not exist
> in guava-11 which hadoop 2 depends on. At runtime guava 11 was being
> used
>
> I suppose this problem might indicate I'm running submitting the job
> using the wrong binary. Currently I am including the giraph
> dependencies with the jar, and running using hadoop jar.
>
> On 9/7/13, Claudio Martella  wrote:
> > OOC is used also at input superstep. try to decrease the number of
> > partitions kept in memory.
> >
> >
> > On Sat, Sep 7, 2013 at 1:37 AM, Alexander Asplund
> > wrote:
> >
> >> Hi,
> >>
> >> I'm trying to process a graph that is about 3 times the size of
> >> available memory. On the other hand, there is plenty of disk space. I
> >> have enabled the giraph.useOutOfCoreGraph property, but it still
> >> crashes with outOfMemoryError: GC limit exceeded when I try running my
> >> job.
> >>
> >> I'm wondering of the spilling is supposed to work during the input
> >> step. If so, are there any additional steps that must be taken to
> >> ensure it functions?
> >>
> >> Regards,
> >> Alexander Asplund
> >>
> >
> >
> >
> > --
> >Claudio Martella
> >claudio.marte...@gmail.com
> >
>
>
> --
> Alexander Asplund
>



-- 
   Claudio Martella
   claudio.marte...@gmail.com


Re: Out of core execution has no effect on GC crash

2013-09-09 Thread Alexander Asplund
Thanks for the reply.

I tried setting giraph.maxPartitionsInMemory to 1, but I'm still
getting OOM: GC limit exceeded.

Are there any particular cases the OOC will not be able to handle, or
is it supposed to work in all cases? If the latter, it might be that I
have made some configuration error.

I do have one concern that might indicateI have done something wrong:
to allow OOC to activate without crashing I had to modify the trunk
code. This was because Giraph relied on guava-12 and
DiskBackedPartitionStore used hasInt() - a method which does not exist
in guava-11 which hadoop 2 depends on. At runtime guava 11 was being
used

I suppose this problem might indicate I'm running submitting the job
using the wrong binary. Currently I am including the giraph
dependencies with the jar, and running using hadoop jar.

On 9/7/13, Claudio Martella  wrote:
> OOC is used also at input superstep. try to decrease the number of
> partitions kept in memory.
>
>
> On Sat, Sep 7, 2013 at 1:37 AM, Alexander Asplund
> wrote:
>
>> Hi,
>>
>> I'm trying to process a graph that is about 3 times the size of
>> available memory. On the other hand, there is plenty of disk space. I
>> have enabled the giraph.useOutOfCoreGraph property, but it still
>> crashes with outOfMemoryError: GC limit exceeded when I try running my
>> job.
>>
>> I'm wondering of the spilling is supposed to work during the input
>> step. If so, are there any additional steps that must be taken to
>> ensure it functions?
>>
>> Regards,
>> Alexander Asplund
>>
>
>
>
> --
>Claudio Martella
>claudio.marte...@gmail.com
>


-- 
Alexander Asplund


Re: Counter limit

2013-09-09 Thread André Kelpe
On older versions of hadoop, you cannot set the counters to a higher
value. That was only introduced later. I had this issue on CDH3 (~1.5
years ago) and my solution was to disable all counters for the giraph
job, to make it work. If you use a more modern version of  hadoop, it
should be possible to increase the limit though.

- André

2013/9/9 Avery Ching :
> If you are running out of counters, you can turn off the superstep counters
>
>   /** Use superstep counters? (boolean) */
>   BooleanConfOption USE_SUPERSTEP_COUNTERS =
>   new BooleanConfOption("giraph.useSuperstepCounters", true,
>   "Use superstep counters? (boolean)");
>
>
> On 9/9/13 6:43 AM, Claudio Martella wrote:
>
> No, I used a different counters limit on that hadoop version. Setting
> mapreduce.job.counters.limit to a higher number and restarting JT and TT
> worked for me. Maybe 64000 might be too high? Try setting it to 512. Does
> not look like the case, but who knows.
>
>
> On Mon, Sep 9, 2013 at 2:57 PM, Christian Krause  wrote:
>>
>> Sorry, it still doesn't work (I ran into a different problem before I
>> reached the limit).
>>
>> I am using Hadoop 0.20.203.0. Is the limit of 120 counters maybe
>> hardcoded?
>>
>> Cheers
>> Christian
>>
>> Am 09.09.2013 08:29 schrieb "Christian Krause" :
>>
>>> I changed the property name to mapred.job.counters.limit and restarted it
>>> again. Now it works.
>>>
>>> Thanks,
>>> Christian
>>>
>>>
>>> 2013/9/7 Claudio Martella 

 did you restart TT and JT?


 On Sat, Sep 7, 2013 at 7:09 AM, Christian Krause  wrote:
>
> Hi,
> I've increased the counter limit in mapred-site.xml, but I still get
> the error: Exceeded counter limits - Counters=121 Limit=120. Groups=6
> Limit=50.
>
> This is my config:
>
>  cat conf/mapred-site.xml
> 
> 
>
> 
>
> 
> ...
> 
> mapreduce.job.counters.limit
> 64000
> 
> 
> mapred.task.timeout
> 240
> 
> ...
> 
>
> Any ideas?
>
> Cheers,
> Christian
>



 --
Claudio Martella
claudio.marte...@gmail.com
>>>
>>>
>
>
>
> --
>Claudio Martella
>claudio.marte...@gmail.com
>
>


Re: Counter limit

2013-09-09 Thread Avery Ching

If you are running out of counters, you can turn off the superstep counters

  /** Use superstep counters? (boolean) */
  BooleanConfOption USE_SUPERSTEP_COUNTERS =
  new BooleanConfOption("giraph.useSuperstepCounters", true,
  "Use superstep counters? (boolean)");

On 9/9/13 6:43 AM, Claudio Martella wrote:
No, I used a different counters limit on that hadoop version. Setting 
mapreduce.job.counters.limit to a higher number and restarting JT and 
TT worked for me. Maybe 64000 might be too high? Try setting it to 
512. Does not look like the case, but who knows.



On Mon, Sep 9, 2013 at 2:57 PM, Christian Krause > wrote:


Sorry, it still doesn't work (I ran into a different problem
before I reached the limit).

I am using Hadoop 0.20.203.0 . Is the limit of 120
counters maybe hardcoded?

Cheers
Christian

Am 09.09.2013 08 :29 schrieb "Christian
Krause" mailto:m...@ckrause.org>>:

I changed the property name to mapred.job.counters.limit and
restarted it again. Now it works.

Thanks,
Christian


2013/9/7 Claudio Martella mailto:claudio.marte...@gmail.com>>

did you restart TT and JT?


On Sat, Sep 7, 2013 at 7:09 AM, Christian Krause
mailto:m...@ckrause.org>> wrote:

Hi,
I've increased the counter limit in mapred-site.xml,
but I still get the error: Exceeded counter limits -
Counters=121 Limit=120. Groups=6 Limit=50.

This is my config:

 cat conf/mapred-site.xml






...

mapreduce.job.counters.limit
64000


mapred.task.timeout
240

...


Any ideas?

Cheers,
Christian




-- 
   Claudio Martella

claudio.marte...@gmail.com






--
   Claudio Martella
claudio.marte...@gmail.com 




Re: Counter limit

2013-09-09 Thread Claudio Martella
No, I used a different counters limit on that hadoop version. Setting
mapreduce.job.counters.limit to a higher number and restarting JT and TT
worked for me. Maybe 64000 might be too high? Try setting it to 512. Does
not look like the case, but who knows.


On Mon, Sep 9, 2013 at 2:57 PM, Christian Krause  wrote:

> Sorry, it still doesn't work (I ran into a different problem before I
> reached the limit).
>
> I am using Hadoop 0.20.203.0. Is the limit of 120 counters maybe
> hardcoded?
>
> Cheers
> Christian
> Am 09.09.2013 08:29 schrieb "Christian Krause" :
>
> I changed the property name to mapred.job.counters.limit and restarted it
>> again. Now it works.
>>
>> Thanks,
>> Christian
>>
>>
>> 2013/9/7 Claudio Martella 
>>
>>> did you restart TT and JT?
>>>
>>>
>>> On Sat, Sep 7, 2013 at 7:09 AM, Christian Krause  wrote:
>>>
 Hi,
 I've increased the counter limit in mapred-site.xml, but I still get
 the error: Exceeded counter limits - Counters=121 Limit=120. Groups=6
 Limit=50.

 This is my config:

  cat conf/mapred-site.xml
 
 

 

 
 ...
 
 mapreduce.job.counters.limit
 64000
 
 
 mapred.task.timeout
 240
 
 ...
 

 Any ideas?

 Cheers,
 Christian


>>>
>>>
>>> --
>>>Claudio Martella
>>>claudio.marte...@gmail.com
>>>
>>
>>


-- 
   Claudio Martella
   claudio.marte...@gmail.com


Re: Counter limit

2013-09-09 Thread Christian Krause
Sorry, it still doesn't work (I ran into a different problem before I
reached the limit).

I am using Hadoop 0.20.203.0. Is the limit of 120 counters maybe hardcoded?

Cheers
Christian
Am 09.09.2013 08:29 schrieb "Christian Krause" :

> I changed the property name to mapred.job.counters.limit and restarted it
> again. Now it works.
>
> Thanks,
> Christian
>
>
> 2013/9/7 Claudio Martella 
>
>> did you restart TT and JT?
>>
>>
>> On Sat, Sep 7, 2013 at 7:09 AM, Christian Krause  wrote:
>>
>>> Hi,
>>> I've increased the counter limit in mapred-site.xml, but I still get the
>>> error: Exceeded counter limits - Counters=121 Limit=120. Groups=6
>>> Limit=50.
>>>
>>> This is my config:
>>>
>>>  cat conf/mapred-site.xml
>>> 
>>> 
>>>
>>> 
>>>
>>> 
>>> ...
>>> 
>>> mapreduce.job.counters.limit
>>> 64000
>>> 
>>> 
>>> mapred.task.timeout
>>> 240
>>> 
>>> ...
>>> 
>>>
>>> Any ideas?
>>>
>>> Cheers,
>>> Christian
>>>
>>>
>>
>>
>> --
>>Claudio Martella
>>claudio.marte...@gmail.com
>>
>
>