Please see my answers inline ...

On Mon, Apr 4, 2011 at 8:45 PM, Stack <[email protected]> wrote:
> On Mon, Apr 4, 2011 at 2:30 AM, Bogdan Ghidireac <[email protected]> wrote:
>> Is is possible to add a timeout and then force a System.exit() ?
>>
>
> Yes. Of course.  Sounds bad.  How you think this scenario came about?

My M/R job reads from a table and creates a lot of data that is
inserted into a second table. Because this new table is empty and I
did not split the keys in advance, the region server where the first
region was created is hit really hard (60-100K ops/sec).

The OOM exception happens during this time, only for one or maybe two
servers. The exception triggers a server shutdown...
Once the initial region splits and the traffic is distributed, the
problem does not happen any more.


> Is the zk ensemble up and running still?

The ZK ensemble is running fine. I have 3 zk servers running ZK 3.3.2.


> Whats the last thing in this regionserver log?

This is the RS log
http://pastebin.com/Cvx8zS54


> Anything in the .out file?

This is the System.out/err
I http://pastebin.com/gNNVUzvZ


> I've not seen this
> before but, hey, the world is a wide and wonderful place.  We could
> run the zk close inside a thread and interrupt if it goes on too long
> (Let me ask the zk boys if they've seen this before too).
>

I am subscribed to ZK list too and I have seen you email. I am using
ZK 3.3.2 ...

> St.Ack
>

Thank you,
Bogdan

Reply via email to