Issues running a large MapReduce job over a complete HBase table

2010-12-06 Thread Gabriel Reid
Hi, We're currently running into issues with running a MapReduce job over a complete HBase table - we can't seem to find a balance between having dfs.datanode.max.xcievers set too low (and getting "xceiverCount X exceeds the limit of concurrent xcievers") and getting OutOfMemoryErrors on datanodes

Re: Issues running a large MapReduce job over a complete HBase table

2010-12-06 Thread Lars George
Hi Gabriel, What max heap to you give the various daemons? This is really odd that you see OOMEs, I would like to know what it has consumed. You are saying the Hadoop DataNodes actually crash with the OOME? Lars On Mon, Dec 6, 2010 at 9:02 AM, Gabriel Reid wrote: > Hi, > > We're currently runni

Re: Issues running a large MapReduce job over a complete HBase table

2010-12-06 Thread Gabriel Reid
Hi Lars, All of the max heap sizes are left on their default values (ie 1000MB). The OOMEs that I encountered in the data nodes was only when I put the dfs.datanode.max.xcievers unrealistically high (8192) in an effort to escape the "xceiverCount X exceeds the limit of concurrent xcievers" errors

Re: Issues running a large MapReduce job over a complete HBase table

2010-12-06 Thread Stack
Tell us more about your cluster Gabriel. Can you take 1M from hbase and give it to HDFS? Does that make a difference? What kinda OOME is it? Whats the message? You might tune the thread stack size and that might give you headroom you need. How many nodes in your cluster and how much RAM they

Re: Issues running a large MapReduce job over a complete HBase table

2010-12-06 Thread Gabriel Reid
Hi St.Ack, The cluster is a set of 5 machines, each with 3GB of RAM and 1TB of storage. One machine is doing duty as Namenode, HBase Master, HBase Regionserver, Datanode, Job Tracker, and Task Tracker, while the other four are all Datanodes, Regionservers, and Task Trackers. I have a similar setu

Re: Issues running a large MapReduce job over a complete HBase table

2010-12-07 Thread Stack
On Mon, Dec 6, 2010 at 11:15 PM, Gabriel Reid wrote: > Hi St.Ack, > > The cluster is a set of 5 machines, each with 3GB of RAM and 1TB of > storage. One machine is doing duty as Namenode, HBase Master, HBase > Regionserver, Datanode, Job Tracker, and Task Tracker, while the other > four are all Da

Re: Issues running a large MapReduce job over a complete HBase table

2010-12-07 Thread Gabriel Reid
Hi St.Ack, > You might be leaking scanners but that should have no effect on number > of open store files.  On deploy of a region, we open its store files > and hold them open and do not open others -- not unless > compacting/splitting. > > Hope this helps, Yes, huge help, thank you very much for

Re: Issues running a large MapReduce job over a complete HBase table

2010-12-07 Thread Stack
Please be our guest. You'll need to make yourself an account on wiki but don't let that intimidate. Thanks Gabriel, St.Ack On Tue, Dec 7, 2010 at 11:30 AM, Gabriel Reid wrote: > Hi St.Ack, > >> You might be leaking scanners but that should have no effect on number >> of open store files.  On dep