Thanks for the quick response Ted.  It does look like HBASE-4508 would
resolve the issue I'm seeing, so I appreciate the pointer.   We're back up
and running ok for now with a custom TableOutputFormat (and custom
TableInputFormat) to work around this issue, but when we're ready to move
to a later version of HBase we'll give that new mechanism a try.

Thanks!

      -Shawn

On Tue, Jan 24, 2012 at 2:39 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> Thanks for reporting this, Shawn.
>
> Do you want to try out HBASE-4508 which is in HBase 0.90.5 ?
>
> On Tue, Jan 24, 2012 at 9:15 AM, Shawn Quinn <squ...@moxiegroup.com>
> wrote:
>
> > Hello,
> >
> > Our application runs Map/Reduce tasks fairly frequently against HBase
> > (Cloudera distribution 0.90.4), and we're making using of the default
> > org.apache.hadoop.hbase.mapreduce.TableOutputFormat class for the reduce
> > step which the TableMapReduceUtil.initTableReducerJob() sets up.  We
> invoke
> > the Map/Reduce tasks via the standard Hadoop Job API, but they're all
> > triggered from the same virtual machine that stays running (so we aren't
> > shutting down the virtual machine after each job runs).  We've been
> > noticing that we've been running out of ZooKeeper connections in this
> > configuration, and believe we've tracked the "leak" down to the
> > TableOutputFormat class.  Specifically, that class does the following:
> >
> >  public void setConf(Configuration otherConf) {
> >    this.conf = HBaseConfiguration.create(otherConf);
> >    String tableName = this.conf.get(OUTPUT_TABLE);
> >    String address = this.conf.get(QUORUM_ADDRESS);
> >    String serverClass = this.conf.get(REGION_SERVER_CLASS);
> >    String serverImpl = this.conf.get(REGION_SERVER_IMPL);
> >    try {
> >      if (address != null) {
> >        ZKUtil.applyClusterKeyToConf(this.conf, address);
> >      }
> >      if (serverClass != null) {
> >        this.conf.set(HConstants.REGION_SERVER_CLASS, serverClass);
> >        this.conf.set(HConstants.REGION_SERVER_IMPL, serverImpl);
> >      }
> >      this.table = new HTable(this.conf, tableName);
> >      this.table.setAutoFlush(false);
> >      LOG.info("Created table instance for "  + tableName);
> >    } catch(IOException e) {
> >      LOG.error(e);
> >    }
> >  }
> >
> > I believe in previous releases of HBase this was different, but at some
> > point the code to clone the configuration object (first line of that
> > method) was added.  Then, in that same method when that code creates the
> > HTable instance, internally the HTable gets a new connection to ZooKeeper
> > everytime (since the configuration instance is different.)
> >
> > I believe I can get around this in my application by creating a custom
> > TableOutputFormat.  However, can anyone confirm if this is indeed a
> > problem, or if there is some other way to work around the default
> > TableOutputFormat class creating a new connection to ZooKeeper every time
> > it runs?
> >
> > Thanks,
> >
> >     -Shawn
> >
>

Reply via email to