[ 
https://issues.apache.org/jira/browse/HBASE-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898508#action_12898508
 ] 

stack commented on HBASE-1849:
------------------------------

@BenĂ´it: Bring it on!

> HTable doesn't work well at the core of a multi-threaded server; e.g. 
> webserver
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-1849
>                 URL: https://issues.apache.org/jira/browse/HBASE-1849
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Benoit Sigoure
>
> HTable must do the following:
> + Sit in a shell or simple client -- e.g. Map or Reduce task -- and feed and 
> read from HBase single-threadedly.  It does this job OK.
> + Sit at core of a multithreaded server (100s of threads) -- a webserver or 
> thrift gateway -- and keep the throughput high. Its currently not good at 
> this job.
> In the way of our achieving the second in the list above are the following:
> + HTable must seekout and cache region locations.  It keeps cache down in 
> HConnectionManager.  One is shared by all HTable instances if the HTable 
> instance was made with same HBaseConfiguration instance.   Lookups of regions 
> is inside a synchronize block; if the region wanted is in the cache, the lock 
> is held a short time.   Otherwise, must wait till trip to server completed 
> (may require retries).  Meantime all other work is blocked even if we're 
> using HTablePool.
> + Regardless of the identity of the HBaseConfiguration, Hadoop RPC has ONE 
> Connection open to a server at a time; request and response are multiplexed 
> over this single connection.
> Broken stuff:
> + Puts are synchronized to protect the write buffer so only one thread at a 
> time appends but flushcommit is open for any thread to call it.  Once the 
> write buffer is full, all Puts block until its freed again. This looks like 
> hang if hundreds of threads and each write is to a random region in a big 
> table and each write has to have its region looked-up (There may be some 
> other brokenness in here because this bottleneck seems to last longer than it 
> should even if hundreds of threads).
> Ideas:
> + Query of the cache does not block all access to the cache.  We only block 
> access if wanted region is being looked up so other reads and writes to 
> regions we know the location of can go ahead.
> + nio'd client and server

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to