Re: HBase Performance Q's

Michael Stack Mon, 18 Jun 2007 13:00:17 -0700

I recently spent some time profiling hbase writing. I haven't yet spenttime on reading. The patch in HADOOP-1498 has fixes that came of thewrite profiling sessions. After applying the patch, running thehttp://wiki.apache.org/lucene-hadoop/Hbase/PerformanceEvaluation writetests, we seem to be spending > 90% of allocations and CPU time in RPCreading and writing instances of the deprecated UTF8 class (I believeHADOOP-414 is the issue that covers removing UTF8 from the RPC. If youare seeing the same phenomenon, you might add a vote so the issue getspegged at a higher priority). Does the HADOOP-1498 patch help with theperformance profile you are seeing in hbase?


More below...


James Kennedy wrote:

...
Obervations:
Within the combined RPC$server.call() of all Handler threads,HRegionServer.next() took about 90% in both the 10K and 100K cases.This is as expected since it's called 10K/100K times. The biggestbottleneck bellow that is HStoreKey.extractFamily() which is 33%/39%of HRegionServer.next() which is more expensive than all the IOoperations in HStoreScanner.getNext() (31%/31%). Couldn't there be acheaper way to extract family from a col key?

The HADOOP_1498 patch redoes extractFamily so it uses Text#find and thena byte array copy to produce the column family. It used to do atoString followed by an indexOf, substring, and then a new Text out ofthe produced substring.

Outside the RPC$server.call(),
Server$Connection.readAndProcess() takes about 17%/22% of the timethat HRegionServer.next() takes suggesting a heavy overhead in RPC IOand unmarshalling. Overhead in the other direction,ObjectWritable.write() + DataOutputStream.flush() take 63%/57% of thetime that HRegionServer.next() takes. The ratio of input/output makessense since more data is returned than sent. But this seems like a lotof RPC overhead.


Yes.  See my comment above on what I'm seeing writing.

...

Right now, if i assume that YourKit is counting block time and Iignore those numbers, I come to the following conclusions:
It looks like HStoreKey.extractFamily() is a bottleneck. For 100krows and 10 cols, it will be called over 1,000,000 times during a fulltable scan (even if just retrieving one col) and each call requiresmultiple byte[] <-> string conversions.

If you have suggestions for how to improve on the version inHADOOP-1498, would love to hear them.

RPC overhead can be quite high during a scan. Could batched/read-aheadscanning be useful to minimize RPC calls?

Moving from UTF8 to Text would help with time spent in RPC some going bysome primitive tests I ran locally.


Batching, as per your suggestion in HADOOP-1468, would also help.

St.Ack





------------------------------------------------------------------------

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>hbase.master</name>
        <value>172.30.20.15:60000</value>
        <description>The host and port that the HBase master runs at.
            TODO: Support 'local' (All running in single context).
        </description>
    </property>
    <property>
        <name>hbase.regionserver</name>
        <value>172.30.20.15:60010</value>
        <description>The host and port a HBase region server runs at.
        </description>
    </property>
    <property>
        <name>hbase.regiondir</name>
        <value>hbase</value>
        <description>The directory shared by region servers.
        </description>
    </property>
    <property>
        <name>hbase.hregion.max.filesize</name>
        <value>134217728</value>
        <description>
            Maximum desired file size for an HRegion.  If filesize exceeds
            value + (value / 2), the HRegion is split in two.  Default: 128M 
(134217728).
        </description>
    </property>
    <property>
        <name>hbase.hregion.compactionThreshold</name>
        <value>10</value>
        <description>
            By default, we compact the region if an HStore has more than 10 map 
files.
        </description>
    </property>
    <property>
        <name>hbase.regionserver.thread.splitcompactcheckfrequency</name>
        <value>40000</value>
        <description>
            Default: 60 * 1000
        </description>
    </property>

<property>

        <name>hbase.client.pause</name>
        <value>2000</value>
    </property>
    <property>
        <name>hbase.client.timeout.number</name>
        <value>30000</value>
        <description>Try this many timeouts before giving up.
        </description>
    </property>
    <property>
        <name>hbase.client.retries.number</name>
        <value>15</value>
        <description>Count of maximum retries fetching the root region from root
            region server.
        </description>
    </property>
    <property>
        <name>hbase.master.meta.thread.rescanfrequency</name>
        <value>60000</value>
        <description>How long the HMaster sleeps (in milliseconds) between 
scans of
            the root and meta tables.
        </description>
    </property>
    <property>
        <name>hbase.master.lease.period</name>
        <value>30000</value>
        <description>HMaster server lease period in milliseconds. Default is
            30 seconds.</description>
    </property>
    <property>
        <name>hbase.server.thread.wakefrequency</name>
        <value>5000</value>
        <description>Time to sleep in between searches for work (in 
milliseconds).
            Used as sleep interval by service threads such as META scanner and 
log roller.
        </description>
    </property>
    <property>
        <name>hbase.regionserver.lease.period</name>
        <value>30000</value>
        <description>HRegion server lease period in milliseconds. Default is
            30 seconds.</description>
    </property>
    <property>
        <name>hbase.regionserver.handler.count</name>
        <value>10</value>
        <description>Count of RPC Server instances spun up on RegionServers
            Same property is used by the HMaster for count of master handlers.
            Default is 10.
        </description>
    </property>
    <property>
        <name>hbase.regionserver.msginterval</name>
        <value>500</value>
        <description>Interval between messages from the RegionServer to HMaster
            in milliseconds.  Default is 15 sec. Set this value low if you want 
unit
            tests to be responsive.
        </description>
    </property>

</configuration>

Re: HBase Performance Q's

Reply via email to