hi, I know regions will be reassigned when hbase cluster restarts. My
regionserver and my datanode sit on same physical node. So in my tests after
I restart hbase cluster, performance number drops, I guess this is due to
data locality problem. But in a further experiment, I increase the
replication
hell.
>
> Also, you might want to increase hbase.region.mstore.flush.size to keep
> the number of HFile generations smaller.
>
> Thanks,
>
> --
> Tatsuya Kawano (Mr.)
> Tokyo, Japan
>
>
> On Jan 18, 2011, at 11:20 AM, Tao Xie wrote:
>
> > For example, I have t
For example, I have total some data and I can tune
hbase.hregion.max.filesize to increase/decrease total region number, rite?
I want to know if the region number has performance impact to random read
tests. I observed that in my ycsb test, with larger hfile size, I got
better tput and smaller late
rom disk is the most dominant element, until you are
> fully cached in which case other factors inside the regionserver
> become dominant. at this point copying memory, gc, algorithmic
> complexity, etc become important.
>
> On Wed, Jan 12, 2011 at 10:54 PM, Tao Xie
> wrote
e includes loading up of the file index and metadata.
> In our experience, this overhead has been small. Its currently not
> accounted for in our general memory-counting. We should for sure add
> it.
>
> St.Ack
>
> On Wed, Jan 12, 2011 at 7:51 PM, Tao Xie wrote:
> &
hi, I know generally regionserver manages HRegions and in the HDFS layer
data in HRegion are stored as HFile format. I want to know whether HFiles
are all open and things lke block index are all loaded first to improve
lookup performance? If so, what will happen if exceeding memory limit?
Thanks.
I see there is a block cache percentage configuration in hbase-site.xml. I
wonder if there is a row cache that stores pairs.
Thanks.
rn from DEBUG into ERRORs and
> client fails. If loading on cluster is intensive, it can take a while
> for regions to re-online. There could be another issue in the way of
> the region re-onlining. Grepping around in the logs as per above
> should give a clue.
>
> St.Ack
>
hi, all
I met this exception when I doing intensive insertions using YCSB. Anybody
give me some clues on this? I use hbase 0.20.6.
com.yahoo.ycsb.DBException:
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact
region server -- nothing found, no 'location' returned,
tabl
I read the code and my understanding is when a RS starts StoreFiles of each
Region
will be instantiated. Then HFile.reader.loadFileInfo() will read the the
index and file info.
So each StoreFile is opened only once and block index are cached. The cache
miss are
for blocks. I mean for random Get eac
hi, all
I see many RTE "Cached an already cached block", anybody know why this
happens? I googled this problem and found
https://issues.apache.org/jira/browse/HBASE-2455
describes this but says it's fixed in 0.20.4 or 0.20.5. However my hbase
version is 0.20.6.
Here is one of regionserver log:
http
I once have same problem. Finally I find RS are not started.
2010/10/26 Bradford Stephens
> Hey datamigos,
>
> I'm having trouble getting a finicky .20.6 cluster to behave.
>
> The Master, Zookeeper, and ReigonServers all seem to be happy --
> except the Master doesn't see any RSs. Doing a "stat
BTW, Ryan, can you share some configurations tips of running YCSB to get
better random read performance?
Or can you provide some YCSB test results? In my experiments, I get 40~50k/s
insert throughput but only ~2k/s read throughput.
I wonder if there is something wrong with my configuration.
Thanks
I also have similar result with YCSB. I disabled block cache (set to 0) and
got better throughput than default.
In my case my dataset is 160M records and block cache hit ratio is very low,
so frequent cache eviction causes long time pause.
2010/10/21 Ryan Rawson
> Our own systems show much bett
I applied the patch for HBASE-2939. (The patch is for 0.89 but my code is
0.20.6, I checked the patch found it only changed one connection thread at
client side to a pool strategy.)
But when I rebuild the source and start hbase cluster. The master cannot
recognize regionservers though they are runn
hi, all
I set hdfs replica=1 when running hbase. And DN and RS co-exists on each
slave node. So the data in the regions managed by RS will be stored on its
local data node, rite?
But when I restart hbase and hbase client does gets on RS, datanode will
read data from remote data nodes. Does that mea
I use hadoop CDH3 and hbase 0.89. When I doing heavy puts, I see the
following thread dump in one RS.
And the puts throughput drops to zero in a long time. Is this a bug? Anybody
see this before?
Thanks for any response.
"IPC Server handler 73 on 60020" tid=118 daemon=true priority=5
state=BLOCKED
Resolved. A stupid error I made. Sorry for this.
2010/9/28 Tao Xie
> Maybe a stupid question. I have set export HBASE_MANAGES_ZK=true and
> provide one ZK in hbase-site.xml. In my example, I only set the server sr114
> as zk. But I still find zookeeper will check other quorum servers.
Maybe a stupid question. I have set export HBASE_MANAGES_ZK=true and provide
one ZK in hbase-site.xml. In my example, I only set the server sr114 as zk.
But I still find zookeeper will check other quorum servers. I wonder where
the server lists it reads. Confused about this. Anybody can give me a h
t; Also dont forget hardware.. .make sure your HW is the same as the papers.
>
> Good luck!
> -ryan
>
> On Sat, Sep 25, 2010 at 8:19 AM, Tao Xie wrote:
> > I want to reproduce the results in the ycsb paper. I run hbase 0.20.6
> and
> > hadoop 0.20.2. My cluster is
I want to reproduce the results in the ycsb paper. I run hbase 0.20.6 and
hadoop 0.20.2. My cluster is like this:
1 Node as HMaster + ZK
6 Nodes as DN, RS
1 Node as Hbase client.
I think this environment is something like the one used by the paper.
When I run tests like workloadb with 100 thr
Here is the gc log: http://pastebin.com/1bGZvMri
2010/9/19 Ryan Rawson
> I'd love to see a GC log, and yes it can be possible for ParNew to
> take a long long time.
>
> Thanks,
> -ryan
>
> On Sun, Sep 19, 2010 at 1:20 AM, Tao Xie wrote:
> > At first wh
/20 Todd Lipcon
> On Sun, Sep 19, 2010 at 1:05 AM, Tao Xie wrote:
>
> > Now my scenario is running ycsb doing heavy read. I compared the results
> of
> > setting hfile.block.cache.size to 0.2 with 0. I found with the factor 0
> > the
> > hbase metric 'get_avg
w to take
as long as 500 ms. I will attach a snippet of that if necessary.
Thanks.
2010/9/19 Ryan Rawson
> What does your GC situation look like?
>
>
>
> On Sun, Sep 19, 2010 at 1:05 AM, Tao Xie wrote:
> > Now my scenario is running ycsb doing heavy read. I compared th
Now my scenario is running ycsb doing heavy read. I compared the results of
setting hfile.block.cache.size to 0.2 with 0. I found with the factor 0 the
hbase metric 'get_avg_time' is even smaller. Maybe I should turn off block
cache in such scenario. I wonder if there are performance tests show th
I see the following recommendation in
http://hbase.apache.org/docs/r0.20.6/api/overview-summary.html#requirements
"It is recommended to run a ZooKeeper quorum of 3, 5 or 7 machines, and give
each ZooKeeper server around 1GB of RAM, and if possible, its own dedicated
disk. For very heavily loaded c
eing reached. It's kinda rough and could probably be
> "smoother", but do you really have a use case that requires it or just
> poking?
>
> J-D
>
> On Thu, Sep 9, 2010 at 7:32 PM, Tao Xie wrote:
> > hi, all
> > I use YCSB to measure the insert/r
hi, all
I use YCSB to measure the insert/read latency of hbase.
I found there will be 0 records inserted in up to 10 seconds during the
insertion procedure.
See the following result at 1514 second. I want to know why this occurs. Is
this due to compaction?
And I also want to know why the ops/sec va
; 10.1.0.124: /dev/sdc1 280G 15G 252G 6% /mnt/DP_disk2
> > >
> > > 10.1.0.124: /dev/sdd1 280G 14G 253G 6% /mnt/DP_disk3
> > >
> > > 10.1.0.123: /dev/sdb1 280G 66G 200G 25% /mnt/DP_disk1
> > >
> > > 10
/dev/sdb1 280G 14G 252G 6% /mnt/DP_disk1
10.1.0.126: /dev/sdc1 280G 14G 252G 6% /mnt/DP_disk2
10.1.0.126: /dev/sdd1 280G 13G 253G 5% /mnt/DP_disk3
2010/9/7 Tao Xie
> I have a look at the following method in 0.89. Is the the following li
I have a look at the following method in 0.89. Is the the following line
correct ?
nRegions *= e.getValue().size();
private int regionsToGiveOtherServers(final int numUnassignedRegions,
final HServerLoad thisServersLoad) {
SortedMap> lightServers =
new TreeMap>();
this.master.g
hi, all
I'm reading the code of RegionManager, I find in the following method there
is an situation when nRegionsToAssign <= nregions, the code only assigns 1
region.
Is this correct? Hbase version 0.20.4.
private void assignRegionsToMultipleServers(final HServerLoad
thisServersLoad,
final S
32 matches
Mail list logo