Hi,
I have a question on how the splits work on hbase.
I have one master which also acts as a region server, along with other 3
region servers.
I have set the following parameters on all the region servers
hbase.hregion.max.filesize
1048576
Maximum HStoreFile size. If any one
Scans are in serial.
To use DB parlance, consider a Scan + filter the moral equivalent of a
"SELECT * FROM <> WHERE col='val'" with no index, and a full table
scan is engaged.
The typical ways to help solve performance issues are such:
- arrange your data using the primary key so you can scan the
By naming rows from the timestamp the rowids are going to all be sequential
when inserting. So all new inserts will be going into the same region. When
checking the last 30 days you will also be reading from the same region
where all the writing is happening, i.e the one that is already busy writin
Hi,
We have a table split across multiple regions(approx 50-60 regions for 64 MB
split size) with rowid schema as
[ReverseTimestamp/itemtimestamp/customerid/itemid].This stores the
activities for an item for a customer.We have lots of data for lots of item
for a custoer in this table.
When we try
We're planning out our first Hbase cluster, and we'd like to get some feedback
on our proposed hardware configuration. We're intending to use this cluster
purely for Hbase; it will not generally be running MapReduce jobs, nor will we
be using HDFS for other storage tasks. In addition, our projec
HBase will always need to store the column name in each cell that uses it.
The only way to reduce the size taken by storing repeated column names
(besides using compression) is to instead store a small pointer to a lookup
table that holds the column name. Check out OpenTSDB, which does something
si
I like how I can have X columns in a row that varies from another row. I am
wondering if there is a way to have hbase have "static" column names(for lack
of a better term) where the column names don't take up space for each row I add
to my database. It just would be nice to have a significantl
The master of my Hbase instance (0.90.x) crashes each time it is restarted,
with the exceptions shown below. Can you let me know what this is usually due
to? (I also saw these exceptions in a JIRA but they were about uncaught EOF
exception). Only the master dies while the region servers wait for
Thanks for the comments,
Going to work on it tomorrow - I'll keep you updated.
Ophir
On Wed, May 11, 2011 at 8:01 PM, Stack wrote:
> On Wed, May 11, 2011 at 6:14 AM, Ophir Cohen wrote:
> > My results from today's researches:
> >
> > I tried to delete region as Stack suggested:
> >
> > 1. *cl
I installed hbase on my Mac Os 10.6 machine and when i try to run hbase master
start I get the following error:
my error is similar if not the same with the following thread
http://article.gmane.org/gmane.comp.java.hadoop.hbase.user/17432/match=got+user+level+keeperexception+processing+sessionid
And another question, shall I use hbase 0.20.6 if I used the append branch
of hadoop?
在 2011-5-11 上午12:51,"Jean-Daniel Cryans" 写道:
> Data cannot be corrupted at all, since the files in HDFS are immutable
> and CRC'ed (unless you are able to lose all 3 copies of every block).
>
> Corruption would h
On Wed, May 11, 2011 at 2:05 AM, Iulia Zidaru wrote:
> Hi,
> I'll try to rephrase the problem...
> We have a table where we add an empty value.(The same thing happen also if
> we have a value).
> Afterward we put a value inside.(Same put, just other value). When scanning
> for empty values (first
On Wed, May 11, 2011 at 6:14 AM, Ophir Cohen wrote:
> My results from today's researches:
>
> I tried to delete region as Stack suggested:
>
> 1. *close_region*
> 2. Remove files from file system.
> 3. *assign* the region again.
>
Try inserting something into that region and then getting it
I have not seen this before. You are failing because of
java.lang.ArrayIndexOutOfBoundsException in
org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:83).
Tell us more about your context. Are you using compression? What
kind of hardware, operating system (I'm trying to figure what is
Dear all,
I just checked our log today. And found the following logs
2011-05-11 16:46:06,258 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_7212216405058183301_3974453 src: /10.0.2.39:60393 dest: /10.0.2.39:50010
2011-05-11 16:46:14,716 INFO
org.apache.hadoop.hdfs.serv
Running 2 ZooKeeper's isn't a good idea as it doesn't handle any server
failure. ZooKeeper needs a majority of nodes in the ensemble to be available
to handle failures. So 1,3,5 are better choices.
See:
http://wiki.apache.org/hadoop/ZooKeeper/FAQ#A7
http://zookeeper.apache.org/doc/r3.3.3/zookeepe
See the performance section of the HBase book.
http://hbase.apache.org/book.html#performance
-Original Message-
From: Ferdy Galema [mailto:ferdy.gal...@kalooga.com]
Sent: Wednesday, May 11, 2011 10:25 AM
To: user@hbase.apache.org
Cc: byambajargal; cdh-u...@cloudera.org
Subject: Re: What
Dear all,
We are using hadoop 0.20.2 with a couple of patches, and hbase 0.20.6, when
we are running a MapReduce job which contains a lots of random access to a
hbase table. We met a lot of logs like the following at the same time in the
region server and data node:
For RegionServer:
"INFO org.ap
A rowcounter is a scan job, so you should use
hbase.client.scanner.caching for better scan performance. (Depending on
your value sizes, set to 1000 or something like that).
For us, 1 zookeeper is able to manage our 15node cluster perfectly fine.
On 05/11/2011 02:40 PM, byambajargal wrote:
Hel
See '[ANN]: HBaseWD: Distribute Sequential Writes in HBase' thread.
https://github.com/sematext/HBaseWD
On Wed, May 11, 2011 at 2:21 AM, Felix Sprick wrote:
> Hi guys,
>
> I am using rowkeys with a pattern like [minute]_[timestamp] because my
> main use case is to read time ranges over a couple
Dear community,
We are doing a test on a 5 node cluster with a table of about 50 million
rows (writes and reads). At some point we end up getting the following
exception on 2 of the region servers:
2011-05-11 14:18:28,660 INFO org.apache.hadoop.hbase.regionserver.Store:
Started compaction o
My results from today's researches:
I tried to delete region as Stack suggested:
1. *close_region*
2. Remove files from file system.
3. *assign* the region again.
It looks like it works!
The region still exists but its empty.
Looks good but definitely not the end of the way.
In order t
Hello everybody
I have run a cluster with 11 nodes hbase CDH3u0 and i have 2 zookeeper
server in my cluster
It seems very slowly when i run the rowcounter example
my question is what is the recommended number of zookeeper server should
i run for 11 nodes cluster
cheers
Byambajargal
Le 10/05/11 11:34, Kobla Gbenyo a écrit :
Hello,
I am new at this list and I start testing HBase. I download and
install HBase successfully and now I am looking for a framework which
can help me performing CRUD operations (create, read, update and
delete). Through my research, I found JDO but
Hi Andrew,
You're right. I try to upgrade to the latest version.
Frank
- Original Nachricht
Von: Andrew Purtell
An: user@hbase.apache.org
Datum: 11.05.2011 11:10
Betreff: Re: Lost hbase table after restart
> Hi,
>
> HBase 0.20.4 is very much out of date. It was released o
Hi guys,
I am using rowkeys with a pattern like [minute]_[timestamp] because my
main use case is to read time ranges over a couple of hours and I want
to read in parallel from as many nodes in the cluster as possible,
thus, distributing the data in minute buckets across the cluster.
Problem now i
Furthermore, be sure to read about what HBase 0.90.x requires:
http://hbase.apache.org/notsoquick.html#requirements
Best regards,
- Andy
--- On Wed, 5/11/11, Andrew Purtell wrote:
> From: Andrew Purtell
> Subject: Re: Lost hbase table after restart
> To: user@hbase.apache.org
> Date: Wed
Hi,
HBase 0.20.4 is very much out of date. It was released on 10 May 2010.
The current release is 0.90.2, released on 11 April 2011.
Why are you using such an out of date version?
For many many reasons you should be using the latest 0.90.x version.
Best regards,
- Andy
--- On Wed, 5/11/1
Hadoop: 0.20.2
Hbase: 0.20.4, r941076
Hi,
I've running a hbase table on 4 regionservers with 4 datanodes back cross on
the same machines. After restart hbase and hadoop I've lost the one and only
hbase table. The hbase-master-log says:
2011-04-16 15:45:03,411 INFO org.apache.hadoop.hbase.mas
Hi,
I'll try to rephrase the problem...
We have a table where we add an empty value.(The same thing happen also
if we have a value).
Afterward we put a value inside.(Same put, just other value). When
scanning for empty values (first values inserted), the result is wrong
because the filter gets
Last week I loaded ~1TB on a 100 node cluster in about 6 hours. In this case
the dataset was made of rows each with 50 columns at about 12 bytes each (12
byte qualifier, empty value).
This was not using the bulk load API, which in my experience is at least 10x
faster than using the normal API. The
Sorry, give other information:
Ycsb don't share the HTables.
One thread has a Htable instance
-邮件原件-
发件人: Ted Yu [mailto:yuzhih...@gmail.com]
发送时间: 2011年5月11日 10:31
收件人: user@hbase.apache.org
主题: Re: A question about client
I think the second explanation is plausible.
From
http://dow
32 matches
Mail list logo