Hello,
I got a self-defined Filter like this:
class FilterA{
*private int count*;
...
}
and used as:
Scan scan = new Scan(...);
> scan.addFilter(new FilterA());
> ...
> *while(someCondition){
> *ResultScanner scanner = htable.getScanner(scan);
> .
> *}*
*
*
the value o
Hi All,
I like to know if
HBase.Client.Result.getValue(...) and
Result.getColumn(...) fetch actual value from TABLE everytime
or is it available in Result/ResultScanner already?
--
Alok
Hi!
I notice the strange behavior of HBase during reading and writing
simultaneously.
About my cluster:
Master and 16 regionservers, quorum of 3 zookeepers, gigabit ethernet,
all nodes are in same subnet. Data is read from a table containing ~100
millions binary records (images). I use Cloudera-c
did you check your gc logs from around the time you are seeing a delay?
On Fri, Jan 20, 2012 at 12:54 AM, Pavel Dvorin wrote:
> Hi!
>
> I notice the strange behavior of HBase during reading and writing
> simultaneously.
>
> About my cluster:
> Master and 16 regionservers, quorum of 3 zookeepers,
Also, if you know you only want 2 rows, for example, make sure the caching
is set to 2 so that it only reads that many on the RegionServer.
On 1/20/12 12:31 AM, "Harsh J" wrote:
>Hi Stuti,
>
>The way the Shell does it is by iterating over the ResultScanner iterator
>for only LIMIT number of
Hi,
We had a nice meetup with high quality presentations last night.
I have uploaded the slides onto:
http://www.meetup.com/hbaseusergroup/files/
Thanks for the presenters for sharing their experiences.
Looking forward to HBaseCon 2012.
Hi,
Each HFile has multiple Data Blocks and each block has multiple K/V pairs.
So, effectively a given HFile has many K/V pairs. When a client searches
for a particular row, is the entire HFile scanned for data or some sort of
index is maintained in the HFile? Also, is the data sorted in HFile?
T
Thanks Neil for sharing your experience with AWS! Could you tell what
instance type are you using?
We are using m1.xlarge, that has 4 virtual cores, but i normally see
recommendations for machines with 8 cores like c1.xlarge, m2.4xlarge, etc.
In principle these 8-core machines don't suffer too much
Trying to run my code (a test of Aggregation Protocol and an MR HBase table
loader) on latest build of 0.92.0 (r1232715) I get an 'old server' warning
(I've seen this before and it's always been non-fatal) then an out of memory
exception then job hangs:
[sshexec] 12/01/20 16:56:48 WARN zookeepe
Thanks to everyone who came and special thanks to eBay for hosting,
Ted for organizing and Stack for gluing it all together.
Just like releasing new versions, we should have meetups more often!
J-D
On Fri, Jan 20, 2012 at 7:16 AM, Ted Yu wrote:
> Hi,
> We had a nice meetup with high quality pre
Definitely.
I am hoping this event gives positive feedback to EBay's management. In the
future, the hosting process should be more streamlined.
Cheers
On Fri, Jan 20, 2012 at 9:48 AM, Jean-Daniel Cryans wrote:
> Thanks to everyone who came and special thanks to eBay for hosting,
> Ted for organ
I recently bumped up the region size memory configuration on our HBase
cluster and after doing a rolling restart of our 5 nodes I saw the
following in the HBase status page:
Regions in Transition
Region State
32cb0e36cfa326d0a431734ba93a16df
items,869239091/es-LA,1323971864141.32cb0e36cf
Royston:
I guess you have seen HBASE-5204. In particular:
>> when a 0.92 server fails to deserialize a 0.90-style RPC, it attempts to
allocate a large buffer because it doesn't read fields of 0.90-style RPCs
properly.
Were your client code compiled with the same version of HBase as what was
runnin
I'm doing a POC on HBase and wanted to see if someone could verify that my
map/scan performance is reasonable. I have one 170 million row table.
My cluster setup is 1 master node and 4 slave nodes, all w/ 8GB RM, 1 500GB
SATA disk, 1 quad core hyperthreaded CPU.
I'm running a MapReduce job ov
I run c1.xlarge servers and have found them very stable. I see 100 Mbit/s
sustained bi-directional network throughput (200Mbit/s total), sometimes up
to 150 * 2 Mbit/s.
Here's a pretty thorough examination of the underlying hardware:
http://huanliu.wordpress.com/2010/06/14/amazons-physical-hardw
On Fri, Jan 20, 2012 at 11:43 AM, Wayne wrote:
> Does 0.92 support a significant increase in row size over 0.90.x? With
> 0.90.4 we have seen writes start choking at 30 million cols/row and reads
> start choking at 10 million cols/row. Can we assume these numbers will go
> up with .92 and if yes
On Fri, Jan 20, 2012 at 11:36 AM, kfarmer wrote:
> This job completes in about 8 minutes. That's 354K rows/second for the
> cluster, 88K rows/second for the node, and 22K rows/second (or 22
> rows/millisecond) for each map task.
>
>
Its not too bad? What you need?
> Is this performance reas
On Fri, Jan 20, 2012 at 10:41 AM, Mark wrote:
> I recently bumped up the region size memory configuration on our HBase
> cluster and after doing a rolling restart of our 5 nodes I saw the
> following in the HBase status page:
>
>
> Regions in Transition
>
> Region State
> 32cb0e36cfa326d0a4317
On Fri, Jan 20, 2012 at 7:56 AM, Praveen Sripati
wrote:
> Hi,
>
> Each HFile has multiple Data Blocks and each block has multiple K/V pairs.
> So, effectively a given HFile has many K/V pairs. When a client searches
> for a particular row, is the entire HFile scanned for data or some sort of
> ind
Hi,
I am trying to figure out if Hbase is the right candidate for my use case
which is as follows :
I have a users table containing millions users and for each user I have a
bunch of data points for each day in past
2 years. Some of these data points are number of clicks in different parts
o
from the little i have used hbase for, it is really good for the below use
case you mentioned. hbase takes care of scale and you can use map reduce to
do the kind of task you mentioned below.
but please remember that it is super important how you design the schema.
the schema should allow for your
Please also refer to Mikhail's presentation last night
(Optimizing_HBase_scanner_performance.pptx):
http://www.meetup.com/hbaseusergroup/files/
Cheers
On Fri, Jan 20, 2012 at 1:49 PM, Stack wrote:
> On Fri, Jan 20, 2012 at 7:56 AM, Praveen Sripati
> wrote:
>
> > Hi,
> >
> > Each HFile has multi
The values are fetched by the operation that returned the Result object and
subsequently cached in the Result object.
Is that what you were asking?
-- Lars
From: Alok Kumar
To: user@hbase.apache.org
Sent: Friday, January 20, 2012 12:48 AM
Subject: Is HBase.
Also, you probably do not want to set your blocksize to 512mb. The default is
64k.
HBase has to load (either HDFS or cache) and scan this amount of data for each
key value lookup.
-- Lars
From: Stack
To: user@hbase.apache.org
Sent: Friday, January 20, 2012
(Just to avoid confusion, Lars is talking about HBase Table CF's blocksize,
unrelated to HDFS block sizes, which would be fine if set to 512m.)
On 21-Jan-2012, at 9:49 AM, lars hofhansl wrote:
> Also, you probably do not want to set your blocksize to 512mb. The default is
> 64k.
> HBase has to
Hi, i have Debian 6.03 and problem with best friends hbase and hadoop
step by step, I want working configuration hbase (standalone for the
first step) and hadoop :
wget http://www.sai.msu.su/apache//hbase/hbase-0.90.5/hbase-0.90.5.tar.gz
tar xzfv hbase-0.90.5.tar.gz
sudo mv hbase-0.90.5 /usr/lo
Hi
Check the master side logs and in the RS logs check what the RS is doinig w.r.t
to the region mentioned in the
below trace.
Hope you are trying out in 0.90.x and not 0.92?
Regards
Ram
From: Harsh J [ha...@cloudera.com]
Sent: Saturday, January 21, 201
Hi,
1) According to the this url (1), HBase performs well for two or three
column families. Why is it so?
2) Dump of a HFile, looks like below. The contents of a row stay together
like a regular row-oriented database. If the column family has 100 column
family qualifiers and is dense then the dat
I am not sure how I can do joins using HBase which is essentially what I am
trying to do. Based on what I have read it looks
like HBase is really good for scans or row key lookup. Please correct me if
I am wrong.
I can have a HBase table for users with {userid + timestamp} as the rowkey.
Using thi
I think you need to design your schema with as many tables as many
indexes you want.
For example: tbl1 {user_id_timestamp}
tbl2 {md5(email)} [user_id_timestamp]
Also you may be want to look at google "design patterns hbase"
Also some examples here : "Oreilly.HBase.The.Definitive.Guide.Aug.2011"
30 matches
Mail list logo