Hi guys,
Supposing I have a table in HBase, how to estimate its storage footprint?
Thanks.
regards,
Lin
/client/HTable.html#getRegionLocation%28byte[],%20boolean%29
Because the Hbase client talks directly to each RS, it has to know the
region boundaries.
From: Lin Ma lin...@gmail.com
Date: Thursday, September 6, 2012 11:54 AM
To: user@hbase.apache.org user@hbase.apache.org, Doug Meil
to use to achieve things.
Cheers
Julian
2012/9/2 Lin Ma lin...@gmail.com
Hello HBase masters,
For the two add methods of Put class,
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Put.html#add%28org.apache.hadoop.hbase.KeyValue%29
http://hbase.apache.org
6, 2012 at 12:59 AM, Doug Meil doug.m...@explorysmedical.comwrote:
Hi there, if you look in the source code for HTable there is a list of Put
objects. That's the buffer, and it's a client-side buffer.
On 9/5/12 12:04 PM, Lin Ma lin...@gmail.com wrote:
Thank you Stack for the details
2, 2012 at 2:13 AM, Lin Ma lin...@gmail.com wrote:
Hello guys,
I am reading the book HBase, the definitive guide, at the beginning of
chapter 3, it is mentioned in order to reduce performance impact for
clients to update the same row (lock contention issues for automatic
write), batch
) see
https://github.com/sematext/HBaseHUT
--
Lin Ma schrieb am So., 2. Sep 2012 11:13 MESZ:
Hello guys,
I am reading the book HBase, the definitive guide, at the beginning of
chapter 3, it is mentioned in order to reduce performance impact for
clients
Hello guys,
I am reading the book HBase, the definitive guide, at the beginning of
chapter 3, it is mentioned in order to reduce performance impact for
clients to update the same row (lock contention issues for automatic
write), batch update is preferred. My questions is, for MR job, what are
the
Hello HBase masters,
For the two add methods of Put class,
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Put.html#add%28org.apache.hadoop.hbase.KeyValue%29
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Put.html#add%28byte[],%20byte[],%20long,%20byte[]%29
I
wrong.
regards,
Lin
On Tue, Aug 28, 2012 at 2:41 PM, Harsh J ha...@cloudera.com wrote:
Lin,
On Tue, Aug 28, 2012 at 9:09 AM, Lin Ma lin...@gmail.com wrote:
Thanks Harsh,
A two more comments / thoughts,
1. For mapper: mapper normally runs on the same regional server which
owns
this value to 500,
for example, will transfer 500 rows at a time to the client to be
processed.*
regards,
Lin
On Thu, Aug 23, 2012 at 11:37 PM, Harsh J ha...@cloudera.com wrote:
Hi Lin,
On Thu, Aug 23, 2012 at 7:56 PM, Lin Ma lin...@gmail.com wrote:
Harsh, thanks for the detailed information
. Whatever is accumulated
as the result of the Scan operation (server-side) is accumulated in
sizes of 500 rows and returned in one Scanner.next() call from the
client.
Does this clear it up Lin?
On Mon, Aug 27, 2012 at 8:40 PM, Lin Ma lin...@gmail.com wrote:
Hi Harsh,
I read through
Hello HBase masters,
I am wondering whether in current implementation, each client of HBase
cache all information of region server, for example, where is region server
(physical hosting machine of region server), and also cache row-key range
managed by the region server. If so, two more
caches information as needed for its queries and not
necessarily for 'all' region servers.
Abhishek
i Sent from my iPad with iMstakes
On Aug 22, 2012, at 23:31, Lin Ma lin...@gmail.com wrote:
Hello HBase masters,
I am wondering whether in current implementation, each client of HBase
point me to some more detailed information.
regards,
Lin
On Thu, Aug 23, 2012 at 9:35 PM, Harsh J ha...@cloudera.com wrote:
Hi Lin,
On Thu, Aug 23, 2012 at 4:31 PM, Lin Ma lin...@gmail.com wrote:
Thank you Abhishek,
Two more comments,
-- Client only caches information as needed
On Thu, Aug 23, 2012 at 8:21 PM, Doug Meil doug.m...@explorysmedical.comwrote:
For further information about the catalog tables and region-regionserver
assignment, see thisŠ
http://hbase.apache.org/book.html#arch.catalog
On 8/19/12 7:36 AM, Lin Ma lin...@gmail.com wrote:
Thank you
which META region server to
access.
Not sure if I get the points. Please feel free to correct me.
regards,
Lin
On Thu, Aug 23, 2012 at 11:15 PM, Lin Ma lin...@gmail.com wrote:
Doug, very informative document. Thanks a lot!
I read through it and have some thoughts,
- Supposing
utilizing one room for living before having more children. :-)
regards,
Lin
On Fri, Aug 24, 2012 at 12:46 AM, Harsh J ha...@cloudera.com wrote:
Lin,
On Thu, Aug 23, 2012 at 10:10 PM, Lin Ma lin...@gmail.com wrote:
Thanks, Harsh!
- HBase currently keeps a single META region (Doesn't split
Big Table and Hbase.
Thanks,
Abhishek
-Original Message-
From: Lin Ma [mailto:lin...@gmail.com]
Sent: Thursday, August 23, 2012 9:41 AM
To: user@hbase.apache.org; ha...@cloudera.com
Cc: doug.m...@explorysmedical.com
Subject: Re: how client location a region/tablet?
Thanks, Harsh
Thanks Zahoor,
I read through the document you referred to, I am confused about what means
leaf-level index, intermediate-level index and root-level index. It is
appreciate if you could give more details what they are, or point me to the
related documents.
BTW: the document you pointed me is
will be fetched. But if bloom is not enabled, we might find one
block which is having a row range such that 'x' comes in between and Hbase
will load that block. So usage of blooms can avoid this IO. Hope this is
clear for you now.
-Anoop-
From: Lin Ma [lin
use cases.
Asif Ali
On Mon, Aug 20, 2012 at 9:09 AM, Lin Ma lin...@gmail.com wrote:
Thank you Drew. I like your reply, especially blocking cache nature
provided by HBase. A quick question, for traditional memcached, all of
the
items are in memory, no disk is used, correct?
regards
look at Guava cache for similar use cases.
Asif Ali
On Mon, Aug 20, 2012 at 9:09 AM, Lin Ma lin...@gmail.com wrote:
Thank you Drew. I like your reply, especially blocking cache nature
provided by HBase. A quick question, for traditional memcached, all of
the
items
Thank you Zahoor,
Two more comments,
1. After reading the materials you sent to me, I am confused how Bloom
Filter could save I/O during random read. Supposing I am not using Bloom
Filter, in order to find whether a row (or row-key) exists, we need to scan
the index block which is at the end
Thanks Zahoor,
If there is no bloom... you have to load every block and scan to find if
the row exists..
I could be wrong. I think HFile index block (which is located at the end of
HFile) is a binary search tree containing all row-key values (of the HFile)
in the binary search tree. Searching a
things
with memcached just as you would with any other data store. If you're
looking for a spiffy memcached replacement I'd recommend checking out
Redis.
On Sat, Aug 18, 2012 at 3:12 AM, Lin Ma lin...@gmail.com wrote:
Hello guys,
In your experience, is it practical to use HBase directly
server mapping data. Why you say not data (do you mean
real content in each region)?
regards,
Lin
On Sun, Aug 19, 2012 at 12:40 PM, Stack st...@duboce.net wrote:
On Sat, Aug 18, 2012 at 2:13 AM, Lin Ma lin...@gmail.com wrote:
Hello guys,
I am referencing the Big Table paper about how a client
Hello guys,
I am referencing the Big Table paper about how a client locates a tablet.
In section 5.1 Tablet location, it is mentioned that client will cache all
tablet locations, I think it means client will cache root tablet in
METADATA table, and all other tablets in METADATA table (which means
the
following link comparing traditional columnar databases against
HBase/BigTable interesting:
http://dbmsmusings.blogspot.com/2010/03/distinguishing-two-major-types-of_29.html
-Jason
On Sun, Aug 5, 2012 at 8:03 PM, Lin Ma lin...@gmail.com wrote:
Thank you for the informative reply, Mohit
:41 PM, Amandeep Khurana ama...@gmail.com wrote:
HDFS also chooses to degrade availability in the face of partitions.
On Thu, Aug 9, 2012 at 11:08 AM, Lin Ma lin...@gmail.com wrote:
Amandeep, thanks for your comments, and I will definitely read the paper
you suggested.
For Hadoop itself
...@us.ibm.com; 914-784-6752
From: Lin Ma lin...@gmail.com
To: user@hbase.apache.org,
Date: 08/07/2012 09:30 PM
Subject:consistency, availability and partition pattern of HBase
Hello guys,
According to the notes by Werner*, *He presented the CAP theorem, which
states that of three
think availability is sacrificed in the sense that if region server
fails clients will have data inaccessible for the time region comes up on
some other server, not to confuse with data loss.
Sent from my iPad
On Aug 7, 2012, at 11:56 PM, Lin Ma lin...@gmail.com wrote:
Thank you Wei!
Two
at 10:32 PM, Lin Ma lin...@gmail.com wrote:
Thank you Lars.
Is the same data store duplicated copy across region server? If so, if
one
primary server for the region dies, client just need to read from the
secondary server for the same region. Why there is data is unavailable
time
to reconcile between. When you read, you always get the same
version of the row you are reading. In other words, HBase is strongly
consistent.
Hope that clears things up a bit.
On Thu, Aug 9, 2012 at 8:02 AM, Lin Ma lin...@gmail.com wrote:
Thank you Lars.
Is the same data store duplicated
Hello guys,
According to the notes by Werner*, *He presented the CAP theorem, which
states that of three properties of shared-data systems—data consistency,
system availability, and tolerance to network partition—only two can be
achieved at any given time. =
Hi guys,
I am wondering whether HBase is using column based storage or row based
storage?
- I read some technical documents and mentioned advantages of HBase is
using column based storage to store similar data together to foster
compression. So it means same columns of different rows
to store sparse, large number of columns (with NULL
for free). Any comments?
regards,
Lin
On Mon, Aug 6, 2012 at 12:08 AM, Mohit Anchlia mohitanch...@gmail.comwrote:
On Sun, Aug 5, 2012 at 6:04 AM, Lin Ma lin...@gmail.com wrote:
Hi guys,
I am wondering whether HBase is using column based
://hadoop-hbase.blogspot.com/2011/12/introduction-to-hbase.html
-- Lars
- Original Message -
From: Lin Ma lin...@gmail.com
To: user@hbase.apache.org
Cc:
Sent: Sunday, August 5, 2012 6:04 AM
Subject: column based or row based storage for HBase?
Hi guys,
I am wondering whether HBase
37 matches
Mail list logo