teresting. One could
>> identify clusters of close row keys in the Gets and issue a Scan for each
>> cluster.
>>
>>
>> -- Lars
>>
>>
>>
>>
>> From: Nicolas Liochon
>> To: user
>> Sent: Tuesd
ntify clusters of close row keys in the Gets and issue a Scan for each
> cluster.
>
>
> -- Lars
>
>
>
>
> From: Nicolas Liochon
> To: user
> Sent: Tuesday, February 19, 2013 9:28 AM
> Subject: Re: Optimizing Multi Gets in hb
a Scan for each
cluster.
-- Lars
From: Nicolas Liochon
To: user
Sent: Tuesday, February 19, 2013 9:28 AM
Subject: Re: Optimizing Multi Gets in hbase
Imho, the easiest thing to do would be to write a filter.
You need to order the rows, then you can use
resent the rows you are looking for in a filter, so that would
> > > probably
> > > > shift this slightly more towards Gets (just imaging a Filter that to
> > > encode
> > > > 100k random row keys to be matched; since Filters are instantiated
> > st
table has 10bn rows, in
> > that
> > > case it is almost certain that the Gets are faster than a scan.
> > > Now image the Gets only cover a small key range. With statistics we
> could
> > > tell whether it would beneficial to turn this into a scan.
> >
As I said below, the crux of the matter is having some histograms of your
> > data, so that such a decision could be made automatically.
> >
> >
> > -- Lars
> >
> >
> >
> >
> > From: lars hofhansl
> > To: &q
tore
> there is another natural limit there).
>
>
> As I said below, the crux of the matter is having some histograms of your
> data, so that such a decision could be made automatically.
>
>
> -- Lars
>
>
>
> ____
> From: lars hofha
the matter is having some histograms of your data,
so that such a decision could be made automatically.
-- Lars
From: lars hofhansl
To: "user@hbase.apache.org"
Sent: Monday, February 18, 2013 5:48 PM
Subject: Re: Optimizing Multi Gets in hbase
with a appropriate filter (may have to
> implement your own filter, though). Maybe we could a version of RowFilter
> that match against multiple keys.
>
>
> -- Lars
>
>
>
> ____________
> From: Varun Sharma
> To: user@hbase.apache.org
> Sent
-- Lars
From: Varun Sharma
To: user@hbase.apache.org
Sent: Monday, February 18, 2013 1:57 AM
Subject: Optimizing Multi Gets in hbase
Hi,
I am trying to batched get(s) on a cluster. Here is the code:
List gets = ...
// Prepare my gets with the rows i need
myHTabl
So you'd have to do a little bit of home work up front.
Supposed you have to pull some data from 30K rows out of 10 Mil?
If they are in sort order, you could determine the regions and then think about
doing a couple of scans in parallel.
But that may be more work than just doing the set of get
If the scan is happening on the same region then going for Scan would be a
better option.
Regards
RAm
On Mon, Feb 18, 2013 at 4:26 PM, Nicolas Liochon wrote:
> i) Yes, or, at least, of often yes.
> II) You're right. It's difficult to guess how much it would improve the
> performances (there is
i) Yes, or, at least, of often yes.
II) You're right. It's difficult to guess how much it would improve the
performances (there is a lot of caching effect), but using a single scan
could be an interesting optimisation imho.
Nicolas
On Mon, Feb 18, 2013 at 10:57 AM, Varun Sharma wrote:
> Hi,
>
It will instantiate one scan op per Get
-Anoop-
From: Varun Sharma [va...@pinterest.com]
Sent: Monday, February 18, 2013 3:27 PM
To: user@hbase.apache.org
Subject: Optimizing Multi Gets in hbase
Hi,
I am trying to batched get(s) on a cluster. Here is
Hi Varun,
Are your gets around sequential keys ? If so, you might benefit by doing
scans with a start and stop. If they are not sequential I don't think there
would be a better way from the way you describe the problem.
Besides that, some of the questions that come to mind:
- How many GET(s) are
Hi,
I am trying to batched get(s) on a cluster. Here is the code:
List gets = ...
// Prepare my gets with the rows i need
myHTable.get(gets);
I have two questions about the above scenario:
i) Is this the most optimal way to do this ?
ii) I have a feeling that if there are multiple gets in this c
16 matches
Mail list logo