I have couple of questions related to MapReduce over HBase
1. HBase guarantees data locality of store files and Regionserver only if
it stays up for long. If there are too many region movements or the server
has been recycled recently, there is a high probability that store file
blocks are not
Ted: Awesome. I can think of several use cases where this is useful, but im
pretty stuck on 0.92 right now.
I tried the null-version trick but must be doing something wrong. How do I
set version to null on a column? Isnt version equal to the timestamp
(primitive long)?
Setting timestamp to 0 and
Hello,
I'm a newbie and wondering whether or not there is any restriction during
HBase minor/major compactions. I read the online document but could not
find any explicit mention about restrictions. What I'm mostly worrying
about is whether read/write operations are blocked during compactions.
I didn't mean to set the version to null, I meant to include a revision of
the column whose contents are empty. This empty revision will Still be
returned by any gets on that row, but you can put code into your client
that treats empty values as deleted.
It's a bit of a hack, but it's the best I
Gotcha.
Columns are quite dynamic in my case, but since I need to fetch rows first
anyways; a KeyOnlyFilter to first find them and then overwrite values will
do just fine.
Cheers,
-Kristoffer
1. HBase guarantees data locality of store files and Regionserver only if
it stays up for long. If there are too many region movements or the server
has been recycled recently, there is a high probability that store file
blocks are not local to the region server. But the getSplits command
On Wed, May 23, 2012 at 6:15 AM, Takahiko Kawasaki
takahiko.kawas...@jibemobile.jp wrote:
Hello,
I'm a newbie and wondering whether or not there is any restriction during
HBase minor/major compactions. I read the online document but could not
find any explicit mention about restrictions.
We are currently on Hbase 0.90 (cdh3u3) and soon will be upgrading to Hbase
0.94. Our application is written in Python and we use Thrift to connect
to HBase.
Looking at Thrift2 (hbase.thrift) I noticed that TScan struct does not
accept filterString as a parameter. This was introduced in HBase
Why don't you log a JIRA ?
By the time you reach the next iteration, hopefully this feature is there -
especially if your team can contribute.
On Wed, May 23, 2012 at 10:06 AM, Jay T jay.pyl...@gmail.com wrote:
We are currently on Hbase 0.90 (cdh3u3) and soon will be upgrading to
Hbase
I saw the need for such converting many times before. Should we add it as a
public method in some utility class? (create JIRA for that?)
Alex Baranau
--
Sematext :: http://blog.sematext.com/
On Mon, May 21, 2012 at 4:26 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote:
How exactly are you
**
Added a JIRA to track this issue.
https://issues.apache.org/jira/browse/HBASE-6073
Thanks,
Jay
On 5/23/12 1:14 PM, Ted Yu wrote:
Why don't you log a JIRA ?
By the time you reach the next iteration, hopefully this feature is there -
especially if your team can contribute.
On Wed, May 23,
Talked to J-D (and source code). It turned out that
when hbase.regionserver.global.memstore.lowerLimit is reached flushes are
forced without blocking reads (of course,
if hbase.regionserver.global.memstore.upperLimit is not hit). Makes perfect
sense. Though couldn't figure this out from settings
Talked to Stack. It's not completely crazy idea. May be implemented as tiny
lib, which can be used when row keys are randomized in some way by
application logic. In this case randomization would take into account how
individual regionservers behave (wrt writing speed).
Would be very interesting
On Wed, May 23, 2012 at 2:33 PM, Alex Baranau alex.barano...@gmail.com wrote:
Talked to J-D (and source code). It turned out that
when hbase.regionserver.global.memstore.lowerLimit is reached flushes are
forced without blocking reads (of course,
if hbase.regionserver.global.memstore.upperLimit
It's a facility so that you don't have to read+write in order to add
something to a value. With Append the read is done in the region
server before the write, also it solves the problem where you could
have a race when there are multiple appenders.
J-D
On Tue, May 22, 2012 at 8:51 PM, NNever
Thanks Harsh, I'll try it ;)
---
Best regards,
nn
2012/5/24 Harsh J ha...@cloudera.com
NNever,
You can use asynchbase (an asynchronous API for HBase) for that need:
https://github.com/stumbleupon/asynchbase
On Thu, May 24, 2012 at 7:25 AM, NNever
Thanks J-D.
so it means 'Append' keeps write-lock only and 'Put' keeps
write-lock/read-lock both?
and if we use 'Append' instead of 'Put', then the chance Clients to wait
will reduce, right?
2012/5/24 Jean-Daniel Cryans jdcry...@apache.org
It's a facility so that you don't have to
On Wed, May 23, 2012 at 8:11 PM, NNever nnever...@gmail.com wrote:
Thanks J-D.
so it means 'Append' keeps write-lock only and 'Put' keeps
write-lock/read-lock both?
Yeah... not at all. First, there's no read lock. Then Put is just a
Put, it takes a write lock. Append is a read+write
I think a similar concept would be a great idea. It would definitely
prevent the type of issue that you mentioned. I think that if it was done
in a similar way to how it is handled for hadoop, where you can specify a
list, but if you don't, you get autoadd, should keep everyone happy.
Mike
19 matches
Mail list logo