On 2012-02-15, at 7:32 AM, Stack wrote:
On Tue, Feb 14, 2012 at 8:14 AM, Stack st...@duboce.net wrote:
2) With that same randomWrite command line above, I would expect a
resulting table with 10 * (1024 * 1024) rows (so 10485700 = roughly 10M
rows). Instead what I'm seeing is that the
On 2012-02-15, at 9:09 AM, Oliver Meyn (GBIF) wrote:
On 2012-02-15, at 7:32 AM, Stack wrote:
On Tue, Feb 14, 2012 at 8:14 AM, Stack st...@duboce.net wrote:
2) With that same randomWrite command line above, I would expect a
resulting table with 10 * (1024 * 1024) rows (so 10485700 = roughly
Oliver:
Thanks for digging.
Please file Jira's for these issues.
On Feb 15, 2012, at 1:53 AM, Oliver Meyn (GBIF) om...@gbif.org wrote:
On 2012-02-15, at 9:09 AM, Oliver Meyn (GBIF) wrote:
On 2012-02-15, at 7:32 AM, Stack wrote:
On Tue, Feb 14, 2012 at 8:14 AM, Stack st...@duboce.net
Thanks a lot for the help Todd!
On 14 February 2012 22:39, Todd Lipcon t...@cloudera.com wrote:
Yep, definitely bound on seeks - see the 100% util, and the r/s 100.
The bandwidth provided by random IO from a disk is going to be much
smaller than the sequential IO you see from hdparm
-Todd
I am new to hbase, I can't get the Hive handler working. I downloaded
the latest Hive (0.8.1) which has a handler for 0.89, and based on the
instructions on
https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration I
recompiled hive after updating the hbase, zookeeper and guava versions
in
Hi James, I'm new to HBase too.
How about this:
with a range of orderIds, select the first id.
Step1 : set this ID as startRow, then checkout the closest id(Only fetch
one),
Step2:then with this fetched ID, setStartRow(fetchedID-startTimestamp),
setEndRow(fetchedID-endTimestamp),
Step3:
Okie:
10x # of mappers: https://issues.apache.org/jira/browse/HBASE-5401
wrong row count: https://issues.apache.org/jira/browse/HBASE-5402
Oliver
On 2012-02-15, at 11:50 AM, yuzhih...@gmail.com wrote:
Oliver:
Thanks for digging.
Please file Jira's for these issues.
On Feb 15,
Thank you for your reply Doug.. that is what i wanted to know.
On Tue, Feb 14, 2012 at 9:39 PM, Doug Meil doug.m...@explorysmedical.comwrote:
I say basically because inside a Region there are Stores, and for each
Store there are StoreFiles. For more info see:
What version of Hadoop are you running? There are many erroneous
instructions for how to get this up and running all over the internet.
You do not need to rebuild hive in order to get it to work. You only
need to do the following:
1. It will only work if HBase is running in distributed or
On Tue, Feb 14, 2012 at 11:18 PM, Ulrich Staudinger
ustaudin...@activequant.com wrote:
Hi St.Ack,
i don't wanna be a pain in the back, but any progress on this?
You are not being a pain.
I'm fumbling the mvn publishing, repeatedly. Its a little
embarrassing which is why I'm not talking to
On Wed, Feb 15, 2012 at 1:53 AM, Oliver Meyn (GBIF) om...@gbif.org wrote:
So hacking around reveals that key collision is indeed the problem. I
thought the modulo part of the getRandomRow method was suspect but while
removing it improved the behaviour (I got ~8M rows instead of ~6.6M) it
You cannot use the option -D*skipTests* ?
On Wed, Feb 15, 2012 at 5:27 PM, Stack st...@duboce.net wrote:
On Tue, Feb 14, 2012 at 11:18 PM, Ulrich Staudinger
ustaudin...@activequant.com wrote:
Hi St.Ack,
i don't wanna be a pain in the back, but any progress on this?
You are not being
On Wed, Feb 15, 2012 at 8:43 AM, N Keywal nkey...@gmail.com wrote:
You cannot use the option -D*skipTests* ?
Not on the release plugin apparently (its ignored -- I should fix it).
St.Ack
I deployed it pretty easy on our internal repo by checking out the tag
0.92.0 (I assume this is the release) and *mvn deploy -DskipTests=true*.
Or you can move tests to a separate module eg hbase-test and add a
dependency to hbase. If all tests in hbase-test pass then you can
release the
You would have to grep the lease's id, in your first email it was
-7220618182832784549.
About the time it takes to process each row, I meant client (pig) side
not in the RS.
J-D
On Tue, Feb 14, 2012 at 1:33 PM, Mikael Sitruk mikael.sit...@gmail.com wrote:
Please see answer inline
Thanks
Hey guys, im a hbase and python newbie, and im stuck with the mutateRow()
command.
I'm using Centos 5.5, python 2.6 Hbase 0.90.4-cdh3u3. This is running in
a virtualbox, the original image file for the VM is the one provided by
Cloudera.
I've downloaded the hbase-0.90.4-cdh3u3.tar.gz file from
Ok, I don't have this log anymore but since the problem was reproduced in
other log (which i keep), here is the grep
2012-02-08 14:13:02,970 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer:
org.apache.hadoop.hbase.regionserver.LeaseException: lease
'-6992210222685255354' does not exist
Hi,
Was just reading about SSTable and LevelDB
(http://www.igvita.com/2012/02/06/sstable-and-log-structured-storage-leveldb/),
which has some HBase references. Somebody pointed out in comments Riak
supports LevelDB as a storage engine option, which made me wonder whether
pluggable backend
Hi,
I did look more into this and have a better idea how it could be
implemented.
As values are looked-up by dates (and sometimes additionally by source ID),
it would make sense to store each value in separate row.
rowkey would be some kind of timeseries, like:
timestamp_sourceID
However, docs
Hmm...
Does something like the below help?
diff --git
a/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
index f9627ed..0cee8e3 100644
--- a/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
+++
Hello,
We are looking at Bloom Filters and wondering if they are helpful when
doing a sequential read (multi-row scan) or only when doing a Get for a
single row. It logically makes sense that it would only affect (or to
greater affect) getting a single row since it is a way for determining if
Thanks Mikael. I will try the first solution.
To answer your question, I am evaluating both RDBMS and NoSQL and trying to
find best solution.
On Tue, Feb 14, 2012 at 8:03 PM, Mikael Sitruk mikael.sit...@gmail.comwrote:
Why don't you prefix the columns with an execution date (reverse order so
Hi, all,
I have two region servers setup and each machine have around 32G
memory. For each region server, I started it with 12G JVM limit. Recently
I have one map-reduce job which will write big chunk of data into a hbase
table. The job will run around 10 hours and the final hbase table will
i am really intrigued to know why you are thinking of NoSQL for this use
case..
thanks
On Wed, Feb 15, 2012 at 10:39 PM, Raj N objectli...@gmail.com wrote:
Thanks Mikael. I will try the first solution.
To answer your question, I am evaluating both RDBMS and NoSQL and trying to
find best
Andy hi
Not sure what you mean by Does something like the below help? The current
code running is pasted below, line number are sightly different than yours.
It seems very close to the first file (revision a) in your extract.
Mikael.S
public Result[] next(final long scannerId, int nbRows)
25 matches
Mail list logo