On Sun, Aug 12, 2012 at 7:04 AM, Gurjeet Singh wrote:
> Am I missing something ? Is there a way to optimize this ?
>
You've checked out the perf section of the refguide?
http://hbase.apache.org/book.html#performance
And have you read the postings by the GBIF lads starting with this one:
http:/
Hi Jason,
This is very good reference. I read it from begin to the end and learned a
lot. Thanks and have a good weekend.
regards,
Lin
On Tue, Aug 7, 2012 at 2:00 AM, Jason Frantz wrote:
> Lin,
>
> Looks like your questions may already be answered, but you might find the
> following link compa
Richard,
The property disables major compactions from happening automatically.
However, if you choose to do this, you should ensure you have a cron
job that does trigger major_compact on all tables - for compaction is
a necessary thing, but you just do not want it to happen at any time
it likes to
Hi Anil,
Thank you for your advice.
On Sat, Aug 11, 2012 at 10:12 PM, anil gupta wrote:
> Hi David,
>
> As i understand that you want to print the Integer values as Strings in
> HBase shell. There are two ways to do it:
> 1. You can write a ruby script to interpret the value as bytes. This migh
Thanks for the reply Stack. My comments are inline.
> You've checked out the perf section of the refguide?
>
> http://hbase.apache.org/book.html#performance
Yes. HBase has 8GB RAM both on my cluster as well as my dev machine.
Both configurations are backed by SSDs and Hbase options are set to
HB
Hi all,
iam new user of Hbase and i need help with secondary indexes.
For example i have messages and users. Each user has many messages.
Data structure will be like this:
Message:
- String id
- Long sender_id
- Long recipient_id
- String text
- Timestamp created_at
[...]
User:
- Long id
- Stri
Gurjeet:
Can you tell us which HBase version you are using ?
Thanks
On Sun, Aug 12, 2012 at 5:32 AM, Gurjeet Singh wrote:
> Thanks for the reply Stack. My comments are inline.
>
> > You've checked out the perf section of the refguide?
> >
> > http://hbase.apache.org/book.html#performance
>
> Ye
Hi Ted,
Yes, I am using the cloudera distribution 3.
Gurjeet
Sent from my iPad
On Aug 12, 2012, at 7:11 AM, Ted Yu wrote:
> Gurjeet:
> Can you tell us which HBase version you are using ?
>
> Thanks
>
> On Sun, Aug 12, 2012 at 5:32 AM, Gurjeet Singh wrote:
>
>> Thanks for the reply Stack.
You're right there - compactions does not merge region boundaries.
They just merge the accumulated (flushed, etc.) storefiles belonging
to each region, for every region thats fragmented over time.
On Fri, Aug 10, 2012 at 2:47 PM, J Mohamed Zahoor wrote:
> No. I just now learnt that compactions do
Bryan,
I believe running with "-Djava.net.preferIPv4Stack=true" should work just fine.
On Thu, Aug 9, 2012 at 1:17 AM, Bryan Beaudreault
wrote:
> Did this fix end up working? I'm hesitant to upgrade to 10.8 if I'm going
> to run into this issue. I'm running the CDH3 jars locally to mirror my
>
Hi all,
iam new user of Hbase and i need help with secondary indexes.
For example i have messages and users. Each user has many messages.
Data structure will be like this:
Message:
- String id
- Long sender_id
- Long recipient_id
- String text
- Timestamp created_at
[...]
User:
- Long id
- Stri
Something to consider is that HBase stores and retrieves the row key (8
bytes in your case) + timestamp (8 bytes) + column qualifier (?) for every
single value. The schemaless nature of HBase generally means that this
data has to be stored for each row (certain kinds of newer block level
compressi
I filed HBASE-6561 for this (Jira is back).
- Original Message -
From: lars hofhansl
To: "d...@hbase.apache.org" ; "user@hbase.apache.org"
Cc:
Sent: Saturday, August 11, 2012 12:42 AM
Subject: Re: Extremely long flush times
A possible solution is to have the MemStoreScanner reseek e
Do you really have to retrieve all 200.000 each time?
Scan.setBatch(...) makes no difference?! (note that batching is different and
separate from caching).
Also note that the scanner contract is to return sorted KVs, so a single scan
cannot be parallelized across RegionServers (well not entirely
Hi Jacques,
I did consider that. So, this increases the on-disk size of my data by
3-4x (=600-800MB). That still does not explain why reading 1row (=~4MB
with overhead) takes 5sec. About serialization/deserialization on the
client side - it happens on a different thread out of a buffer and
most of
Hello experts,
Would it be feasible to create a separate thread for each region??I
mean we can determine start and end key of each region and issue a scan for
each region in parallel.
Regards,
Mohammad Tariq
On Mon, Aug 13, 2012 at 3:54 AM, lars hofhansl wrote:
> Do you really hav
Hi Lars,
Yes, I need to retrieve all the values for a row at a time. That said,
I did experiment with different batch sizes and that made no
difference whatsoever. (caching on the other hand did make some
difference ~2-3% faster for larger cache)
I see your point about scanners returning sorted K
Hi Mohammad,
This is a great idea. Is there a API call to determine the start/end
key for each region ?
Thanks,
Gurjeet
On Sun, Aug 12, 2012 at 3:49 PM, Mohammad Tariq wrote:
> Hello experts,
>
>Would it be feasible to create a separate thread for each region??I
> mean we can determine
You can use HTable.{getStartEndKeys|getEndKeys|getStartKeys} to get the current
region demarcations for your table.
If you wanted to group threads by RegionServer (which you should) you get that
information via HTable.getRegionLocation{s}
-- Lars
- Original Message -
From: Gurjeet Sin
Methods getStartKey and getEndKey provided by HRegionInfo class can used
for that purpose.
Also, please make sure, any HTable instance is not left opened once you are
are done with reads.
Regards,
Mohammad Tariq
On Mon, Aug 13, 2012 at 4:22 AM, Gurjeet Singh wrote:
> Hi Mohammad,
>
> This
HTable.getRegionLocations()
I didn't realize the KeyValue serializations/deserialization happened on a
separate thread in the hbase client code.
J
On Sun, Aug 12, 2012 at 3:52 PM, Gurjeet Singh wrote:
> Hi Mohammad,
>
> This is a great idea. Is there a API call to determine the start/end
> k
I think the first question is where is the time spent. Does your analysis
show that all the time spent is on the regionservers or is a portion of the
bottleneck on the client side?
Jacques
On Sun, Aug 12, 2012 at 4:00 PM, Mohammad Tariq wrote:
> Methods getStartKey and getEndKey provided by
Also, give it a shot using HTablePools and see if it makes any significant
difference.
Regards,
Mohammad Tariq
On Mon, Aug 13, 2012 at 4:43 AM, Jacques wrote:
> I think the first question is where is the time spent. Does your analysis
> show that all the time spent is on the regionserve
Hi, hbase users.
now, I am testing coprocessors to create secondary indexes in background.
coprocessors itself is packaged in base 0.92.1 I am using.
the scenario I want to describe is this one.
the main table is 'blog' which is having a field named 'userId'.
from this field I want to create
Can u paste your CP implementation here [prePut/ postPut?]
Are u doing check for the table in CP hook? U need to only handle the hooks
while it is being called for your table. Remember that your index table also
have these same hooks.
-Anoop-
From: Henry
It seems like the client code just sits idle, waiting for data from
the regionservers.
Gurjeet
On Sun, Aug 12, 2012 at 4:13 PM, Jacques wrote:
> I think the first question is where is the time spent. Does your analysis
> show that all the time spent is on the regionservers or is a portion of th
Hi, I'm new to HBase.
I'm working with Hadoop-1.0.3 and HBase 0.92.1.
I have 8 data nodes which also work as region servers.
And I'm trying to import my data into HBase.
I wrote two programs, one is using HBase client API(auto flush is off,
WAL is on, multi-threaded) and the other is using HBase MR
Thanks Lars!
One final question : is it advisable to issue multiple threads
against a single HTable instance, like so:
HTable table = ...
for (i = 0; i < 10; i++) {
new ScanThread(table, startRow, endRow, rowProcessor).start();
}
class ScanThread implements Runnable {
public void run
hi, Anoop.
this is my implementation using Coprocessors RegionObserver.
…
@Override
public void prePut(ObserverContext e, Put
put, WALEdit edit,
boolean writeToWAL) throws IOException {
String tableName =
e.getEnvironment().getRegion().getRegionInfo().getTableNameAsStrin
Hi
Can you see if this helps..
http://hbase.apache.org/book/performance.html
./zahoor
On Mon, Aug 13, 2012 at 10:28 AM, 某因幡 wrote:
> Hi, I'm new to HBase.
> I'm working with Hadoop-1.0.3 and HBase 0.92.1.
> I have 8 data nodes which also work as region servers.
> And I'm trying to import my d
HBASE-6564.
I will try to take a stab on it this weekend.
./zahoor
On Fri, Aug 10, 2012 at 12:47 PM, J Mohamed Zahoor wrote:
> Hi Lars,
>
> Will file it...
>
> ./Zahoor
>
>
> On Fri, Aug 10, 2012 at 12:00 AM, lars hofhansl wrote:
>
>> Hi zahoor,
>>
>> could you file a jira with what you found?
31 matches
Mail list logo