Re: Error about rs block seek

2013-05-13 Thread Bing Jiang
Hi,all Before the exception stack, there is an Error log: 2013-05-13 00:00:14,491 ERROR org.apache.hadoop.hbase.io.hfile.HFileReaderV2: Current pos = 32651; currKeyLen = 45; currValLen = 80; block limit = 32775; HFile name = 1f96183d55144c058fa2a05fe5c0b814; currBlock currBlockOffset = 33550830

Block size of HBase files

2013-05-13 Thread Praveen Bysani
Hi, I have the dfs.block.size value set to 1 GB in my cluster configuration. I have around 250 GB of data stored in hbase over this cluster. But when i check the number of blocks, it doesn't correspond to the block size value i set. From what i understand i should only have ~250 blocks. But

Re: Block size of HBase files

2013-05-13 Thread Amandeep Khurana
On Sun, May 12, 2013 at 11:40 PM, Praveen Bysani praveen.ii...@gmail.comwrote: Hi, I have the dfs.block.size value set to 1 GB in my cluster configuration. Just out of curiosity - why do you have it set at 1GB? I have around 250 GB of data stored in hbase over this cluster. But when i

RE: How to implement this check put and then update something logic?

2013-05-13 Thread Liu, Raymond
Well, this did come from a graph domain. However, I think this could be a common problem when you need to update something according to the original value where a simple checkAndPut on single value won't work. Another example, if you want to implement something like UPDATE, you want to know

Re: Error about rs block seek

2013-05-13 Thread Anoop John
Current pos = 32651; currKeyLen = 45; currValLen = 80; block limit = 32775 This means after the cur position we need to have atleast 45+80+4(key length stored as 4 bytes) +4(value length 4 bytes) So atleast 32784 should have been the limit. If we have memstoreTS also written with this KV some

Re: Error about rs block seek

2013-05-13 Thread Bing Jiang
hi, Anoop. I do not handle or change the hbase checksum. So I want to know if I set block size at the beginning of creating tables, does something make troubles? 2013/5/13 Anoop John anoop.hb...@gmail.com Current pos = 32651; currKeyLen = 45; currValLen = 80; block limit = 32775 This

Re: Error about rs block seek

2013-05-13 Thread Anoop John
So I want to know if I set block size at the beginning of creating tables, does something make troubles? Should not. We have tested with diff block sizes from def 64K to 8K fro testing purposes. Have not came across issues like this. Only on this data it is coming or every time u create a new

Re: Error about rs block seek

2013-05-13 Thread ramkrishna vasudevan
Is it possible to reproduce this with simple test case based on your usecase and data? You can share it so that can really debug the actual problem. Regards Ram On Mon, May 13, 2013 at 1:57 PM, Anoop John anoop.hb...@gmail.com wrote: So I want to know if I set block size at the beginning of

Re: Block size of HBase files

2013-05-13 Thread Praveen Bysani
Hi, I wanted to minimize on the number of map reduce tasks generated while processing a job, hence configured it to a larger value. I don't think i have configured HFile size in the cluster. I use Cloudera Manager to mange my cluster, and the only configuration i can relate to is

Re: Block size of HBase files

2013-05-13 Thread Anoop John
Praveen, How many regions there in ur table and how and CFs? Under /hbase/table-name there will be many files and dir u will be able to see. There will be .tableinfo file and every region will have .regionInfo file and then under cf the data file (HFiles) . Your total data is 250GB. When your

Re: Block size of HBase files

2013-05-13 Thread Ted Yu
You can change HFile size through hbase.hregion.max.filesize parameter. On May 13, 2013, at 2:45 AM, Praveen Bysani praveen.ii...@gmail.com wrote: Hi, I wanted to minimize on the number of map reduce tasks generated while processing a job, hence configured it to a larger value. I don't

Re: Block size of HBase files

2013-05-13 Thread Praveen Bysani
Hi, Thanks for the details. No i haven't run any compaction or i have no idea if there is one going on in background. I executed a major_compact on that table and i now have 731 regions (each about ~350 mb !!). I checked the configuration in CM, and the value for hbase.hregion.max.filesize is 1

Re: Block size of HBase files

2013-05-13 Thread Anoop John
now have 731 regions (each about ~350 mb !!). I checked the configuration in CM, and the value for hbase.hregion.max.filesize is 1 GB too !!! You mentioned the splits at the time of table creation? How u created the table? -Anoop- On Mon, May 13, 2013 at 5:18 PM, Praveen Bysani

Re: Block size of HBase files

2013-05-13 Thread Anoop John
I mean when u created the table (Using client I guess) have u specified any thuing like splitKeys or [start,end, no#regions]? -Anoop- On Mon, May 13, 2013 at 5:49 PM, Praveen Bysani praveen.ii...@gmail.comwrote: We insert data using java hbase client (org.apache.hadoop.hbase.client.*) .

Re: Export / Import and table splits

2013-05-13 Thread Jean-Marc Spaggiari
Hi Jeremy, Thanks for sharing this. I will take a look at it, and also most probably give a try to the snapshot option JM 2013/5/7 Jeremy Carroll phobos...@gmail.com https://github.com/phobos182/hadoop-hbase-tools/blob/master/hbase/copy_table.rb I wrote a quick script to do it with

Re: Export / Import and table splits

2013-05-13 Thread Matteo Bertozzi
I'll go with the snapshots since you can avoid all the I/O of the import/export but the consistency model is different, and you don't have the start/end time option... you should delete the rows tstart and tend after the clone Matteo On Tue, May 14, 2013 at 1:48 AM, Jean-Marc Spaggiari

Re: Export / Import and table splits

2013-05-13 Thread Jean-Marc Spaggiari
The cluser is stopped anyway, so there is no consistency concerns. which mean snapshots might be the best option. No need to delete any after. The goal is really to export the data locally, get the cluster down, get a new cluster, put the data and reload the table... the 2 clusters can't be up at

Re: Block size of HBase files

2013-05-13 Thread Praveen Bysani
Hi Anoop, No we didn't specify any such while creating and writing into the table. On 13 May 2013 20:22, Anoop John anoop.hb...@gmail.com wrote: I mean when u created the table (Using client I guess) have u specified any thuing like splitKeys or [start,end, no#regions]? -Anoop- On Mon,