Re: HBase - stable versions

2013-09-09 Thread Kiru Pakkirisamy
BTW, can somebody explain the function/purpose of 0.95.2. Do the community expect 0.95.2 to be used in a prod env or does it have to 0.96.0 for that ? Also, I have some development hiccups with it (like cannot find the jar on the maven repo etc, if somebody can provide pointers that would be gre

Re: Tables gets Major Compacted even if they haven't changed

2013-09-09 Thread lars hofhansl
Interesting. I guess we could add a check to avoid major compactions if (1) no TTL is set or we can show that all data is newer and (2) there's only one file (3) and there are no delete markers. All of these can be cheaply checked with some HFile metadata (we might have all data needed already).

RE: Tables gets Major Compacted even if they haven't changed

2013-09-09 Thread Vladimir Rodionov
Sure, you can overwrite standard compaction in RegionCoprocessor Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Premal Shah [premal.j.s...@gmail.com] Sent: Monday, September 0

Re: Tables gets Major Compacted even if they haven't changed

2013-09-09 Thread anil gupta
Hi Premal, You can set hbase.hregion.majorcompaction=0 i.e. never run major compaction by itself. Then major compaction will never run periodically by itself. Either, user has to trigger it manually or this will be driven by no. of store files. HTH, Anil On Mon, Sep 9, 2013 at 9:28 PM, Premal S

Re: Tables gets Major Compacted even if they haven't changed

2013-09-09 Thread Premal Shah
Ah ok. We don't expire any data, so have not set any TTLs. Is there a policy we can use to avoid compacting regions that have not changed (ie have just 1 store file)? On Mon, Sep 9, 2013 at 9:13 PM, Vladimir Rodionov wrote: > HBase can run major compaction (even if table has not been updated) to

答复: Dropping a very large table

2013-09-09 Thread 冯宏华
seems no very simple way to do this. not sure if close/unassign regions gradually via script before dropping can help a little. the pain derives from current master assignment design which relies on ZK to track the assign/split progress/status, and for creating/dropping/restarting tables with v

RE: Tables gets Major Compacted even if they haven't changed

2013-09-09 Thread Vladimir Rodionov
HBase can run major compaction (even if table has not been updated) to purge expired data (TTL). Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Premal Shah [premal.j.s...@gma

Tables gets Major Compacted even if they haven't changed

2013-09-09 Thread Premal Shah
Hi, We have a bunch on tables in our HBase cluster. We have a script which makes sure all of them get Major Compacted once every 2 days. There are 2 things I'm observing 1) Table X has not updated in a month. We have not inserted, updated or deleted data. However, it still major compacts every 2 d

Re: hdfs data into Hbase

2013-09-09 Thread kun yan
Sorry my expression may not be very clear, that is the case, not importing HDFS disk usage DFS Used: 54.19 GB Importing data from HDFS to HBase HDFS usage is DFS Used: 57.16 GB, in my HDFS storage data size is 69MB, HDFS rep is 3 2013/9/9 Shahab Yunus > Some quick thoughts, well your size is b

Re: Hbase ExportSnapshot with AWS

2013-09-09 Thread Carlos Espinoza
That'd be a great enhancement. On Mon, Sep 9, 2013 at 4:42 PM, Vladimir Rodionov wrote: > Great, thanks. One possible enhancement: support for arbitrary file > system (not only s3). > > Best regards, > Vladimir Rodionov > Principal Platform Engineer > Carrier IQ, www.carrieriq.com > e-mail: vr

Dropping a very large table

2013-09-09 Thread Michael Webster
Hello, I have a very large HBase table running on 0.90, large meaning >20K regions with a max region size of 1GB. This table is legacy and can be dropped, but we aren't sure what impact disabling/dropping that large of a table will have on our cluster. We are using dropAsync and polling HTable#i

Re: HBase - stable versions

2013-09-09 Thread Ameya Kanitkar
We (Groupon), will also stick to 0.94 for near future. Ameya On Mon, Sep 9, 2013 at 4:03 PM, Kiru Pakkirisamy wrote: > When is 0.96 release being planned ? Right now we are testing against > 0.95.2 as this does not seem to have the HBASE-9410 bug. > > > Regards, > - kiru > > ___

RE: Hbase ExportSnapshot with AWS

2013-09-09 Thread Vladimir Rodionov
Great, thanks. One possible enhancement: support for arbitrary file system (not only s3). Best regards, Vladimir Rodionov Principal Platform Engineer Carrier IQ, www.carrieriq.com e-mail: vrodio...@carrieriq.com From: Carlos Espinoza [car...@imgur.com]

Re: HBase - stable versions

2013-09-09 Thread Kiru Pakkirisamy
When is 0.96 release being planned ? Right now we are testing against 0.95.2 as this does not seem to have the HBASE-9410 bug.    Regards, - kiru From: Enis Söztutar To: hbase-user Cc: "d...@hbase.apache.org" Sent: Wednesday, September 4, 2013 6:20 PM Subje

Re: Hbase ExportSnapshot with AWS

2013-09-09 Thread Carlos Espinoza
You might be interested in a tool I recently wrote. It's a wrapper around the Export Snapshot tool. It's not heavily tested, but it has worked for us so far. You can use it to create a snapshot and export it to S3. It can also be used to import a snapshot from S3 into a running HBase cluster. You n

Getting column values in batches for a single row

2013-09-09 Thread Sam William
Hi, I have a table which is wide(with a single family) and the column qualifiers are timestamps. I'd like to do a get on a rowkey, but I dont need to read all of the columns. I want to read the first n values and then read more in batches if need be. Is there a way to do this? Im on version

Re: Getting column values in batches for a single row

2013-09-09 Thread Jean-Daniel Cryans
Scan.setBatch does what you are looking for, since with a Get there's no way to iterate over mutliple calls: https://github.com/apache/hbase/blob/0.94.2/src/main/java/org/apache/hadoop/hbase/client/Scan.java#L306 Just make sure to make the Scan start at the row you want and stop right after it. J

RE: Hbase ExportSnapshot with AWS

2013-09-09 Thread Vladimir Rodionov
There is no option to export materialized snapshot to AWS yet - only to another HBase cluster. You can export snapshot metadata to s3 using S3 FileSystem in Hadoop, but this is not you are looking for. You can export snapshots to table(s) in another cluster and then use distcp to copy data over

Re: Getting column values in batches for a single row

2013-09-09 Thread Sam William
JD, Thanks. This works. I considered Scan.setBatch earlier but dismissed it because the ResultScanner.next()'s docs said it checks for the next row. Sam On Sep 9, 2013, at 12:43 PM, Jean-Daniel Cryans wrote: > Scan.setBatch does what you are looking for, since with a Get there's no > way t

Hbase ExportSnapshot with AWS

2013-09-09 Thread Dhanasekaran Anbalagan
Hi Guys, It's possible to Export Snapshot with Amazon S3 or Glacier. Currently we have two hbase cluster running on same datacenter. We want backup solution for Hbase data. want redundant backup solution for data store. something like S3 or Glacier. I will use CopyTable and snapshot for Hbase d

Re: java.lang.NegativeArraySizeException: -1 in hbase

2013-09-09 Thread lars hofhansl
The 0.94.5 change (presumably HBASE-3996) is only forward compatible. M/R is a bit special in the jars are shipped with the job. Here's a comment from Todd Lipcon on that issue: "The jar on the JT doesn't matter. Split computation and interpretation happens only in the user code – i.e on the cli

Re: Bulk load from OSGi running client

2013-09-09 Thread Stack
On Mon, Sep 9, 2013 at 12:14 AM, Amit Sela wrote: ... > The main issue still remains, it looks like Compression.Algortihm > configuration's class loader had reference to the bundle in revision 0 > (before jar update) instead of revision 1 (after jar update). This could be > because of caching (or

Re: java.lang.NegativeArraySizeException: -1 in hbase

2013-09-09 Thread Anoop John
Thats sound correct. Can we mention it some where in our doc? Will that be good? -Anoop- On Mon, Sep 9, 2013 at 11:24 PM, lars hofhansl wrote: > The 0.94.5 change (presumably HBASE-3996) is only forward compatible. M/R > is a bit special in the jars are shipped with the job. > > Here's a comme

Re: java.lang.NegativeArraySizeException: -1 in hbase

2013-09-09 Thread Jean-Marc Spaggiari
So. After some internal discussions with Anoop, here is a summary of the situation. An hbase-0.94.0 jar file was included in the MR job client file. But also, this MR client file was stored into the Master lib directory. And only in the master and the RS hosted on the same host. Not in any of the

Re: hdfs data into Hbase

2013-09-09 Thread Shahab Yunus
Some quick thoughts, well your size is bound to increase because recall that the rowkey is stored in every cell. So when in CSV if you have let us say 5 columns and when you imported them to HBASE using the first column as key, then you will end up with essentially 9 (1 for the rowkey and then 2 ea

Re: Set Max Number of Row Versions of a table

2013-09-09 Thread Gaetan Deputier
Works pretty well. Thanks for the examples. On Mon, Sep 9, 2013 at 2:04 AM, Nicolas Liochon wrote: > Here is an example on trunk. IIRC, with 0.94, you may have to disable the > table before updating the definition. > > hbase(main):007:0> create 't2', {NAME => 'f1', VERSIONS => 5} > 0 row(s) in

hdfs data into Hbase

2013-09-09 Thread kun yan
Hello everyone, I wrote a mapreduce program to import data(HDFS) into hbase, but when I import data into hbase later increased a lot, my original data size is 69MB (HDFS), import HBase, My HDFS increase the size 3GB, I wrote the program do what is wrong thanks public class MRImportHBaseCsv {

Re: Set Max Number of Row Versions of a table

2013-09-09 Thread Nicolas Liochon
Here is an example on trunk. IIRC, with 0.94, you may have to disable the table before updating the definition. hbase(main):007:0> create 't2', {NAME => 'f1', VERSIONS => 5} 0 row(s) in 0.2820 seconds => Hbase::Table - t2 hbase(main):008:0> describe 't2' DESCRIPTION ENABLED 't2', {NAME => 'f1'

Re: Set Max Number of Row Versions of a table

2013-09-09 Thread Gaetan Deputier
I tried on a simple table using the following commands : create 't', 'f' alter 't', NAME => 'f', VERSIONS => 5 I have this Error : ERROR: Column family datafVERSIONS5 must have a name I have tried with the syntax from the alter help page but no success. Any hints ? I am running Hbase from Clo

Re: Set Max Number of Row Versions of a table

2013-09-09 Thread Gaetan Deputier
Exactly what i was looking for. Thank you very much ! On Mon, Sep 9, 2013 at 12:48 AM, Nicolas Liochon wrote: > There is a comment in this class that is outdated ("Once set, the > parameters that specify a column cannot be changed without deleting the > column and recreating it. If there is dat

Re: Set Max Number of Row Versions of a table

2013-09-09 Thread Nicolas Liochon
There is a comment in this class that is outdated ("Once set, the parameters that specify a column cannot be changed without deleting the column and recreating it. If there is data stored in the column, it will be deleted when the column is deleted."). This is from 2007. I will fix this. It's poss