Re: question on column families

2018-11-13 Thread Antonio Si
our > case. When flushing, the flusher will choose those memstore who satisfy > certain conditions, so it is possible that not every column family(Store) > will flush the memstore. > Best Regards > Allan Yang > > > Antonio Si 于2018年11月14日周三 上午7:34写道: > > > Hi, > >

question on column families

2018-11-13 Thread Antonio Si
Hi, I would like to confirm my understand. Let's say I have 13 column families in a hbase table. 11 of those column families have no data, which 2 column families have large amount of data. My understanding is that the size of memstore, which is 128M in my env, will be shared across all column f

check if column family has any data

2018-11-13 Thread Antonio Si
Hi, Is there an easy way to check if a column family of a hbase table has any data? I try something like "scan '', { LIMIT => 10, FILTER=>"FamilyFilter(=, 'binary:')" } in hbase shell and it timeout. I guess it's because my table has 15TB of data. So, I am guessing that particular family has no d

questions regarding hbase major compaction

2018-09-10 Thread Antonio Si
Hello, As I understand, the deleted records in hbase files do not get removed until a major compaction is performed. I have a few questions regarding major compaction: 1. If I set a TTL and/or a max number of versions, the records are older than the TTL or the expired versions will still

Re: question on snapshot and export utility

2018-09-06 Thread Antonio Si
plementation. All you need is a new > Mapper which does required filtering of a HFile before moving data to a > destination. > > -Vlad > > On Wed, Sep 5, 2018 at 10:51 AM Antonio Si wrote: > > > Hi, > > > > When taking a snapshot or running the export utility,

question on snapshot and export utility

2018-09-05 Thread Antonio Si
Hi, When taking a snapshot or running the export utility, is it possible to specify a condition or filter on some columns so that only rows that satisfy the condition will be included in the snapshot or exported? Thanks. Antonio.

Re: a table is neither disable or enable

2018-08-29 Thread Antonio Si
fter the procedure is stopped, see if running hbck can fix the issue (I > haven't worked with 1.3 release in production). > When running hbck, run without -fix parameter first to see what > inconsistencies hbck reports. > > Cheers > > On Wed, Aug 29, 2018 at 3:42 PM Antonio

Re: a table is neither disable or enable

2018-08-29 Thread Antonio Si
Forgot to mention that all regions of the table is offline now. Wondering if the table will eventually got disable as it has been running for almost 24 hrs now. Thanks. Antonio. On Wed, Aug 29, 2018 at 3:40 PM Antonio Si wrote: > Thanks Ted. > Now that the table is in neither disa

Re: a table is neither disable or enable

2018-08-29 Thread Antonio Si
: > The 'missing table descriptor' error should have been fixed by running hbck > (with selected parameters). > > FYI > > On Wed, Aug 29, 2018 at 2:46 PM Antonio Si wrote: > > > Thanks Ted. > > > > The log says "java.io.IOException: missing table de

Re: a table is neither disable or enable

2018-08-29 Thread Antonio Si
e has. But > the length should be less related to data amount. > > Which version of hbase are you using ? > > Thanks > > On Wed, Aug 29, 2018 at 2:22 PM Antonio Si wrote: > > > Hi, > > > > We have a table which is stuck in FAILED_OPEN state. So, we plan

a table is neither disable or enable

2018-08-29 Thread Antonio Si
Hi, We have a table which is stuck in FAILED_OPEN state. So, we planned to drop the table and re-clone the table from an old snapshot. We disabled the table, but the disable procedure has been running for more than 20 hrs. I went to hbase shell and found out "is_disabled" and "is_enabled" both re

Re: question on reducing number of versions

2018-08-26 Thread Antonio Si
table to > clean/delete up extra version. > Btw, 18000 max version is a unusually high value. > > Are you using hbase on s3 or hbase on hdfs? > > Sent from my iPhone > > > On Aug 26, 2018, at 2:34 PM, Antonio Si wrote: > > > > Hello, > > > > I have

question on reducing number of versions

2018-08-26 Thread Antonio Si
Hello, I have a hbase table whose definition has a max number of versions set to 36000. I have verified that there are rows which have more than 2 versions saved. Now, I change the definition of the table and reduce the max number of versions to 18000. Will I see the size of the table being r

Re: time out when running CellCounter

2018-08-25 Thread Antonio Si
gt; > You can specify, e.g. scan timerange, scan max versions, start row, stop > row, etc. so that individual run has shorter runtime. > > Cheers > > On Sat, Aug 25, 2018 at 9:35 AM Antonio Si wrote: > > > Hi, > > > > When I run org.apache.hadoop.hbase.mapre

time out when running CellCounter

2018-08-25 Thread Antonio Si
Hi, When I run org.apache.hadoop.hbase.mapreduce.*CellCounter*, I am getting Timed out after 600 secs. Is there a way to override the timeout value rather than changing it in hbase-site.xml and restart hbase? Any suggestions would be helpful. Thank you. Antonio.

Re: how to get rowkey with largest number of versions

2018-08-22 Thread Antonio Si
nd count the number of > > versions of a column returned for each row to calculate the max. (you can > > optimize this with custom coprocessor by returning a single row key > having > > the largest versions of a column through each regionserver and at client > >

how to get rowkey with largest number of versions

2018-08-22 Thread Antonio Si
Hi, I am new to hbase. I am wondering how I could find out which rowkey has the largest number of versions in a column family. Any pointer would be very helpful. Thanks. Antonio.