RE: find size of each table in the cluster
Hello- We are running v0.94.9 cluster. I am seeing that 'fs -dus' reports 24TB used and 'fs -df' reports 74.TB used. Does anyone know why these do not reconcile? Our replication factor is 2 so that is not a likely explanation. Shown below are results from my cluster (doctored to TB for ease of reading): bash-4.1$ hadoop fs -dus /hbase hdfs://host/hbase 24.5TB bash-4.1$ hadoop fs -df /hbase Filesystem SizeUsedAvail Use% /hbase 103.8TB 74.2TB 24.3TB 71%
Re: find size of each table in the cluster
:) Output of hadoop fs -du -s -h 'tablename' matches output of "status 'detailed'" when I sum up all storefileSizeMB values. Thanks! -- View this message in context: http://apache-hbase.679495.n3.nabble.com/find-size-of-each-table-in-the-cluster-tp4078899p4078931.html Sent from the HBase User mailing list archive at Nabble.com.
Re: find size of each table in the cluster
bq. COMPRESSION => 'LZ4', The answer is given by above attribute :-) On Thu, Mar 31, 2016 at 10:41 AM, marjanawrote: > Sure, here's describe of one table: > > Table RAWHITS_AURORA-COM is ENABLED > RAWHITS_AURORA-COM > COLUMN FAMILIES DESCRIPTION > {NAME => 'f1', DATA_BLOCK_ENCODING => 'FAST_DIFF', BLOOMFILTER => 'ROW', > REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'LZ4', > MIN_VERSIONS => '0', TTL => '216 SECONDS (2 > 5 DAYS)', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '262144', IN_MEMORY > => > 'false', BLOCKCACHE => 'true'} > > > > -- > View this message in context: > http://apache-hbase.679495.n3.nabble.com/find-size-of-each-table-in-the-cluster-tp4078899p4078927.html > Sent from the HBase User mailing list archive at Nabble.com. >
Re: find size of each table in the cluster
Sure, here's describe of one table: Table RAWHITS_AURORA-COM is ENABLED RAWHITS_AURORA-COM COLUMN FAMILIES DESCRIPTION {NAME => 'f1', DATA_BLOCK_ENCODING => 'FAST_DIFF', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'LZ4', MIN_VERSIONS => '0', TTL => '216 SECONDS (2 5 DAYS)', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '262144', IN_MEMORY => 'false', BLOCKCACHE => 'true'} -- View this message in context: http://apache-hbase.679495.n3.nabble.com/find-size-of-each-table-in-the-cluster-tp4078899p4078927.html Sent from the HBase User mailing list archive at Nabble.com.
Re: find size of each table in the cluster
bq. data is distributed on node servers, Data is on hdfs, i.e. the Data Nodes. bq. it gets propagated to all data nodes, If I understand correctly, the -du command queries namenode. bq. Is this size compressed or uncompressed? Can you show us the table description (output of describe command in hbase shell) ? On Thu, Mar 31, 2016 at 8:38 AM, marjanawrote: > Thanks all on your replies. > This is clustered env, with 2 master nodes and 4 data nodes. Master nodes > have these components installed (as shown in Ambari UI): > active hbase master > history server > name node > resource manager > zookeeper server > metrics monitor > > Node server has these components: > Data Node > region server > metrics monitor > node manager > > So I looked on my node server for the hbase.rootdir, and it points to my > hdfs://hbasmaserserver:8020//apps/hbase/data. > Now this is confusing to me as I thought data is distributed on node > servers, where region servers are. > I sshed to my masterserver and looked under this dir and did see all my > tables in my default namespace. Example: > $ hadoop fs -du -s -h /apps/hbase/data/data/default/RAWHITS_AURORA-COM > 2.0 G /apps/hbase/data/data/default/RAWHITS_AURORA-COM > > So when I run this command on hbmaster, it gets propagated to all data > nodes, correct? Is this size compressed or uncompressed? > > Many thanks! > Marjana > > > > > -- > View this message in context: > http://apache-hbase.679495.n3.nabble.com/find-size-of-each-table-in-the-cluster-tp4078899p4078919.html > Sent from the HBase User mailing list archive at Nabble.com. >
Re: find size of each table in the cluster
Hi, this is standard convention for hortonworks distribution. First Three are HBase version, the last two are HDP version. Cheers Tomasz Bem. On 2016-03-31 06:27, Ted Yu wrote: bq. hbase version is 1.1.1.2.3 I don't think there was ever such a release - there should be only 3 dots. bq. /hbase is the default storage location for tables in hdfs the root dir is given by hbase.rootdir config parameter. Here is sample listing: http://pastebin.com/ekF4tsYn Under data, you would see: drwxr-xr-x - hbase hdfs 0 2016-03-22 20:26 /apps/hbase/data/data/default drwxr-xr-x - hbase hdfs 0 2016-03-14 19:13 /apps/hbase/data/data/hbase hbase is system namespace. Under default (or your own namespace), you would get table dir. Here is a sample: drwxr-xr-x - hbase hdfs 0 2016-03-22 20:26 /apps/hbase/data/data/default/elog_pn_split On Wed, Mar 30, 2016 at 7:26 PM, Stephen Durfeywrote: I believe the easiest way would be to run 'hadoop dfs -du -h /hbase'. I believe /hbase is the default storage location for tables in hdfs. The size will be either compressed or uncompressed, depending upon if compression is enabled. On Wed, Mar 30, 2016 at 6:32 PM -0700, "marjana" wrote: Hello, I am new to hBase, so sorry if I am talking nonsense. I am trying to figure out a way how to find the total size of each table in my hBase. I have looked into hbase shell commands. There's "status 'detailed'", that shows storefileSizeMB. If I were to add all of these grouped by tablename, would that be the correct way to show MB used per table? Is there any other (easier/cleaner) way? hbase version is 1.1.1.2.3, HDFS: 2.7.1 Thanks Marjana -- View this message in context: http://apache-hbase.679495.n3.nabble.com/find-size-of-each-table-in-the-cluster-tp4078899.html Sent from the HBase User mailing list archive at Nabble.com.
Re: find size of each table in the cluster
bq. hbase version is 1.1.1.2.3 I don't think there was ever such a release - there should be only 3 dots. bq. /hbase is the default storage location for tables in hdfs the root dir is given by hbase.rootdir config parameter. Here is sample listing: http://pastebin.com/ekF4tsYn Under data, you would see: drwxr-xr-x - hbase hdfs 0 2016-03-22 20:26 /apps/hbase/data/data/default drwxr-xr-x - hbase hdfs 0 2016-03-14 19:13 /apps/hbase/data/data/hbase hbase is system namespace. Under default (or your own namespace), you would get table dir. Here is a sample: drwxr-xr-x - hbase hdfs 0 2016-03-22 20:26 /apps/hbase/data/data/default/elog_pn_split On Wed, Mar 30, 2016 at 7:26 PM, Stephen Durfeywrote: > I believe the easiest way would be to run 'hadoop dfs -du -h /hbase'. I > believe /hbase is the default storage location for tables in hdfs. The size > will be either compressed or uncompressed, depending upon if compression is > enabled. > > > > > > > On Wed, Mar 30, 2016 at 6:32 PM -0700, "marjana" > wrote: > > > > > > > > > > > Hello, > I am new to hBase, so sorry if I am talking nonsense. > > I am trying to figure out a way how to find the total size of each table in > my hBase. > I have looked into hbase shell commands. There's "status 'detailed'", that > shows storefileSizeMB. If I were to add all of these grouped by tablename, > would that be the correct way to show MB used per table? > Is there any other (easier/cleaner) way? > hbase version is 1.1.1.2.3, HDFS: 2.7.1 > Thanks > Marjana > > > > -- > View this message in context: > http://apache-hbase.679495.n3.nabble.com/find-size-of-each-table-in-the-cluster-tp4078899.html > Sent from the HBase User mailing list archive at Nabble.com. > > > > > >
Re: find size of each table in the cluster
I believe the easiest way would be to run 'hadoop dfs -du -h /hbase'. I believe /hbase is the default storage location for tables in hdfs. The size will be either compressed or uncompressed, depending upon if compression is enabled. On Wed, Mar 30, 2016 at 6:32 PM -0700, "marjana"wrote: Hello, I am new to hBase, so sorry if I am talking nonsense. I am trying to figure out a way how to find the total size of each table in my hBase. I have looked into hbase shell commands. There's "status 'detailed'", that shows storefileSizeMB. If I were to add all of these grouped by tablename, would that be the correct way to show MB used per table? Is there any other (easier/cleaner) way? hbase version is 1.1.1.2.3, HDFS: 2.7.1 Thanks Marjana -- View this message in context: http://apache-hbase.679495.n3.nabble.com/find-size-of-each-table-in-the-cluster-tp4078899.html Sent from the HBase User mailing list archive at Nabble.com.