RE: find size of each table in the cluster

2016-03-31 Thread Ted Tuttle
Hello-

We are running v0.94.9 cluster.

I am seeing that 'fs -dus' reports 24TB used and 'fs -df' reports 74.TB used.  

Does anyone know why these do not reconcile? Our replication factor is 2 so 
that is not a likely explanation.

Shown below are results from my cluster (doctored to TB for ease of reading):

bash-4.1$ hadoop fs -dus /hbase
hdfs://host/hbase  24.5TB

bash-4.1$ hadoop fs -df /hbase
Filesystem  SizeUsedAvail   Use%
/hbase  103.8TB 74.2TB 24.3TB  71%


Re: find size of each table in the cluster

2016-03-31 Thread marjana
:)

Output of hadoop fs -du -s -h 'tablename' matches output of "status
'detailed'" when I sum up all storefileSizeMB values.

Thanks!



--
View this message in context: 
http://apache-hbase.679495.n3.nabble.com/find-size-of-each-table-in-the-cluster-tp4078899p4078931.html
Sent from the HBase User mailing list archive at Nabble.com.


Re: find size of each table in the cluster

2016-03-31 Thread Ted Yu
bq. COMPRESSION => 'LZ4',

The answer is given by above attribute :-)

On Thu, Mar 31, 2016 at 10:41 AM, marjana  wrote:

> Sure, here's describe of one table:
>
> Table RAWHITS_AURORA-COM is ENABLED
> RAWHITS_AURORA-COM
> COLUMN FAMILIES DESCRIPTION
> {NAME => 'f1', DATA_BLOCK_ENCODING => 'FAST_DIFF', BLOOMFILTER => 'ROW',
> REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'LZ4',
> MIN_VERSIONS => '0', TTL => '216 SECONDS (2
> 5 DAYS)', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '262144', IN_MEMORY
> =>
> 'false', BLOCKCACHE => 'true'}
>
>
>
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/find-size-of-each-table-in-the-cluster-tp4078899p4078927.html
> Sent from the HBase User mailing list archive at Nabble.com.
>


Re: find size of each table in the cluster

2016-03-31 Thread marjana
Sure, here's describe of one table:

Table RAWHITS_AURORA-COM is ENABLED 

 
RAWHITS_AURORA-COM  

 
COLUMN FAMILIES DESCRIPTION 

 
{NAME => 'f1', DATA_BLOCK_ENCODING => 'FAST_DIFF', BLOOMFILTER => 'ROW',
REPLICATION_SCOPE => '0', VERSIONS => '1', COMPRESSION => 'LZ4',
MIN_VERSIONS => '0', TTL => '216 SECONDS (2
5 DAYS)', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '262144', IN_MEMORY =>
'false', BLOCKCACHE => 'true'} 



--
View this message in context: 
http://apache-hbase.679495.n3.nabble.com/find-size-of-each-table-in-the-cluster-tp4078899p4078927.html
Sent from the HBase User mailing list archive at Nabble.com.


Re: find size of each table in the cluster

2016-03-31 Thread Ted Yu
bq. data is distributed on node servers,

Data is on hdfs, i.e. the Data Nodes.

bq. it gets propagated to all data nodes,

If I understand correctly, the -du command queries namenode.

bq. Is this size compressed or uncompressed?

Can you show us the table description (output of describe command in hbase
shell) ?

On Thu, Mar 31, 2016 at 8:38 AM, marjana  wrote:

> Thanks all on your replies.
> This is clustered env, with 2 master nodes and 4 data nodes. Master nodes
> have these components installed (as shown in Ambari UI):
> active hbase master
> history server
> name node
> resource manager
> zookeeper server
> metrics monitor
>
> Node server has these components:
> Data Node
> region server
> metrics monitor
> node manager
>
> So I looked on my node server for the hbase.rootdir, and it points to my
> hdfs://hbasmaserserver:8020//apps/hbase/data.
> Now this is confusing to me as I thought data is distributed on node
> servers, where region servers are.
> I sshed to my masterserver and looked under this dir and did see all my
> tables in my default namespace. Example:
> $ hadoop fs -du -s -h /apps/hbase/data/data/default/RAWHITS_AURORA-COM
> 2.0 G  /apps/hbase/data/data/default/RAWHITS_AURORA-COM
>
> So when I run this command on hbmaster, it gets propagated to all data
> nodes, correct? Is this size compressed or uncompressed?
>
> Many thanks!
> Marjana
>
>
>
>
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/find-size-of-each-table-in-the-cluster-tp4078899p4078919.html
> Sent from the HBase User mailing list archive at Nabble.com.
>


Re: find size of each table in the cluster

2016-03-30 Thread Tomasz Bem

Hi,

this is standard convention for hortonworks distribution. First Three 
are HBase version, the last two are HDP version.


Cheers
Tomasz Bem.

On 2016-03-31 06:27, Ted Yu wrote:

bq. hbase version is 1.1.1.2.3

I don't think there was ever such a release - there should be only 3 dots.

bq. /hbase is the default storage location for tables in hdfs

the root dir is given by hbase.rootdir config parameter.

Here is sample listing:

http://pastebin.com/ekF4tsYn

Under data, you would see:

drwxr-xr-x   - hbase hdfs  0 2016-03-22 20:26
/apps/hbase/data/data/default
drwxr-xr-x   - hbase hdfs  0 2016-03-14 19:13
/apps/hbase/data/data/hbase

hbase is system namespace.

Under default (or your own namespace), you would get table dir. Here is a
sample:

drwxr-xr-x   - hbase hdfs  0 2016-03-22 20:26
/apps/hbase/data/data/default/elog_pn_split

On Wed, Mar 30, 2016 at 7:26 PM, Stephen Durfey  wrote:


I believe the easiest way would be to run 'hadoop dfs -du -h /hbase'. I
believe /hbase is the default storage location for tables in hdfs. The size
will be either compressed or uncompressed, depending upon if compression is
enabled.






On Wed, Mar 30, 2016 at 6:32 PM -0700, "marjana" 
wrote:










Hello,
I am new to hBase, so sorry if I am talking nonsense.

I am trying to figure out a way how to find the total size of each table in
my hBase.
I have looked into hbase shell commands. There's "status 'detailed'", that
shows storefileSizeMB. If I were to add all of these grouped by tablename,
would that be the correct way to show MB used per table?
Is there any other (easier/cleaner) way?
hbase version is 1.1.1.2.3, HDFS: 2.7.1
Thanks
Marjana



--
View this message in context:
http://apache-hbase.679495.n3.nabble.com/find-size-of-each-table-in-the-cluster-tp4078899.html
Sent from the HBase User mailing list archive at Nabble.com.










Re: find size of each table in the cluster

2016-03-30 Thread Ted Yu
bq. hbase version is 1.1.1.2.3

I don't think there was ever such a release - there should be only 3 dots.

bq. /hbase is the default storage location for tables in hdfs

the root dir is given by hbase.rootdir config parameter.

Here is sample listing:

http://pastebin.com/ekF4tsYn

Under data, you would see:

drwxr-xr-x   - hbase hdfs  0 2016-03-22 20:26
/apps/hbase/data/data/default
drwxr-xr-x   - hbase hdfs  0 2016-03-14 19:13
/apps/hbase/data/data/hbase

hbase is system namespace.

Under default (or your own namespace), you would get table dir. Here is a
sample:

drwxr-xr-x   - hbase hdfs  0 2016-03-22 20:26
/apps/hbase/data/data/default/elog_pn_split

On Wed, Mar 30, 2016 at 7:26 PM, Stephen Durfey  wrote:

> I believe the easiest way would be to run 'hadoop dfs -du -h /hbase'. I
> believe /hbase is the default storage location for tables in hdfs. The size
> will be either compressed or uncompressed, depending upon if compression is
> enabled.
>
>
>
>
>
>
> On Wed, Mar 30, 2016 at 6:32 PM -0700, "marjana" 
> wrote:
>
>
>
>
>
>
>
>
>
>
> Hello,
> I am new to hBase, so sorry if I am talking nonsense.
>
> I am trying to figure out a way how to find the total size of each table in
> my hBase.
> I have looked into hbase shell commands. There's "status 'detailed'", that
> shows storefileSizeMB. If I were to add all of these grouped by tablename,
> would that be the correct way to show MB used per table?
> Is there any other (easier/cleaner) way?
> hbase version is 1.1.1.2.3, HDFS: 2.7.1
> Thanks
> Marjana
>
>
>
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/find-size-of-each-table-in-the-cluster-tp4078899.html
> Sent from the HBase User mailing list archive at Nabble.com.
>
>
>
>
>
>


Re: find size of each table in the cluster

2016-03-30 Thread Stephen Durfey
I believe the easiest way would be to run 'hadoop dfs -du -h /hbase'. I believe 
/hbase is the default storage location for tables in hdfs. The size will be 
either compressed or uncompressed, depending upon if compression is enabled. 






On Wed, Mar 30, 2016 at 6:32 PM -0700, "marjana"  wrote:










Hello,
I am new to hBase, so sorry if I am talking nonsense.

I am trying to figure out a way how to find the total size of each table in
my hBase.
I have looked into hbase shell commands. There's "status 'detailed'", that
shows storefileSizeMB. If I were to add all of these grouped by tablename,
would that be the correct way to show MB used per table?
Is there any other (easier/cleaner) way?
hbase version is 1.1.1.2.3, HDFS: 2.7.1
Thanks
Marjana



--
View this message in context: 
http://apache-hbase.679495.n3.nabble.com/find-size-of-each-table-in-the-cluster-tp4078899.html
Sent from the HBase User mailing list archive at Nabble.com.