Anwers inline. J-D
On Tue, Sep 1, 2009 at 10:53 AM, Xine Jar<[email protected]> wrote: > Hallo, > I have a cluster of 6 nodes running hadoop0.19.3 and hbase 0.19.1. I have > managed to write small programs to test the settings and everything seems to > be fine. > > I wrote a mapreduce program reading a small hbase table (100 rows, one > familiy colum, 6 columns) and summing some values. In my opinion the job is > slow, it > is taking 19sec. I would like to look closer what is going, if the table is > plit into tablets or not ...Therefore I appreciate if someone can answer my > following questions: With that size, that's expected. You would be better off scanning your table directly instead, MapReduce has a startup cost and 19 seconds isn't that much. > > > *Q1 -Does the value of "hbase.hregion.max.filesize" in the > hbase-default.xml indicate the maximum size of a tablet in bytes? It's the maximum size of a family (in a region) in bytes. > > Q2- How can I know the size of the hbase table I have created? (I guess the > "Describe" command from the shell does not provide it) Size as in disk space? You could use the hadoop dfs -du command on your table's folder. > > Q3- Is there a way to know the real number of tablets constituting my table? In the Master's web UI, click on the name of your table. If you want to do that programmatically, you can indirectly do it by calling HTable.getEndKeys() and the size of that array is the number of regions. > > Q4- Is there a way to get more information on the tablets handeled by each > regionserver? (their number, the rows constituting each tablet) * In the Master's web UI, click on the region server you want info for. Getting the number of rows inside a region, for the moment, can't be done directly (requires doing a scan between the start and end keys of a region and counting the number of rows you see). > > Thank you for you help, > CJ >
