[ 
https://issues.apache.org/jira/browse/HBASE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13695882#comment-13695882
 ] 

Enis Soztutar commented on HBASE-8778:
--------------------------------------

bq. Where does this table's descriptor information get stored?
I don't know of any design plan for this yet. But I imagine we can store them 
in serialized form in some column. Table name becomes the row key, and 
info:tableInfo column contains serialized table info. This can mimic the META 
structure. 
bq. What does the bootstrap look like? 
META's HTD is already hardcoded in the source code. We can do the same for 
TABLE table, just hardcoding the HTD in source code. Make sure that it is 
assigned first before accepting any changes to table metadata. I guess we have 
to assign META first, then TABLE table, then user regions. 
bq. What happens if this table is corrupted somehow? Currently, hbck can 
recreate META entries if they are lost, but is there any other place to recover 
table descriptor information?
We keep regionInfo's serialized in hdfs and in META. HBCK can repair the META 
regionInfos from hdfs. Maybe we can do the same here for tableinfos as well. We 
can keep the current logic in hdfs tableInfos, but serialize them in TABLE 
table as well.  I think having multiple sources of truth hurts us though. 
Ideally, we should not need to keep this data in multiple places. 
                
> Region assigments scan table directory making them slow for huge tables
> -----------------------------------------------------------------------
>
>                 Key: HBASE-8778
>                 URL: https://issues.apache.org/jira/browse/HBASE-8778
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Dave Latham
>         Attachments: HBASE-8778-0.94.5.patch, HBASE-8778-0.94.5-v2.patch
>
>
> On a table with 130k regions it takes about 3 seconds for a region server to 
> open a region once it has been assigned.
> Watching the threads for a region server running 0.94.5 that is opening many 
> such regions shows the thread opening the reigon in code like this:
> {noformat}
> "PRI IPC Server handler 4 on 60020" daemon prio=10 tid=0x00002aaac07e9000 
> nid=0x6566 runnable [0x000000004c46d000]
>    java.lang.Thread.State: RUNNABLE
>         at java.lang.String.indexOf(String.java:1521)
>         at java.net.URI$Parser.scan(URI.java:2912)
>         at java.net.URI$Parser.parse(URI.java:3004)
>         at java.net.URI.<init>(URI.java:736)
>         at org.apache.hadoop.fs.Path.initialize(Path.java:145)
>         at org.apache.hadoop.fs.Path.<init>(Path.java:126)
>         at org.apache.hadoop.fs.Path.<init>(Path.java:50)
>         at 
> org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:215)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:252)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:311)
>         at 
> org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:159)
>         at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:842)
>         at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:867)
>         at org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1168)
>         at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:269)
>         at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:255)
>         at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoModtime(FSTableDescriptors.java:368)
>         at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:155)
>         at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:126)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2834)
>         at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2807)
>         at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
>         at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
> {noformat}
> To open the region, the region server first loads the latest 
> HTableDescriptor.  Since HBASE-4553 HTableDescriptor's are stored in the file 
> system at "/hbase/<tableDir>/.tableinfo.<sequenceNum>".  The file with the 
> largest sequenceNum is the current descriptor.  This is done so that the 
> current descirptor is updated atomically.  However, since the filename is not 
> known in advance FSTableDescriptors it has to do a FileSystem.listStatus 
> operation which has to list all files in the directory to find it.  The 
> directory also contains all the region directories, so in our case it has to 
> load 130k FileStatus objects.  Even using a globStatus matching function 
> still transfers all the objects to the client before performing the pattern 
> matching.  Furthermore HDFS uses a default of transferring 1000 directory 
> entries in each RPC call, so it requires 130 roundtrips to the namenode to 
> fetch all the directory entries.
> Consequently, to reassign all the regions of a table (or a constant fraction 
> thereof) requires time proportional to the square of the number of regions.
> In our case, if a region server fails with 200 such regions, it takes 10+ 
> minutes for them all to be reassigned, after the zk expiration and log 
> splitting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to