[ https://issues.apache.org/jira/browse/HBASE-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054642#comment-13054642 ]
Gary Helmling commented on HBASE-451: ------------------------------------- TestTableMapReduce seems to be hanging consistently on a master abort. From the logs, it looks like it's related to these changes, particularly: {noformat} 2011-06-24 12:30:29,159 INFO [PRI IPC Server handler 9 on 42359] regionserver.HRegionServer(2300): Received request to open 25 region(s) 2011-06-24 12:30:29,159 INFO [PRI IPC Server handler 9 on 42359] regionserver.HRegionServer(2283): Received request to open region: mrtest,,1308943786187.294c624b4dca488152adc33fb47ffba6. 2011-06-24 12:30:29,166 DEBUG [PRI IPC Server handler 9 on 42359] util.FSUtils(943): Exception during readTableDecriptor. Current table name = mrtest java.io.IOException: Cannot open filename /user/ghelmling/mrtest/.tableinfo at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1527) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1518) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:384) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:178) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:356) at org.apache.hadoop.hbase.util.FSUtils.getTableDescriptor(FSUtils.java:953) at org.apache.hadoop.hbase.util.FSUtils.getTableDescriptor(FSUtils.java:938) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:130) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:99) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2286) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegions(HRegionServer.java:2301) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:312) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1065) 2011-06-24 12:30:29,169 FATAL [ghelmling-laptop.local,52891,1308943821042-StartupBulkAssigner-0] master.HMaster(1198): Uncaught exception in ghelmling-laptop.local,52891,1308943821042-StartupBulkAssigner-0 java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.TableExistsException: No descriptor for mrtest at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:134) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:99) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2286) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegions(HRegionServer.java:2301) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:312) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1065) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1088) at org.apache.hadoop.hbase.master.AssignmentManager$SingleServerBulkAssigner.run(AssignmentManager.java:1654) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.TableExistsException: No descriptor for mrtest at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:134) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:99) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2286) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegions(HRegionServer.java:2301) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:312) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1065) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:837) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:141) at $Proxy10.openRegions(Unknown Source) at org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:422) at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1075) ... 4 more 2011-06-24 12:30:29,170 INFO [ghelmling-laptop.local,52891,1308943821042-StartupBulkAssigner-0] master.HMaster(1323): Aborting 2011-06-24 12:30:29,263 INFO [ghelmling-laptop.local,52891,1308943821042.splitLogManagerTimeoutMonitor] hbase.Chore(79): ghelmling-laptop.local,52891,1308943821042.splitLogManagerTimeoutMonitor exiting {noformat} Eventually this causes a region server to spin when the cluster attempts to shutdown: {noformat} 2011-06-24 12:31:09,767 DEBUG [RegionServer:0;ghelmling-laptop.local,42359,1308943821244] regionserver.HRegionServer(1485): No master found; retry 2011-06-24 12:31:10,768 DEBUG [RegionServer:0;ghelmling-laptop.local,42359,1308943821244] regionserver.HRegionServer(1485): No master found; retry 2011-06-24 12:31:11,769 DEBUG [RegionServer:0;ghelmling-laptop.local,42359,1308943821244] regionserver.HRegionServer(1485): No master found; retry 2011-06-24 12:31:12,769 DEBUG [RegionServer:0;ghelmling-laptop.local,42359,1308943821244] regionserver.HRegionServer(1485): No master found; retry {noformat} > Remove HTableDescriptor from HRegionInfo > ---------------------------------------- > > Key: HBASE-451 > URL: https://issues.apache.org/jira/browse/HBASE-451 > Project: HBase > Issue Type: Improvement > Components: master, regionserver > Affects Versions: 0.2.0 > Reporter: Jim Kellerman > Assignee: Subbu M Iyer > Priority: Critical > Fix For: 0.92.0 > > Attachments: 451_support_for_removing_HTD_from_HRI_trunk.txt, > HBASE-451-Fixed_broken_TestAdmin.patch, > HBASE-451-Fixed_broken_TestAdmin1.patch, > HBASE-451_-_First_draft_support_for_removing_HTD_from_HRI1.patch, > HBASE-451_-_Fourth_draft_support_for_removing_HTD_from_HRI.patch, > HBASE-451_-_Second_draft_-_Remove_HTD_from_HRI.patch, descriptors.txt, > fixtestadmin.txt, pass_htd_on_region_construction.txt > > > There is an HRegionInfo for every region in HBase. Currently HRegionInfo also > contains the HTableDescriptor (the schema). That means we store the schema n > times where n is the number of regions in the table. > Additionally, for every region of the same table that the region server has > open, there is a copy of the schema. Thus it is stored in memory once for > each open region. > If HRegionInfo merely contained the table name the HTableDescriptor could be > stored in a separate file and easily found. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira