[ 
https://issues.apache.org/jira/browse/HBASE-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Kennedy updated HBASE-3445:
---------------------------------

    Fix Version/s:     (was: 0.90.1)
                   0.90.0

Actually, let me qualify that last statement.  By "swallow" i didn't mean to 
imply that the exceptions should be completely silent. In fact some WARN output 
in that CatalogTracker exception handling would make sense. 

Something like:

"Unable to connect to .meta region at 192.168.1.2:60020. Waiting for 
RegionServers to update location data."

> Master crashes on data that was moved from different host
> ---------------------------------------------------------
>
>                 Key: HBASE-3445
>                 URL: https://issues.apache.org/jira/browse/HBASE-3445
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.0
>            Reporter: James Kennedy
>            Priority: Critical
>             Fix For: 0.90.0
>
>         Attachments: 3445_0.90.0.patch
>
>
> While testing an upgrade to 0.90.0 RC3 I noticed that if I seeded our test 
> data on one machine and transferred to another machine the HMaster on the new 
> machine dies on startup.
> Based on the following stack trace it looks as though it is attempting to 
> find the .meta region with the ip address of the original machine.  Instead 
> of waiting around for RegionServer's to register with new location data, 
> HMaster throws it's hands up with a FATAL exception.
> Note that deleting the zookeeper dir makes no difference.
> Also note that so far I have only reproduced this in my own environment using 
> the hbase-trx extension of HBase and an ApplicationStarter that starts the 
> Master and RegionServer together in the same JVM.  While the issue seems 
> likely isolated from those factors it is far from a vanilla HBase environment.
> I will spend some time trying to reproduce the issue in a proper hbase test.  
> But perhaps someone can beat me to it?  How do I simulate the IP switch? May 
> require a data.tar upload. 
> [14/01/11 10:45:20] 6396   [     Thread-298] ERROR 
> server.quorum.QuorumPeerConfig  - Invalid configuration, only one server 
> specified (ignoring)
> [14/01/11 10:45:21] 7178   [           main] INFO  
> ion.service.HBaseRegionService  - troove> region port:       60010
> [14/01/11 10:45:21] 7180   [           main] INFO  
> ion.service.HBaseRegionService  - troove> region interface:  
> org.apache.hadoop.hbase.ipc.IndexedRegionInterface
> [14/01/11 10:45:21] 7180   [           main] INFO  
> ion.service.HBaseRegionService  - troove> root dir: 
> hdfs://localhost:8701/hbase
> [14/01/11 10:45:21] 7180   [           main] INFO  
> ion.service.HBaseRegionService  - troove> Initializing region server.
> [14/01/11 10:45:21] 7631   [           main] INFO  
> ion.service.HBaseRegionService  - troove> Starting region server thread.
> [14/01/11 10:46:54] 100764 [        HMaster] FATAL 
> he.hadoop.hbase.master.HMaster  - Unhandled exception. Starting shutdown.
> java.net.SocketTimeoutException: 20000 millis timeout while waiting for 
> channel to be ready for connect. ch : 
> java.nio.channels.SocketChannel[connection-pending 
> remote=192.168.1.102/192.168.1.102:60020]
>       at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:213)
>       at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
>       at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:311)
>       at 
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:865)
>       at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:732)
>       at 
> org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:258)
>       at $Proxy14.getProtocolVersion(Unknown Source)
>       at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
>       at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
>       at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
>       at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
>       at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:954)
>       at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:384)
>       at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:283)
>       at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:478)
>       at 
> org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:435)
>       at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:382)
>       at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:277)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to