[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256210#comment-13256210
 ] 

Lars Hofhansl commented on HBASE-5545:
--------------------------------------

2nd patch looks good. I don't think you needed to the add the FSUtil code, but 
it cannot hurt.
fs.delete does not throw if the file does not exist, right?
Do you want to set recursive to false (just in case somebody changes this 
around ends up pointing to a directory)... This is a super minor nit.

+1 on commit if delete does not throw on non-existant file.
                
> region can't be opened for a long time. Because the creating File failed.
> -------------------------------------------------------------------------
>
>                 Key: HBASE-5545
>                 URL: https://issues.apache.org/jira/browse/HBASE-5545
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.6
>            Reporter: gaojinchao
>            Assignee: gaojinchao
>             Fix For: 0.90.7, 0.92.2, 0.94.0
>
>         Attachments: HBASE-5545.patch, HBASE-5545.patch
>
>
> Scenario:
> ------------
> 1. File is created 
> 2. But while writing data, all datanodes might have crashed. So writing data 
> will fail.
> 3. Now even if close is called in finally block, close also will fail and 
> throw the Exception because writing data failed.
> 4. After this if RS try to create the same file again, then 
> AlreadyBeingCreatedException will come.
> Suggestion to handle this scenario.
> ---------------------------
> 1. Check for the existence of the file, if exists delete the file and create 
> new file. 
> Here delete call for the file will not check whether the file is open or 
> closed.
> Overwrite Option:
> ----------------
> 1. Overwrite option will be applicable if you are trying to overwrite a 
> closed file.
> 2. If the file is not closed, then even with overwrite option Same 
> AlreadyBeingCreatedException will be thrown.
> This is the expected behaviour to avoid the Multiple clients writing to same 
> file.
> Region server logs:
> org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
> create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
> for 
> DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
> on client 158.1.132.19 because current leaseholder is trying to recreate file.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
> at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
> at org.apache.hadoop.ipc.Client.call(Client.java:961)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
> at $Proxy6.create(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at $Proxy6.create(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:3643)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
> at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
> at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> [2012-03-07 20:51:45,858] [WARN ] 
> [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
> [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the 
> method call: public abstract void 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.create(java.lang.String,org.apache.hadoop.fs.permission.FsPermission,java.lang.String,boolean,boolean,short,long)
>  throws java.io.IOException with arguments of length: 7. The exisiting 
> ActiveServerConnection is:
> ActiveServerConnectionInfo:
> Metadata:158-1-131-48/158.1.132.19:9000
> Version:145720623220907
> [2012-03-07 20:51:45,872] [DEBUG] 
> [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-20] 
> [org.apache.hadoop.hbase.zookeeper.ZKAssign 849] 
> regionserver:20020-0x135ec32b39e0002-0x135ec32b39e0002 Successfully 
> transitioned node 91bf3e6f8adb2e4b335f061036353126 from M_ZK_REGION_OFFLINE 
> to RS_ZK_REGION_OPENING
> [2012-03-07 20:51:45,873] [DEBUG] 
> [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-20] 
> [org.apache.hadoop.hbase.regionserver.HRegion 2649] Opening region: REGION => 
> {NAME => 'test1,00088613810,1331112369360.91bf3e6f8adb2e4b335f061036353126.', 
> STARTKEY => '00088613810', ENDKEY => '00088613815', ENCODED => 
> 91bf3e6f8adb2e4b335f061036353126, TABLE => {{NAME => 'test1', FAMILIES => 
> [{NAME => 'value', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS 
> => '3', COMPRESSION => 'GZ', TTL => '86400', BLOCKSIZE => '655360', IN_MEMORY 
> => 'false', BLOCKCACHE => 'true'}]}}
> [2012-03-07 20:51:45,873] [DEBUG] 
> [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-20] 
> [org.apache.hadoop.hbase.regionserver.HRegion 316] Instantiated 
> test1,00088613810,1331112369360.91bf3e6f8adb2e4b335f061036353126.
> [2012-03-07 20:51:45,874] [ERROR] 
> [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-20] [ 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to