[ 
https://issues.apache.org/jira/browse/HCATALOG-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13466621#comment-13466621
 ] 

Francis Liu commented on HCATALOG-513:
--------------------------------------

Changed the patch to use mapred commitJob() instead:

{code}
+            for (JobContext context : contextDiscoveredByPath.values()) {
+                new 
JobConf(context.getConfiguration()).getOutputCommitter().commitJob(context);
+            }
{code}
                
> Data Store onto HCatalog table fails for dynamic partitioning as the 
> temporary directory gets deleted by the completed map tasks
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HCATALOG-513
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-513
>             Project: HCatalog
>          Issue Type: Bug
>    Affects Versions: 0.4, 0.5
>            Reporter: Arup Malakar
>            Assignee: Arup Malakar
>         Attachments: HCATALOG-513-branch-0.4-1.patch, 
> HCATALOG-513-branch-0.4.patch, HCATALOG-513-trunk-1.patch, 
> HCATALOG-513-trunk.patch
>
>
> When dynamic partitioning is used the map tasks share the same temporary 
> directory. So in case of large data the first map task to finish removes it's 
> temporary directory in HDFS making the directory unavailable for the tasks 
> that are still running or about to run.
> The following exception is thrown:
> {code}
> 2012-09-26 23:17:57,628 ERROR 
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException 
> as:malakar cause:org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on 
> /user/hive/warehouse/page_views_2000000000_0/_DYN0.8928740061409304/action=0/_temporary/_attempt_201208301839_0165_m_000001_0/part-m-00001
>  File does not exist. Holder DFSClient_attempt_201208301839_0165_m_000001_0 
> does not have any open files.
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1631)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1622)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:1677)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:1665)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:718)
>       at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
> 2012-09-26 23:17:57,629 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on 
> /user/hive/warehouse/page_views_2000000000_0/_DYN0.8928740061409304/action=0/_temporary/_attempt_201208301839_0165_m_000001_0/part-m-00001
>  File does not exist. Holder DFSClient_attempt_201208301839_0165_m_000001_0 
> does not have any open files.
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1631)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1622)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:1677)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:1665)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:718)
>       at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>       at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>       at $Proxy7.complete(Unknown Source)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>       at $Proxy7.complete(Unknown Source)
>       at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3900)
>       at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3815)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
>       at org.apache.hadoop.hive.ql.io.RCFile$Writer.close(RCFile.java:1033)
>       at 
> org.apache.hadoop.hive.ql.io.RCFileOutputFormat$1.close(RCFileOutputFormat.java:92)
>       at 
> org.apache.hcatalog.mapreduce.FileRecordWriterContainer.close(FileRecordWriterContainer.java:141)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.close(PigOutputFormat.java:149)
>       at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
>       at org.apache.hadoop.mapred.Child.main(Child.java:249)
> {code}
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to