[
https://issues.apache.org/jira/browse/HIVE-2485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
J. Chris Folsom updated HIVE-2485:
----------------------------------
Attachment: CopyTask.java
A version of CopyTask that closes the source and destination FileSystem after
opening them.
> LOAD DATA LOCAL Leaves Files Hanging Open
> -----------------------------------------
>
> Key: HIVE-2485
> URL: https://issues.apache.org/jira/browse/HIVE-2485
> Project: Hive
> Issue Type: Bug
> Components: CLI
> Affects Versions: 0.7.1
> Environment: Linux - Hadoop 0.20.204 Hive 0.7.0, Hive 0.7.1, Hive
> 0.8.0, trunk
> Reporter: J. Chris Folsom
> Attachments: CopyTask.java
>
>
> When loading multiple files using the statement:
> DROP TABLE partition_test;
> CREATE TABLE partition_test (key INT, value STRING) PARTITIONED BY (ts
> STRING) STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH 'file:/myfile.txt' OVERWRITE INTO TABLE partition_test
> PARTITION(ts='1');
> LOAD DATA LOCAL INPATH 'file:/myfile2.txt' OVERWRITE INTO TABLE
> partition_test PARTITION(ts='2');
> LOAD DATA LOCAL INPATH 'file:/myfile3.txt' OVERWRITE INTO TABLE
> partition_test PARTITION(ts='3');
> etc. up to 250...
> Hive CLI fails with error:
> Failed with exception null
> hive.log:
> 2011-10-05 15:07:11,899 WARN hdfs.DFSClient
> (DFSClient.java:processDatanodeError(2667)) - Error Recovery for block
> blk_-7990368440974156305_1721 bad datanode[0] nodes == null
> 2011-10-05 15:07:11,899 WARN hdfs.DFSClient
> (DFSClient.java:processDatanodeError(2695)) - Could not get block locations.
> Source file
> "/tmp/hive-cfolsom/hive_2011-10-05_15-06-47_812_4664863850423838867/-ext-10000/kv1.txt"
> - Aborting...
> 2011-10-05 15:07:11,900 ERROR exec.Task (SessionState.java:printError(365)) -
> Failed with exception null
> java.io.EOFException
> at java.io.DataInputStream.readShort(DataInputStream.java:298)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:3065)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2988)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2260)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2451)
> 2011-10-05 15:07:11,901 ERROR hdfs.DFSClient (DFSClient.java:close(1125)) -
> Exception closing file
> /tmp/hive-cfolsom/hive_2011-10-05_15-06-47_812_4664863850423838867/-ext-10000/kv1.txt
> : java.io.EOFException
> java.io.EOFException
> at java.io.DataInputStream.readShort(DataInputStream.java:298)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:3065)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2988)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2260)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2451)
> 2011-10-05 15:07:12,002 ERROR ql.Driver (SessionState.java:printError(365)) -
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.CopyTask
> Hadoop datanode log: there are too many open files:
> java.io.FileNotFoundException:
> /tmp/hadoop-hadoop/dfs/data/current/subdir4/blk_207900366872942737 (Too many
> open files)
> at java.io.RandomAccessFile.open(Native Method)
> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
> at
> org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockInputStream(FSDataset.java:862)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:166)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:189)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:99)
> at java.lang.Thread.run(Thread.java:619)
> 2011-10-05 15:06:14,532 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(127.0.0.1:50010,
> storageID=DS-1718046113-127.0.0.1-50010-1317838257821, infoPort=50075,
> ipcPort=50020):DataXceiver
> java.io.FileNotFoundException:
> /tmp/hadoop-hadoop/dfs/data/current/subdir4/blk_207900366872942737 (Too many
> open files)
> at java.io.RandomAccessFile.open(Native Method)
> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
> at
> org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockInputStream(FSDataset.java:862)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:166)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:189)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:99)
> at java.lang.Thread.run(Thread.java:619)
> 2011-10-05 15:06:14,772 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:
> java.io.IOException: Call to localhost/127.0.0.1:9000 failed on local
> exception: java.io.IOException: Too many open files
> at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
> at org.apache.hadoop.ipc.Client.call(Client.java:1033)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
> at $Proxy5.sendHeartbeat(Unknown Source)
> at
> org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:853)
> at
> org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1356)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.IOException: Too many open files
> at sun.nio.ch.EPollArrayWrapper.epollCreate(Native Method)
> at sun.nio.ch.EPollArrayWrapper.<init>(EPollArrayWrapper.java:69)
> at sun.nio.ch.EPollSelectorImpl.<init>(EPollSelectorImpl.java:52)
> at
> sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18)
> at
> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.get(SocketIOWithTimeout.java:407)
> at
> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:322)
> at
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
> at
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
> at
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
> at java.io.FilterInputStream.read(FilterInputStream.java:116)
> at
> org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:343)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
> at java.io.DataInputStream.readInt(DataInputStream.java:370)
> at
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:767)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)
> 2011-10-05 15:06:17,535 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(127.0.0.1:50010,
> storageID=DS-1718046113-127.0.0.1-50010-1317838257821, infoPort=50075,
> ipcPort=50020):Got exception while serving blk_207900366872942737_1719 to
> /127.0.0.1:
> java.io.FileNotFoundException:
> /tmp/hadoop-hadoop/dfs/data/current/subdir4/blk_207900366872942737_1719.meta
> (Too many open files)
> at java.io.FileInputStream.open(Native Method)
> at java.io.FileInputStream.<init>(FileInputStream.java:106)
> at
> org.apache.hadoop.hdfs.server.datanode.FSDataset.getMetaDataInputStream(FSDataset.java:751)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:97)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:189)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:99)
> at java.lang.Thread.run(Thread.java:619)
> 2011-10-05 15:06:17,535 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(127.0.0.1:50010,
> storageID=DS-1718046113-127.0.0.1-50010-1317838257821, infoPort=50075,
> ipcPort=50020):DataXceiver
> java.io.FileNotFoundException:
> /tmp/hadoop-hadoop/dfs/data/current/subdir4/blk_207900366872942737_1719.meta
> (Too many open files)
> at java.io.FileInputStream.open(Native Method)
> at java.io.FileInputStream.<init>(FileInputStream.java:106)
> at
> org.apache.hadoop.hdfs.server.datanode.FSDataset.getMetaDataInputStream(FSDataset.java:751)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:97)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:189)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:99)
> at java.lang.Thread.run(Thread.java:619)
> 2011-10-05 15:06:18,771 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: localhost/127.0.0.1:9000. Already tried 0 time(s).
> 2011-10-05 15:06:19,772 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: localhost/127.0.0.1:9000. Already tried 1 time(s).
> 2011-10-05 15:06:20,539 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(127.0.0.1:50010,
> storageID=DS-1718046113-127.0.0.1-50010-1317838257821, infoPort=50075,
> ipcPort=50020):Got exception while serving blk_207900366872942737_1719 to
> /127.0.0.1:
> java.io.FileNotFoundException:
> /tmp/hadoop-hadoop/dfs/data/current/subdir4/blk_207900366872942737_1719.meta
> (Too many open files)
> at java.io.FileInputStream.open(Native Method)
> at java.io.FileInputStream.<init>(FileInputStream.java:106)
> at
> org.apache.hadoop.hdfs.server.datanode.FSDataset.getMetaDataInputStream(FSDataset.java:751)
> at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:97)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:189)
> at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:99)
> at java.lang.Thread.run(Thread.java:619)
> a quick lsof reveals that the datanode pid has 1092 open files.
> At first, I suspected that this was because the CopyTask was opening
> instances of FileSystem that it did not close, but updating CopyTask to make
> sure that the source and destination FileSystems were closed did not resolve
> the issue. I suspect that some other task has left files hanging open
> elsewhere. In any case, this causes more than a bit of a problem as the
> datanode is completely unavailable after a number of file loads. The process
> is still running, but the node is effectively dead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira