Re: FileSystem Closed.
Thank you I've been researching based on your opinions, and found the below two solutions. These are the answers for who has FileSystem.closed issue like me. - close it in your cleanup method and you have JVM reuse turned on (mapred.job.reuse.jvm.num.tasks) - set fs.hdfs.impl.disable,cache' to turn in the conf, and new instances don't get cached. Do you think they will work on my problem? 2012/7/12 Aniket Mokashi aniket...@gmail.com Can you share your query and use case? ~Aniket On Tue, Jul 10, 2012 at 9:39 AM, Harsh J ha...@cloudera.com wrote: This appears to be a Hive issue (something probably called FS.close() too early?). Redirecting to the Hive user lists as they can help better with this. On Tue, Jul 10, 2012 at 9:59 PM, 안의건 ahneui...@gmail.com wrote: Hello. I have a problem with the filesystem closing. The filesystem was closed when the hive query is running. It is 'select' query and the data size is about 1TB. I'm using hadoop-0.20.2 and hive-0.7.1. The error log is telling that tmp file is not deleted, or the tmp path exception is occurred. Is there any hadoop configuration I'm missing? Thank you [stderr logs] org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Filesystem closed at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:454) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:636) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:226) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:617) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:453) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:648) at org.apache.hadoop.fs.FileSystem.deleteOnExit(FileSystem.java:615) at org.apache.hadoop.hive.shims.Hadoop20Shims.fileSystemDeleteOnExit(Hadoop20Shims.java:68) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:451) ... 12 more -- Harsh J -- ...:::Aniket:::... Quetzalco@tl
FileSystem Closed.
Hello. I have a problem with the filesystem closing. The filesystem was closed when the hive query is running. It is 'select' query and the data size is about 1TB. I'm using hadoop-0.20.2 and hive-0.7.1. The error log is telling that tmp file is not deleted, or the tmp path exception is occurred. Is there any hadoop configuration I'm missing? Thank you [stderr logs] org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Filesystem closed at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:454) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:636) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:226) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:617) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:453) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:648) at org.apache.hadoop.fs.FileSystem.deleteOnExit(FileSystem.java:615) at org.apache.hadoop.hive.shims.Hadoop20Shims.fileSystemDeleteOnExit(Hadoop20Shims.java:68) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:451) ... 12 more
Re: FileSystem Closed.
This appears to be a Hive issue (something probably called FS.close() too early?). Redirecting to the Hive user lists as they can help better with this. On Tue, Jul 10, 2012 at 9:59 PM, 안의건 ahneui...@gmail.com wrote: Hello. I have a problem with the filesystem closing. The filesystem was closed when the hive query is running. It is 'select' query and the data size is about 1TB. I'm using hadoop-0.20.2 and hive-0.7.1. The error log is telling that tmp file is not deleted, or the tmp path exception is occurred. Is there any hadoop configuration I'm missing? Thank you [stderr logs] org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Filesystem closed at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:454) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:636) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566) at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:226) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:617) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:453) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:648) at org.apache.hadoop.fs.FileSystem.deleteOnExit(FileSystem.java:615) at org.apache.hadoop.hive.shims.Hadoop20Shims.fileSystemDeleteOnExit(Hadoop20Shims.java:68) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:451) ... 12 more -- Harsh J
Re: FileSystem closed
On 29/09/2011 18:02, Joey Echeverria wrote: Do you close your FileSystem instances at all? IIRC, the FileSystem instance you use is a singleton and if you close it once, it's closed for everybody. My guess is you close it in your cleanup method and you have JVM reuse turned on. I've hit this in the past. In 0.21+ you can ask for a new instance explicity. For 0.20.20x, set fs.hdfs.impl.disable.cache to true in the conf, and new instances don't get cached.
Re: FileSystem closed
Do you close your FileSystem instances at all? IIRC, the FileSystem instance you use is a singleton and if you close it once, it's closed for everybody. My guess is you close it in your cleanup method and you have JVM reuse turned on. -Joey On Thu, Sep 29, 2011 at 12:49 PM, Mark question markq2...@gmail.com wrote: Hello, I'm running 100 mappers sequentially on a single machine, where each mapper opens 100 files at the beginning then read one by one sequentially and closes after each one is done. After executing 6 mappers, the 7th gives this error: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:297) at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:426) at java.io.FilterInputStream.close(FilterInputStream.java:155) at org.apache.hadoop.io.SequenceFile$Reader.close(SequenceFile.java:1653) at Mapper_Reader20HM4.CleanUp(Mapper_Reader20HM4.java:124) at BFMapper20HM9.close(BFMapper20HM9.java:264) at BFMapRunner20HM9.run(BFMapRunner20HM9.java:95) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:397) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:297) at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:426) at java.io.FilterInputStream.close(FilterInputStream.java:155) at org.apache.hadoop.io.SequenceFile$Reader.close(SequenceFile.java:1653) at Mapper_Reader20HM4.CleanUp(Mapper_Reader20HM4.java:124) at BFMapper20HM9.close(BFMapper20HM9.java:264) at BFMapRunner20HM9.run(BFMapRunner20HM9.java:95) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:397) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:297) at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:426) at java.io.FilterInputStream.close(FilterInputStream.java:155) at org.apache.hadoop.io.SequenceFile$Reader.close(SequenceFile.java:1653) at Mapper_Reader20HM4.CleanUp(Mapper_Reader20HM4.java:124) at BFMapper20HM9.close(BFMapper20HM9.java:264) at BFMapRunner20HM9.run(BFMapRunner20HM9.java:95) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:397) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:297) at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:426) at java.io.FilterInputStream.close(FilterInputStream.java:155) at org.apache.hadoop.io.SequenceFile$Reader.close(SequenceFile.java:1653) at Mapper_Reader20HM4.CleanUp(Mapper_Reader20HM4.java:124) at BFMapper20HM9.close(BFMapper20HM9.java:264) at BFMapRunner20HM9.run(BFMapRunner20HM9.java:95) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:397) Can anybody give me a hint of what that could be? Thank you, Mark -- Joseph Echeverria Cloudera, Inc. 443.305.9434
Re: FileSystem closed
FileSystem objects will be cached in jvm. When it tries to get the FS object by using Filesystem.get(..) ( sequence file internally will use it), it will return same fs object if scheme and authority is same for the uri. fs cache key's equals implementation is below static boolean isEqual(Object a, Object b) { return a == b || (a != null a.equals(b)); } /** {@inheritDoc} */ public boolean equals(Object obj) { if (obj == this) { return true; } if (obj != null obj instanceof Key) { Key that = (Key)obj; return isEqual(this.scheme, that.scheme) isEqual(this.authority, that.authority) isEqual(this.ugi, that.ugi) (this.unique == that.unique); } return false; } I think, here some your files uri and schems are same and got the same fs object. When it closes first one, diffenitely other will get this exception. Regards, Uma - Original Message - From: Joey Echeverria j...@cloudera.com Date: Thursday, September 29, 2011 10:34 pm Subject: Re: FileSystem closed To: common-user@hadoop.apache.org Do you close your FileSystem instances at all? IIRC, the FileSystem instance you use is a singleton and if you close it once, it's closed for everybody. My guess is you close it in your cleanup method and you have JVM reuse turned on. -Joey On Thu, Sep 29, 2011 at 12:49 PM, Mark question markq2...@gmail.com wrote: Hello, I'm running 100 mappers sequentially on a single machine, where each mapper opens 100 files at the beginning then read one by one sequentially and closes after each one is done. After executing 6 mappers, the 7th gives this error: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:297) at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:426) at java.io.FilterInputStream.close(FilterInputStream.java:155) at org.apache.hadoop.io.SequenceFile$Reader.close(SequenceFile.java:1653) at Mapper_Reader20HM4.CleanUp(Mapper_Reader20HM4.java:124) at BFMapper20HM9.close(BFMapper20HM9.java:264) at BFMapRunner20HM9.run(BFMapRunner20HM9.java:95) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:397) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:297) at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:426) at java.io.FilterInputStream.close(FilterInputStream.java:155) at org.apache.hadoop.io.SequenceFile$Reader.close(SequenceFile.java:1653) at Mapper_Reader20HM4.CleanUp(Mapper_Reader20HM4.java:124) at BFMapper20HM9.close(BFMapper20HM9.java:264) at BFMapRunner20HM9.run(BFMapRunner20HM9.java:95) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:397) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:297) at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:426) at java.io.FilterInputStream.close(FilterInputStream.java:155) at org.apache.hadoop.io.SequenceFile$Reader.close(SequenceFile.java:1653) at Mapper_Reader20HM4.CleanUp(Mapper_Reader20HM4.java:124) at BFMapper20HM9.close(BFMapper20HM9.java:264) at BFMapRunner20HM9.run(BFMapRunner20HM9.java:95) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:397) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:297) at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:426) at java.io.FilterInputStream.close
what does it mean -- java.io.IOException: Filesystem closed
Hi , Running jadoop job from time to time I got such exception (from one of the reducers): The questions are : 1) What does this exception means for the data integrity? 2) Does it mean that part of the data which reducer responsible for (and got exception) are lost? 3) What could cause for such exception? java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:222) at org.apache.hadoop.hdfs.DFSClient.access$600(DFSClient.java:66) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:2948) at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:150) at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132) at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:121) at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:112) at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49) at java.io.DataOutputStream.write(DataOutputStream.java:90) at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.writeObject(TextOutputFormat.java:78) at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.write(TextOutputFormat.java:99) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at com.analytics.hbase.internals.MergeMapReduceHDFSInserter$MergeMapReduceHDFSInserterReducer.reduce(Unknown Source) at com.analytics.hbase.internals.MergeMapReduceHDFSInserter$MergeMapReduceHDFSInserterReducer.reduce(Unknown Source) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:563) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at org.apache.hadoop.mapred.Child.main(Child.java:170) Thanks in Advance Oleg.
Re: what does it mean -- java.io.IOException: Filesystem closed
Hi Oleg, I had the same exception here for both hadoop-0.21.0 and 0.21.0 It looks it is still an open bug in hadoop: https://issues.apache.org/jira/browse/MAPREDUCE-2060, https://issues.apache.org/jira/browse/HDFS-925. I also want to know how to fix it. On 11/2/10 11:38 AM, Oleg Ruchovets wrote: Hi , Running jadoop job from time to time I got such exception (from one of the reducers): The questions are : 1) What does this exception means for the data integrity? 2) Does it mean that part of the data which reducer responsible for (and got exception) are lost? 3) What could cause for such exception? java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:222) at org.apache.hadoop.hdfs.DFSClient.access$600(DFSClient.java:66) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:2948) at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:150) at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132) at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:121) at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:112) at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49) at java.io.DataOutputStream.write(DataOutputStream.java:90) at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.writeObject(TextOutputFormat.java:78) at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.write(TextOutputFormat.java:99) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at com.analytics.hbase.internals.MergeMapReduceHDFSInserter$MergeMapReduceHDFSInserterReducer.reduce(Unknown Source) at com.analytics.hbase.internals.MergeMapReduceHDFSInserter$MergeMapReduceHDFSInserterReducer.reduce(Unknown Source) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:563) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at org.apache.hadoop.mapred.Child.main(Child.java:170) Thanks in Advance Oleg.