[ https://issues.apache.org/jira/browse/SQOOP-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Boglarka Egyed updated SQOOP-3243: ---------------------------------- Attachment: (was: SQOOP-3243.patch) > Importing BLOB data causes "Stream closed" error on encrypted HDFS > ------------------------------------------------------------------ > > Key: SQOOP-3243 > URL: https://issues.apache.org/jira/browse/SQOOP-3243 > Project: Sqoop > Issue Type: Bug > Affects Versions: 1.4.6 > Reporter: Boglarka Egyed > Assignee: Boglarka Egyed > Attachments: SQOOP-3243.patch > > > Importing BLOB data into encrypted zone causes "Stream closed" error with > * BLOB data size bigger than 16MB -> LobFile will be used > * Java 8 -> has a different implementation of the close() method of > FilterOutputStream class than Java 7 > Exception and stack trace: > {noformat} > 17/10/12 07:16:04 INFO mapreduce.Job: Running job: job_1507777811520_5091 > 17/10/12 07:16:13 INFO mapreduce.Job: Job job_1507777811520_5091 running in > uber mode : false > 17/10/12 07:16:13 INFO mapreduce.Job: map 0% reduce 0% > 17/10/12 07:22:37 INFO mapreduce.Job: Task Id : > attempt_1507777811520_5091_m_000000_0, Status : FAILED > Error: java.io.IOException: Stream closed > at > org.apache.hadoop.crypto.CryptoOutputStream.checkStream(CryptoOutputStream.java:268) > at > org.apache.hadoop.crypto.CryptoOutputStream.flush(CryptoOutputStream.java:255) > at java.io.FilterOutputStream.flush(FilterOutputStream.java:140) > at java.io.DataOutputStream.flush(DataOutputStream.java:123) > at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:141) > at java.io.FilterOutputStream.close(FilterOutputStream.java:158) > at > org.apache.commons.io.output.ProxyOutputStream.close(ProxyOutputStream.java:117) > at org.apache.sqoop.io.LobFile$V0Writer.close(LobFile.java:1669) > at org.apache.sqoop.lib.LargeObjectLoader.close(LargeObjectLoader.java:96) > at > org.apache.sqoop.mapreduce.AvroImportMapper.cleanup(AvroImportMapper.java:79) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:148) > at > org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > {noformat} > The root cause of this issue is in LobFile.close method, which is being > invoked from the Map cleanup. In line 1669, from the stacktrace, it's trying > to close countingOut OS. However, at line 1664, out OS is already being > closed. However, out OS is just a wrapper of countingOut OS, so at the end, > both are pointing to same instance of CryptoOutputStream. When the call > reaches line 1669, CryptoOutputStream instance is already closed by line > 1664. The problem happens because java.io.BufferedOutputStream will try to > call flush on the underlying OS it's wrapping (in this case, > CryptoOutputStream), reaching line 255 of CryptoOutputStream. -- This message was sent by Atlassian JIRA (v6.4.14#64029)