Boglarka Egyed created SQOOP-3243:
-------------------------------------
Summary: Importing LOB data causes "Stream closed" error on
encrypted HDFS
Key: SQOOP-3243
URL: https://issues.apache.org/jira/browse/SQOOP-3243
Project: Sqoop
Issue Type: Bug
Affects Versions: 1.4.6
Reporter: Boglarka Egyed
Importing LOB data into encrypted zone causes "Stream closed" error:
{noformat}
17/10/12 07:16:04 INFO mapreduce.Job: Running job: job_1507777811520_5091
17/10/12 07:16:13 INFO mapreduce.Job: Job job_1507777811520_5091 running in
uber mode : false
17/10/12 07:16:13 INFO mapreduce.Job: map 0% reduce 0%
17/10/12 07:22:37 INFO mapreduce.Job: Task Id :
attempt_1507777811520_5091_m_000000_0, Status : FAILED
Error: java.io.IOException: Stream closed
at
org.apache.hadoop.crypto.CryptoOutputStream.checkStream(CryptoOutputStream.java:268)
at
org.apache.hadoop.crypto.CryptoOutputStream.flush(CryptoOutputStream.java:255)
at java.io.FilterOutputStream.flush(FilterOutputStream.java:140)
at java.io.DataOutputStream.flush(DataOutputStream.java:123)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:141)
at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
at
org.apache.commons.io.output.ProxyOutputStream.close(ProxyOutputStream.java:117)
at org.apache.sqoop.io.LobFile$V0Writer.close(LobFile.java:1669)
at org.apache.sqoop.lib.LargeObjectLoader.close(LargeObjectLoader.java:96)
at org.apache.sqoop.mapreduce.AvroImportMapper.cleanup(AvroImportMapper.java:79)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:148)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
{noformat}
The root cause of this issue seems to be in LobFile.close method, which is
being invoked from the Map cleanup. In line 1669, from the stacktrace, it's
trying to close countingOut OS. However, at line 1664, out OS is already being
closed. However, out OS is just a wrapper of countingOut OS, so at the end,
both are pointing to same instance of CryptoOutputStream. When the call reaches
line 1669, CryptoOutputStream instance is already closed by line 1664. The
problem happens because java.io.BufferedOutputStream will try to call flush on
the underlying OS it's wrapping (in this case, CryptoOutputStream), reaching
line 255 of CryptoOutputStream.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)