[jira] [Updated] (HDFS-12612) DFSStripedOutputStream#close will throw if called a second time with a failed streamer

2017-10-17 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-12612:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks a lot for the reviews [~xiaochen] and [~andrew.wang].

Committed to trunk and branch-3.0

> DFSStripedOutputStream#close will throw if called a second time with a failed 
> streamer
> --
>
> Key: HDFS-12612
> URL: https://issues.apache.org/jira/browse/HDFS-12612
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Andrew Wang
>Assignee: Lei (Eddy) Xu
>  Labels: hdfs-ec-3.0-must-do
> Fix For: 3.0.0
>
> Attachments: HDFS-12612.00.patch, HDFS-12612.01.patch, 
> HDFS-12612.02.patch, HDFS-12612.03.patch
>
>
> Found while testing with Hive. We have a cluster with 2 DNs and the XOR-2-1 
> policy. If you write a file and call close() twice, it throws this exception:
> {noformat}
> 17/10/04 16:02:14 WARN hdfs.DFSOutputStream: Cannot allocate parity 
> block(index=2, policy=XOR-2-1-1024k). Not enough datanodes? Exclude nodes=[]
> ...
> Caused by: java.io.IOException: Failed to get parity block, index=2
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.allocateNewBlock(DFSStripedOutputStream.java:500)
>  ~[hadoop-hdfs-client-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:524)
>  ~[hadoop-hdfs-client-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> {noformat}
> This is because in DFSStripedOutputStream#closeImpl, if the stream is closed, 
> we throw an exception if any of the striped streamers had an exception:
> {code}
>   protected synchronized void closeImpl() throws IOException {
> if (isClosed()) {
>   final MultipleIOException.Builder b = new MultipleIOException.Builder();
>   for(int i = 0; i < streamers.size(); i++) {
> final StripedDataStreamer si = getStripedDataStreamer(i);
> try {
>   si.getLastException().check(true);
> } catch (IOException e) {
>   b.add(e);
> }
>   }
>   final IOException ioe = b.build();
>   if (ioe != null) {
> throw ioe;
>   }
>   return;
> }
> {code}
> I think this is incorrect, since we only need to throw in this situation if 
> we have too many failed streamers. close should also be idempotent, so it 
> should throw the first time we call close if it's going to throw at all.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12612) DFSStripedOutputStream#close will throw if called a second time with a failed streamer

2017-10-17 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-12612:
-
Attachment: HDFS-12612.03.patch

Fix checkstyle and findbug warnings.

> DFSStripedOutputStream#close will throw if called a second time with a failed 
> streamer
> --
>
> Key: HDFS-12612
> URL: https://issues.apache.org/jira/browse/HDFS-12612
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Andrew Wang
>Assignee: Lei (Eddy) Xu
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-12612.00.patch, HDFS-12612.01.patch, 
> HDFS-12612.02.patch, HDFS-12612.03.patch
>
>
> Found while testing with Hive. We have a cluster with 2 DNs and the XOR-2-1 
> policy. If you write a file and call close() twice, it throws this exception:
> {noformat}
> 17/10/04 16:02:14 WARN hdfs.DFSOutputStream: Cannot allocate parity 
> block(index=2, policy=XOR-2-1-1024k). Not enough datanodes? Exclude nodes=[]
> ...
> Caused by: java.io.IOException: Failed to get parity block, index=2
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.allocateNewBlock(DFSStripedOutputStream.java:500)
>  ~[hadoop-hdfs-client-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:524)
>  ~[hadoop-hdfs-client-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> {noformat}
> This is because in DFSStripedOutputStream#closeImpl, if the stream is closed, 
> we throw an exception if any of the striped streamers had an exception:
> {code}
>   protected synchronized void closeImpl() throws IOException {
> if (isClosed()) {
>   final MultipleIOException.Builder b = new MultipleIOException.Builder();
>   for(int i = 0; i < streamers.size(); i++) {
> final StripedDataStreamer si = getStripedDataStreamer(i);
> try {
>   si.getLastException().check(true);
> } catch (IOException e) {
>   b.add(e);
> }
>   }
>   final IOException ioe = b.build();
>   if (ioe != null) {
> throw ioe;
>   }
>   return;
> }
> {code}
> I think this is incorrect, since we only need to throw in this situation if 
> we have too many failed streamers. close should also be idempotent, so it 
> should throw the first time we call close if it's going to throw at all.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12612) DFSStripedOutputStream#close will throw if called a second time with a failed streamer

2017-10-16 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-12612:
-
Attachment: HDFS-12612.02.patch

Thanks for the suggestions [~andrew.wang]

Updated to fix findbugs and checkstyle warnings, and also extract the 
{{LastException}} to a new class {{ExceptionLastSeen}} (choosing this name to 
avoid findbugs warning).

> DFSStripedOutputStream#close will throw if called a second time with a failed 
> streamer
> --
>
> Key: HDFS-12612
> URL: https://issues.apache.org/jira/browse/HDFS-12612
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Andrew Wang
>Assignee: Lei (Eddy) Xu
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-12612.00.patch, HDFS-12612.01.patch, 
> HDFS-12612.02.patch
>
>
> Found while testing with Hive. We have a cluster with 2 DNs and the XOR-2-1 
> policy. If you write a file and call close() twice, it throws this exception:
> {noformat}
> 17/10/04 16:02:14 WARN hdfs.DFSOutputStream: Cannot allocate parity 
> block(index=2, policy=XOR-2-1-1024k). Not enough datanodes? Exclude nodes=[]
> ...
> Caused by: java.io.IOException: Failed to get parity block, index=2
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.allocateNewBlock(DFSStripedOutputStream.java:500)
>  ~[hadoop-hdfs-client-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:524)
>  ~[hadoop-hdfs-client-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> {noformat}
> This is because in DFSStripedOutputStream#closeImpl, if the stream is closed, 
> we throw an exception if any of the striped streamers had an exception:
> {code}
>   protected synchronized void closeImpl() throws IOException {
> if (isClosed()) {
>   final MultipleIOException.Builder b = new MultipleIOException.Builder();
>   for(int i = 0; i < streamers.size(); i++) {
> final StripedDataStreamer si = getStripedDataStreamer(i);
> try {
>   si.getLastException().check(true);
> } catch (IOException e) {
>   b.add(e);
> }
>   }
>   final IOException ioe = b.build();
>   if (ioe != null) {
> throw ioe;
>   }
>   return;
> }
> {code}
> I think this is incorrect, since we only need to throw in this situation if 
> we have too many failed streamers. close should also be idempotent, so it 
> should throw the first time we call close if it's going to throw at all.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12612) DFSStripedOutputStream#close will throw if called a second time with a failed streamer

2017-10-13 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-12612:
-
Attachment: HDFS-12612.01.patch

Discussed offline with Andrew.  Now we consider if there are more than 
{{dataUnits}} streamer success, then the {{OutputStream}} is success.  Also 
move the {{lastException}} to {{DFSStripedOutputStream}} that can be set by 
{{abort()}}.

> DFSStripedOutputStream#close will throw if called a second time with a failed 
> streamer
> --
>
> Key: HDFS-12612
> URL: https://issues.apache.org/jira/browse/HDFS-12612
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Andrew Wang
>Assignee: Lei (Eddy) Xu
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-12612.00.patch, HDFS-12612.01.patch
>
>
> Found while testing with Hive. We have a cluster with 2 DNs and the XOR-2-1 
> policy. If you write a file and call close() twice, it throws this exception:
> {noformat}
> 17/10/04 16:02:14 WARN hdfs.DFSOutputStream: Cannot allocate parity 
> block(index=2, policy=XOR-2-1-1024k). Not enough datanodes? Exclude nodes=[]
> ...
> Caused by: java.io.IOException: Failed to get parity block, index=2
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.allocateNewBlock(DFSStripedOutputStream.java:500)
>  ~[hadoop-hdfs-client-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:524)
>  ~[hadoop-hdfs-client-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> {noformat}
> This is because in DFSStripedOutputStream#closeImpl, if the stream is closed, 
> we throw an exception if any of the striped streamers had an exception:
> {code}
>   protected synchronized void closeImpl() throws IOException {
> if (isClosed()) {
>   final MultipleIOException.Builder b = new MultipleIOException.Builder();
>   for(int i = 0; i < streamers.size(); i++) {
> final StripedDataStreamer si = getStripedDataStreamer(i);
> try {
>   si.getLastException().check(true);
> } catch (IOException e) {
>   b.add(e);
> }
>   }
>   final IOException ioe = b.build();
>   if (ioe != null) {
> throw ioe;
>   }
>   return;
> }
> {code}
> I think this is incorrect, since we only need to throw in this situation if 
> we have too many failed streamers. close should also be idempotent, so it 
> should throw the first time we call close if it's going to throw at all.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12612) DFSStripedOutputStream#close will throw if called a second time with a failed streamer

2017-10-10 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-12612:
-
Status: Patch Available  (was: Open)

> DFSStripedOutputStream#close will throw if called a second time with a failed 
> streamer
> --
>
> Key: HDFS-12612
> URL: https://issues.apache.org/jira/browse/HDFS-12612
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Andrew Wang
>Assignee: Lei (Eddy) Xu
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-12612.00.patch
>
>
> Found while testing with Hive. We have a cluster with 2 DNs and the XOR-2-1 
> policy. If you write a file and call close() twice, it throws this exception:
> {noformat}
> 17/10/04 16:02:14 WARN hdfs.DFSOutputStream: Cannot allocate parity 
> block(index=2, policy=XOR-2-1-1024k). Not enough datanodes? Exclude nodes=[]
> ...
> Caused by: java.io.IOException: Failed to get parity block, index=2
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.allocateNewBlock(DFSStripedOutputStream.java:500)
>  ~[hadoop-hdfs-client-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:524)
>  ~[hadoop-hdfs-client-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> {noformat}
> This is because in DFSStripedOutputStream#closeImpl, if the stream is closed, 
> we throw an exception if any of the striped streamers had an exception:
> {code}
>   protected synchronized void closeImpl() throws IOException {
> if (isClosed()) {
>   final MultipleIOException.Builder b = new MultipleIOException.Builder();
>   for(int i = 0; i < streamers.size(); i++) {
> final StripedDataStreamer si = getStripedDataStreamer(i);
> try {
>   si.getLastException().check(true);
> } catch (IOException e) {
>   b.add(e);
> }
>   }
>   final IOException ioe = b.build();
>   if (ioe != null) {
> throw ioe;
>   }
>   return;
> }
> {code}
> I think this is incorrect, since we only need to throw in this situation if 
> we have too many failed streamers. close should also be idempotent, so it 
> should throw the first time we call close if it's going to throw at all.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12612) DFSStripedOutputStream#close will throw if called a second time with a failed streamer

2017-10-10 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-12612:
-
Attachment: HDFS-12612.00.patch

Add a test to verify the bug, and only logs remaining IOE from {{streams}} 
after close() being called.

> DFSStripedOutputStream#close will throw if called a second time with a failed 
> streamer
> --
>
> Key: HDFS-12612
> URL: https://issues.apache.org/jira/browse/HDFS-12612
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-beta1
>Reporter: Andrew Wang
>Assignee: Lei (Eddy) Xu
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-12612.00.patch
>
>
> Found while testing with Hive. We have a cluster with 2 DNs and the XOR-2-1 
> policy. If you write a file and call close() twice, it throws this exception:
> {noformat}
> 17/10/04 16:02:14 WARN hdfs.DFSOutputStream: Cannot allocate parity 
> block(index=2, policy=XOR-2-1-1024k). Not enough datanodes? Exclude nodes=[]
> ...
> Caused by: java.io.IOException: Failed to get parity block, index=2
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.allocateNewBlock(DFSStripedOutputStream.java:500)
>  ~[hadoop-hdfs-client-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream.writeChunk(DFSStripedOutputStream.java:524)
>  ~[hadoop-hdfs-client-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> {noformat}
> This is because in DFSStripedOutputStream#closeImpl, if the stream is closed, 
> we throw an exception if any of the striped streamers had an exception:
> {code}
>   protected synchronized void closeImpl() throws IOException {
> if (isClosed()) {
>   final MultipleIOException.Builder b = new MultipleIOException.Builder();
>   for(int i = 0; i < streamers.size(); i++) {
> final StripedDataStreamer si = getStripedDataStreamer(i);
> try {
>   si.getLastException().check(true);
> } catch (IOException e) {
>   b.add(e);
> }
>   }
>   final IOException ioe = b.build();
>   if (ioe != null) {
> throw ioe;
>   }
>   return;
> }
> {code}
> I think this is incorrect, since we only need to throw in this situation if 
> we have too many failed streamers. close should also be idempotent, so it 
> should throw the first time we call close if it's going to throw at all.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org