[jira] [Commented] (HDFS-12821) Block invalid IOException causes the DFSClient domain socket being disabled

2017-11-16 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16254904#comment-16254904
 ] 

John Zhuge commented on HDFS-12821:
---

Think so too.

In HDFS-12528, the file was appended, thus the last block generate stamp was 
changed. Since the block meta file name contains the gen stamp, the meta file 
could not be found any more:
{noformat}
Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 
not found
{noformat}

How was the block invalidated in this case?


> Block invalid IOException causes the DFSClient domain socket being disabled
> ---
>
> Key: HDFS-12821
> URL: https://issues.apache.org/jira/browse/HDFS-12821
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.4.0, 2.6.0
>Reporter: Gang Xie
>
> We use HDFS2.4 & 2.6, and recently hit a issue that DFSClient domain socket 
> is disabled when datanode throw block invalid exception. 
> The block is invalidated for some reason on datanote and it's OK. Then 
> DFSClient tries to access this block on this datanode via domain socket. This 
> triggers a IOExcetion. On DFSClient side, when get a IOExcetion and error 
> code 'ERROR', it disables the domain socket and fails back to TCP. and the 
> worst is that it seems never recover the socket. 
> I think this is a defect and with such "block invalid" exception, we should 
> not disable the domain socket because the is nothing wrong about the domain 
> socket service.
> And thoughts?
> The code:
> {code}
> private ShortCircuitReplicaInfo requestFileDescriptors(DomainPeer peer,
> Slot slot) throws IOException {
>   ShortCircuitCache cache = clientContext.getShortCircuitCache();
>   final DataOutputStream out =
>   new DataOutputStream(new BufferedOutputStream(peer.getOutputStream()));
>   SlotId slotId = slot == null ? null : slot.getSlotId();
>   new Sender(out).requestShortCircuitFds(block, token, slotId, 1);
>   DataInputStream in = new DataInputStream(peer.getInputStream());
>   BlockOpResponseProto resp = BlockOpResponseProto.parseFrom(
>   PBHelper.vintPrefixed(in));
>   DomainSocket sock = peer.getDomainSocket();
>   switch (resp.getStatus()) {
>   case SUCCESS:
> byte buf[] = new byte[1];
> FileInputStream fis[] = new FileInputStream[2];
> sock.recvFileInputStreams(fis, buf, 0, buf.length);
> ShortCircuitReplica replica = null;
> try {
>   ExtendedBlockId key =
>   new ExtendedBlockId(block.getBlockId(), block.getBlockPoolId());
>   replica = new ShortCircuitReplica(key, fis[0], fis[1], cache,
>   Time.monotonicNow(), slot);
> } catch (IOException e) {
>   // This indicates an error reading from disk, or a format error.  Since
>   // it's not a socket communication problem, we return null rather than
>   // throwing an exception.
>   LOG.warn(this + ": error creating ShortCircuitReplica.", e);
>   return null;
> } finally {
>   if (replica == null) {
> IOUtils.cleanup(DFSClient.LOG, fis[0], fis[1]);
>   }
> }
> return new ShortCircuitReplicaInfo(replica);
>   case ERROR_UNSUPPORTED:
> if (!resp.hasShortCircuitAccessVersion()) {
>   LOG.warn("short-circuit read access is disabled for " +
>   "DataNode " + datanode + ".  reason: " + resp.getMessage());
>   clientContext.getDomainSocketFactory()
>   .disableShortCircuitForPath(pathInfo.getPath());
> } else {
>   LOG.warn("short-circuit read access for the file " +
>   fileName + " is disabled for DataNode " + datanode +
>   ".  reason: " + resp.getMessage());
> }
> return null;
>   case ERROR_ACCESS_TOKEN:
> String msg = "access control error while " +
> "attempting to set up short-circuit access to " +
> fileName + resp.getMessage();
> if (LOG.isDebugEnabled()) {
>   LOG.debug(this + ":" + msg);
> }
> return new ShortCircuitReplicaInfo(new InvalidToken(msg));
>   default:
> LOG.warn(this + ": unknown response code " + resp.getStatus() +
> " while attempting to set up short-circuit access. " +
> resp.getMessage());
> clientContext.getDomainSocketFactory()
> .disableShortCircuitForPath(pathInfo.getPath());
> <<=
> return null;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12821) Block invalid IOException causes the DFSClient domain socket being disabled

2017-11-15 Thread Gang Xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16254712#comment-16254712
 ] 

Gang Xie commented on HDFS-12821:
-

yes, should be the same issue. 

> Block invalid IOException causes the DFSClient domain socket being disabled
> ---
>
> Key: HDFS-12821
> URL: https://issues.apache.org/jira/browse/HDFS-12821
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.4.0, 2.6.0
>Reporter: Gang Xie
>
> We use HDFS2.4 & 2.6, and recently hit a issue that DFSClient domain socket 
> is disabled when datanode throw block invalid exception. 
> The block is invalidated for some reason on datanote and it's OK. Then 
> DFSClient tries to access this block on this datanode via domain socket. This 
> triggers a IOExcetion. On DFSClient side, when get a IOExcetion and error 
> code 'ERROR', it disables the domain socket and fails back to TCP. and the 
> worst is that it seems never recover the socket. 
> I think this is a defect and with such "block invalid" exception, we should 
> not disable the domain socket because the is nothing wrong about the domain 
> socket service.
> And thoughts?
> The code:
> private ShortCircuitReplicaInfo requestFileDescriptors(DomainPeer peer,
> Slot slot) throws IOException {
>   ShortCircuitCache cache = clientContext.getShortCircuitCache();
>   final DataOutputStream out =
>   new DataOutputStream(new BufferedOutputStream(peer.getOutputStream()));
>   SlotId slotId = slot == null ? null : slot.getSlotId();
>   new Sender(out).requestShortCircuitFds(block, token, slotId, 1);
>   DataInputStream in = new DataInputStream(peer.getInputStream());
>   BlockOpResponseProto resp = BlockOpResponseProto.parseFrom(
>   PBHelper.vintPrefixed(in));
>   DomainSocket sock = peer.getDomainSocket();
>   switch (resp.getStatus()) {
>   case SUCCESS:
> byte buf[] = new byte[1];
> FileInputStream fis[] = new FileInputStream[2];
> sock.recvFileInputStreams(fis, buf, 0, buf.length);
> ShortCircuitReplica replica = null;
> try {
>   ExtendedBlockId key =
>   new ExtendedBlockId(block.getBlockId(), block.getBlockPoolId());
>   replica = new ShortCircuitReplica(key, fis[0], fis[1], cache,
>   Time.monotonicNow(), slot);
> } catch (IOException e) {
>   // This indicates an error reading from disk, or a format error.  Since
>   // it's not a socket communication problem, we return null rather than
>   // throwing an exception.
>   LOG.warn(this + ": error creating ShortCircuitReplica.", e);
>   return null;
> } finally {
>   if (replica == null) {
> IOUtils.cleanup(DFSClient.LOG, fis[0], fis[1]);
>   }
> }
> return new ShortCircuitReplicaInfo(replica);
>   case ERROR_UNSUPPORTED:
> if (!resp.hasShortCircuitAccessVersion()) {
>   LOG.warn("short-circuit read access is disabled for " +
>   "DataNode " + datanode + ".  reason: " + resp.getMessage());
>   clientContext.getDomainSocketFactory()
>   .disableShortCircuitForPath(pathInfo.getPath());
> } else {
>   LOG.warn("short-circuit read access for the file " +
>   fileName + " is disabled for DataNode " + datanode +
>   ".  reason: " + resp.getMessage());
> }
> return null;
>   case ERROR_ACCESS_TOKEN:
> String msg = "access control error while " +
> "attempting to set up short-circuit access to " +
> fileName + resp.getMessage();
> if (LOG.isDebugEnabled()) {
>   LOG.debug(this + ":" + msg);
> }
> return new ShortCircuitReplicaInfo(new InvalidToken(msg));
>   default:
> LOG.warn(this + ": unknown response code " + resp.getStatus() +
> " while attempting to set up short-circuit access. " +
> resp.getMessage());
> {color:#d04437}clientContext.getDomainSocketFactory()
> .disableShortCircuitForPath(pathInfo.getPath());{color}
> return null;
>   }



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12821) Block invalid IOException causes the DFSClient domain socket being disabled

2017-11-15 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16254673#comment-16254673
 ] 

Weiwei Yang commented on HDFS-12821:


A dup of HDFS-12528 ?

> Block invalid IOException causes the DFSClient domain socket being disabled
> ---
>
> Key: HDFS-12821
> URL: https://issues.apache.org/jira/browse/HDFS-12821
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.4.0, 2.6.0
>Reporter: Gang Xie
>
> We use HDFS2.4 & 2.6, and recently hit a issue that DFSClient domain socket 
> is disabled when datanode throw block invalid exception. 
> The block is invalidated for some reason on datanote and it's OK. Then 
> DFSClient tries to access this block on this datanode via domain socket. This 
> triggers a IOExcetion. On DFSClient side, when get a IOExcetion and error 
> code 'ERROR', it disables the domain socket and fails back to TCP. and the 
> worst is that it seems never recover the socket. 
> I think this is a defect and with such "block invalid" exception, we should 
> not disable the domain socket because the is nothing wrong about the domain 
> socket service.
> And thoughts?
> The code:
> private ShortCircuitReplicaInfo requestFileDescriptors(DomainPeer peer,
> Slot slot) throws IOException {
>   ShortCircuitCache cache = clientContext.getShortCircuitCache();
>   final DataOutputStream out =
>   new DataOutputStream(new BufferedOutputStream(peer.getOutputStream()));
>   SlotId slotId = slot == null ? null : slot.getSlotId();
>   new Sender(out).requestShortCircuitFds(block, token, slotId, 1);
>   DataInputStream in = new DataInputStream(peer.getInputStream());
>   BlockOpResponseProto resp = BlockOpResponseProto.parseFrom(
>   PBHelper.vintPrefixed(in));
>   DomainSocket sock = peer.getDomainSocket();
>   switch (resp.getStatus()) {
>   case SUCCESS:
> byte buf[] = new byte[1];
> FileInputStream fis[] = new FileInputStream[2];
> sock.recvFileInputStreams(fis, buf, 0, buf.length);
> ShortCircuitReplica replica = null;
> try {
>   ExtendedBlockId key =
>   new ExtendedBlockId(block.getBlockId(), block.getBlockPoolId());
>   replica = new ShortCircuitReplica(key, fis[0], fis[1], cache,
>   Time.monotonicNow(), slot);
> } catch (IOException e) {
>   // This indicates an error reading from disk, or a format error.  Since
>   // it's not a socket communication problem, we return null rather than
>   // throwing an exception.
>   LOG.warn(this + ": error creating ShortCircuitReplica.", e);
>   return null;
> } finally {
>   if (replica == null) {
> IOUtils.cleanup(DFSClient.LOG, fis[0], fis[1]);
>   }
> }
> return new ShortCircuitReplicaInfo(replica);
>   case ERROR_UNSUPPORTED:
> if (!resp.hasShortCircuitAccessVersion()) {
>   LOG.warn("short-circuit read access is disabled for " +
>   "DataNode " + datanode + ".  reason: " + resp.getMessage());
>   clientContext.getDomainSocketFactory()
>   .disableShortCircuitForPath(pathInfo.getPath());
> } else {
>   LOG.warn("short-circuit read access for the file " +
>   fileName + " is disabled for DataNode " + datanode +
>   ".  reason: " + resp.getMessage());
> }
> return null;
>   case ERROR_ACCESS_TOKEN:
> String msg = "access control error while " +
> "attempting to set up short-circuit access to " +
> fileName + resp.getMessage();
> if (LOG.isDebugEnabled()) {
>   LOG.debug(this + ":" + msg);
> }
> return new ShortCircuitReplicaInfo(new InvalidToken(msg));
>   default:
> LOG.warn(this + ": unknown response code " + resp.getStatus() +
> " while attempting to set up short-circuit access. " +
> resp.getMessage());
> {color:#d04437}clientContext.getDomainSocketFactory()
> .disableShortCircuitForPath(pathInfo.getPath());{color}
> return null;
>   }



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org