[ https://issues.apache.org/jira/browse/HDFS-3726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Todd Lipcon resolved HDFS-3726. ------------------------------- Resolution: Fixed Fix Version/s: QuorumJournalManager (HDFS-3077) Hadoop Flags: Reviewed Committed to branch, thanks for the review. > QJM: if a logger misses an RPC, don't retry that logger until next segment > -------------------------------------------------------------------------- > > Key: HDFS-3726 > URL: https://issues.apache.org/jira/browse/HDFS-3726 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha > Affects Versions: QuorumJournalManager (HDFS-3077) > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Fix For: QuorumJournalManager (HDFS-3077) > > Attachments: hdfs-3726.txt, hdfs-3726.txt > > > Currently, if a logger misses an RPC in the middle of a log segment, or > misses the {{startLogSegment}} RPC (eg it was down or network was > disconnected during that time period), then it will throw an exception on > every subsequent {{journal()}} call in that segment, since it knows that it > missed some edits in the middle. > We should change this exception to a specific IOE subclass, and have the > client side of QJM detect the situation and stop sending IPCs until the next > {{startLogSegment}} call. > This isn't critical for correctness but will help reduce log spew on both > sides. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira