[ https://issues.apache.org/jira/browse/HDFS-8965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14990167#comment-14990167 ]
Yongjun Zhang commented on HDFS-8965: ------------------------------------- Hi [~cmccabe], Thanks for your earlier work on this issue. I have a question, sounds like that the edit log is somehow corrupted, and we tried to avoid out-of-memory issue when loading the edit log. Two questions: 1. What do we do with the corrupted edit log entry? skip it? 2. Do we have idea about whether it is a bug that corrupt edit log, or is it a disk/network problem that corrupt the edit log? Thanks. > Harden edit log reading code against out of memory errors > --------------------------------------------------------- > > Key: HDFS-8965 > URL: https://issues.apache.org/jira/browse/HDFS-8965 > Project: Hadoop HDFS > Issue Type: Improvement > Affects Versions: 2.0.0-alpha > Reporter: Colin Patrick McCabe > Assignee: Colin Patrick McCabe > Fix For: 2.8.0 > > Attachments: HDFS-8965.001.patch, HDFS-8965.002.patch, > HDFS-8965.003.patch, HDFS-8965.004.patch, HDFS-8965.005.patch, > HDFS-8965.006.patch, HDFS-8965.007.patch > > > We should harden the edit log reading code against out of memory errors. Now > that each op has a length prefix and a checksum, we can validate the checksum > before trying to load the Op data. This should avoid out of memory errors > when trying to load garbage data as Op data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)