[ https://issues.apache.org/jira/browse/HBASE-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13754914#comment-13754914 ]
stack commented on HBASE-9373: ------------------------------ nit: there is stuff like the below that does not have to be inside the try (not important) if (trailerPresent && originalPosition == this.walEditsStopOffset) return false; Remove this on commit: + // See if available is any good to us. Record before we start reading. My guess is that it + // does not change once reader has been opened but check see. ...especially as my 'guess' above turns out to be wrong (its embarrassing to have it persist in code!) I think I preferred the old repetitive way of doing things rather than this messaging via exceptions that you have here (throwing EOFEs only to catch them locally) Good to include available and size in this exception message throw new EOFException("Available stream not enough for edit"); Could we already have an EOFE when you do this throw new EOFException("Partial PB while reading WAL, " + + "probably an unexpected EOF, ignoring"); ? If so, are you losing info when you throw this on? Add original exception as 'cause'? Above is minor stuff. +1 on commit since it works for you. > [replication] data loss because replication doesn't expect partial reads > ------------------------------------------------------------------------ > > Key: HBASE-9373 > URL: https://issues.apache.org/jira/browse/HBASE-9373 > Project: HBase > Issue Type: Improvement > Affects Versions: 0.95.2 > Reporter: Jean-Daniel Cryans > Assignee: Jean-Daniel Cryans > Priority: Blocker > Fix For: 0.98.0, 0.96.0 > > Attachments: 9373.txt, 9373-v2.txt, 9373-v3.txt > > > When I see this in the logs it often means we got a partial read and then we > have the wrong offset when reading the rest of the file > {noformat} > 2013-08-28 23:16:07,182 ERROR > [ReplicationExecutor-0.replicationSource,1-jdec2hbase0403-5,60020,1377730319617] > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader: Invalid PB while > reading WAL, probably an unexpected EOF, ignoring > com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had > invalid wire type. > at > com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99) > at > com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:498) > at > com.google.protobuf.GeneratedMessage.parseUnknownField(GeneratedMessage.java:193) > at > org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.<init>(WALProtos.java:686) > at > org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.<init>(WALProtos.java:644) > at > org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:771) > at > org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:766) > at > org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1444) > at > org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1218) > at > com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:220) > at > com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:912) > at > com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267) > at > com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:290) > at > com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:926) > at > com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:296) > at > com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:918) > at > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:197) > at > org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:98) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.readNextAndSetPosition(ReplicationHLogReaderManager.java:89) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:390) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:298) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira