[ 
https://issues.apache.org/jira/browse/HBASE-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13754914#comment-13754914
 ] 

stack commented on HBASE-9373:
------------------------------

nit: there is stuff like the below that does not have to be inside the try (not 
important)

if (trailerPresent && originalPosition == this.walEditsStopOffset) return false;

Remove this on commit:

+        // See if available is any good to us.  Record before we start 
reading.  My guess is that it
+        // does not change once reader has been opened but check see.

...especially as my 'guess' above turns out to be wrong (its embarrassing to 
have it persist in code!)

I think I preferred the old repetitive way of doing things rather than this 
messaging via exceptions that you have here (throwing EOFEs only to catch them 
locally)

Good to include available and size in this exception message throw new 
EOFException("Available stream not enough for edit");

Could we already have an EOFE when you do this throw new EOFException("Partial 
PB while reading WAL, " +
+              "probably an unexpected EOF, ignoring"); ?  If so, are you 
losing info when you throw this on? Add original exception as 'cause'?

Above is minor stuff.  +1 on commit since it works for you.


                
> [replication] data loss because replication doesn't expect partial reads
> ------------------------------------------------------------------------
>
>                 Key: HBASE-9373
>                 URL: https://issues.apache.org/jira/browse/HBASE-9373
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.95.2
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.98.0, 0.96.0
>
>         Attachments: 9373.txt, 9373-v2.txt, 9373-v3.txt
>
>
> When I see this in the logs it often means we got a partial read and then we 
> have the wrong offset when reading the rest of the file
> {noformat}
> 2013-08-28 23:16:07,182 ERROR 
> [ReplicationExecutor-0.replicationSource,1-jdec2hbase0403-5,60020,1377730319617]
>  org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader: Invalid PB while 
> reading WAL, probably an unexpected EOF, ignoring
> com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had 
> invalid wire type.
>         at 
> com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99)
>         at 
> com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:498)
>         at 
> com.google.protobuf.GeneratedMessage.parseUnknownField(GeneratedMessage.java:193)
>         at 
> org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.<init>(WALProtos.java:686)
>         at 
> org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.<init>(WALProtos.java:644)
>         at 
> org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:771)
>         at 
> org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:766)
>         at 
> org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1444)
>         at 
> org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1218)
>         at 
> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:220)
>         at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:912)
>         at 
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267)
>         at 
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:290)
>         at 
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:926)
>         at 
> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:296)
>         at 
> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:918)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:197)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:98)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.readNextAndSetPosition(ReplicationHLogReaderManager.java:89)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:390)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:298)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to