[jira] [Commented] (HBASE-9373) [replication] data loss because replication doesn't expect partial reads

2013-08-30 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754914#comment-13754914
 ] 

stack commented on HBASE-9373:
--

nit: there is stuff like the below that does not have to be inside the try (not 
important)

if (trailerPresent  originalPosition == this.walEditsStopOffset) return false;

Remove this on commit:

+// See if available is any good to us.  Record before we start 
reading.  My guess is that it
+// does not change once reader has been opened but check see.

...especially as my 'guess' above turns out to be wrong (its embarrassing to 
have it persist in code!)

I think I preferred the old repetitive way of doing things rather than this 
messaging via exceptions that you have here (throwing EOFEs only to catch them 
locally)

Good to include available and size in this exception message throw new 
EOFException(Available stream not enough for edit);

Could we already have an EOFE when you do this throw new EOFException(Partial 
PB while reading WAL,  +
+  probably an unexpected EOF, ignoring); ?  If so, are you 
losing info when you throw this on? Add original exception as 'cause'?

Above is minor stuff.  +1 on commit since it works for you.



 [replication] data loss because replication doesn't expect partial reads
 

 Key: HBASE-9373
 URL: https://issues.apache.org/jira/browse/HBASE-9373
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.95.2
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.98.0, 0.96.0

 Attachments: 9373.txt, 9373-v2.txt, 9373-v3.txt


 When I see this in the logs it often means we got a partial read and then we 
 have the wrong offset when reading the rest of the file
 {noformat}
 2013-08-28 23:16:07,182 ERROR 
 [ReplicationExecutor-0.replicationSource,1-jdec2hbase0403-5,60020,1377730319617]
  org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader: Invalid PB while 
 reading WAL, probably an unexpected EOF, ignoring
 com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had 
 invalid wire type.
 at 
 com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99)
 at 
 com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:498)
 at 
 com.google.protobuf.GeneratedMessage.parseUnknownField(GeneratedMessage.java:193)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.init(WALProtos.java:686)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.init(WALProtos.java:644)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:771)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:766)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1444)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1218)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:220)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:912)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:290)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:926)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:296)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:918)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:197)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:98)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.readNextAndSetPosition(ReplicationHLogReaderManager.java:89)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:390)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:298)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-9373) [replication] data loss because replication doesn't expect partial reads

2013-08-30 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754934#comment-13754934
 ] 

Jean-Daniel Cryans commented on HBASE-9373:
---

bq. I think I preferred the old repetitive way of doing things rather than this 
messaging via exceptions that you have here (throwing EOFEs only to catch them 
locally)

I don't like the other way of repeating 2 lines either, too easy to miss... we 
could instead set a message+break, then if the message != null then print it 
and return false? It's all ugly to me anyways.

bq. Could we already have an EOFE when you do this 

Not following, which EOFE? 

 [replication] data loss because replication doesn't expect partial reads
 

 Key: HBASE-9373
 URL: https://issues.apache.org/jira/browse/HBASE-9373
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.95.2
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.98.0, 0.96.0

 Attachments: 9373.txt, 9373-v2.txt, 9373-v3.txt


 When I see this in the logs it often means we got a partial read and then we 
 have the wrong offset when reading the rest of the file
 {noformat}
 2013-08-28 23:16:07,182 ERROR 
 [ReplicationExecutor-0.replicationSource,1-jdec2hbase0403-5,60020,1377730319617]
  org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader: Invalid PB while 
 reading WAL, probably an unexpected EOF, ignoring
 com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had 
 invalid wire type.
 at 
 com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99)
 at 
 com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:498)
 at 
 com.google.protobuf.GeneratedMessage.parseUnknownField(GeneratedMessage.java:193)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.init(WALProtos.java:686)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.init(WALProtos.java:644)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:771)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:766)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1444)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1218)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:220)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:912)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:290)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:926)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:296)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:918)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:197)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:98)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.readNextAndSetPosition(ReplicationHLogReaderManager.java:89)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:390)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:298)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-9373) [replication] data loss because replication doesn't expect partial reads

2013-08-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754972#comment-13754972
 ] 

Hadoop QA commented on HBASE-9373:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12600810/9373-v3.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6990//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6990//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6990//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6990//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6990//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6990//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6990//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6990//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6990//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6990//console

This message is automatically generated.

 [replication] data loss because replication doesn't expect partial reads
 

 Key: HBASE-9373
 URL: https://issues.apache.org/jira/browse/HBASE-9373
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.95.2
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.98.0, 0.96.0

 Attachments: 9373.txt, 9373-v2.txt, 9373-v3.txt, 9373-v4.patch


 When I see this in the logs it often means we got a partial read and then we 
 have the wrong offset when reading the rest of the file
 {noformat}
 2013-08-28 23:16:07,182 ERROR 
 [ReplicationExecutor-0.replicationSource,1-jdec2hbase0403-5,60020,1377730319617]
  org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader: Invalid PB while 
 reading WAL, probably an unexpected EOF, ignoring
 com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had 
 invalid wire type.
 at 
 com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99)
 at 
 com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:498)
 at 
 com.google.protobuf.GeneratedMessage.parseUnknownField(GeneratedMessage.java:193)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.init(WALProtos.java:686)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.init(WALProtos.java:644)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:771)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:766)
 at 
 

[jira] [Commented] (HBASE-9373) [replication] data loss because replication doesn't expect partial reads

2013-08-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755043#comment-13755043
 ] 

Hudson commented on HBASE-9373:
---

FAILURE: Integrated in hbase-0.95-on-hadoop2 #282 (See 
[https://builds.apache.org/job/hbase-0.95-on-hadoop2/282/])
HBASE-9373 [replication] data loss because replication doesn't expect partial 
reads (jdcryans: rev 1519038)
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java


 [replication] data loss because replication doesn't expect partial reads
 

 Key: HBASE-9373
 URL: https://issues.apache.org/jira/browse/HBASE-9373
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.95.2
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.98.0, 0.96.0

 Attachments: 9373.txt, 9373-v2.txt, 9373-v3.txt, 9373-v4.patch


 When I see this in the logs it often means we got a partial read and then we 
 have the wrong offset when reading the rest of the file
 {noformat}
 2013-08-28 23:16:07,182 ERROR 
 [ReplicationExecutor-0.replicationSource,1-jdec2hbase0403-5,60020,1377730319617]
  org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader: Invalid PB while 
 reading WAL, probably an unexpected EOF, ignoring
 com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had 
 invalid wire type.
 at 
 com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99)
 at 
 com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:498)
 at 
 com.google.protobuf.GeneratedMessage.parseUnknownField(GeneratedMessage.java:193)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.init(WALProtos.java:686)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.init(WALProtos.java:644)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:771)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:766)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1444)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1218)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:220)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:912)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:290)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:926)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:296)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:918)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:197)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:98)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.readNextAndSetPosition(ReplicationHLogReaderManager.java:89)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:390)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:298)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-9373) [replication] data loss because replication doesn't expect partial reads

2013-08-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755106#comment-13755106
 ] 

Hudson commented on HBASE-9373:
---

FAILURE: Integrated in hbase-0.95 #510 (See 
[https://builds.apache.org/job/hbase-0.95/510/])
HBASE-9373 [replication] data loss because replication doesn't expect partial 
reads (jdcryans: rev 1519038)
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java


 [replication] data loss because replication doesn't expect partial reads
 

 Key: HBASE-9373
 URL: https://issues.apache.org/jira/browse/HBASE-9373
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.95.2
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.98.0, 0.96.0

 Attachments: 9373.txt, 9373-v2.txt, 9373-v3.txt, 9373-v4.patch


 When I see this in the logs it often means we got a partial read and then we 
 have the wrong offset when reading the rest of the file
 {noformat}
 2013-08-28 23:16:07,182 ERROR 
 [ReplicationExecutor-0.replicationSource,1-jdec2hbase0403-5,60020,1377730319617]
  org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader: Invalid PB while 
 reading WAL, probably an unexpected EOF, ignoring
 com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had 
 invalid wire type.
 at 
 com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99)
 at 
 com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:498)
 at 
 com.google.protobuf.GeneratedMessage.parseUnknownField(GeneratedMessage.java:193)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.init(WALProtos.java:686)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.init(WALProtos.java:644)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:771)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:766)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1444)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1218)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:220)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:912)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:290)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:926)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:296)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:918)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:197)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:98)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.readNextAndSetPosition(ReplicationHLogReaderManager.java:89)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:390)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:298)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-9373) [replication] data loss because replication doesn't expect partial reads

2013-08-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755118#comment-13755118
 ] 

Hudson commented on HBASE-9373:
---

SUCCESS: Integrated in HBase-TRUNK #4451 (See 
[https://builds.apache.org/job/HBase-TRUNK/4451/])
HBASE-9373 [replication] data loss because replication doesn't expect partial 
reads (jdcryans: rev 1519037)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java


 [replication] data loss because replication doesn't expect partial reads
 

 Key: HBASE-9373
 URL: https://issues.apache.org/jira/browse/HBASE-9373
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.95.2
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.98.0, 0.96.0

 Attachments: 9373.txt, 9373-v2.txt, 9373-v3.txt, 9373-v4.patch


 When I see this in the logs it often means we got a partial read and then we 
 have the wrong offset when reading the rest of the file
 {noformat}
 2013-08-28 23:16:07,182 ERROR 
 [ReplicationExecutor-0.replicationSource,1-jdec2hbase0403-5,60020,1377730319617]
  org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader: Invalid PB while 
 reading WAL, probably an unexpected EOF, ignoring
 com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had 
 invalid wire type.
 at 
 com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99)
 at 
 com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:498)
 at 
 com.google.protobuf.GeneratedMessage.parseUnknownField(GeneratedMessage.java:193)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.init(WALProtos.java:686)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.init(WALProtos.java:644)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:771)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:766)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1444)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1218)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:220)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:912)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:290)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:926)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:296)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:918)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:197)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:98)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.readNextAndSetPosition(ReplicationHLogReaderManager.java:89)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:390)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:298)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-9373) [replication] data loss because replication doesn't expect partial reads

2013-08-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755315#comment-13755315
 ] 

Hudson commented on HBASE-9373:
---

SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #705 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/705/])
HBASE-9373 [replication] data loss because replication doesn't expect partial 
reads (jdcryans: rev 1519037)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java


 [replication] data loss because replication doesn't expect partial reads
 

 Key: HBASE-9373
 URL: https://issues.apache.org/jira/browse/HBASE-9373
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.95.2
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.98.0, 0.96.0

 Attachments: 9373.txt, 9373-v2.txt, 9373-v3.txt, 9373-v4.patch


 When I see this in the logs it often means we got a partial read and then we 
 have the wrong offset when reading the rest of the file
 {noformat}
 2013-08-28 23:16:07,182 ERROR 
 [ReplicationExecutor-0.replicationSource,1-jdec2hbase0403-5,60020,1377730319617]
  org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader: Invalid PB while 
 reading WAL, probably an unexpected EOF, ignoring
 com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had 
 invalid wire type.
 at 
 com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99)
 at 
 com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:498)
 at 
 com.google.protobuf.GeneratedMessage.parseUnknownField(GeneratedMessage.java:193)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.init(WALProtos.java:686)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.init(WALProtos.java:644)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:771)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:766)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1444)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1218)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:220)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:912)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:290)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:926)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:296)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:918)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:197)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:98)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.readNextAndSetPosition(ReplicationHLogReaderManager.java:89)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:390)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:298)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-9373) [replication] data loss because replication doesn't expect partial reads

2013-08-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754329#comment-13754329
 ] 

stack commented on HBASE-9373:
--

Patch is good.  Remove the comment about checking available since it seems to 
work?

 [replication] data loss because replication doesn't expect partial reads
 

 Key: HBASE-9373
 URL: https://issues.apache.org/jira/browse/HBASE-9373
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.95.2
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.98.0, 0.96.0

 Attachments: 9373.txt, 9373-v2.txt


 When I see this in the logs it often means we got a partial read and then we 
 have the wrong offset when reading the rest of the file
 {noformat}
 2013-08-28 23:16:07,182 ERROR 
 [ReplicationExecutor-0.replicationSource,1-jdec2hbase0403-5,60020,1377730319617]
  org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader: Invalid PB while 
 reading WAL, probably an unexpected EOF, ignoring
 com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had 
 invalid wire type.
 at 
 com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99)
 at 
 com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:498)
 at 
 com.google.protobuf.GeneratedMessage.parseUnknownField(GeneratedMessage.java:193)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.init(WALProtos.java:686)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.init(WALProtos.java:644)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:771)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:766)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1444)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$Builder.mergeFrom(WALProtos.java:1218)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:220)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:912)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:290)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:926)
 at 
 com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:296)
 at 
 com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:918)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.readNext(ProtobufLogReader.java:197)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ReaderBase.next(ReaderBase.java:98)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.readNextAndSetPosition(ReplicationHLogReaderManager.java:89)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:390)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:298)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-9373) [replication] data loss because replication doesn't expect partial reads

2013-08-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754357#comment-13754357
 ] 

Hadoop QA commented on HBASE-9373:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12600704/9373-v2.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6975//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6975//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6975//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6975//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6975//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6975//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6975//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6975//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6975//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6975//console

This message is automatically generated.

 [replication] data loss because replication doesn't expect partial reads
 

 Key: HBASE-9373
 URL: https://issues.apache.org/jira/browse/HBASE-9373
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.95.2
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.98.0, 0.96.0

 Attachments: 9373.txt, 9373-v2.txt


 When I see this in the logs it often means we got a partial read and then we 
 have the wrong offset when reading the rest of the file
 {noformat}
 2013-08-28 23:16:07,182 ERROR 
 [ReplicationExecutor-0.replicationSource,1-jdec2hbase0403-5,60020,1377730319617]
  org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader: Invalid PB while 
 reading WAL, probably an unexpected EOF, ignoring
 com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had 
 invalid wire type.
 at 
 com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99)
 at 
 com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:498)
 at 
 com.google.protobuf.GeneratedMessage.parseUnknownField(GeneratedMessage.java:193)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.init(WALProtos.java:686)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey.init(WALProtos.java:644)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:771)
 at 
 org.apache.hadoop.hbase.protobuf.generated.WALProtos$WALKey$1.parsePartialFrom(WALProtos.java:766)
 at