[jira] [Commented] (HBASE-11620) Record the class name of Writer in WAL header so that only proper Reader can open the WAL file

2014-08-15 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14099161#comment-14099161
 ] 

Ted Yu commented on HBASE-11620:


I logged HBASE-11762 for writing Codec class name in WAL header.

Initial patch attached.

> Record the class name of Writer in WAL header so that only proper Reader can 
> open the WAL file
> --
>
> Key: HBASE-11620
> URL: https://issues.apache.org/jira/browse/HBASE-11620
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.4
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.99.0, 0.98.5, 2.0.0
>
> Attachments: 11620-0.98-v6.txt, 11620-0.98-v7.txt, 11620-v1.txt, 
> 11620-v2.txt, 11620-v3.txt, 11620-v4.txt, 11620-v5.txt, 11620-v6.txt, 
> 11620-v6.txt, 11620-v7.txt
>
>
> Reported by Kiran in this thread: "HBase file encryption, inconsistencies 
> observed and data loss"
> After step 4 ( i.e disabling of WAL encryption, removing 
> SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly 
> due to EOF exception at Basedecoder. This is not considered as error and 
> these WAL are being moved to /oldWALs.
> Following is observed in log files:
> {code}
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Splitting hlog: 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017,
>  length=172
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: DistributedLogReplay = false
> 2014-07-30 19:44:29,313 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: Recovering lease on dfs file 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
> 2014-07-30 19:44:29,315 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: recoverLease=true, attempt=0 on 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  after 1ms
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting
> 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting
> 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: 
> Premature EOF from inputStream
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Finishing writing output logs and closing down.
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Waiting for split writer threads to finish
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Split writers finished
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Processed 0 edits across 0 regions; log 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  is corrupted = false progress failed = false
> {code}
> To fix this, we need to propagate EOF exception to HLogSplitter. Any 
> suggestions on the fix?
>  (end of quote from Kiran)
> In BaseDecoder#rethrowEofException() :
> {code}
> if (!isEof) throw ioEx;
> LOG.error("Partial cell read caused by EOF: " + ioEx);
> EOFException eofEx = new EOFException("Partial cell read");
> eofEx.initCause(ioEx);
> throw eofEx;
> {code}
> throwing EOFException would not propagate the "Partial cell read" condition 
> to HLogSplitter which doesn't treat EOFException as an error.
> I think IOException should be thrown above - HLogSplitter#getNextLogLine() 
> would translate the IOEx to CorruptedLogFileException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11620) Record the class name of Writer in WAL header so that only proper Reader can open the WAL file

2014-08-14 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14098152#comment-14098152
 ] 

ramkrishna.s.vasudevan commented on HBASE-11620:


bq.Related to this, should not we also write the CellCodec that we use in the 
WAL header. Right now, the codec comes from the configuration which means that 
you cannot read back the WAL files if you change the codec
+1.  Would be really helpful.

> Record the class name of Writer in WAL header so that only proper Reader can 
> open the WAL file
> --
>
> Key: HBASE-11620
> URL: https://issues.apache.org/jira/browse/HBASE-11620
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.4
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.99.0, 0.98.5, 2.0.0
>
> Attachments: 11620-0.98-v6.txt, 11620-0.98-v7.txt, 11620-v1.txt, 
> 11620-v2.txt, 11620-v3.txt, 11620-v4.txt, 11620-v5.txt, 11620-v6.txt, 
> 11620-v6.txt, 11620-v7.txt
>
>
> Reported by Kiran in this thread: "HBase file encryption, inconsistencies 
> observed and data loss"
> After step 4 ( i.e disabling of WAL encryption, removing 
> SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly 
> due to EOF exception at Basedecoder. This is not considered as error and 
> these WAL are being moved to /oldWALs.
> Following is observed in log files:
> {code}
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Splitting hlog: 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017,
>  length=172
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: DistributedLogReplay = false
> 2014-07-30 19:44:29,313 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: Recovering lease on dfs file 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
> 2014-07-30 19:44:29,315 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: recoverLease=true, attempt=0 on 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  after 1ms
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting
> 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting
> 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: 
> Premature EOF from inputStream
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Finishing writing output logs and closing down.
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Waiting for split writer threads to finish
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Split writers finished
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Processed 0 edits across 0 regions; log 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  is corrupted = false progress failed = false
> {code}
> To fix this, we need to propagate EOF exception to HLogSplitter. Any 
> suggestions on the fix?
>  (end of quote from Kiran)
> In BaseDecoder#rethrowEofException() :
> {code}
> if (!isEof) throw ioEx;
> LOG.error("Partial cell read caused by EOF: " + ioEx);
> EOFException eofEx = new EOFException("Partial cell read");
> eofEx.initCause(ioEx);
> throw eofEx;
> {code}
> throwing EOFException would not propagate the "Partial cell read" condition 
> to HLogSplitter which doesn't treat EOFException as an error.
> I think IOException should be thrown above - HLogSplitter#getNextLogLine() 
> would translate the IOEx to CorruptedLogFileException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11620) Record the class name of Writer in WAL header so that only proper Reader can open the WAL file

2014-08-14 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14097910#comment-14097910
 ] 

Andrew Purtell commented on HBASE-11620:


bq. Related to this, should not we also write the CellCodec that we use in the 
WAL header. Right now, the codec comes from the configuration which means that 
you cannot read back the WAL files if you change the codec. What do you guys 
think, shall I open an issue?

Sounds good to me

> Record the class name of Writer in WAL header so that only proper Reader can 
> open the WAL file
> --
>
> Key: HBASE-11620
> URL: https://issues.apache.org/jira/browse/HBASE-11620
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.4
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.99.0, 0.98.5, 2.0.0
>
> Attachments: 11620-0.98-v6.txt, 11620-0.98-v7.txt, 11620-v1.txt, 
> 11620-v2.txt, 11620-v3.txt, 11620-v4.txt, 11620-v5.txt, 11620-v6.txt, 
> 11620-v6.txt, 11620-v7.txt
>
>
> Reported by Kiran in this thread: "HBase file encryption, inconsistencies 
> observed and data loss"
> After step 4 ( i.e disabling of WAL encryption, removing 
> SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly 
> due to EOF exception at Basedecoder. This is not considered as error and 
> these WAL are being moved to /oldWALs.
> Following is observed in log files:
> {code}
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Splitting hlog: 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017,
>  length=172
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: DistributedLogReplay = false
> 2014-07-30 19:44:29,313 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: Recovering lease on dfs file 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
> 2014-07-30 19:44:29,315 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: recoverLease=true, attempt=0 on 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  after 1ms
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting
> 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting
> 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: 
> Premature EOF from inputStream
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Finishing writing output logs and closing down.
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Waiting for split writer threads to finish
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Split writers finished
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Processed 0 edits across 0 regions; log 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  is corrupted = false progress failed = false
> {code}
> To fix this, we need to propagate EOF exception to HLogSplitter. Any 
> suggestions on the fix?
>  (end of quote from Kiran)
> In BaseDecoder#rethrowEofException() :
> {code}
> if (!isEof) throw ioEx;
> LOG.error("Partial cell read caused by EOF: " + ioEx);
> EOFException eofEx = new EOFException("Partial cell read");
> eofEx.initCause(ioEx);
> throw eofEx;
> {code}
> throwing EOFException would not propagate the "Partial cell read" condition 
> to HLogSplitter which doesn't treat EOFException as an error.
> I think IOException should be thrown above - HLogSplitter#getNextLogLine() 
> would translate the IOEx to CorruptedLogFileException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11620) Record the class name of Writer in WAL header so that only proper Reader can open the WAL file

2014-08-14 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14097909#comment-14097909
 ] 

Enis Soztutar commented on HBASE-11620:
---

Related to this, should not we also write the CellCodec that we use in the WAL 
header. Right now, the codec comes from the configuration which means that you 
cannot read back the WAL files if you change the codec. What do you guys think, 
shall I open an issue? 

> Record the class name of Writer in WAL header so that only proper Reader can 
> open the WAL file
> --
>
> Key: HBASE-11620
> URL: https://issues.apache.org/jira/browse/HBASE-11620
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.4
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.99.0, 0.98.5, 2.0.0
>
> Attachments: 11620-0.98-v6.txt, 11620-0.98-v7.txt, 11620-v1.txt, 
> 11620-v2.txt, 11620-v3.txt, 11620-v4.txt, 11620-v5.txt, 11620-v6.txt, 
> 11620-v6.txt, 11620-v7.txt
>
>
> Reported by Kiran in this thread: "HBase file encryption, inconsistencies 
> observed and data loss"
> After step 4 ( i.e disabling of WAL encryption, removing 
> SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly 
> due to EOF exception at Basedecoder. This is not considered as error and 
> these WAL are being moved to /oldWALs.
> Following is observed in log files:
> {code}
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Splitting hlog: 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017,
>  length=172
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: DistributedLogReplay = false
> 2014-07-30 19:44:29,313 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: Recovering lease on dfs file 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
> 2014-07-30 19:44:29,315 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: recoverLease=true, attempt=0 on 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  after 1ms
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting
> 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting
> 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: 
> Premature EOF from inputStream
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Finishing writing output logs and closing down.
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Waiting for split writer threads to finish
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Split writers finished
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Processed 0 edits across 0 regions; log 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  is corrupted = false progress failed = false
> {code}
> To fix this, we need to propagate EOF exception to HLogSplitter. Any 
> suggestions on the fix?
>  (end of quote from Kiran)
> In BaseDecoder#rethrowEofException() :
> {code}
> if (!isEof) throw ioEx;
> LOG.error("Partial cell read caused by EOF: " + ioEx);
> EOFException eofEx = new EOFException("Partial cell read");
> eofEx.initCause(ioEx);
> throw eofEx;
> {code}
> throwing EOFException would not propagate the "Partial cell read" condition 
> to HLogSplitter which doesn't treat EOFException as an error.
> I think IOException should be thrown above - HLogSplitter#getNextLogLine() 
> would translate the IOEx to CorruptedLogFileException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11620) Record the class name of Writer in WAL header so that only proper Reader can open the WAL file

2014-08-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082846#comment-14082846
 ] 

Hudson commented on HBASE-11620:


FAILURE: Integrated in HBase-1.0 #80 (See 
[https://builds.apache.org/job/HBase-1.0/80/])
HBASE-11620 Record the class name of Writer in WAL header so that only proper 
Reader can open the WAL file (Ted Yu) (tedyu: rev 
e142961099cda5b3f733cd2239cb22ce150f5c08)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogReaderOnSecureHLog.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogReader.java
* hbase-protocol/src/main/protobuf/WAL.proto
* 
hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogWriter.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java


> Record the class name of Writer in WAL header so that only proper Reader can 
> open the WAL file
> --
>
> Key: HBASE-11620
> URL: https://issues.apache.org/jira/browse/HBASE-11620
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.4
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.99.0, 0.98.5, 2.0.0
>
> Attachments: 11620-0.98-v6.txt, 11620-0.98-v7.txt, 11620-v1.txt, 
> 11620-v2.txt, 11620-v3.txt, 11620-v4.txt, 11620-v5.txt, 11620-v6.txt, 
> 11620-v6.txt, 11620-v7.txt
>
>
> Reported by Kiran in this thread: "HBase file encryption, inconsistencies 
> observed and data loss"
> After step 4 ( i.e disabling of WAL encryption, removing 
> SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly 
> due to EOF exception at Basedecoder. This is not considered as error and 
> these WAL are being moved to /oldWALs.
> Following is observed in log files:
> {code}
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Splitting hlog: 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017,
>  length=172
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: DistributedLogReplay = false
> 2014-07-30 19:44:29,313 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: Recovering lease on dfs file 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
> 2014-07-30 19:44:29,315 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: recoverLease=true, attempt=0 on 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  after 1ms
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting
> 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting
> 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: 
> Premature EOF from inputStream
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Finishing writing output logs and closing down.
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Waiting for split writer threads to finish
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Split writers finished
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Processed 0 edits across 0 regions; log 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  is corrupted = false progress failed = false
> {code}
> To fix this, we need to propagate EOF exception to HLogSplitter. Any 
> suggestions on the fix?
>  (end of quote from Kiran)
> In BaseDecoder#rethrowEofException() :
> {code}
> if (!isEof) throw ioEx;
> LOG.error("Partial cell read caused by EOF: " + ioEx);
> EOFException eofEx = new EOFException("Partial cell read");
> eofEx.initCause(ioEx);
> throw eofE

[jira] [Commented] (HBASE-11620) Record the class name of Writer in WAL header so that only proper Reader can open the WAL file

2014-08-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082805#comment-14082805
 ] 

Hudson commented on HBASE-11620:


FAILURE: Integrated in HBase-TRUNK #5361 (See 
[https://builds.apache.org/job/HBase-TRUNK/5361/])
HBASE-11620 Record the class name of Writer in WAL header so that only proper 
Reader can open the WAL file (Ted Yu) (tedyu: rev 
b384c06d35c89642510c097a1afc0228bff774fb)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogWriter.java
* hbase-protocol/src/main/protobuf/WAL.proto
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogReader.java
* 
hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogReaderOnSecureHLog.java


> Record the class name of Writer in WAL header so that only proper Reader can 
> open the WAL file
> --
>
> Key: HBASE-11620
> URL: https://issues.apache.org/jira/browse/HBASE-11620
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.4
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.99.0, 0.98.5, 2.0.0
>
> Attachments: 11620-0.98-v6.txt, 11620-0.98-v7.txt, 11620-v1.txt, 
> 11620-v2.txt, 11620-v3.txt, 11620-v4.txt, 11620-v5.txt, 11620-v6.txt, 
> 11620-v6.txt, 11620-v7.txt
>
>
> Reported by Kiran in this thread: "HBase file encryption, inconsistencies 
> observed and data loss"
> After step 4 ( i.e disabling of WAL encryption, removing 
> SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly 
> due to EOF exception at Basedecoder. This is not considered as error and 
> these WAL are being moved to /oldWALs.
> Following is observed in log files:
> {code}
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Splitting hlog: 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017,
>  length=172
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: DistributedLogReplay = false
> 2014-07-30 19:44:29,313 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: Recovering lease on dfs file 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
> 2014-07-30 19:44:29,315 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: recoverLease=true, attempt=0 on 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  after 1ms
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting
> 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting
> 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: 
> Premature EOF from inputStream
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Finishing writing output logs and closing down.
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Waiting for split writer threads to finish
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Split writers finished
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Processed 0 edits across 0 regions; log 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  is corrupted = false progress failed = false
> {code}
> To fix this, we need to propagate EOF exception to HLogSplitter. Any 
> suggestions on the fix?
>  (end of quote from Kiran)
> In BaseDecoder#rethrowEofException() :
> {code}
> if (!isEof) throw ioEx;
> LOG.error("Partial cell read caused by EOF: " + ioEx);
> EOFException eofEx = new EOFException("Partial cell read");
> eofEx.initCause(ioEx);
> th

[jira] [Commented] (HBASE-11620) Record the class name of Writer in WAL header so that only proper Reader can open the WAL file

2014-08-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082793#comment-14082793
 ] 

Hudson commented on HBASE-11620:


SUCCESS: Integrated in HBase-0.98 #429 (See 
[https://builds.apache.org/job/HBase-0.98/429/])
HBASE-11620 Record the class name of Writer in WAL header so that only proper 
Reader can open the WAL file (Ted Yu) (tedyu: rev 
acc5c13f37c7b16058797e81a6ec4769d8335540)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java
* 
hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogWriter.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogReaderOnSecureHLog.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogReader.java
* hbase-protocol/src/main/protobuf/WAL.proto


> Record the class name of Writer in WAL header so that only proper Reader can 
> open the WAL file
> --
>
> Key: HBASE-11620
> URL: https://issues.apache.org/jira/browse/HBASE-11620
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.4
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.99.0, 0.98.5, 2.0.0
>
> Attachments: 11620-0.98-v6.txt, 11620-0.98-v7.txt, 11620-v1.txt, 
> 11620-v2.txt, 11620-v3.txt, 11620-v4.txt, 11620-v5.txt, 11620-v6.txt, 
> 11620-v6.txt, 11620-v7.txt
>
>
> Reported by Kiran in this thread: "HBase file encryption, inconsistencies 
> observed and data loss"
> After step 4 ( i.e disabling of WAL encryption, removing 
> SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly 
> due to EOF exception at Basedecoder. This is not considered as error and 
> these WAL are being moved to /oldWALs.
> Following is observed in log files:
> {code}
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Splitting hlog: 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017,
>  length=172
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: DistributedLogReplay = false
> 2014-07-30 19:44:29,313 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: Recovering lease on dfs file 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
> 2014-07-30 19:44:29,315 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: recoverLease=true, attempt=0 on 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  after 1ms
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting
> 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting
> 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: 
> Premature EOF from inputStream
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Finishing writing output logs and closing down.
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Waiting for split writer threads to finish
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Split writers finished
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Processed 0 edits across 0 regions; log 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  is corrupted = false progress failed = false
> {code}
> To fix this, we need to propagate EOF exception to HLogSplitter. Any 
> suggestions on the fix?
>  (end of quote from Kiran)
> In BaseDecoder#rethrowEofException() :
> {code}
> if (!isEof) throw ioEx;
> LOG.error("Partial cell read caused by EOF: " + ioEx);
> EOFException eofEx = new EOFException("Partial cell read");
> eofEx.initCause(ioEx);
> throw 

[jira] [Commented] (HBASE-11620) Record the class name of Writer in WAL header so that only proper Reader can open the WAL file

2014-08-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082768#comment-14082768
 ] 

Hudson commented on HBASE-11620:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #406 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/406/])
HBASE-11620 Record the class name of Writer in WAL header so that only proper 
Reader can open the WAL file (Ted Yu) (tedyu: rev 
acc5c13f37c7b16058797e81a6ec4769d8335540)
* hbase-protocol/src/main/protobuf/WAL.proto
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogReaderOnSecureHLog.java
* 
hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogReader.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogWriter.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java


> Record the class name of Writer in WAL header so that only proper Reader can 
> open the WAL file
> --
>
> Key: HBASE-11620
> URL: https://issues.apache.org/jira/browse/HBASE-11620
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.4
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.99.0, 0.98.5, 2.0.0
>
> Attachments: 11620-0.98-v6.txt, 11620-0.98-v7.txt, 11620-v1.txt, 
> 11620-v2.txt, 11620-v3.txt, 11620-v4.txt, 11620-v5.txt, 11620-v6.txt, 
> 11620-v6.txt, 11620-v7.txt
>
>
> Reported by Kiran in this thread: "HBase file encryption, inconsistencies 
> observed and data loss"
> After step 4 ( i.e disabling of WAL encryption, removing 
> SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly 
> due to EOF exception at Basedecoder. This is not considered as error and 
> these WAL are being moved to /oldWALs.
> Following is observed in log files:
> {code}
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Splitting hlog: 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017,
>  length=172
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: DistributedLogReplay = false
> 2014-07-30 19:44:29,313 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: Recovering lease on dfs file 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
> 2014-07-30 19:44:29,315 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: recoverLease=true, attempt=0 on 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  after 1ms
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting
> 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting
> 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: 
> Premature EOF from inputStream
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Finishing writing output logs and closing down.
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Waiting for split writer threads to finish
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Split writers finished
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Processed 0 edits across 0 regions; log 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  is corrupted = false progress failed = false
> {code}
> To fix this, we need to propagate EOF exception to HLogSplitter. Any 
> suggestions on the fix?
>  (end of quote from Kiran)
> In BaseDecoder#rethrowEofException() :
> {code}
> if (!isEof) throw ioEx;
> LOG.error("Partial cell read caused by EOF: " + ioEx);
> EOFException eofEx = new EOFException("Partial cell read");
> eofEx.i

[jira] [Commented] (HBASE-11620) Record the class name of Writer in WAL header so that only proper Reader can open the WAL file

2014-08-01 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082486#comment-14082486
 ] 

Andrew Purtell commented on HBASE-11620:


+1, patch v6

Please update or remove the comments in testSecureHLogReaderOnHLog on commit, 
and fix the assert messages. The meaning of all the checks are reversed, but 
the text hasn't been updated to reflect that.


> Record the class name of Writer in WAL header so that only proper Reader can 
> open the WAL file
> --
>
> Key: HBASE-11620
> URL: https://issues.apache.org/jira/browse/HBASE-11620
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.4
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.99.0, 0.98.5, 2.0.0
>
> Attachments: 11620-0.98-v6.txt, 11620-v1.txt, 11620-v2.txt, 
> 11620-v3.txt, 11620-v4.txt, 11620-v5.txt, 11620-v6.txt, 11620-v6.txt
>
>
> Reported by Kiran in this thread: "HBase file encryption, inconsistencies 
> observed and data loss"
> After step 4 ( i.e disabling of WAL encryption, removing 
> SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly 
> due to EOF exception at Basedecoder. This is not considered as error and 
> these WAL are being moved to /oldWALs.
> Following is observed in log files:
> {code}
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Splitting hlog: 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017,
>  length=172
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: DistributedLogReplay = false
> 2014-07-30 19:44:29,313 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: Recovering lease on dfs file 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
> 2014-07-30 19:44:29,315 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: recoverLease=true, attempt=0 on 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  after 1ms
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting
> 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting
> 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: 
> Premature EOF from inputStream
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Finishing writing output logs and closing down.
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Waiting for split writer threads to finish
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Split writers finished
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Processed 0 edits across 0 regions; log 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  is corrupted = false progress failed = false
> {code}
> To fix this, we need to propagate EOF exception to HLogSplitter. Any 
> suggestions on the fix?
>  (end of quote from Kiran)
> In BaseDecoder#rethrowEofException() :
> {code}
> if (!isEof) throw ioEx;
> LOG.error("Partial cell read caused by EOF: " + ioEx);
> EOFException eofEx = new EOFException("Partial cell read");
> eofEx.initCause(ioEx);
> throw eofEx;
> {code}
> throwing EOFException would not propagate the "Partial cell read" condition 
> to HLogSplitter which doesn't treat EOFException as an error.
> I think IOException should be thrown above - HLogSplitter#getNextLogLine() 
> would translate the IOEx to CorruptedLogFileException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11620) Record the class name of Writer in WAL header so that only proper Reader can open the WAL file

2014-08-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082474#comment-14082474
 ] 

Ted Yu commented on HBASE-11620:


Test suite for 0.98 ran through.

Ping [~enis] for branch-1

> Record the class name of Writer in WAL header so that only proper Reader can 
> open the WAL file
> --
>
> Key: HBASE-11620
> URL: https://issues.apache.org/jira/browse/HBASE-11620
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.4
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.99.0, 0.98.5, 2.0.0
>
> Attachments: 11620-0.98-v6.txt, 11620-v1.txt, 11620-v2.txt, 
> 11620-v3.txt, 11620-v4.txt, 11620-v5.txt, 11620-v6.txt, 11620-v6.txt
>
>
> Reported by Kiran in this thread: "HBase file encryption, inconsistencies 
> observed and data loss"
> After step 4 ( i.e disabling of WAL encryption, removing 
> SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly 
> due to EOF exception at Basedecoder. This is not considered as error and 
> these WAL are being moved to /oldWALs.
> Following is observed in log files:
> {code}
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Splitting hlog: 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017,
>  length=172
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: DistributedLogReplay = false
> 2014-07-30 19:44:29,313 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: Recovering lease on dfs file 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
> 2014-07-30 19:44:29,315 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: recoverLease=true, attempt=0 on 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  after 1ms
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting
> 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting
> 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: 
> Premature EOF from inputStream
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Finishing writing output logs and closing down.
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Waiting for split writer threads to finish
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Split writers finished
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Processed 0 edits across 0 regions; log 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  is corrupted = false progress failed = false
> {code}
> To fix this, we need to propagate EOF exception to HLogSplitter. Any 
> suggestions on the fix?
>  (end of quote from Kiran)
> In BaseDecoder#rethrowEofException() :
> {code}
> if (!isEof) throw ioEx;
> LOG.error("Partial cell read caused by EOF: " + ioEx);
> EOFException eofEx = new EOFException("Partial cell read");
> eofEx.initCause(ioEx);
> throw eofEx;
> {code}
> throwing EOFException would not propagate the "Partial cell read" condition 
> to HLogSplitter which doesn't treat EOFException as an error.
> I think IOException should be thrown above - HLogSplitter#getNextLogLine() 
> would translate the IOEx to CorruptedLogFileException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11620) Record the class name of Writer in WAL header so that only proper Reader can open the WAL file

2014-08-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082321#comment-14082321
 ] 

Ted Yu commented on HBASE-11620:


Patch v6 applies to branch-1
I have run test suite for branch-1 on Linux - result looks good.

> Record the class name of Writer in WAL header so that only proper Reader can 
> open the WAL file
> --
>
> Key: HBASE-11620
> URL: https://issues.apache.org/jira/browse/HBASE-11620
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.4
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.99.0, 0.98.5, 2.0.0
>
> Attachments: 11620-v1.txt, 11620-v2.txt, 11620-v3.txt, 11620-v4.txt, 
> 11620-v5.txt, 11620-v6.txt, 11620-v6.txt
>
>
> Reported by Kiran in this thread: "HBase file encryption, inconsistencies 
> observed and data loss"
> After step 4 ( i.e disabling of WAL encryption, removing 
> SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly 
> due to EOF exception at Basedecoder. This is not considered as error and 
> these WAL are being moved to /oldWALs.
> Following is observed in log files:
> {code}
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Splitting hlog: 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017,
>  length=172
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: DistributedLogReplay = false
> 2014-07-30 19:44:29,313 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: Recovering lease on dfs file 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
> 2014-07-30 19:44:29,315 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: recoverLease=true, attempt=0 on 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  after 1ms
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting
> 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting
> 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: 
> Premature EOF from inputStream
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Finishing writing output logs and closing down.
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Waiting for split writer threads to finish
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Split writers finished
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Processed 0 edits across 0 regions; log 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  is corrupted = false progress failed = false
> {code}
> To fix this, we need to propagate EOF exception to HLogSplitter. Any 
> suggestions on the fix?
>  (end of quote from Kiran)
> In BaseDecoder#rethrowEofException() :
> {code}
> if (!isEof) throw ioEx;
> LOG.error("Partial cell read caused by EOF: " + ioEx);
> EOFException eofEx = new EOFException("Partial cell read");
> eofEx.initCause(ioEx);
> throw eofEx;
> {code}
> throwing EOFException would not propagate the "Partial cell read" condition 
> to HLogSplitter which doesn't treat EOFException as an error.
> I think IOException should be thrown above - HLogSplitter#getNextLogLine() 
> would translate the IOEx to CorruptedLogFileException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11620) Record the class name of Writer in WAL header so that only proper Reader can open the WAL file

2014-08-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082061#comment-14082061
 ] 

Hadoop QA commented on HBASE-11620:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12659092/11620-v6.txt
  against trunk revision .
  ATTACHMENT ID: 12659092

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10255//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10255//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10255//console

This message is automatically generated.

> Record the class name of Writer in WAL header so that only proper Reader can 
> open the WAL file
> --
>
> Key: HBASE-11620
> URL: https://issues.apache.org/jira/browse/HBASE-11620
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.4
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.99.0, 0.98.5, 2.0.0
>
> Attachments: 11620-v1.txt, 11620-v2.txt, 11620-v3.txt, 11620-v4.txt, 
> 11620-v5.txt, 11620-v6.txt
>
>
> Reported by Kiran in this thread: "HBase file encryption, inconsistencies 
> observed and data loss"
> After step 4 ( i.e disabling of WAL encryption, removing 
> SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly 
> due to EOF exception at Basedecoder. This is not considered as error and 
> these WAL are being moved to /oldWALs.
> Following is observed in log files:
> {code}
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Splitting hlog: 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017,
>  length=172
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: DistributedLogReplay = false
> 2014-07-30 19:44:29,313 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: Recovering lease on dfs file 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
> 2014-07-30 19:44:29,315 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: recoverLease=true, attempt=0 on 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  after 1ms
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] 
> wal.HLogSplitter: Writer 

[jira] [Commented] (HBASE-11620) Record the class name of Writer in WAL header so that only proper Reader can open the WAL file

2014-07-31 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081690#comment-14081690
 ] 

Andrew Purtell commented on HBASE-11620:


Thanks Ted. The test looks good and it does what you'd expect. The remaining 
issue here is the SecureProtobufWALReader should be able to read files written 
by the ProtobufWALWriter. The first thing we do in 
SecureProtobufWALReader#readHeader is call super.readHeader(), which will fail 
because we're only checking for a single class name, not a list of valid 
options. After that change this looks good to go in.

Please consider extending the unit test a bit to check that the 
SecureProtobufWALReader can read files written by the ProtobufWALWriter.

> Record the class name of Writer in WAL header so that only proper Reader can 
> open the WAL file
> --
>
> Key: HBASE-11620
> URL: https://issues.apache.org/jira/browse/HBASE-11620
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.4
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.99.0, 0.98.5, 2.0.0
>
> Attachments: 11620-v1.txt, 11620-v2.txt, 11620-v3.txt, 11620-v4.txt, 
> 11620-v5.txt
>
>
> Reported by Kiran in this thread: "HBase file encryption, inconsistencies 
> observed and data loss"
> After step 4 ( i.e disabling of WAL encryption, removing 
> SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly 
> due to EOF exception at Basedecoder. This is not considered as error and 
> these WAL are being moved to /oldWALs.
> Following is observed in log files:
> {code}
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Splitting hlog: 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017,
>  length=172
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: DistributedLogReplay = false
> 2014-07-30 19:44:29,313 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: Recovering lease on dfs file 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
> 2014-07-30 19:44:29,315 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: recoverLease=true, attempt=0 on 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  after 1ms
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting
> 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting
> 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: 
> Premature EOF from inputStream
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Finishing writing output logs and closing down.
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Waiting for split writer threads to finish
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Split writers finished
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Processed 0 edits across 0 regions; log 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  is corrupted = false progress failed = false
> {code}
> To fix this, we need to propagate EOF exception to HLogSplitter. Any 
> suggestions on the fix?
>  (end of quote from Kiran)
> In BaseDecoder#rethrowEofException() :
> {code}
> if (!isEof) throw ioEx;
> LOG.error("Partial cell read caused by EOF: " + ioEx);
> EOFException eofEx = new EOFException("Partial cell read");
> eofEx.initCause(ioEx);
> throw eofEx;
> {code}
> throwing EOFException would not propagate the "Partial cell read" condition 
> to HLogSplitter which doesn't treat EOFException as an error.
> I think IOException should be thrown above - HLogSplitter#getNextLogLine() 
> would translate the IOEx to CorruptedLogFileException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11620) Record the class name of Writer in WAL header so that only proper Reader can open the WAL file

2014-07-31 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081651#comment-14081651
 ] 

Andrew Purtell commented on HBASE-11620:


You could check the HFile ends up in the corrupt log dir. That's the desired 
outcome, correct [~kiranmr]?

> Record the class name of Writer in WAL header so that only proper Reader can 
> open the WAL file
> --
>
> Key: HBASE-11620
> URL: https://issues.apache.org/jira/browse/HBASE-11620
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.4
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.99.0, 0.98.5, 2.0.0
>
> Attachments: 11620-v1.txt, 11620-v2.txt, 11620-v3.txt, 11620-v4.txt
>
>
> Reported by Kiran in this thread: "HBase file encryption, inconsistencies 
> observed and data loss"
> After step 4 ( i.e disabling of WAL encryption, removing 
> SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly 
> due to EOF exception at Basedecoder. This is not considered as error and 
> these WAL are being moved to /oldWALs.
> Following is observed in log files:
> {code}
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Splitting hlog: 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017,
>  length=172
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: DistributedLogReplay = false
> 2014-07-30 19:44:29,313 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: Recovering lease on dfs file 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
> 2014-07-30 19:44:29,315 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> util.FSHDFSUtils: recoverLease=true, attempt=0 on 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  after 1ms
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting
> 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] 
> wal.HLogSplitter: Writer thread 
> Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting
> 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> codec.BaseDecoder: Partial cell read caused by EOF: java.io.IOException: 
> Premature EOF from inputStream
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Finishing writing output logs and closing down.
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Waiting for split writer threads to finish
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Split writers finished
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Processed 0 edits across 0 regions; log 
> file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
>  is corrupted = false progress failed = false
> {code}
> To fix this, we need to propagate EOF exception to HLogSplitter. Any 
> suggestions on the fix?
>  (end of quote from Kiran)
> In BaseDecoder#rethrowEofException() :
> {code}
> if (!isEof) throw ioEx;
> LOG.error("Partial cell read caused by EOF: " + ioEx);
> EOFException eofEx = new EOFException("Partial cell read");
> eofEx.initCause(ioEx);
> throw eofEx;
> {code}
> throwing EOFException would not propagate the "Partial cell read" condition 
> to HLogSplitter which doesn't treat EOFException as an error.
> I think IOException should be thrown above - HLogSplitter#getNextLogLine() 
> would translate the IOEx to CorruptedLogFileException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HBASE-11620) Record the class name of Writer in WAL header so that only proper Reader can open the WAL file

2014-07-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081516#comment-14081516
 ] 

Hadoop QA commented on HBASE-11620:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12658969/11620-v2.txt
  against trunk revision .
  ATTACHMENT ID: 12658969

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestIOFencing
  
org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction
  org.apache.hadoop.hbase.regionserver.TestRegionReplicas
  
org.apache.hadoop.hbase.master.TestMasterOperationsForRegionReplicas
  org.apache.hadoop.hbase.client.TestReplicasClient
  org.apache.hadoop.hbase.master.TestRestartCluster
  org.apache.hadoop.hbase.TestRegionRebalancing

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10248//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10248//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/10248//console

This message is automatically generated.

> Record the class name of Writer in WAL header so that only proper Reader can 
> open the WAL file
> --
>
> Key: HBASE-11620
> URL: https://issues.apache.org/jira/browse/HBASE-11620
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.4
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.99.0, 0.98.5, 2.0.0
>
> Attachments: 11620-v1.txt, 11620-v2.txt, 11620-v3.txt
>
>
> Reported by Kiran in this thread: "HBase file encryption, inconsistencies 
> observed and data loss"
> After step 4 ( i.e disabling of WAL encryption, removing 
> SecureProtobufReader/Writer and restart), read of encrypted WAL fails mainly 
> due to EOF exception at Basedecoder. This is not considered as error and 
> these WAL are being moved to /oldWALs.
> Following is observed in log files:
> {code}
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wal.HLogSplitter: Splitting hlog: 
> hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017,
>  length=172
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] 
> wa