[ 
https://issues.apache.org/jira/browse/HBASE-27963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17740668#comment-17740668
 ] 

Rushabh Shah commented on HBASE-27963:
--------------------------------------

We are also seeing similar errors in our production environment. We are running 
some version of 1.7 version. As a work around we restart the regionserver and 
the new regionserver is able to replicate. So some in-memory data structure is 
out of sync.

> Replication stuck when switch to new reader
> -------------------------------------------
>
>                 Key: HBASE-27963
>                 URL: https://issues.apache.org/jira/browse/HBASE-27963
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 3.0.0-alpha-4, 2.4.17, 2.5.5
>            Reporter: Xiaolin Ha
>            Assignee: Xiaolin Ha
>            Priority: Major
>
> After creating new reader for next WAL, it immediately seek() to the  
> currentPositionOfEntry, but this position may be spill over the length of 
> current WAL.
> {code:java}
> WARN  
> [RpcServer.default.FPRWQ.Fifo.read.handler=101,queue=1,port=16020.replicationSource.wal-reader.XXXXXXX]
>  regionserver.ReplicationSourceWALReader: Failed to read stream of 
> replication entries
> java.io.EOFException: Cannot seek after EOF
>         at 
> org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1488)
>         at 
> org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:62)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.seekOnFs(ProtobufLogReader.java:495)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.ReaderBase.seek(ReaderBase.java:138)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.seek(WALEntryStream.java:399)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openReader(WALEntryStream.java:341)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.handleFileNotFound(WALEntryStream.java:328)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openReader(WALEntryStream.java:347)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openNextLog(WALEntryStream.java:310)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.checkReader(WALEntryStream.java:300)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:176)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:102)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.tryAdvanceStreamAndCreateWALBatch(ReplicationSourceWALReader.java:260)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:142)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to