Re: (BUG)ShortCircuitLocalReads Failed when enabled replication
Can anybody test it again,to verify whether there is a problem. Thanks! -- View this message in context: http://apache-hbase.679495.n3.nabble.com/BUG-ShortCircuitLocalReads-Failed-when-enabled-replication-tp4081733p4081968.html Sent from the HBase User mailing list archive at Nabble.com.
Re: (BUG)ShortCircuitLocalReads Failed when enabled replication
online cluster configuration: hdfs-site.xml <http://apache-hbase.679495.n3.nabble.com/file/n4081822/hdfs-site.xml> hbase-site.xml <http://apache-hbase.679495.n3.nabble.com/file/n4081822/hbase-site.xml> -- View this message in context: http://apache-hbase.679495.n3.nabble.com/BUG-ShortCircuitLocalReads-Failed-when-enabled-replication-tp4081733p4081822.html Sent from the HBase User mailing list archive at Nabble.com.
Re: (BUG)ShortCircuitLocalReads Failed when enabled replication
online cluster configuration: hdfs-site.xml <http://apache-hbase.679495.n3.nabble.com/file/n4081821/hdfs-site.xml> hbase-site.xml <http://apache-hbase.679495.n3.nabble.com/file/n4081821/hbase-site.xml> -- View this message in context: http://apache-hbase.679495.n3.nabble.com/BUG-ShortCircuitLocalReads-Failed-when-enabled-replication-tp4081733p4081821.html Sent from the HBase User mailing list archive at Nabble.com.
Re: (BUG)ShortCircuitLocalReads Failed when enabled replication
/*Are you seeing it for all the log files? The local ReplicationSource will handle the WALs for replication from this RS. But it may so happen that this RS went down and so another took charge of doing replication of the WAL files originated at the current down RS. Those WALs might not be local for the current replicating RS. So you may see SCR not happening. As added logs here I just commented.*/ Dima Spivak,no RS went down.All server logs are the same: HDFS log(no error log): 2016-08-12 14:25:49,902 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.7.139:50010, dest: /192.168.7.139:55856, bytes: 66048, op: HDFS_READ, cliID: DFSClient_hb_rs_dn7,60020,1470216726863, offset: 0, srvID: DS-1014379950-192.168.7.139-50010-1416802300444, blockid: blk_-4614053595081055029_54130616, duration: 252426 Src and dest are the same ip. -- View this message in context: http://apache-hbase.679495.n3.nabble.com/BUG-ShortCircuitLocalReads-Failed-when-enabled-replication-tp4081733p4081820.html Sent from the HBase User mailing list archive at Nabble.com.
Re: (BUG)ShortCircuitLocalReads Failed when enabled replication
HBase log: 2016-08-12 14:25:56,738 DEBUG org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening log for replication dn7%2C60020%2C1470216726863.1470983077165 at 28175754 2016-08-12 14:25:56,740 WARN org.apache.hadoop.hdfs.DFSClient: BlockReaderLocal requested with incorrect offset: Offset 0 and length 28186400 don't match block blk_-4614053595081055029_54130616 ( blockLen 28175754 ) 2016-08-12 14:25:56,740 WARN org.apache.hadoop.hdfs.DFSClient: BlockReaderLocal: Removing blk_-4614053595081055029_54130616 from cache because local file /sdf/hdfs/dfs/data/blocksBeingWritten/blk_-4614053595081055029 could not be opened. 2016-08-12 14:25:56,740 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read block blk_-4614053595081055029_54130616 on local machinejava.io.IOException: Offset 0 and length 28186400 don't match block blk_-4614053595081055029_54130616 ( blockLen 28175754 ) at org.apache.hadoop.hdfs.BlockReaderLocal.(BlockReaderLocal.java:287) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:171) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:358) at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:74) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2073) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2224) at java.io.DataInputStream.read(DataInputStream.java:149) at java.io.DataInputStream.readFully(DataInputStream.java:195) at java.io.DataInputStream.readFully(DataInputStream.java:169) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1486) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.(SequenceFileLogReader.java:55) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:178) at org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:734) at org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.openReader(ReplicationHLogReaderManager.java:69) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.openReader(ReplicationSource.java:574) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:364) 2016-08-12 14:25:56,740 INFO org.apache.hadoop.hdfs.DFSClient: Try reading via the datanode on /192.168.7.139:50010 HDFS log(no error log): 2016-08-12 14:25:49,902 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.7.139:50010, dest: /192.168.7.139:55856, bytes: 66048, op: HDFS_READ, cliID: DFSClient_hb_rs_dn7,60020,1470216726863, offset: 0, srvID: DS-1014379950-192.168.7.139-50010-1416802300444, blockid: blk_-4614053595081055029_54130616, duration: 252426 2016-08-12 14:25:49,964 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /192.168.7.139:50010, dest: /192.168.7.139:55858, bytes: 198144, op: HDFS_READ, cliID: DFSClient_hb_rs_dn7,60020,1470216726863, offset: 0, srvID: DS-1014379950-192.168.7.139-50010-1416802300444, blockid: blk_-4614053595081055029_54130616, duration: 329364 -- View this message in context: http://apache-hbase.679495.n3.nabble.com/BUG-ShortCircuitLocalReads-Failed-when-enabled-replication-tp4081733p4081819.html Sent from the HBase User mailing list archive at Nabble.com.
Re: (BUG)ShortCircuitLocalReads Failed when enabled replication
dfs.domain.socket.path not configured.There is no similar fail when ShortCircuitLocalReads ,just only for log files(replication). hdfs verion:1.2.1 dfs.block.local-path-access.user hadoop dfs.client.read.shortcircuit true hbase version:0.94.20 dfs.client.read.shortcircuit true hbase.replication true -- View this message in context: http://apache-hbase.679495.n3.nabble.com/BUG-ShortCircuitLocalReads-Failed-when-enabled-replication-tp4081733p4081818.html Sent from the HBase User mailing list archive at Nabble.com.
Re: (BUG)ShortCircuitLocalReads Failed when enabled replication
What's the value for dfs.domain.socket.path ? See explanation in http://hbase.apache.org/book.html for the meaning of this config. Cheers On Thu, Aug 11, 2016 at 12:46 AM, Ming Yangwrote: > The cluster enabled shortCircuitLocalReads. > > dfs.client.read.shortcircuit > true > > > When enabled replication,we found a large number of error logs. > 1.shortCircuitLocalReads(fail everytime). > 2.Try reading via the datanode on targetAddr(success). > How to make shortCircuitLocalReads successfully when enabled replication? > > 2016-08-03 10:46:21,721 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Opening > log for replication dn7%2C60020%2C1470136216957.1470192327030 at 16999670 > 2016-08-03 10:46:21,723 WARN org.apache.hadoop.hdfs.DFSClient: > BlockReaderLocal requested with incorrect offset: Offset 0 and length > 17073479 don't match block blk_4137524355009640437_53760530 ( blockLen > 16999670 ) > 2016-08-03 10:46:21,723 WARN org.apache.hadoop.hdfs.DFSClient: > BlockReaderLocal: Removing blk_4137524355009640437_53760530 from cache > because local file > /sdd/hdfs/dfs/data/blocksBeingWritten/blk_4137524355009640437 could not be > opened. > 2016-08-03 10:46:21,724 INFO org.apache.hadoop.hdfs.DFSClient: Failed to > read block blk_4137524355009640437_53760530 on local > machinejava.io.IOException: Offset 0 and length 17073479 don't match block > blk_4137524355009640437_53760530 ( blockLen 16999670 ) > at org.apache.hadoop.hdfs.BlockReaderLocal.( > BlockReaderLocal.java:287) > at > org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader( > BlockReaderLocal.java:171) > at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader( > DFSClient.java:358) > at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:74) > at > org.apache.hadoop.hdfs.DFSClient$DFSInputStream. > blockSeekTo(DFSClient.java:2073) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read( > DFSClient.java:2224) > at java.io.DataInputStream.read(DataInputStream.java:149) > at java.io.DataInputStream.readFully(DataInputStream.java:195) > at java.io.DataInputStream.readFully(DataInputStream.java:169) > at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1486) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$ > WALReader.(SequenceFileLogReader.java:55) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init( > SequenceFileLogReader.java:178) > at org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:734) > at > org.apache.hadoop.hbase.replication.regionserver. > ReplicationHLogReaderManager.openReader(ReplicationHLogReaderManager. > java:69) > at > org.apache.hadoop.hbase.replication.regionserver. > ReplicationSource.openReader(ReplicationSource.java:574) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run( > ReplicationSource.java:364) > 2016-08-03 10:46:21,724 INFO org.apache.hadoop.hdfs.DFSClient: Try reading > via the datanode on /192.168.7.139:50010 >
Re: (BUG)ShortCircuitLocalReads Failed when enabled replication
Hey Yang, Looks like HDFS is having trouble with a block. Have you tried running hadoop fsck? -Dima On Thursday, August 11, 2016, Ming Yangwrote: > The cluster enabled shortCircuitLocalReads. > > dfs.client.read.shortcircuit > true > > > When enabled replication,we found a large number of error logs. > 1.shortCircuitLocalReads(fail everytime). > 2.Try reading via the datanode on targetAddr(success). > How to make shortCircuitLocalReads successfully when enabled replication? > > 2016-08-03 10:46:21,721 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Opening > log for replication dn7%2C60020%2C1470136216957.1470192327030 at 16999670 > 2016-08-03 10:46:21,723 WARN org.apache.hadoop.hdfs.DFSClient: > BlockReaderLocal requested with incorrect offset: Offset 0 and length > 17073479 don't match block blk_4137524355009640437_53760530 ( blockLen > 16999670 ) > 2016-08-03 10:46:21,723 WARN org.apache.hadoop.hdfs.DFSClient: > BlockReaderLocal: Removing blk_4137524355009640437_53760530 from cache > because local file > /sdd/hdfs/dfs/data/blocksBeingWritten/blk_4137524355009640437 could not be > opened. > 2016-08-03 10:46:21,724 INFO org.apache.hadoop.hdfs.DFSClient: Failed to > read block blk_4137524355009640437_53760530 on local > machinejava.io.IOException: Offset 0 and length 17073479 don't match block > blk_4137524355009640437_53760530 ( blockLen 16999670 ) > at org.apache.hadoop.hdfs.BlockReaderLocal.( > BlockReaderLocal.java:287) > at > org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader( > BlockReaderLocal.java:171) > at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader( > DFSClient.java:358) > at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:74) > at > org.apache.hadoop.hdfs.DFSClient$DFSInputStream. > blockSeekTo(DFSClient.java:2073) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read( > DFSClient.java:2224) > at java.io.DataInputStream.read(DataInputStream.java:149) > at java.io.DataInputStream.readFully(DataInputStream.java:195) > at java.io.DataInputStream.readFully(DataInputStream.java:169) > at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1486) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$ > WALReader.(SequenceFileLogReader.java:55) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init( > SequenceFileLogReader.java:178) > at org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:734) > at > org.apache.hadoop.hbase.replication.regionserver. > ReplicationHLogReaderManager.openReader(ReplicationHLogReaderManager. > java:69) > at > org.apache.hadoop.hbase.replication.regionserver. > ReplicationSource.openReader(ReplicationSource.java:574) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run( > ReplicationSource.java:364) > 2016-08-03 10:46:21,724 INFO org.apache.hadoop.hdfs.DFSClient: Try reading > via the datanode on /192.168.7.139:50010 > -- -Dima
(BUG)ShortCircuitLocalReads Failed when enabled replication
The cluster enabled shortCircuitLocalReads. dfs.client.read.shortcircuit true When enabled replication,we found a large number of error logs. 1.shortCircuitLocalReads(fail everytime). 2.Try reading via the datanode on targetAddr(success). How to make shortCircuitLocalReads successfully when enabled replication? 2016-08-03 10:46:21,721 DEBUG org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening log for replication dn7%2C60020%2C1470136216957.1470192327030 at 16999670 2016-08-03 10:46:21,723 WARN org.apache.hadoop.hdfs.DFSClient: BlockReaderLocal requested with incorrect offset: Offset 0 and length 17073479 don't match block blk_4137524355009640437_53760530 ( blockLen 16999670 ) 2016-08-03 10:46:21,723 WARN org.apache.hadoop.hdfs.DFSClient: BlockReaderLocal: Removing blk_4137524355009640437_53760530 from cache because local file /sdd/hdfs/dfs/data/blocksBeingWritten/blk_4137524355009640437 could not be opened. 2016-08-03 10:46:21,724 INFO org.apache.hadoop.hdfs.DFSClient: Failed to read block blk_4137524355009640437_53760530 on local machinejava.io.IOException: Offset 0 and length 17073479 don't match block blk_4137524355009640437_53760530 ( blockLen 16999670 ) at org.apache.hadoop.hdfs.BlockReaderLocal.(BlockReaderLocal.java:287) at org.apache.hadoop.hdfs.BlockReaderLocal.newBlockReader(BlockReaderLocal.java:171) at org.apache.hadoop.hdfs.DFSClient.getLocalBlockReader(DFSClient.java:358) at org.apache.hadoop.hdfs.DFSClient.access$800(DFSClient.java:74) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2073) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2224) at java.io.DataInputStream.read(DataInputStream.java:149) at java.io.DataInputStream.readFully(DataInputStream.java:195) at java.io.DataInputStream.readFully(DataInputStream.java:169) at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1508) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1486) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1475) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1470) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.(SequenceFileLogReader.java:55) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:178) at org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:734) at org.apache.hadoop.hbase.replication.regionserver.ReplicationHLogReaderManager.openReader(ReplicationHLogReaderManager.java:69) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.openReader(ReplicationSource.java:574) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:364) 2016-08-03 10:46:21,724 INFO org.apache.hadoop.hdfs.DFSClient: Try reading via the datanode on /192.168.7.139:50010