[ https://issues.apache.org/jira/browse/HDFS-7765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16428532#comment-16428532 ]
Jonathan Eagles commented on HDFS-7765: --------------------------------------- [~janmejay], I think [~wankunde]'sĀ assessment matches my experience for when this issue happens. Once an IOException happens at max buffer size, this class becomes unusable. Much like this other apache stream class as reference, flush if we can't write, then write. That way the state is not modified until safe. https://github.com/apache/commons-io/blob/master/src/main/java/org/apache/commons/io/output/ByteArrayOutputStream.java#L171 {code} public synchronized void write(int b) throws IOException { int newcount = count + 1; if (newcount > buf.length) { flushBuffer(); } buf[count++] = (byte)b; } {code} I haven't checked the rest of the FSOutputSummer for correctness. That is worth checking. > FSOutputSummer throwing ArrayIndexOutOfBoundsException > ------------------------------------------------------ > > Key: HDFS-7765 > URL: https://issues.apache.org/jira/browse/HDFS-7765 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client > Affects Versions: 2.6.0 > Environment: Centos 6, Open JDK 7, Amazon EC2, Accumulo 1.6.2RC4 > Reporter: Keith Turner > Assignee: Janmejay Singh > Priority: Major > Attachments: > 0001-PATCH-HDFS-7765-FSOutputSummer-throwing-ArrayIndexOu.patch, > HDFS-7765.patch > > > While running an Accumulo test, saw exceptions like the following while > trying to write to write ahead log in HDFS. > The exception occurrs at > [FSOutputSummer.java:76|https://github.com/apache/hadoop/blob/release-2.6.0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSOutputSummer.java#L76] > which is attempting to update a byte array. > {noformat} > 2015-02-06 19:46:49,769 [log.DfsLogger] WARN : Exception syncing > java.lang.reflect.InvocationTargetException > java.lang.ArrayIndexOutOfBoundsException: 4608 > at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:76) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:50) > at java.io.DataOutputStream.write(DataOutputStream.java:88) > at java.io.DataOutputStream.writeByte(DataOutputStream.java:153) > at > org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87) > at org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:526) > at > org.apache.accumulo.tserver.log.DfsLogger.logFileData(DfsLogger.java:540) > at > org.apache.accumulo.tserver.log.DfsLogger.logManyTablets(DfsLogger.java:573) > at > org.apache.accumulo.tserver.log.TabletServerLogger$6.write(TabletServerLogger.java:373) > at > org.apache.accumulo.tserver.log.TabletServerLogger.write(TabletServerLogger.java:274) > at > org.apache.accumulo.tserver.log.TabletServerLogger.logManyTablets(TabletServerLogger.java:365) > at > org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.flush(TabletServer.java:1667) > at > org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.closeUpdate(TabletServer.java:1754) > at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.accumulo.trace.instrument.thrift.RpcServerInvocationHandler.invoke(RpcServerInvocationHandler.java:46) > at > org.apache.accumulo.server.util.RpcWrapper$1.invoke(RpcWrapper.java:47) > at com.sun.proxy.$Proxy22.closeUpdate(Unknown Source) > at > org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$closeUpdate.getResult(TabletClientService.java:2370) > at > org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$closeUpdate.getResult(TabletClientService.java:2354) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.accumulo.server.util.TServerUtils$TimedProcessor.process(TServerUtils.java:168) > at > org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:516) > at > org.apache.accumulo.server.util.CustomNonBlockingServer$1.run(CustomNonBlockingServer.java:77) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at > org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47) > at > org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34) > at java.lang.Thread.run(Thread.java:744) > 2015-02-06 19:46:49,769 [log.DfsLogger] WARN : Exception syncing > java.lang.reflect.InvocationTargetException > 2015-02-06 19:46:49,772 [log.DfsLogger] ERROR: > java.lang.ArrayIndexOutOfBoundsException: 4609 > java.lang.ArrayIndexOutOfBoundsException: 4609 > at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:76) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:50) > at java.io.DataOutputStream.write(DataOutputStream.java:88) > at java.io.DataOutputStream.writeByte(DataOutputStream.java:153) > at > org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87) > at org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:526) > at > org.apache.accumulo.tserver.log.DfsLogger.logFileData(DfsLogger.java:540) > at > org.apache.accumulo.tserver.log.DfsLogger.logManyTablets(DfsLogger.java:573) > at > org.apache.accumulo.tserver.log.TabletServerLogger$6.write(TabletServerLogger.java:373) > at > org.apache.accumulo.tserver.log.TabletServerLogger.write(TabletServerLogger.java:274) > at > org.apache.accumulo.tserver.log.TabletServerLogger.logManyTablets(TabletServerLogger.java:365) > at > org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.flush(TabletServer.java:1667) > at > org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.applyUpdates(TabletServer.java:1574) > at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.accumulo.trace.instrument.thrift.RpcServerInvocationHandler.invoke(RpcServerInvocationHandler.java:46) > at > org.apache.accumulo.server.util.RpcWrapper$1.invoke(RpcWrapper.java:47) > at com.sun.proxy.$Proxy22.applyUpdates(Unknown Source) > at > org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$applyUpdates.getResult(TabletClientService.java:2349) > at > org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$applyUpdates.getResult(TabletClientService.java:2335) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.accumulo.server.util.TServerUtils$TimedProcessor.process(TServerUtils.java:168) > at > org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:516) > at > org.apache.accumulo.server.util.CustomNonBlockingServer$1.run(CustomNonBlockingServer.java:77) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at > org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47) > at > org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34) > at java.lang.Thread.run(Thread.java:744) > . > . > . > java.lang.ArrayIndexOutOfBoundsException: 4632 > at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:76) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:50) > at java.io.DataOutputStream.write(DataOutputStream.java:88) > at java.io.DataOutputStream.writeByte(DataOutputStream.java:153) > at > org.apache.accumulo.tserver.logger.LogFileKey.write(LogFileKey.java:87) > at org.apache.accumulo.tserver.log.DfsLogger.write(DfsLogger.java:526) > at > org.apache.accumulo.tserver.log.DfsLogger.logFileData(DfsLogger.java:540) > at > org.apache.accumulo.tserver.log.DfsLogger.logManyTablets(DfsLogger.java:573) > at > org.apache.accumulo.tserver.log.TabletServerLogger$6.write(TabletServerLogger.java:373) > at > org.apache.accumulo.tserver.log.TabletServerLogger.write(TabletServerLogger.java:274) > at > org.apache.accumulo.tserver.log.TabletServerLogger.logManyTablets(TabletServerLogger.java:365) > at > org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.flush(TabletServer.java:1667) > at > org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.applyUpdates(TabletServer.java:1574) > at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.accumulo.trace.instrument.thrift.RpcServerInvocationHandler.invoke(RpcServerInvocationHandler.java:46) > at > org.apache.accumulo.server.util.RpcWrapper$1.invoke(RpcWrapper.java:47) > at com.sun.proxy.$Proxy22.applyUpdates(Unknown Source) > at > org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$applyUpdates.getResult(TabletClientService.java:2349) > at > org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$applyUpdates.getResult(TabletClientService.java:2335) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.accumulo.server.util.TServerUtils$TimedProcessor.process(TServerUtils.java:168) > at > org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:516) > at > org.apache.accumulo.server.util.CustomNonBlockingServer$1.run(CustomNonBlockingServer.java:77) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at > org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47) > at > org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34) > at java.lang.Thread.run(Thread.java:744) > {noformat} > Immediately before the above exception occurred, the following exception was > logged by a hdfs client background thread. > {noformat} > 2015-02-06 19:46:49,767 [hdfs.DFSClient] WARN : DataStreamer Exception > java.net.SocketTimeoutException: 70000 millis timeout while waiting for > channel to be ready for read. ch : java.nio.channels.SocketChannel[connected > local=/10.1.2.17:59411 remote=/10.1.2.24:50010] > at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118) > at java.io.FilterInputStream.read(FilterInputStream.java:83) > at java.io.FilterInputStream.read(FilterInputStream.java:83) > at > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2201) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:1142) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1112) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1253) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1004) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:548) > {noformat} > Information on how accumulo opened the file. > {noformat} > 2015-02-06 19:43:10,051 [fs.VolumeManagerImpl] DEBUG: Found CREATE enum CREATE > 2015-02-06 19:43:10,051 [fs.VolumeManagerImpl] DEBUG: Found synch enum > SYNC_BLOCK > 2015-02-06 19:43:10,051 [fs.VolumeManagerImpl] DEBUG: CreateFlag set: > [CREATE, SYNC_BLOCK] > 2015-02-06 19:43:10,051 [fs.VolumeManagerImpl] DEBUG: creating > hdfs://ip-10-1-2-11:9000/accumulo/wal/ip-10-1-2-17+9997/2e548d95-d075-484d-abac-bdd877ea205b > with SYNCH_BLOCK flag > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org