[
https://issues.apache.org/jira/browse/HADOOP-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667881#action_12667881
]
Doug Judd commented on HADOOP-4379:
-----------------------------------
Now when I apply your latest patch and the one from 5027 to the 0.19.0 source,
the datanodes seem to be going into an infinite loop of NullPointerExceptions.
At the top of the hadoop-zvents-datanode-motherlode007.admin.zvents.com.log I'm
seeing this:
[d...@motherlode007 logs]$ more
hadoop-zvents-datanode-motherlode007.admin.zvents.com.log
2009-01-27 15:32:55,828 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = motherlode007.admin.zvents.com/10.0.30.114
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.19.1-dev
STARTUP_MSG: build = -r ; compiled by 'doug' on Tue Jan 27 15:04:06 PST 2009
************************************************************/
2009-01-27 15:32:57,041 INFO org.apache.hadoop.ipc.Client: Retrying connect to
server: motherlode000/10.0.30.100:9000. Already tried 0 time(s).
2009-01-27 15:33:00,505 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
Registered FSDatasetStatusMBean
2009-01-27 15:33:00,507 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
Opened info server at 50010
2009-01-27 15:33:00,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
Balancing bandwith is 1048576 bytes/s
2009-01-27 15:33:00,783 INFO org.mortbay.http.HttpServer: Version Jetty/5.1.4
2009-01-27 15:33:00,792 INFO org.mortbay.util.Credential: Checking Resource
aliases
2009-01-27 15:33:01,839 INFO org.mortbay.util.Container: Started
org.mortbay.jetty.servlet.webapplicationhand...@319c0bd6
2009-01-27 15:33:01,878 INFO org.mortbay.util.Container: Started
WebApplicationContext[/static,/static]
2009-01-27 15:33:02,048 INFO org.mortbay.util.Container: Started
org.mortbay.jetty.servlet.webapplicationhand...@5a943dc4
2009-01-27 15:33:02,049 INFO org.mortbay.util.Container: Started
WebApplicationContext[/logs,/logs]
2009-01-27 15:33:02,754 INFO org.mortbay.util.Container: Started
org.mortbay.jetty.servlet.webapplicationhand...@6d581e80
2009-01-27 15:33:02,760 INFO org.mortbay.util.Container: Started
WebApplicationContext[/,/]
2009-01-27 15:33:02,763 INFO org.mortbay.http.SocketListener: Started
SocketListener on 0.0.0.0:50075
2009-01-27 15:33:02,764 INFO org.mortbay.util.Container: Started
org.mortbay.jetty.ser...@5c435a3a
2009-01-27 15:33:02,769 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=DataNode, sessionId=null
2009-01-27 15:33:02,825 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
Initializing RPC Metrics with hostName=DataNode, port=50020
2009-01-27 15:33:02,831 INFO org.apache.hadoop.ipc.Server: IPC Server
Responder: starting
2009-01-27 15:33:02,834 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0
on 50020: starting
2009-01-27 15:33:02,834 INFO org.apache.hadoop.ipc.Server: IPC Server listener
on 50020: starting
2009-01-27 15:33:02,836 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
dnRegistration = DatanodeRegistration(motherlode007.admin.zvents.com:50010,
storageID
=DS-745224472-10.0.30.114-50010-1230665635246, infoPort=50075, ipcPort=50020)
2009-01-27 15:33:02,837 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1
on 50020: starting
2009-01-27 15:33:02,837 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2
on 50020: starting
2009-01-27 15:33:02,839 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.0.30.114:50010,
storageID=DS-745224472-10.0.30.114-50010-1230
665635246, infoPort=50075, ipcPort=50020)In DataNode.run, data =
FSDataset{dirpath='/data1/hadoop/dfs/data/current,/data2/hadoop/dfs/data/current,/data3/hadoop/dfs
/data/current'}
2009-01-27 15:33:02,840 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:
using BLOCKREPORT_INTERVAL of 3600000msec Initial delay: 0msec
2009-01-27 15:33:02,932 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:
org.apache.hadoop.ipc.RemoteException: java.io.IOException:
java.lang.NullPointerException
at
org.apache.hadoop.hdfs.server.namenode.DatanodeDescriptor.reportDiff(DatanodeDescriptor.java:396)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processReport(FSNamesystem.java:2803)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.blockReport(NameNode.java:636)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892)
at org.apache.hadoop.ipc.Client.call(Client.java:696)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at $Proxy4.blockReport(Unknown Source)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:723)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1100)
at java.lang.Thread.run(Thread.java:619)
2009-01-27 15:33:02,953 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:
org.apache.hadoop.ipc.RemoteException: java.io.IOException:
java.lang.NullPointerExce
ption
at
org.apache.hadoop.hdfs.server.namenode.DatanodeDescriptor.reportDiff(DatanodeDescriptor.java:396)
[...]
And the file is growing rapidly with the following exceptions tacked on to the
end:
2009-01-27 16:10:55,973 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:
org.apache.hadoop.ipc.RemoteException: java.io.IOException:
java.lang.NullPointerException
at org.apache.hadoop.ipc.Client.call(Client.java:696)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at $Proxy4.blockReport(Unknown Source)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:723)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1100)
at java.lang.Thread.run(Thread.java:619)
It appears that these exceptions started happening within about 5 seconds after
startup, so it doesn't look like it has anything to do with Hypertable. Is it
ok to apply these patches to the 0.19.0 source? Or should I be applying them
to the trunk?
- Doug
> In HDFS, sync() not yet guarantees data available to the new readers
> --------------------------------------------------------------------
>
> Key: HADOOP-4379
> URL: https://issues.apache.org/jira/browse/HADOOP-4379
> Project: Hadoop Core
> Issue Type: New Feature
> Components: dfs
> Reporter: Tsz Wo (Nicholas), SZE
> Assignee: dhruba borthakur
> Fix For: 0.19.1
>
> Attachments: 4379_20081010TC3.java, fsyncConcurrentReaders.txt,
> fsyncConcurrentReaders3.patch, fsyncConcurrentReaders4.patch, Reader.java,
> Reader.java, Writer.java, Writer.java
>
>
> In the append design doc
> (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it
> says
> * A reader is guaranteed to be able to read data that was 'flushed' before
> the reader opened the file
> However, this feature is not yet implemented. Note that the operation
> 'flushed' is now called "sync".
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.