Re: Namenode failed to start with "FSNamesystem initialization failed" error
I didn't have a space problem which led to it (I think). The corruption started after I bounced the cluster. At the time, I tried to investigate what led to the corruption but didn't find anything useful in the logs besides this line: saveLeases found path /tmp/temp623789763/tmp659456056/_temporary_attempt_200904211331_0010_r_02_0/part-2 but no matching entry in namespace I also tried to recover from the secondary name node files but the corruption my too wide-spread and I had to format. Tamir On Mon, May 4, 2009 at 4:48 PM, Stas Oskin wrote: > Hi. > > Same conditions - where the space has run out and the fs got corrupted? > > Or it got corrupted by itself (which is even more worrying)? > > Regards. > > 2009/5/4 Tamir Kamara > > > I had the same problem a couple of weeks ago with 0.19.1. Had to reformat > > the cluster too... > > > > On Mon, May 4, 2009 at 3:50 PM, Stas Oskin wrote: > > > > > Hi. > > > > > > After rebooting the NameNode server, I found out the NameNode doesn't > > start > > > anymore. > > > > > > The logs contained this error: > > > "FSNamesystem initialization failed" > > > > > > > > > I suspected filesystem corruption, so I tried to recover from > > > SecondaryNameNode. Problem is, it was completely empty! > > > > > > I had an issue that might have caused this - the root mount has run out > > of > > > space. But, both the NameNode and the SecondaryNameNode directories > were > > on > > > another mount point with plenty of space there - so it's very strange > > that > > > they were impacted in any way. > > > > > > Perhaps the logs, which were located on root mount and as a result, > could > > > not be written, have caused this? > > > > > > > > > To get back HDFS running, i had to format the HDFS (including manually > > > erasing the files from DataNodes). While this reasonable in test > > > environment > > > - production-wise it would be very bad. > > > > > > Any idea why it happened, and what can be done to prevent it in the > > future? > > > I'm using the stable 0.18.3 version of Hadoop. > > > > > > Thanks in advance! > > > > > >
java.io.EOFException: while trying to read 65557 bytes
Hello Everyone, I know there's been some chatter about this before but I am seeing the errors below on just about every one of our nodes. Is there a definitive reason on why these are occuring, is there something that we can do to prevent these? 2009-05-04 21:35:11,764 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.102.0.105:50010, storageID=DS-991582569-127.0.0.1-50010-1240886381606, infoPort=50075, ipcPort=50020):DataXceiver java.io.EOFException: while trying to read 65557 bytes at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:264) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:308) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:372) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:524) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103) at java.lang.Thread.run(Thread.java:619) Followed by: 2009-05-04 21:35:20,891 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_-7056150840276493498_10885 1 Exception java.io.InterruptedIOException: Interruped while waiting for IO on channel java.nio.channels.Socke tChannel[connected local=/10.102.0.105:37293 remote=/10.102.0.106:50010]. 59756 millis timeout left. at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:277) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:155) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123) at java.io.DataInputStream.readFully(DataInputStream.java:178) at java.io.DataInputStream.readLong(DataInputStream.java:399) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:853) at java.lang.Thread.run(Thread.java:619) Thanks, Albert
Re: specifying command line args, but getting an NPE
> But if conf.set(...) is called after instantiating job, it doesn't. > > Is this intended? > yes, Configuration must be set up before instantiating the Job object. However, some job parameters can be changed (before the actual job submission) by calling set methods on Job object. - Sharad
Re: How to configure nodes with different user account?
Hi, Menno and Aseem. Thank you for your help! With your help, I can now use ssh to connect each node without providing the username. However, another problem occurs. The directory structures are different among the servers, so when I use the starting script "start-all.sh" to start hadoop, it seems to run the bash using the structure exactly to the namenode. For example, the namenode(in server0) structure is like "/home/user0/hadoop/...", slave1(in server1) is like "/home/s/user1/hadoop/...", and slave2(in server2) is like "/home/u/user2/proj/hadoop/...". When I run start-all.sh, this message is shown: " starting namenode, logging to /home/user0/hadoop/bin/... server1: bash: line 0: cd: /home/user0/hadoop/bin/..: No such file or directory server1: bash: /home/user0/hadoop/bin/hadoop-daemon.sh: No such file or directory server2: bash: line 0: cd: /home/user0/hadoop/bin/..: No such file or directory server2: bash: /home/user0/hadoop/bin/hadoop-daemon.sh: No such file or directory .. " It seemed all the nodes should have the same structure to run hadoop. But I have no admin privillege on these servers, so I cannot create the exact directory as the namenode. Is there any way to let the nodes with different structures to run it? How can I configure it? Again, thank you all for your kind help!!! Starry /* Tomorrow is another day. So is today. */ On Mon, May 4, 2009 at 22:09, Puri, Aseem wrote: > Starry, > > In ".ssh" directory you have to create a file "config" (without > extension) on every node. > > Suppose server1 is your master and server2, server3 is your slave. > > On the master (server1), in the "config" file and add the following > lines: > > Host server2 > User user2 > Host server3 > User user3 > > On both slave (server2, server3) nodes, in the "config" file and add the > following lines: > > Host server1 > User user1 > > Hope it works for you > > Regards > Aseem Puri > > > -Original Message- > From: Menno Luiten [mailto:mlui...@artifix.net] > Sent: Monday, May 04, 2009 7:27 PM > To: core-user@hadoop.apache.org > Subject: RE: How to configure nodes with different user account? > > Hi Starry, > > What is the content of your 'slaves' file in the hadoop/conf directory > of your master node? > It should say something like: > > localhost > us...@server2 > us...@server3 > us...@server4 > > This should let the start-up scripts try and login using the proper > users. > > Hope that helps, > Menno > > -Oorspronkelijk bericht- > Van: Starry SHI [mailto:starr...@gmail.com] > Verzonden: maandag 4 mei 2009 10:53 > Aan: core-user@hadoop.apache.org > Onderwerp: How to configure nodes with different user account? > > Hi, all. I am new to Hadoop and I have a question to ask~ > > I have several accounts located in different linux servers (normal > user privilege, no admin authority), and i want to use them to form a > small cluster to run Hadoop applications. However, the usernames for > these accounts are different. I want to use shared key to connect all > the nodes, but I failed after several attempts. Is it possible to > connect all of them via different account? > > For example, I have 3 account: us...@server1, us...@server2, > us...@server3. After assigning authorized keys, I can use "ssh > us...@server2" without input the password. But when I start hadoop, I > was asked to input the password for us...@server2 (when I have already > logged in as user1). > > Can my problem be solved easily? I wish to get your help soon. > > Thank you for all your attention and help! > > Best regards, > Starry > >
What do we call Hadoop+HBase+Lucene+Zookeeper+etc....
Hey all, I'm going to be speaking at OSCON about my company's experiences with Hadoop and Friends, but I'm having a hard time coming up with a name for the entire software ecosystem. I'm thinking of calling it the "Apache CloudStack". Does this sound legit to you all? :) Is there something more 'official'? Cheers, Bradford
Re: Wrong FS Exception
Are you trying to run a distributed cluster? Does everything have the same config file? If so, every node is going to look at "localhost" instead of the correct host for fs.default.name, mapred.job.tracker, etc On Mon, May 4, 2009 at 1:54 PM, Kirk Hunter wrote: > > Can someone tell me how to resolve the following error message found in the > job tracker log file when trying to start map reduce. > > grep FATAL * > hadoop-hadoop-jobtracker-hadoop-1.log:2009-05-04 16:35:14,176 FATAL > org.apache.hadoop.mapred.JobTracker: java.lang.IllegalArgumentException: > Wrong FS: hdfs://usr/local/hadoop-datastore/hadoop-hadoop/mapred/system, > expected: hdfs://localhost:54310 > > > > Here is my hadoop-site.xml as well > > > > > > hadoop.tmp.dir > //usr/local/hadoop-datastore/hadoop-${user.name} > A base for other temporary directories. > > > dfs.data.dir > /usr/local/hadoop-datastore/hadoop-${user.name}/dfs/data > > > > fs.default.name > hdfs://localhost:54310 > The name of the default file system> A URI whose scheme and > author > ity determines the File System implementation> The uri's scheme determines > the config > property (fs.SCHEME.impl) naming the File System implementation class. > > The uri's authority is used to determine the host, port, etc. For a > filesystem. ription> > > > > mapred.job.tracker > localhost:54311 > The host and port that the MapREduce job tracker runs at. If > "local", > then jobs are run in-process as a single map and reduce task. > > > > -- > View this message in context: > http://www.nabble.com/Wrong-FS-Exception-tp23376486p23376486.html > Sent from the Hadoop core-user mailing list archive at Nabble.com. > >
Re: Infinite Loop Resending status from task tracker
Hi Lance, Two thoughts here that might be the culprit: 1) Is it possible that the partition that your mapred.local.dir is on is out of space on that task tracker? 2) Is it possible that you're using a directory under /tmp for mapred.local.dir and some system cron script cleared out /tmp? -Todd On Sat, May 2, 2009 at 9:01 AM, Lance Riedel wrote: > Hi Todd, > Not sure if this is related, but our hadoop cluster in general is getting > more and more unstable. the logs are full of this error message (but having > trouble tracking down the root problem): > > 2009-05-02 11:30:39,294 INFO org.apache.hadoop.mapred.TaskTracker: > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find > taskTracker/jobcache/job_200904301103_/attempt_200904301103__m_01_1/output/file.out > in any of the configured local directories > 2009-05-02 11:30:39,294 INFO org.apache.hadoop.mapred.TaskTracker: > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find > taskTracker/jobcache/job_200904301103_1675/attempt_200904301103_1675_r_12_1/output/file.out > in any of the configured local directories > 2009-05-02 11:30:44,295 INFO org.apache.hadoop.mapred.TaskTracker: > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find > taskTracker/jobcache/job_200904301103_0944/attempt_200904301103_0944_r_15_0/output/file.out > in any of the configured local directories > 2009-05-02 11:30:44,295 INFO org.apache.hadoop.mapred.TaskTracker: > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find > taskTracker/jobcache/job_200904301103_/attempt_200904301103__m_01_1/output/file.out > in any of the configured local directories > 2009-05-02 11:30:44,295 INFO org.apache.hadoop.mapred.TaskTracker: > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find > taskTracker/jobcache/job_200904301103_1675/attempt_200904301103_1675_r_12_1/output/file.out > in any of the configured local directories > 2009-05-02 11:30:49,296 INFO org.apache.hadoop.mapred.TaskTracker: > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find > taskTracker/jobcache/job_200904301103_0944/attempt_200904301103_0944_r_15_0/output/file.out > in any of the configured local directories > 2009-05-02 11:30:49,296 INFO org.apache.hadoop.mapred.TaskTracker: > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find > taskTracker/jobcache/job_200904301103_/attempt_200904301103__m_01_1/output/file.out > in any of the configured local directories > 2009-05-02 11:30:49,297 INFO org.apache.hadoop.mapred.TaskTracker: > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find > taskTracker/jobcache/job_200904301103_1675/attempt_200904301103_1675_r_12_1/output/file.out > in any of the configured local directories > 2009-05-02 11:30:54,298 INFO org.apache.hadoop.mapred.TaskTracker: > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find > taskTracker/jobcache/job_200904301103_0944/attempt_200904301103_0944_r_15_0/output/file.out > in any of the configured local directories > > > Lance > > > On Apr 30, 2009, at 12:04 PM, Todd Lipcon wrote: > > Hey Lance, >> >> Thanks for the logs. They definitely confirmed my suspicion. There are two >> problems here: >> >> 1) If the JobTracker throws an exception during processing of a heartbeat, >> the tasktracker retries with no delay, since lastHeartbeat isn't updated >> in >> TaskTracker.offerService. This is related to HADOOP-3987 >> >> 2) If the TaskTracker sends a task in COMMIT_PENDING state with an invalid >> task id, the jobtracker will trigger a NullPointerException in >> JobTracker.getTasksToSave. Instead it should probably create a >> KillTaskAction. I just filed a JIRA to track this issue: >> >> https://issues.apache.org/jira/browse/HADOOP-5761 >> >> 3) The TaskTracker somehow had a task attempt in COMMIT_PENDING state that >> the JobTracker didn't know about. How it got there is a separate problem >> that's a bit harder to track down. >> >> Thanks >> -Todd >> >> On Thu, Apr 30, 2009 at 11:17 AM, Lance Riedel >> wrote: >> >> Here are the job tracker logs from the same time (and yes.. there is >>> something there!!): >>> >>> >>> 2009-04-30 02:34:28,484 INFO org.apache.hadoop.mapred.JobTracker: Serious >>> problem. While updating status, cannot find taskid >>> attempt_200904291917_0252_r_03_0 >>> >>> >>> 2009-04-30 02:34:40,215 INFO org.apache.hadoop.ipc.Server: IPC Server >>> handler 2 on 54311, call >>> heartbeat(org.apache.hadoop.mapred.tasktrackersta...@1a93388, false, >>> true, >>> 5341) from 10.253.134.191:42688: error: java.io.IOException: >>> java.lang.NullPointerException >>> java.io.IOException: java.lang.NullPointerException >>> at >>> org.apache.hadoop.mapred.JobTracker.getTasksToSave(JobTracker.java:2130) >>> at >>> org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1923) >>> at sun.reflect.GeneratedMethodAccessor72.invoke(
Re: specifying command line args, but getting an NPE
On May 4, 2009, at 6:07 PM, Todd Lipcon wrote: Since you have a simple String here, this should be pretty simple. Something like: conf.set("com.example.tool.pattern", otherArgs[2]); then in the configure() function of your Mapper/Reducer, simply retrieve it using conf.get("com.example.tool.pattern"); Trial and error solved the problem. It turns out I need to set the value in the Configuration object before I create the Job object. Thus, the following works and makes the value of net.rguha.dc.data.pattern available to the mappers. Configuration conf = new Configuration(); conf.set("net.rguha.dc.data.pattern", otherArgs[2]); Job job = new Job(conf, "id 1"); But if conf.set(...) is called after instantiating job, it doesn't. Is this intended? --- Rajarshi Guha GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84 --- Q: What's polite and works for the phone company? A: A deferential operator.
Re: specifying command line args, but getting an NPE
On May 4, 2009, at 6:07 PM, Todd Lipcon wrote: The issue here is that your mapper and reducer classes are being instantiated in a different JVM from your main() function. In order to pass data to them, you need to use the Configuration object. Since you have a simple String here, this should be pretty simple. Something like: conf.set("com.example.tool.pattern", otherArgs[2]); then in the configure() function of your Mapper/Reducer, simply retrieve it using conf.get("com.example.tool.pattern"); Thanks for the pointer. I'm using Hadoop 0.20.0 and my mapper which extends Mapper doesn't seem to have a configure() method. Looking at the API I see the superclass has a setup method. Thus in my class I do: public static class MoleculeMapper extends MapperText, IntWritable> { private Text matches = new Text(); private String pattern; public void setup(Context context) { pattern = context.getConfiguration().get("net.rguha.dc.data.pattern"); System.out.println("pattern = " + pattern); } } In my main method I have Configuration conf = new Configuration(); String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); conf.set("net.rguha.dc.data.pattern", otherArgs[2]); However, even with this, pattern turns out to be null when printed in setup(). I just started on Hadoop a day or two ago, and my understanding is that 0.20.0 had some pretty major refactoring. As a result a lot of examples I come across on the Net don't seem to work. Could the lack of the configure() method be due to the refactoring? --- Rajarshi Guha GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84 --- Q: What's polite and works for the phone company? A: A deferential operator.
Re: specifying command line args, but getting an NPE
On Mon, May 4, 2009 at 2:59 PM, Rajarshi Guha wrote: > So my question is: if I need to use an argument, specified on the command > line, do I need to do anything special to the variable holding it? In other > words, the simple assignment > >pattern = otherArgs[2]; > > seems to lead to an NPE when run in distributed mode. > Hi Rajarshi, The issue here is that your mapper and reducer classes are being instantiated in a different JVM from your main() function. In order to pass data to them, you need to use the Configuration object. Since you have a simple String here, this should be pretty simple. Something like: conf.set("com.example.tool.pattern", otherArgs[2]); then in the configure() function of your Mapper/Reducer, simply retrieve it using conf.get("com.example.tool.pattern"); Hope that helps, -Todd
specifying command line args, but getting an NPE
Hi, I have a Hadoop program in which main() reads in some command line args: public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); if (otherArgs.length != 3) { System.err.println("Usage: subsearch "); System.exit(2); } FileInputFormat.addInputPath(job, new Path(otherArgs[0])); FileOutputFormat.setOutputPath(job, new Path(otherArgs[1])); pattern = otherArgs[2]; } Here pattern is declared as a static String class variable. When I run the program using the local tracker, it runs fine and uses the value of pattern. However, if I run the code in distributed mode, I get a NullPointerException - as far as I can tell, pattern is turning out to be null in this case. If I hard code the value of pattern in to the code that uses it, the program runs fine. So my question is: if I need to use an argument, specified on the command line, do I need to do anything special to the variable holding it? In other words, the simple assignment pattern = otherArgs[2]; seems to lead to an NPE when run in distributed mode. Any pointers would be appreciated Thanks, --- Rajarshi Guha GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84 --- Q: What's polite and works for the phone company? A: A deferential operator.
Re: cannot open an hdfs file in O_RDWR mode
> > Hey Philip, > >how could I enable "append to and existing file" in Hadoop? Set dfs.support.append to true in your hadoop-site.xml. See also https://issues.apache.org/jira/browse/HADOOP-5332 . -- Philip
Wrong FS Exception
Can someone tell me how to resolve the following error message found in the job tracker log file when trying to start map reduce. grep FATAL * hadoop-hadoop-jobtracker-hadoop-1.log:2009-05-04 16:35:14,176 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.IllegalArgumentException: Wrong FS: hdfs://usr/local/hadoop-datastore/hadoop-hadoop/mapred/system, expected: hdfs://localhost:54310 Here is my hadoop-site.xml as well hadoop.tmp.dir //usr/local/hadoop-datastore/hadoop-${user.name} A base for other temporary directories. dfs.data.dir /usr/local/hadoop-datastore/hadoop-${user.name}/dfs/data fs.default.name hdfs://localhost:54310 The name of the default file system> A URI whose scheme and author ity determines the File System implementation> The uri's scheme determines the config property (fs.SCHEME.impl) naming the File System implementation class. The uri's authority is used to determine the host, port, etc. For a filesystem. mapred.job.tracker localhost:54311 The host and port that the MapREduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. -- View this message in context: http://www.nabble.com/Wrong-FS-Exception-tp23376486p23376486.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: TextInputFormat unique key across files
Hi Rares, You can access the name of the current file by looking at the "mapred.input.file" configuration variable in the Configuration object. If you're using Hadoop Streaming this is available as $MAPRED_INPUT_FILE Hope that helps, -Todd On Mon, May 4, 2009 at 12:46 PM, Rares Vernica wrote: > Hello, > > TextInputFormat is a perfect match for my problem. The only drawback is > that fact that keys are unique only within a file. Is there an easy way > to have keys unique across files. That is, each line in any file should > get a unique key. Is there an unique id for each file? If yes, maybe I > can concatenate them if I can access the file id from the map function. > > Thanks, > Rares >
Re: TextInputFormat unique key across files
I don't think that you can using those classes. If you look at TextInputFormat and LineRecordReader, they should not be hard to use as a basis for copying into your own version which uniques the IDs but I presume you would need to make them Text and not LongWritable keys. Just a thought... Rather than going that route, could you construct the new key in the Map? Just because the LineRecordReader passes this as the input key, does not mean you have to use it as the output key in the Map phase. Perhaps concatenate it with a different field? Cheers, Tim On Mon, May 4, 2009 at 9:46 PM, Rares Vernica wrote: > Hello, > > TextInputFormat is a perfect match for my problem. The only drawback is > that fact that keys are unique only within a file. Is there an easy way > to have keys unique across files. That is, each line in any file should > get a unique key. Is there an unique id for each file? If yes, maybe I > can concatenate them if I can access the file id from the map function. > > Thanks, > Rares >
Re: TextInputFormat unique key across files
if you can tolerate errors then a simple idea is to generate a random number in the range 0 ... 2 ^n and use that as the key. if the number of lines is small relative to 2 ^ n then with high probability you won't get the same key twice. Miles 2009/5/4 Rares Vernica : > Hello, > > TextInputFormat is a perfect match for my problem. The only drawback is > that fact that keys are unique only within a file. Is there an easy way > to have keys unique across files. That is, each line in any file should > get a unique key. Is there an unique id for each file? If yes, maybe I > can concatenate them if I can access the file id from the map function. > > Thanks, > Rares > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
TextInputFormat unique key across files
Hello, TextInputFormat is a perfect match for my problem. The only drawback is that fact that keys are unique only within a file. Is there an easy way to have keys unique across files. That is, each line in any file should get a unique key. Is there an unique id for each file? If yes, maybe I can concatenate them if I can access the file id from the map function. Thanks, Rares
Re: cannot open an hdfs file in O_RDWR mode
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hey Jason, I tried to uncomment the piece of code you suggested, rebuild everything, umount, mount and here is what happens: Line: 1092 in fuse-dfs.c: fuse_dfs: ERROR: hdfs trying to rename /osg/app/robert/deployment-1.3.x/.svn/tmp/entries to /osg/app/robert/deployment-1.3.x/.svn/entries which is due to a call to is_protected() at Line 1018. Both in Hadoop 0.19.1. Thanks for your help! Robert jason hadoop wrote: > In hadoop 0.19.1, (and 19.0) libhdfs (which is used by the fuse package for > hdfs access) explicitly denies open requests that pass O_RDWR > > If you have binary applications that pass the flag, but would work correctly > given the limitations of HDFS, you may alter the code in > src/c++/libhdfs/hdfs.c to allow it, or build a shared library that you > preload that changes the flags passed to the real open. Hacking hdfs.c is > much simpler. > > Line 407 of hdfs.c > > jobject jFS = (jobject)fs; > > if (flags & O_RDWR) { > fprintf(stderr, "ERROR: cannot open an hdfs file in O_RDWR mode\n"); > errno = ENOTSUP; > return NULL; > } > > > > > On Fri, May 1, 2009 at 6:34 PM, Philip Zeyliger wrote: > >> HDFS does not allow you to overwrite bytes of a file that have already been >> written. The only operations it supports are read (an existing file), >> write >> (a new file), and (in newer versions, not always enabled) append (to an >> existing file). >> >> -- Philip >> >> On Fri, May 1, 2009 at 5:56 PM, Robert Engel >> wrote: > Hello, > >I am using Hadoop on a small storage cluster (x86_64, CentOS 5.3, > Hadoop-0.19.1). The hdfs is mounted using fuse and everything seemed > to work just fine so far. However, I noticed that I cannot: > > 1) use svn to check out files on the mounted hdfs partition > 2) request that stdout and stderr of Globus jobs is written to the > hdfs partition > > In both cases I see following error message in /var/log/messages: > > fuse_dfs: ERROR: could not connect open file fuse_dfs.c:1364 > > When I run fuse_dfs in debugging mode I get: > > ERROR: cannot open an hdfs file in O_RDWR mode > unique: 169, error: -5 (Input/output error), outsize: 16 > > My question is if this is a general limitation of Hadoop or if this > operation is just not currently supported? I searched Google and JIRA > but could not find an answer. > > Thanks, > Robert > >>> >>> -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkn/Lr0ACgkQrxCAtr5BXdMZUQCeLEKI2msbgEgQoT0KwihilEKO 7DkAmwSgPmB7Cth/QsFlV3rEAV6wikbf =MNW6 -END PGP SIGNATURE-
Re: cannot open an hdfs file in O_RDWR mode
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hey Philip, how could I enable "append to and existing file" in Hadoop? Thanks, Robert Philip Zeyliger wrote: > HDFS does not allow you to overwrite bytes of a file that have already been > written. The only operations it supports are read (an existing file), write > (a new file), and (in newer versions, not always enabled) append (to an > existing file). > > -- Philip > > On Fri, May 1, 2009 at 5:56 PM, Robert Engel wrote: > > Hello, > >I am using Hadoop on a small storage cluster (x86_64, CentOS 5.3, > Hadoop-0.19.1). The hdfs is mounted using fuse and everything seemed > to work just fine so far. However, I noticed that I cannot: > > 1) use svn to check out files on the mounted hdfs partition > 2) request that stdout and stderr of Globus jobs is written to the > hdfs partition > > In both cases I see following error message in /var/log/messages: > > fuse_dfs: ERROR: could not connect open file fuse_dfs.c:1364 > > When I run fuse_dfs in debugging mode I get: > > ERROR: cannot open an hdfs file in O_RDWR mode > unique: 169, error: -5 (Input/output error), outsize: 16 > > My question is if this is a general limitation of Hadoop or if this > operation is just not currently supported? I searched Google and JIRA > but could not find an answer. > > Thanks, > Robert > >> >> -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkn/K+UACgkQrxCAtr5BXdNfFwCfU8pz7gV6zi8aLOLTjEAb8fIS j4kAn1/3DnGZZP7TTewV4QTB0S43/tNV =3BF/ -END PGP SIGNATURE-
RE: How to configure nodes with different user account?
Hi Starry, What is the content of your 'slaves' file in the hadoop/conf directory of your master node? It should say something like: localhost us...@server2 us...@server3 us...@server4 This should let the start-up scripts try and login using the proper users. Hope that helps, Menno -Oorspronkelijk bericht- Van: Starry SHI [mailto:starr...@gmail.com] Verzonden: maandag 4 mei 2009 10:53 Aan: core-user@hadoop.apache.org Onderwerp: How to configure nodes with different user account? Hi, all. I am new to Hadoop and I have a question to ask~ I have several accounts located in different linux servers (normal user privilege, no admin authority), and i want to use them to form a small cluster to run Hadoop applications. However, the usernames for these accounts are different. I want to use shared key to connect all the nodes, but I failed after several attempts. Is it possible to connect all of them via different account? For example, I have 3 account: us...@server1, us...@server2, us...@server3. After assigning authorized keys, I can use "ssh us...@server2" without input the password. But when I start hadoop, I was asked to input the password for us...@server2 (when I have already logged in as user1). Can my problem be solved easily? I wish to get your help soon. Thank you for all your attention and help! Best regards, Starry
RE: How to configure nodes with different user account?
Starry, In ".ssh" directory you have to create a file "config" (without extension) on every node. Suppose server1 is your master and server2, server3 is your slave. On the master (server1), in the "config" file and add the following lines: Host server2 User user2 Host server3 User user3 On both slave (server2, server3) nodes, in the "config" file and add the following lines: Host server1 User user1 Hope it works for you Regards Aseem Puri -Original Message- From: Menno Luiten [mailto:mlui...@artifix.net] Sent: Monday, May 04, 2009 7:27 PM To: core-user@hadoop.apache.org Subject: RE: How to configure nodes with different user account? Hi Starry, What is the content of your 'slaves' file in the hadoop/conf directory of your master node? It should say something like: localhost us...@server2 us...@server3 us...@server4 This should let the start-up scripts try and login using the proper users. Hope that helps, Menno -Oorspronkelijk bericht- Van: Starry SHI [mailto:starr...@gmail.com] Verzonden: maandag 4 mei 2009 10:53 Aan: core-user@hadoop.apache.org Onderwerp: How to configure nodes with different user account? Hi, all. I am new to Hadoop and I have a question to ask~ I have several accounts located in different linux servers (normal user privilege, no admin authority), and i want to use them to form a small cluster to run Hadoop applications. However, the usernames for these accounts are different. I want to use shared key to connect all the nodes, but I failed after several attempts. Is it possible to connect all of them via different account? For example, I have 3 account: us...@server1, us...@server2, us...@server3. After assigning authorized keys, I can use "ssh us...@server2" without input the password. But when I start hadoop, I was asked to input the password for us...@server2 (when I have already logged in as user1). Can my problem be solved easily? I wish to get your help soon. Thank you for all your attention and help! Best regards, Starry
Re: Namenode failed to start with "FSNamesystem initialization failed" error
Hi. Same conditions - where the space has run out and the fs got corrupted? Or it got corrupted by itself (which is even more worrying)? Regards. 2009/5/4 Tamir Kamara > I had the same problem a couple of weeks ago with 0.19.1. Had to reformat > the cluster too... > > On Mon, May 4, 2009 at 3:50 PM, Stas Oskin wrote: > > > Hi. > > > > After rebooting the NameNode server, I found out the NameNode doesn't > start > > anymore. > > > > The logs contained this error: > > "FSNamesystem initialization failed" > > > > > > I suspected filesystem corruption, so I tried to recover from > > SecondaryNameNode. Problem is, it was completely empty! > > > > I had an issue that might have caused this - the root mount has run out > of > > space. But, both the NameNode and the SecondaryNameNode directories were > on > > another mount point with plenty of space there - so it's very strange > that > > they were impacted in any way. > > > > Perhaps the logs, which were located on root mount and as a result, could > > not be written, have caused this? > > > > > > To get back HDFS running, i had to format the HDFS (including manually > > erasing the files from DataNodes). While this reasonable in test > > environment > > - production-wise it would be very bad. > > > > Any idea why it happened, and what can be done to prevent it in the > future? > > I'm using the stable 0.18.3 version of Hadoop. > > > > Thanks in advance! > > >
Re: Implementing compareTo in user-written keys where one extends the other is error prone
On Sun, 2009-05-03 at 23:38 -0700, Sharad Agarwal wrote: > Marshall Schor wrote: > > > > public class Super implements WritableComparable { > > . . . > > public int compareTo(Super o) { > > // sort on string value > > . . . > > } > > > > I implemented the 2nd key class (let's call it Sub) > > > > public class Sub extends Super { > > . . . > > public int compareTo(Sub o) { > > // sort on boolean value > > . . . > > // if equal, use the super: > > ... else > > return super.compareTo(o); > > } > > > The overridden method must have same arguments as the parent class > method. Otherwise it is just another method, not an overridden one. > In your case, if the current code looks like error prone, you can > make Super also as a template. Then you can use the Sub class in > the compareTo method However you will have to cast in the > Super class. In this particular case, I _think_ making Sub implement Comparable will be sufficient since then javac will also generate public volatile int compareTo(Object o) { compareTo((Sub)o); } which overrides the volatile method in the superclass. Overriding compareTo(Super) is not required. See my post to gene...@hadoop for more details. S. > class Super implements WritableComparable { > public int compareTo(T o) { >Super other = (Super) o; > > } > } > > class Sub extends Super { > public int compareTo(Sub o) { > ... > } > } > > -Sharad
Re: Namenode failed to start with "FSNamesystem initialization failed" error
I had the same problem a couple of weeks ago with 0.19.1. Had to reformat the cluster too... On Mon, May 4, 2009 at 3:50 PM, Stas Oskin wrote: > Hi. > > After rebooting the NameNode server, I found out the NameNode doesn't start > anymore. > > The logs contained this error: > "FSNamesystem initialization failed" > > > I suspected filesystem corruption, so I tried to recover from > SecondaryNameNode. Problem is, it was completely empty! > > I had an issue that might have caused this - the root mount has run out of > space. But, both the NameNode and the SecondaryNameNode directories were on > another mount point with plenty of space there - so it's very strange that > they were impacted in any way. > > Perhaps the logs, which were located on root mount and as a result, could > not be written, have caused this? > > > To get back HDFS running, i had to format the HDFS (including manually > erasing the files from DataNodes). While this reasonable in test > environment > - production-wise it would be very bad. > > Any idea why it happened, and what can be done to prevent it in the future? > I'm using the stable 0.18.3 version of Hadoop. > > Thanks in advance! >
Namenode failed to start with "FSNamesystem initialization failed" error
Hi. After rebooting the NameNode server, I found out the NameNode doesn't start anymore. The logs contained this error: "FSNamesystem initialization failed" I suspected filesystem corruption, so I tried to recover from SecondaryNameNode. Problem is, it was completely empty! I had an issue that might have caused this - the root mount has run out of space. But, both the NameNode and the SecondaryNameNode directories were on another mount point with plenty of space there - so it's very strange that they were impacted in any way. Perhaps the logs, which were located on root mount and as a result, could not be written, have caused this? To get back HDFS running, i had to format the HDFS (including manually erasing the files from DataNodes). While this reasonable in test environment - production-wise it would be very bad. Any idea why it happened, and what can be done to prevent it in the future? I'm using the stable 0.18.3 version of Hadoop. Thanks in advance!
How to configure nodes with different user account?
Hi, all. I am new to Hadoop and I have a question to ask~ I have several accounts located in different linux servers (normal user privilege, no admin authority), and i want to use them to form a small cluster to run Hadoop applications. However, the usernames for these accounts are different. I want to use shared key to connect all the nodes, but I failed after several attempts. Is it possible to connect all of them via different account? For example, I have 3 account: us...@server1, us...@server2, us...@server3. After assigning authorized keys, I can use "ssh us...@server2" without input the password. But when I start hadoop, I was asked to input the password for us...@server2 (when I have already logged in as user1). Can my problem be solved easily? I wish to get your help soon. Thank you for all your attention and help! Best regards, Starry