Re: Namenode failed to start with "FSNamesystem initialization failed" error

2009-05-04 Thread Tamir Kamara
I didn't have a space problem which led to it (I think). The corruption
started after I bounced the cluster.
At the time, I tried to investigate what led to the corruption but didn't
find anything useful in the logs besides this line:
saveLeases found path
/tmp/temp623789763/tmp659456056/_temporary_attempt_200904211331_0010_r_02_0/part-2
but no matching entry in namespace

I also tried to recover from the secondary name node files but the
corruption my too wide-spread and I had to format.

Tamir

On Mon, May 4, 2009 at 4:48 PM, Stas Oskin  wrote:

> Hi.
>
> Same conditions - where the space has run out and the fs got corrupted?
>
> Or it got corrupted by itself (which is even more worrying)?
>
> Regards.
>
> 2009/5/4 Tamir Kamara 
>
> > I had the same problem a couple of weeks ago with 0.19.1. Had to reformat
> > the cluster too...
> >
> > On Mon, May 4, 2009 at 3:50 PM, Stas Oskin  wrote:
> >
> > > Hi.
> > >
> > > After rebooting the NameNode server, I found out the NameNode doesn't
> > start
> > > anymore.
> > >
> > > The logs contained this error:
> > > "FSNamesystem initialization failed"
> > >
> > >
> > > I suspected filesystem corruption, so I tried to recover from
> > > SecondaryNameNode. Problem is, it was completely empty!
> > >
> > > I had an issue that might have caused this - the root mount has run out
> > of
> > > space. But, both the NameNode and the SecondaryNameNode directories
> were
> > on
> > > another mount point with plenty of space there - so it's very strange
> > that
> > > they were impacted in any way.
> > >
> > > Perhaps the logs, which were located on root mount and as a result,
> could
> > > not be written, have caused this?
> > >
> > >
> > > To get back HDFS running, i had to format the HDFS (including manually
> > > erasing the files from DataNodes). While this reasonable in test
> > > environment
> > > - production-wise it would be very bad.
> > >
> > > Any idea why it happened, and what can be done to prevent it in the
> > future?
> > > I'm using the stable 0.18.3 version of Hadoop.
> > >
> > > Thanks in advance!
> > >
> >
>


java.io.EOFException: while trying to read 65557 bytes

2009-05-04 Thread Albert Sunwoo
Hello Everyone,

I know there's been some chatter about this before but I am seeing the errors 
below on just about every one of our nodes.  Is there a definitive reason on 
why these are occuring, is there something that we can do to prevent these?

2009-05-04 21:35:11,764 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(10.102.0.105:50010, 
storageID=DS-991582569-127.0.0.1-50010-1240886381606, infoPort=50075, 
ipcPort=50020):DataXceiver
java.io.EOFException: while trying to read 65557 bytes
at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:264)
at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:308)
at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:372)
at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:524)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
at java.lang.Thread.run(Thread.java:619)

Followed by:
2009-05-04 21:35:20,891 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
PacketResponder blk_-7056150840276493498_10885 1 Exception 
java.io.InterruptedIOException: Interruped while waiting for IO on channel 
java.nio.channels.Socke
tChannel[connected local=/10.102.0.105:37293 remote=/10.102.0.106:50010]. 59756 
millis timeout left.
at 
org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:277)
at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:155)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)
at java.io.DataInputStream.readFully(DataInputStream.java:178)
at java.io.DataInputStream.readLong(DataInputStream.java:399)
at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:853)
at java.lang.Thread.run(Thread.java:619)

Thanks,
Albert


Re: specifying command line args, but getting an NPE

2009-05-04 Thread Sharad Agarwal


> But if conf.set(...) is called after instantiating job, it doesn't.
>
> Is this intended?
>
yes, Configuration must be set up before instantiating the Job object. However,
some job parameters can be changed (before the actual job submission) by
calling set methods on Job object.

- Sharad


Re: How to configure nodes with different user account?

2009-05-04 Thread Starry SHI
Hi, Menno and Aseem. Thank you for your help! With your help, I can
now use ssh to connect each node without providing the username.

However, another problem occurs. The directory structures are
different among the servers, so when I use the starting script
"start-all.sh" to start hadoop, it seems to run the bash using the
structure exactly to the namenode.

For example, the namenode(in server0) structure is like
"/home/user0/hadoop/...", slave1(in server1) is like
"/home/s/user1/hadoop/...", and slave2(in server2) is like
"/home/u/user2/proj/hadoop/...". When I run start-all.sh, this message
is shown:

"

  starting namenode, logging to /home/user0/hadoop/bin/...
  server1: bash: line 0: cd: /home/user0/hadoop/bin/..: No such file
or directory
  server1: bash: /home/user0/hadoop/bin/hadoop-daemon.sh: No such file
or directory
  server2: bash: line 0: cd: /home/user0/hadoop/bin/..: No such file
or directory
  server2: bash: /home/user0/hadoop/bin/hadoop-daemon.sh: No such file
or directory
  ..

"
It seemed all the nodes should have the same structure to run hadoop.
But I have no admin privillege on these servers, so I cannot create
the exact directory as the namenode. Is there any way to let the nodes
with different structures to run it? How can I configure it?

Again, thank you all for your kind help!!!

Starry

/* Tomorrow is another day. So is today. */



On Mon, May 4, 2009 at 22:09, Puri, Aseem  wrote:
> Starry,
>
>        In ".ssh" directory you have to create a file "config" (without
> extension) on every node.
>
> Suppose server1 is your master and server2, server3 is your slave.
>
> On the master (server1), in the "config" file and add the following
> lines:
>
> Host server2
> User user2
> Host server3
> User user3
>
> On both slave (server2, server3) nodes, in the "config" file and add the
> following lines:
>
> Host server1
> User user1
>
> Hope it works for you
>
> Regards
> Aseem Puri
>
>
> -Original Message-
> From: Menno Luiten [mailto:mlui...@artifix.net]
> Sent: Monday, May 04, 2009 7:27 PM
> To: core-user@hadoop.apache.org
> Subject: RE: How to configure nodes with different user account?
>
> Hi Starry,
>
> What is the content of your 'slaves' file in the hadoop/conf directory
> of your master node?
> It should say something like:
>
> localhost
> us...@server2
> us...@server3
> us...@server4
>
> This should let the start-up scripts try and login using the proper
> users.
>
> Hope that helps,
> Menno
>
> -Oorspronkelijk bericht-
> Van: Starry SHI [mailto:starr...@gmail.com]
> Verzonden: maandag 4 mei 2009 10:53
> Aan: core-user@hadoop.apache.org
> Onderwerp: How to configure nodes with different user account?
>
> Hi, all. I am new to Hadoop and I have a question to ask~
>
> I have several accounts located in different linux servers (normal
> user privilege, no admin authority), and i want to use them to form a
> small cluster to run Hadoop applications. However, the usernames for
> these accounts are different. I want to use shared key to connect all
> the nodes, but I failed after several attempts. Is it possible to
> connect all of them via different account?
>
> For example, I have 3 account: us...@server1, us...@server2,
> us...@server3. After assigning authorized keys, I can use "ssh
> us...@server2" without input the password. But when I start hadoop, I
> was asked to input the password for us...@server2 (when I have already
> logged in as user1).
>
> Can my problem be solved easily? I wish to get your help soon.
>
> Thank you for all your attention and help!
>
> Best regards,
> Starry
>
>


What do we call Hadoop+HBase+Lucene+Zookeeper+etc....

2009-05-04 Thread Bradford Stephens
Hey all,

I'm going to be speaking at OSCON about my company's experiences with
Hadoop and Friends, but I'm having a hard time coming up with a name
for the entire software ecosystem. I'm thinking of calling it the
"Apache CloudStack". Does this sound legit to you all? :) Is there
something more 'official'?

Cheers,
Bradford


Re: Wrong FS Exception

2009-05-04 Thread Bradford Stephens
Are you trying to run a distributed cluster? Does everything have the
same config file? If so, every node is going to look at "localhost"
instead of the correct host for fs.default.name, mapred.job.tracker,
etc

On Mon, May 4, 2009 at 1:54 PM, Kirk Hunter  wrote:
>
> Can someone tell me how to resolve the following error message found in the
> job tracker log file when trying to start map reduce.
>
> grep FATAL *
> hadoop-hadoop-jobtracker-hadoop-1.log:2009-05-04 16:35:14,176 FATAL
> org.apache.hadoop.mapred.JobTracker: java.lang.IllegalArgumentException:
> Wrong FS: hdfs://usr/local/hadoop-datastore/hadoop-hadoop/mapred/system,
> expected: hdfs://localhost:54310
>
>
>
> Here is my hadoop-site.xml as well
>
>
> 
>
> 
> hadoop.tmp.dir
> //usr/local/hadoop-datastore/hadoop-${user.name}
> A base for other temporary directories.
> 
>  
> dfs.data.dir
> /usr/local/hadoop-datastore/hadoop-${user.name}/dfs/data
> 
> 
> 
> fs.default.name
> hdfs://localhost:54310
>  The name of the default file system> A URI whose scheme and
> author
> ity determines the File System implementation> The uri's scheme determines
> the config
>  property (fs.SCHEME.impl) naming the File System implementation class.
>
> The uri's authority is used to determine the host, port, etc. For a
> filesystem. ription>
> 
>
> 
> mapred.job.tracker
> localhost:54311
>  The host and port that the MapREduce job tracker runs at.  If
> "local",
> then jobs are run in-process as a single map and reduce task. 
> 
> 
>
> --
> View this message in context: 
> http://www.nabble.com/Wrong-FS-Exception-tp23376486p23376486.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>


Re: Infinite Loop Resending status from task tracker

2009-05-04 Thread Todd Lipcon
Hi Lance,

Two thoughts here that might be the culprit:

1) Is it possible that the partition that your mapred.local.dir is on is out
of space on that task tracker?

2) Is it possible that you're using a directory under /tmp for
mapred.local.dir and some system cron script cleared out /tmp?

-Todd

On Sat, May 2, 2009 at 9:01 AM, Lance Riedel  wrote:

> Hi Todd,
> Not sure if this is related, but our hadoop cluster in general is getting
> more and more unstable.  the logs are full of this error message (but having
> trouble tracking down the root problem):
>
> 2009-05-02 11:30:39,294 INFO org.apache.hadoop.mapred.TaskTracker:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
> taskTracker/jobcache/job_200904301103_/attempt_200904301103__m_01_1/output/file.out
> in any of the configured local directories
> 2009-05-02 11:30:39,294 INFO org.apache.hadoop.mapred.TaskTracker:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
> taskTracker/jobcache/job_200904301103_1675/attempt_200904301103_1675_r_12_1/output/file.out
> in any of the configured local directories
> 2009-05-02 11:30:44,295 INFO org.apache.hadoop.mapred.TaskTracker:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
> taskTracker/jobcache/job_200904301103_0944/attempt_200904301103_0944_r_15_0/output/file.out
> in any of the configured local directories
> 2009-05-02 11:30:44,295 INFO org.apache.hadoop.mapred.TaskTracker:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
> taskTracker/jobcache/job_200904301103_/attempt_200904301103__m_01_1/output/file.out
> in any of the configured local directories
> 2009-05-02 11:30:44,295 INFO org.apache.hadoop.mapred.TaskTracker:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
> taskTracker/jobcache/job_200904301103_1675/attempt_200904301103_1675_r_12_1/output/file.out
> in any of the configured local directories
> 2009-05-02 11:30:49,296 INFO org.apache.hadoop.mapred.TaskTracker:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
> taskTracker/jobcache/job_200904301103_0944/attempt_200904301103_0944_r_15_0/output/file.out
> in any of the configured local directories
> 2009-05-02 11:30:49,296 INFO org.apache.hadoop.mapred.TaskTracker:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
> taskTracker/jobcache/job_200904301103_/attempt_200904301103__m_01_1/output/file.out
> in any of the configured local directories
> 2009-05-02 11:30:49,297 INFO org.apache.hadoop.mapred.TaskTracker:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
> taskTracker/jobcache/job_200904301103_1675/attempt_200904301103_1675_r_12_1/output/file.out
> in any of the configured local directories
> 2009-05-02 11:30:54,298 INFO org.apache.hadoop.mapred.TaskTracker:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
> taskTracker/jobcache/job_200904301103_0944/attempt_200904301103_0944_r_15_0/output/file.out
> in any of the configured local directories
>
>
> Lance
>
>
> On Apr 30, 2009, at 12:04 PM, Todd Lipcon wrote:
>
>  Hey Lance,
>>
>> Thanks for the logs. They definitely confirmed my suspicion. There are two
>> problems here:
>>
>> 1) If the JobTracker throws an exception during processing of a heartbeat,
>> the tasktracker retries with no delay, since lastHeartbeat isn't updated
>> in
>> TaskTracker.offerService. This is related to HADOOP-3987
>>
>> 2) If the TaskTracker sends a task in COMMIT_PENDING state with an invalid
>> task id, the jobtracker will trigger a NullPointerException in
>> JobTracker.getTasksToSave. Instead it should probably create a
>> KillTaskAction. I just filed a JIRA to track this issue:
>>
>> https://issues.apache.org/jira/browse/HADOOP-5761
>>
>> 3) The TaskTracker somehow had a task attempt in COMMIT_PENDING state that
>> the JobTracker didn't know about. How it got there is a separate problem
>> that's a bit harder to track down.
>>
>> Thanks
>> -Todd
>>
>> On Thu, Apr 30, 2009 at 11:17 AM, Lance Riedel 
>> wrote:
>>
>>  Here are the job tracker logs from the same time (and yes.. there is
>>> something there!!):
>>>
>>>
>>> 2009-04-30 02:34:28,484 INFO org.apache.hadoop.mapred.JobTracker: Serious
>>> problem.  While updating status, cannot find taskid
>>> attempt_200904291917_0252_r_03_0
>>>
>>>
>>> 2009-04-30 02:34:40,215 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 2 on 54311, call
>>> heartbeat(org.apache.hadoop.mapred.tasktrackersta...@1a93388, false,
>>> true,
>>> 5341) from 10.253.134.191:42688: error: java.io.IOException:
>>> java.lang.NullPointerException
>>> java.io.IOException: java.lang.NullPointerException
>>>  at
>>> org.apache.hadoop.mapred.JobTracker.getTasksToSave(JobTracker.java:2130)
>>>  at
>>> org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1923)
>>>  at sun.reflect.GeneratedMethodAccessor72.invoke(

Re: specifying command line args, but getting an NPE

2009-05-04 Thread Rajarshi Guha


On May 4, 2009, at 6:07 PM, Todd Lipcon wrote:



Since you have a simple String here, this should be pretty simple.  
Something

like:

conf.set("com.example.tool.pattern", otherArgs[2]);

then in the configure() function of your Mapper/Reducer, simply  
retrieve it

using conf.get("com.example.tool.pattern");



Trial and error solved the problem. It turns out I need to set the  
value in the Configuration object before I create the Job object.  
Thus, the following works and makes the value of  
net.rguha.dc.data.pattern available to the mappers.


Configuration conf = new Configuration();
conf.set("net.rguha.dc.data.pattern", otherArgs[2]);
Job job = new Job(conf, "id 1");

But if conf.set(...) is called after instantiating job, it doesn't.

Is this intended?

---
Rajarshi Guha  
GPG Fingerprint: D070 5427 CC5B 7938 929C  DD13 66A1 922C 51E7 9E84
---
Q:  What's polite and works for the phone company?
A:  A deferential operator.




Re: specifying command line args, but getting an NPE

2009-05-04 Thread Rajarshi Guha


On May 4, 2009, at 6:07 PM, Todd Lipcon wrote:


The issue here is that your mapper and reducer classes are being
instantiated in a different JVM from your main() function. In order  
to pass

data to them, you need to use the Configuration object.

Since you have a simple String here, this should be pretty simple.  
Something

like:

conf.set("com.example.tool.pattern", otherArgs[2]);

then in the configure() function of your Mapper/Reducer, simply  
retrieve it

using conf.get("com.example.tool.pattern");



Thanks for the pointer. I'm using Hadoop 0.20.0 and my mapper which  
extends Mapper doesn't seem to have a  
configure() method.


Looking at the API I see the superclass has a setup method. Thus in my  
class I do:


public static class MoleculeMapper extends MapperText, IntWritable> {


private Text matches = new Text();
private String pattern;

public void setup(Context context) {
pattern =  
context.getConfiguration().get("net.rguha.dc.data.pattern");

System.out.println("pattern = " + pattern);
}
   
}

In my main method I have

Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf,  
args).getRemainingArgs();

conf.set("net.rguha.dc.data.pattern", otherArgs[2]);

However, even with this, pattern turns out to be null when printed in  
setup().


I just started on Hadoop a day or two ago, and my understanding is  
that 0.20.0 had some pretty major refactoring. As a result a lot of  
examples I come across on the Net don't seem to work. Could the lack  
of the configure() method be due to the refactoring?


---
Rajarshi Guha  
GPG Fingerprint: D070 5427 CC5B 7938 929C  DD13 66A1 922C 51E7 9E84
---
Q:  What's polite and works for the phone company?
A:  A deferential operator.




Re: specifying command line args, but getting an NPE

2009-05-04 Thread Todd Lipcon
On Mon, May 4, 2009 at 2:59 PM, Rajarshi Guha  wrote:

> So my question is: if I need to use an argument, specified on the command
> line, do I need to do anything special to the variable holding it? In other
> words, the simple assignment
>
>pattern = otherArgs[2];
>
> seems to lead to an NPE when run in distributed mode.
>

Hi Rajarshi,

The issue here is that your mapper and reducer classes are being
instantiated in a different JVM from your main() function. In order to pass
data to them, you need to use the Configuration object.

Since you have a simple String here, this should be pretty simple. Something
like:

conf.set("com.example.tool.pattern", otherArgs[2]);

then in the configure() function of your Mapper/Reducer, simply retrieve it
using conf.get("com.example.tool.pattern");

Hope that helps,
-Todd


specifying command line args, but getting an NPE

2009-05-04 Thread Rajarshi Guha
Hi, I have a Hadoop program in which main() reads in some command line  
args:


   public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf,  
args).getRemainingArgs();

if (otherArgs.length != 3) {
System.err.println("Usage: subsearch
");

System.exit(2);
}

FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
pattern = otherArgs[2];
   
   }

Here pattern is declared as a static String class variable.

When I run the program using the local tracker, it runs fine and uses  
the value of pattern. However, if I run the code in distributed mode,  
I get a NullPointerException - as far as I can tell, pattern is  
turning out to be null in this case.


If I hard code the value of pattern in to the code that uses it, the  
program runs fine.


So my question is: if I need to use an argument, specified on the  
command line, do I need to do anything special to the variable holding  
it? In other words, the simple assignment


pattern = otherArgs[2];

seems to lead to an NPE when run in distributed mode.

Any pointers would be appreciated

Thanks,


---
Rajarshi Guha  
GPG Fingerprint: D070 5427 CC5B 7938 929C  DD13 66A1 922C 51E7 9E84
---
Q:  What's polite and works for the phone company?
A:  A deferential operator.




Re: cannot open an hdfs file in O_RDWR mode

2009-05-04 Thread Philip Zeyliger
>
> Hey Philip,
>
>how could I enable "append to and existing file" in Hadoop?


Set dfs.support.append to true in your hadoop-site.xml.  See also
https://issues.apache.org/jira/browse/HADOOP-5332 .

-- Philip


Wrong FS Exception

2009-05-04 Thread Kirk Hunter

Can someone tell me how to resolve the following error message found in the
job tracker log file when trying to start map reduce.

grep FATAL *
hadoop-hadoop-jobtracker-hadoop-1.log:2009-05-04 16:35:14,176 FATAL
org.apache.hadoop.mapred.JobTracker: java.lang.IllegalArgumentException:
Wrong FS: hdfs://usr/local/hadoop-datastore/hadoop-hadoop/mapred/system,
expected: hdfs://localhost:54310



Here is my hadoop-site.xml as well





hadoop.tmp.dir
//usr/local/hadoop-datastore/hadoop-${user.name}
A base for other temporary directories.

 
dfs.data.dir
/usr/local/hadoop-datastore/hadoop-${user.name}/dfs/data



fs.default.name
hdfs://localhost:54310
 The name of the default file system> A URI whose scheme and
author
ity determines the File System implementation> The uri's scheme determines
the config
 property (fs.SCHEME.impl) naming the File System implementation class.

The uri's authority is used to determine the host, port, etc. For a
filesystem.



mapred.job.tracker
localhost:54311
 The host and port that the MapREduce job tracker runs at.  If
"local", 
then jobs are run in-process as a single map and reduce task. 



-- 
View this message in context: 
http://www.nabble.com/Wrong-FS-Exception-tp23376486p23376486.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



Re: TextInputFormat unique key across files

2009-05-04 Thread Todd Lipcon
Hi Rares,

You can access the name of the current file by looking at the
"mapred.input.file" configuration variable in the Configuration object.

If you're using Hadoop Streaming this is available as $MAPRED_INPUT_FILE

Hope that helps,
-Todd

On Mon, May 4, 2009 at 12:46 PM, Rares Vernica  wrote:

> Hello,
>
> TextInputFormat is a perfect match for my problem. The only drawback is
> that fact that keys are unique only within a file. Is there an easy way
> to have keys unique across files. That is, each line in any file should
> get a unique key. Is there an unique id for each file? If yes, maybe I
> can concatenate them if I can access the file id from the map function.
>
> Thanks,
> Rares
>


Re: TextInputFormat unique key across files

2009-05-04 Thread tim robertson
I don't think that you can using those classes.  If you look at
TextInputFormat and LineRecordReader, they should not be hard to use
as a basis for copying into your own version which uniques the IDs but
I presume you would need to make them Text and not LongWritable keys.

Just a thought... Rather than going that route, could you construct
the new key in the Map?  Just because the LineRecordReader passes this
as the input key, does not mean you have to use it as the output key
in the Map phase.  Perhaps concatenate it with a different field?

Cheers,
Tim


On Mon, May 4, 2009 at 9:46 PM, Rares Vernica  wrote:
> Hello,
>
> TextInputFormat is a perfect match for my problem. The only drawback is
> that fact that keys are unique only within a file. Is there an easy way
> to have keys unique across files. That is, each line in any file should
> get a unique key. Is there an unique id for each file? If yes, maybe I
> can concatenate them if I can access the file id from the map function.
>
> Thanks,
> Rares
>


Re: TextInputFormat unique key across files

2009-05-04 Thread Miles Osborne
if you can tolerate errors then a simple idea is to generate a random
number in the range 0 ... 2 ^n and use that as the key.  if the number
of lines is small relative to 2 ^ n then with high probability you
won't get the same key twice.

Miles

2009/5/4 Rares Vernica :
> Hello,
>
> TextInputFormat is a perfect match for my problem. The only drawback is
> that fact that keys are unique only within a file. Is there an easy way
> to have keys unique across files. That is, each line in any file should
> get a unique key. Is there an unique id for each file? If yes, maybe I
> can concatenate them if I can access the file id from the map function.
>
> Thanks,
> Rares
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


TextInputFormat unique key across files

2009-05-04 Thread Rares Vernica
Hello,

TextInputFormat is a perfect match for my problem. The only drawback is
that fact that keys are unique only within a file. Is there an easy way
to have keys unique across files. That is, each line in any file should
get a unique key. Is there an unique id for each file? If yes, maybe I
can concatenate them if I can access the file id from the map function.

Thanks,
Rares


Re: cannot open an hdfs file in O_RDWR mode

2009-05-04 Thread Robert Engel
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hey Jason,

I tried to uncomment the piece of code you suggested, rebuild
everything, umount, mount and here is what happens:

Line: 1092 in fuse-dfs.c:

fuse_dfs: ERROR: hdfs trying to rename
/osg/app/robert/deployment-1.3.x/.svn/tmp/entries to
/osg/app/robert/deployment-1.3.x/.svn/entries

which is due to a call to is_protected() at Line 1018. Both in Hadoop
0.19.1.

Thanks for your help!
Robert

jason hadoop wrote:
> In hadoop 0.19.1, (and 19.0) libhdfs (which is used by the fuse package for
> hdfs access) explicitly denies open requests that pass O_RDWR
> 
> If you have binary applications that pass the flag, but would work correctly
> given the limitations of HDFS, you may alter the code in
> src/c++/libhdfs/hdfs.c to allow it, or build a shared library that you
> preload that changes the flags passed to the real open. Hacking hdfs.c is
> much simpler.
> 
> Line 407 of hdfs.c
> 
> jobject jFS = (jobject)fs;
> 
> if (flags & O_RDWR) {
>   fprintf(stderr, "ERROR: cannot open an hdfs file in O_RDWR mode\n");
>   errno = ENOTSUP;
>   return NULL;
> }
> 
> 
> 
> 
> On Fri, May 1, 2009 at 6:34 PM, Philip Zeyliger  wrote:
> 
>> HDFS does not allow you to overwrite bytes of a file that have already been
>> written.  The only operations it supports are read (an existing file),
>> write
>> (a new file), and (in newer versions, not always enabled) append (to an
>> existing file).
>>
>> -- Philip
>>
>> On Fri, May 1, 2009 at 5:56 PM, Robert Engel >> wrote:
> Hello,
> 
>I am using Hadoop on a small storage cluster (x86_64, CentOS 5.3,
> Hadoop-0.19.1). The hdfs is mounted using fuse and everything seemed
> to work just fine so far. However, I noticed that I cannot:
> 
> 1) use svn to check out files on the mounted hdfs partition
> 2) request that stdout and stderr of Globus jobs is written to the
> hdfs partition
> 
> In both cases I see following error message in /var/log/messages:
> 
> fuse_dfs: ERROR: could not connect open file fuse_dfs.c:1364
> 
> When I run fuse_dfs in debugging mode I get:
> 
> ERROR: cannot open an hdfs file in O_RDWR mode
> unique: 169, error: -5 (Input/output error), outsize: 16
> 
> My question is if this is a general limitation of Hadoop or if this
> operation is just not currently supported? I searched Google and JIRA
> but could not find an answer.
> 
> Thanks,
> Robert
> 
>>>
>>>

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkn/Lr0ACgkQrxCAtr5BXdMZUQCeLEKI2msbgEgQoT0KwihilEKO
7DkAmwSgPmB7Cth/QsFlV3rEAV6wikbf
=MNW6
-END PGP SIGNATURE-


Re: cannot open an hdfs file in O_RDWR mode

2009-05-04 Thread Robert Engel
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hey Philip,

how could I enable "append to and existing file" in Hadoop?

Thanks,
Robert

Philip Zeyliger wrote:
> HDFS does not allow you to overwrite bytes of a file that have already been
> written.  The only operations it supports are read (an existing file), write
> (a new file), and (in newer versions, not always enabled) append (to an
> existing file).
> 
> -- Philip
> 
> On Fri, May 1, 2009 at 5:56 PM, Robert Engel wrote:
> 
> Hello,
> 
>I am using Hadoop on a small storage cluster (x86_64, CentOS 5.3,
> Hadoop-0.19.1). The hdfs is mounted using fuse and everything seemed
> to work just fine so far. However, I noticed that I cannot:
> 
> 1) use svn to check out files on the mounted hdfs partition
> 2) request that stdout and stderr of Globus jobs is written to the
> hdfs partition
> 
> In both cases I see following error message in /var/log/messages:
> 
> fuse_dfs: ERROR: could not connect open file fuse_dfs.c:1364
> 
> When I run fuse_dfs in debugging mode I get:
> 
> ERROR: cannot open an hdfs file in O_RDWR mode
> unique: 169, error: -5 (Input/output error), outsize: 16
> 
> My question is if this is a general limitation of Hadoop or if this
> operation is just not currently supported? I searched Google and JIRA
> but could not find an answer.
> 
> Thanks,
> Robert
> 
>>
>>

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkn/K+UACgkQrxCAtr5BXdNfFwCfU8pz7gV6zi8aLOLTjEAb8fIS
j4kAn1/3DnGZZP7TTewV4QTB0S43/tNV
=3BF/
-END PGP SIGNATURE-


RE: How to configure nodes with different user account?

2009-05-04 Thread Menno Luiten
Hi Starry,

What is the content of your 'slaves' file in the hadoop/conf directory of your 
master node?
It should say something like:

localhost
us...@server2
us...@server3
us...@server4

This should let the start-up scripts try and login using the proper users.

Hope that helps,
Menno

-Oorspronkelijk bericht-
Van: Starry SHI [mailto:starr...@gmail.com] 
Verzonden: maandag 4 mei 2009 10:53
Aan: core-user@hadoop.apache.org
Onderwerp: How to configure nodes with different user account?

Hi, all. I am new to Hadoop and I have a question to ask~

I have several accounts located in different linux servers (normal
user privilege, no admin authority), and i want to use them to form a
small cluster to run Hadoop applications. However, the usernames for
these accounts are different. I want to use shared key to connect all
the nodes, but I failed after several attempts. Is it possible to
connect all of them via different account?

For example, I have 3 account: us...@server1, us...@server2,
us...@server3. After assigning authorized keys, I can use "ssh
us...@server2" without input the password. But when I start hadoop, I
was asked to input the password for us...@server2 (when I have already
logged in as user1).

Can my problem be solved easily? I wish to get your help soon.

Thank you for all your attention and help!

Best regards,
Starry



RE: How to configure nodes with different user account?

2009-05-04 Thread Puri, Aseem
Starry,

In ".ssh" directory you have to create a file "config" (without
extension) on every node.

Suppose server1 is your master and server2, server3 is your slave. 

On the master (server1), in the "config" file and add the following
lines:

Host server2
User user2
Host server3
User user3

On both slave (server2, server3) nodes, in the "config" file and add the
following lines:

Host server1
User user1

Hope it works for you

Regards
Aseem Puri


-Original Message-
From: Menno Luiten [mailto:mlui...@artifix.net] 
Sent: Monday, May 04, 2009 7:27 PM
To: core-user@hadoop.apache.org
Subject: RE: How to configure nodes with different user account?

Hi Starry,

What is the content of your 'slaves' file in the hadoop/conf directory
of your master node?
It should say something like:

localhost
us...@server2
us...@server3
us...@server4

This should let the start-up scripts try and login using the proper
users.

Hope that helps,
Menno

-Oorspronkelijk bericht-
Van: Starry SHI [mailto:starr...@gmail.com] 
Verzonden: maandag 4 mei 2009 10:53
Aan: core-user@hadoop.apache.org
Onderwerp: How to configure nodes with different user account?

Hi, all. I am new to Hadoop and I have a question to ask~

I have several accounts located in different linux servers (normal
user privilege, no admin authority), and i want to use them to form a
small cluster to run Hadoop applications. However, the usernames for
these accounts are different. I want to use shared key to connect all
the nodes, but I failed after several attempts. Is it possible to
connect all of them via different account?

For example, I have 3 account: us...@server1, us...@server2,
us...@server3. After assigning authorized keys, I can use "ssh
us...@server2" without input the password. But when I start hadoop, I
was asked to input the password for us...@server2 (when I have already
logged in as user1).

Can my problem be solved easily? I wish to get your help soon.

Thank you for all your attention and help!

Best regards,
Starry



Re: Namenode failed to start with "FSNamesystem initialization failed" error

2009-05-04 Thread Stas Oskin
Hi.

Same conditions - where the space has run out and the fs got corrupted?

Or it got corrupted by itself (which is even more worrying)?

Regards.

2009/5/4 Tamir Kamara 

> I had the same problem a couple of weeks ago with 0.19.1. Had to reformat
> the cluster too...
>
> On Mon, May 4, 2009 at 3:50 PM, Stas Oskin  wrote:
>
> > Hi.
> >
> > After rebooting the NameNode server, I found out the NameNode doesn't
> start
> > anymore.
> >
> > The logs contained this error:
> > "FSNamesystem initialization failed"
> >
> >
> > I suspected filesystem corruption, so I tried to recover from
> > SecondaryNameNode. Problem is, it was completely empty!
> >
> > I had an issue that might have caused this - the root mount has run out
> of
> > space. But, both the NameNode and the SecondaryNameNode directories were
> on
> > another mount point with plenty of space there - so it's very strange
> that
> > they were impacted in any way.
> >
> > Perhaps the logs, which were located on root mount and as a result, could
> > not be written, have caused this?
> >
> >
> > To get back HDFS running, i had to format the HDFS (including manually
> > erasing the files from DataNodes). While this reasonable in test
> > environment
> > - production-wise it would be very bad.
> >
> > Any idea why it happened, and what can be done to prevent it in the
> future?
> > I'm using the stable 0.18.3 version of Hadoop.
> >
> > Thanks in advance!
> >
>


Re: Implementing compareTo in user-written keys where one extends the other is error prone

2009-05-04 Thread Shevek
On Sun, 2009-05-03 at 23:38 -0700, Sharad Agarwal wrote:
> Marshall Schor wrote:
> >
> > public class Super implements WritableComparable {
> >  . . .
> >   public int compareTo(Super o) {
> > // sort on string value
> > . . .
> >   }
> >
> > I implemented the 2nd key class (let's call it Sub)
> >
> > public class Sub extends Super {
> >  . . .
> >   public int compareTo(Sub o) {
> > // sort on boolean value
> > . . .
> > // if equal, use the super:
> > ... else
> >  return super.compareTo(o);
> >   }
> >
> The overridden method must have same arguments as the parent class
> method. Otherwise it is just another method, not an overridden one.
> In your case, if the current code looks like error prone, you can
> make Super also as a template. Then you can use the Sub class in
> the compareTo method However you will have to cast in the
> Super class.

In this particular case, I _think_ making Sub implement Comparable
will be sufficient since then javac will also generate public volatile
int compareTo(Object o) { compareTo((Sub)o); } which overrides the
volatile method in the superclass. Overriding compareTo(Super) is not
required. See my post to gene...@hadoop for more details.

S.

> class Super implements WritableComparable {
>   public int compareTo(T o) {
>Super other = (Super) o;
>
>   }
> }
> 
> class Sub extends Super {
>   public int compareTo(Sub o) {
> ...
>   }
> }
> 
> -Sharad



Re: Namenode failed to start with "FSNamesystem initialization failed" error

2009-05-04 Thread Tamir Kamara
I had the same problem a couple of weeks ago with 0.19.1. Had to reformat
the cluster too...

On Mon, May 4, 2009 at 3:50 PM, Stas Oskin  wrote:

> Hi.
>
> After rebooting the NameNode server, I found out the NameNode doesn't start
> anymore.
>
> The logs contained this error:
> "FSNamesystem initialization failed"
>
>
> I suspected filesystem corruption, so I tried to recover from
> SecondaryNameNode. Problem is, it was completely empty!
>
> I had an issue that might have caused this - the root mount has run out of
> space. But, both the NameNode and the SecondaryNameNode directories were on
> another mount point with plenty of space there - so it's very strange that
> they were impacted in any way.
>
> Perhaps the logs, which were located on root mount and as a result, could
> not be written, have caused this?
>
>
> To get back HDFS running, i had to format the HDFS (including manually
> erasing the files from DataNodes). While this reasonable in test
> environment
> - production-wise it would be very bad.
>
> Any idea why it happened, and what can be done to prevent it in the future?
> I'm using the stable 0.18.3 version of Hadoop.
>
> Thanks in advance!
>


Namenode failed to start with "FSNamesystem initialization failed" error

2009-05-04 Thread Stas Oskin
Hi.

After rebooting the NameNode server, I found out the NameNode doesn't start
anymore.

The logs contained this error:
"FSNamesystem initialization failed"


I suspected filesystem corruption, so I tried to recover from
SecondaryNameNode. Problem is, it was completely empty!

I had an issue that might have caused this - the root mount has run out of
space. But, both the NameNode and the SecondaryNameNode directories were on
another mount point with plenty of space there - so it's very strange that
they were impacted in any way.

Perhaps the logs, which were located on root mount and as a result, could
not be written, have caused this?


To get back HDFS running, i had to format the HDFS (including manually
erasing the files from DataNodes). While this reasonable in test environment
- production-wise it would be very bad.

Any idea why it happened, and what can be done to prevent it in the future?
I'm using the stable 0.18.3 version of Hadoop.

Thanks in advance!


How to configure nodes with different user account?

2009-05-04 Thread Starry SHI
Hi, all. I am new to Hadoop and I have a question to ask~

I have several accounts located in different linux servers (normal
user privilege, no admin authority), and i want to use them to form a
small cluster to run Hadoop applications. However, the usernames for
these accounts are different. I want to use shared key to connect all
the nodes, but I failed after several attempts. Is it possible to
connect all of them via different account?

For example, I have 3 account: us...@server1, us...@server2,
us...@server3. After assigning authorized keys, I can use "ssh
us...@server2" without input the password. But when I start hadoop, I
was asked to input the password for us...@server2 (when I have already
logged in as user1).

Can my problem be solved easily? I wish to get your help soon.

Thank you for all your attention and help!

Best regards,
Starry