Re: Trying to write to HDFS from mapreduce.

2008-07-24 Thread s29752-hadoopuser
I think your conf is incorrectly set and your job was run locally.  Also, have 
you done jobconf.setNumReduceTasks(0)?  Try running some example jobs to test 
your setting.

Nicholas Sze




- Original Message 
> From: Erik Holstad <[EMAIL PROTECTED]>
> To: core-user@hadoop.apache.org
> Sent: Thursday, July 24, 2008 3:17:40 PM
> Subject: Trying to write to HDFS from mapreduce.
> 
> Hi!
> I'm writing a mapreduce job where I want the output from the mapper to go
> strait
> to the HDFS without passing the reduce method. Have been told that I can do:
> c.setOutputFormat(TextOutputFormat.class); also added
> Path path = new Path("user");
> FileOutputFormat.setOutputPath(c, path);
> 
> But I still ended up with the result in the local filesystem instead.
> 
> Regards Erik



Re: File permissions issue

2008-07-09 Thread s29752-hadoopuser
Hi Joman,

The temp directory we talking here is the temp directory in the local file 
system (i.e. Unix in your case).  There is a config property hadoop.tmp.dir 
(see hadoop-default.xml), which specifies the path of temp directory.  Before 
you start the cluster, you should set this property and chmod on the temp 
directory to make sure that all users have permission to create files under it.

Hope it helps.

Nicholas Sze




- Original Message 
> From: Joman Chu <[EMAIL PROTECTED]>
> To: core-user@hadoop.apache.org
> Sent: Wednesday, July 9, 2008 4:15:39 AM
> Subject: Re: File permissions issue
> 
> So we can fix this issue by putting all three users in a common group? We did 
> that after we encountered the issue, but we still got the errors. Note that 
> we 
> had not restarted hadoop, so the permissions were still as described earlier. 
> Should we have restarted Hadoop after the grouping?
> 
> On Wed, July 9, 2008 2:05 am, heyongqiang said:
> > because in your permission set, the other role can not write the temp
> > directory. and user3 is not in the same group with user2.
> > 
> > 
> > 
> > 
> > 
> > heyongqiang 2008-07-09
> > 
> > 
> > 
> > ·¢¼þÈË£º Joman Chu ·¢ËÍʱ¼ä£º 2008-07-09 13:06:51 ÊÕ¼þÈË£º
> > core-user@hadoop.apache.org ³­ËÍ£º Ö÷Ì⣺ File permissions issue
> > 
> > Hello,
> > 
> > On a cluster where I run Hadoop, it seems that the temp directory created
> > by Hadoop (in our case, /tmp/hadoop/) gets its permissions set to
> > "drwxrwxr-x" owned by the first person that runs a job after the Hadoop
> > services are started. This causes file permissions problems as we try to
> > run jobs.
> > 
> > For example, user1:user1 starts Hadoop using ./start-all.sh. Then
> > user2:user2 runs a Hadoop job. Temp directories (/tmp/hadoop/) are now
> > created in all nodes in the cluster owned by user2 with permissions
> > "drwxrwxr-x". Now user3:user3 tries to run a job and gets the following
> > exception:
> > 
> > java.io.IOException: Permission denied at
> > java.io.UnixFileSystem.createFileExclusively(Native Method) at
> > java.io.File.checkAndCreate(File.java:1704) at
> > java.io.File.createTempFile(File.java:1793) at
> > org.apache.hadoop.util.RunJar.main(RunJar.java:115) at
> > org.apache.hadoop.mapred.JobShell.run(JobShell.java:194) at
> > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at
> > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at
> > org.apache.hadoop.mapred.JobShell.main(JobShell.java:220)
> > 
> > Why does this happen and how can we fix this? Our current stop gap
> > measure is to run a job as the user that started Hadoop. That is, in our
> > example, after user1 starts Hadoop, user1 runs a job. Everything seems to
> > work fine then.
> > 
> > Thanks, Joman Chu
> > 
> 
> 
> -- 
> Joman Chu
> AIM: ARcanUSNUMquam
> IRC: irc.liquid-silver.net



Re: Task failing, cause FileSystem close?

2008-06-17 Thread s29752-hadoopuser
Hi Christophe,

This exception happens when you access the FileSystem after calling 
FileSystem.close().  From the error message below, a FileSystem input stream 
was accessed after FileSystem.close().  I guess the FileSystem was closed 
manually (and too early).  In most cases, you don't have to call 
FileSystem.close() since it will be closed automatically.

Nicholas


- Original Message 
> From: Christophe Taton <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Sent: Tuesday, June 17, 2008 4:18:45 AM
> Subject: Task failing, cause FileSystem close?
> 
> Hi all,
> 
> I am experiencing (through my students) the following error on a 28
> nodes cluster running Hadoop 0.16.4.
> Some jobs fail with many map tasks aborting with this error message:
> 
> 2008-06-17 12:25:01,512 WARN org.apache.hadoop.mapred.TaskTracker:
> Error running child
> java.io.IOException: Filesystem closed
> at org.apache.hadoop.dfs.DFSClient.checkOpen(DFSClient.java:166)
> at org.apache.hadoop.dfs.DFSClient.access$500(DFSClient.java:58)
> at 
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.close(DFSClient.java:1103)
> at java.io.FilterInputStream.close(FilterInputStream.java:155)
> at org.apache.hadoop.io.SequenceFile$Reader.close(SequenceFile.java:1541)
> at 
> org.apache.hadoop.mapred.SequenceFileRecordReader.close(SequenceFileRecordReader.java:125)
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.close(MapTask.java:155)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:212)
> at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2084)
> 
> Any clue why this would happen?
> 
> Thanks in advance,
> Christophe



Re: client connect as different username?

2008-06-11 Thread s29752-hadoopuser
This information can be found in 
http://hadoop.apache.org/core/docs/current/hdfs_permissions_guide.html
Nicholas


- Original Message 
> From: Chris Collins <[EMAIL PROTECTED]>
> To: core-user@hadoop.apache.org
> Sent: Wednesday, June 11, 2008 9:31:18 PM
> Subject: Re: client connect as different username?
> 
> Thanks Doug, should this be added to the permissions doc or to the  
> faq?  See you in Sonoma.
> 
> C
> On Jun 11, 2008, at 9:15 PM, Doug Cutting wrote:
> 
> > Chris Collins wrote:
> >> You are referring to creating a directory in hdfs?  Because if I am  
> >> user chris and the hdfs only has user foo, then I cant create a  
> >> directory because I dont have perms, infact I cant even connect.
> >
> > Today, users and groups are declared by the client.  The namenode  
> > only records and checks against user and group names provided by the  
> > client.  So if someone named "foo" writes a file, then that file is  
> > owned by someone named "foo" and anyone named "foo" is the owner of  
> > that file. No "foo" account need exist on the namenode.
> >
> > The one (important) exception is the "superuser".  Whatever user  
> > name starts the namenode is the superuser for that filesystem.  And  
> > if "/" is not world writable, a new filesystem will not contain a  
> > home directory (or anywhere else) writable by other users.  So, in a  
> > multiuser Hadoop installation, the superuser needs to create home  
> > directories and project directories for other users and set their  
> > protections accordingly before other users can do anything.  Perhaps  
> > this is what you've run into?
> >
> > Doug



Re: client connect as different username?

2008-06-11 Thread s29752-hadoopuser
The best way is to use sudo command to execute hadoop client.  Does it work for 
you?

Nicholas


- Original Message 
> From: Bob Remeika <[EMAIL PROTECTED]>
> To: core-user@hadoop.apache.org
> Sent: Wednesday, June 11, 2008 12:56:14 PM
> Subject: client connect as different username?
> 
> Apologies if this is an RTM response, but I looked and wasn't able to find
> anything concrete.  Is it possible to connect to HDFS via the HDFS client
> under a different username than I am currently logged in as?
> 
> Here is our situation, I am user bobr on the client machine.  I need to add
> something to the HDFS cluster as the user "companyuser".  Is this possible
> with the current set of APIs or do I have to upload and "chown"?
> 
> Thanks,
> Bob



Re: JAVA_HOME Cygwin problem (solution doesn't work)

2008-05-23 Thread s29752-hadoopuser
The following works for me
set JAVA_HOME=/cygdrive/c/Progra~1/Java/jdk1.5.0_14

Nicholas



- Original Message 
From: vatsan <[EMAIL PROTECTED]>
To: core-user@hadoop.apache.org
Sent: Friday, May 23, 2008 5:41:05 PM
Subject: JAVA_HOME Cygwin problem (solution doesn't work)


I have installed hadoop on cygwin , I am running windows XP.

My Java directory is C:\Program Files\Java\jre1.6.0_06

I am not able to run hadoop as it complains of "no such file or directory
error".

I did some searching and found out someone had proposed a solution of doing

SET JAVA_HOME=C:\Program Files\Java\jre1.6.0_06

in the Cygwin.bat file,

but that doesn't work for me.

Neither does using the absolute path name "\cygwin\c\Program Files\Java" OR
using  \cygwin\c\"Program Files"\Java

Can someone guide me here?

(I understand that the problem is because of the path convention conflicts
in windows and Cygwin, I found some stuff on fixes for the path issues that
spoke of using cygpath.exe as a fix ... for example while running a java
program on cygwin, but could not find anything that addressed my problem.)

-- 
View this message in context: 
http://www.nabble.com/JAVA_HOME-Cygwin-problem-%28solution-doesn%27t-work%29-tp17443172p17443172.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: Hadoop Permission Problem

2008-05-09 Thread s29752-hadoopuser
Hi Senthil,

drwxrwxrwx5 hadoop   hadoop   4096 May  7 18:02 datastore

This one is your local directory.  I think you might have mixed up the local 
and hdfs directories.

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "core-user@hadoop.apache.org" 
Sent: Friday, May 9, 2008 1:11:01 PM
Subject: RE: Hadoop Permission Problem

Hi Nicholas,
You are right, the permission problem is with datastore, that's what I 
mentioned in the previous mails.
But I gave the 777 permission. Here is the datastore permission in the master.

drwxrwxrwx5 hadoop   hadoop   4096 May  7 18:02 datastore

I am not seeing any datastore in the slave machines.



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Friday, May 09, 2008 3:29 PM
To: core-user@hadoop.apache.org
Subject: Re: Hadoop Permission Problem

Hi Senthil,

Let me explain the error message " Permission denied: user=test, access=WRITE, 
inode="datastore":hadoop:supergroup:rwxr-xr-x".  It says that the current user 
"test" is trying to WRITE to the inode "datastore" with owner hadoop:supergroup 
and permission 755.  So the problem is in the directory "datastore".  Could you 
check it?

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "core-user@hadoop.apache.org" 
Sent: Friday, May 9, 2008 12:23:52 PM
Subject: RE: Hadoop Permission Problem

Hi Nicholas,
Here I tried as user test after I got the error (is the exception comes from 
slave machine?)

Exception in thread "main" org.apache.hadoop.ipc.RemoteException: 
org.apache.hadoop.fs.permission.AccessControlException: Permission denied: 
user=test, access=WRITE, inode="datastore":hadoop:supergroup:rwxr-xr-x

/usr/local/hadoop/bin/hadoop fs -ls
Found 1 items
/user/test/myapps  26742008-05-07 17:55rw-r--r--   
test supergroup

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Friday, May 09, 2008 3:13 PM
To: core-user@hadoop.apache.org
Subject: Re: Hadoop Permission Problem

Hi Senthil,

I cannot see why it does not work.  Could you try again, do a fs -ls right 
after you see the error message?

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "core-user@hadoop.apache.org" 
Sent: Friday, May 9, 2008 11:49:49 AM
Subject: RE: Hadoop Permission Problem

Hi Nicholas,
No, I am running map/red jobs over HDFS file.
That permission is for datastore (hadoop.tmp.dir)

Here is the HDFS
/usr/local/hadoop/bin/hadoop dfs -ls /
Found 2 items
/user  2008-05-07 17:55rwxrwxrwx   hadoop  
supergroup
/usr   2008-05-07 17:18rwxr-xr-x   hadoop  
supergroup
[EMAIL PROTECTED] .ssh]$ /usr/local/hadoop/bin/hadoop dfs -ls /user
Found 2 items
/user/hadoop   2008-05-08 16:36rwxr-xr-x   hadoop  
supergroup
/user/test  2008-05-07 17:55rwxrwxrwx   test 
supergroup

Thanks,
Senthil

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Friday, May 09, 2008 2:40 PM
To: core-user@hadoop.apache.org
Subject: Re: Hadoop Permission Problem

Hi Senthil,

drwxrwxrwx4 hadoop   hadoop   4096 May  8 16:31 hadoop-hadoop
drwxrwxrwx2 test  test  4096 May  9 09:29 hadoop-test

>From the output format, the directories above seem not HDFS directories.  Are 
>you running map/red jobs over local file system (e.g. Linux)?

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "core-user@hadoop.apache.org" 
Sent: Friday, May 9, 2008 6:36:27 AM
Subject: RE: Hadoop Permission Problem

Hi Nicholas,
That's what I was wondering. Here is the datastore directory permission in the 
master machine.

drwxrwxrwx5 hadoop   hadoop   4096 May  7 18:02 datastore

This datastore directory present only in the master right not on the slaves 
right? I couldn't find.

After I changed the permission for datastore I restarted dfs and mapred. But 
still it complains about the permission.

Even I changed all the directories in datastore to 777

drwxrwxrwx4 hadoop   hadoop   4096 May  8 16:31 hadoop-hadoop
drwxrwxrwx2 test  test  4096 May  9 09:29 hadoop-test

What are the places I need to change the permissions so that UserB can submit 
the job using the jobtracker and tasktracker started by UserA.

Thanks,
Senthil


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Thursday, May 08, 2008 8:32 PM
To: core-user@hadoop.apache.org
Subject: Re: Hadoop Permission Problem

Hi Senthil,

In the error message, it says that the permission for "datastore" is 755.  Are 
you sure that you have changed it to 777?

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "core-user@hadoop.apache.org" 
Sent: Thursday, May 8, 2008 11:57:46 AM
Subject: RE: Hadoop Permission Problem

Hi Nicholas,
Thanks it helped.

I gave permissi

Re: Hadoop Permission Problem

2008-05-09 Thread s29752-hadoopuser
Hi Senthil,

Let me explain the error message " Permission denied: user=test, access=WRITE, 
inode="datastore":hadoop:supergroup:rwxr-xr-x".  It says that the current user 
"test" is trying to WRITE to the inode "datastore" with owner hadoop:supergroup 
and permission 755.  So the problem is in the directory "datastore".  Could you 
check it?

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "core-user@hadoop.apache.org" 
Sent: Friday, May 9, 2008 12:23:52 PM
Subject: RE: Hadoop Permission Problem

Hi Nicholas,
Here I tried as user test after I got the error (is the exception comes from 
slave machine?)

Exception in thread "main" org.apache.hadoop.ipc.RemoteException: 
org.apache.hadoop.fs.permission.AccessControlException: Permission denied: 
user=test, access=WRITE, inode="datastore":hadoop:supergroup:rwxr-xr-x

/usr/local/hadoop/bin/hadoop fs -ls
Found 1 items
/user/test/myapps  26742008-05-07 17:55rw-r--r--   
test supergroup

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Friday, May 09, 2008 3:13 PM
To: core-user@hadoop.apache.org
Subject: Re: Hadoop Permission Problem

Hi Senthil,

I cannot see why it does not work.  Could you try again, do a fs -ls right 
after you see the error message?

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "core-user@hadoop.apache.org" 
Sent: Friday, May 9, 2008 11:49:49 AM
Subject: RE: Hadoop Permission Problem

Hi Nicholas,
No, I am running map/red jobs over HDFS file.
That permission is for datastore (hadoop.tmp.dir)

Here is the HDFS
/usr/local/hadoop/bin/hadoop dfs -ls /
Found 2 items
/user  2008-05-07 17:55rwxrwxrwx   hadoop  
supergroup
/usr   2008-05-07 17:18rwxr-xr-x   hadoop  
supergroup
[EMAIL PROTECTED] .ssh]$ /usr/local/hadoop/bin/hadoop dfs -ls /user
Found 2 items
/user/hadoop   2008-05-08 16:36rwxr-xr-x   hadoop  
supergroup
/user/test  2008-05-07 17:55rwxrwxrwx   test 
supergroup

Thanks,
Senthil

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Friday, May 09, 2008 2:40 PM
To: core-user@hadoop.apache.org
Subject: Re: Hadoop Permission Problem

Hi Senthil,

drwxrwxrwx4 hadoop   hadoop   4096 May  8 16:31 hadoop-hadoop
drwxrwxrwx2 test  test  4096 May  9 09:29 hadoop-test

>From the output format, the directories above seem not HDFS directories.  Are 
>you running map/red jobs over local file system (e.g. Linux)?

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "core-user@hadoop.apache.org" 
Sent: Friday, May 9, 2008 6:36:27 AM
Subject: RE: Hadoop Permission Problem

Hi Nicholas,
That's what I was wondering. Here is the datastore directory permission in the 
master machine.

drwxrwxrwx5 hadoop   hadoop   4096 May  7 18:02 datastore

This datastore directory present only in the master right not on the slaves 
right? I couldn't find.

After I changed the permission for datastore I restarted dfs and mapred. But 
still it complains about the permission.

Even I changed all the directories in datastore to 777

drwxrwxrwx4 hadoop   hadoop   4096 May  8 16:31 hadoop-hadoop
drwxrwxrwx2 test  test  4096 May  9 09:29 hadoop-test

What are the places I need to change the permissions so that UserB can submit 
the job using the jobtracker and tasktracker started by UserA.

Thanks,
Senthil


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Thursday, May 08, 2008 8:32 PM
To: core-user@hadoop.apache.org
Subject: Re: Hadoop Permission Problem

Hi Senthil,

In the error message, it says that the permission for "datastore" is 755.  Are 
you sure that you have changed it to 777?

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "core-user@hadoop.apache.org" 
Sent: Thursday, May 8, 2008 11:57:46 AM
Subject: RE: Hadoop Permission Problem

Hi Nicholas,
Thanks it helped.

I gave permission 777 for /user
So now user "Test" can perform HDFS operations.

And also I gave permission 777 for /usr/local/hadoop/datastore on the master.

When user "Test" tries to submit the MapReduce job, getting this error

Exception in thread "main" org.apache.hadoop.ipc.RemoteException: 
org.apache.hadoop.fs.permission.AccessControlException: Permission denied: 
user=test, access=WRITE, inode="datastore":hadoop:supergroup:rwxr-xr-x

Where else I need to give permission so that user "Test" can submit jobs using 
jobtracker and Datanode started by user "hadoop".

Thanks,
Senthil

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Wednesday, May 07, 2008 5:49 PM
To: core-user@hadoop.apache.org
Subject: Re: Hadoop Permission Problem

Hi Senthil,

Since the path "myapps" is relative, copyFromLocal will copy the file to the 
home directory, i.e. /user/

Re: Hadoop Permission Problem

2008-05-09 Thread s29752-hadoopuser
Hi Senthil,

I cannot see why it does not work.  Could you try again, do a fs -ls right 
after you see the error message?

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "core-user@hadoop.apache.org" 
Sent: Friday, May 9, 2008 11:49:49 AM
Subject: RE: Hadoop Permission Problem

Hi Nicholas,
No, I am running map/red jobs over HDFS file.
That permission is for datastore (hadoop.tmp.dir)

Here is the HDFS
/usr/local/hadoop/bin/hadoop dfs -ls /
Found 2 items
/user  2008-05-07 17:55rwxrwxrwx   hadoop  
supergroup
/usr   2008-05-07 17:18rwxr-xr-x   hadoop  
supergroup
[EMAIL PROTECTED] .ssh]$ /usr/local/hadoop/bin/hadoop dfs -ls /user
Found 2 items
/user/hadoop   2008-05-08 16:36rwxr-xr-x   hadoop  
supergroup
/user/test  2008-05-07 17:55rwxrwxrwx   test 
supergroup

Thanks,
Senthil

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Friday, May 09, 2008 2:40 PM
To: core-user@hadoop.apache.org
Subject: Re: Hadoop Permission Problem

Hi Senthil,

drwxrwxrwx4 hadoop   hadoop   4096 May  8 16:31 hadoop-hadoop
drwxrwxrwx2 test  test  4096 May  9 09:29 hadoop-test

>From the output format, the directories above seem not HDFS directories.  Are 
>you running map/red jobs over local file system (e.g. Linux)?

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "core-user@hadoop.apache.org" 
Sent: Friday, May 9, 2008 6:36:27 AM
Subject: RE: Hadoop Permission Problem

Hi Nicholas,
That's what I was wondering. Here is the datastore directory permission in the 
master machine.

drwxrwxrwx5 hadoop   hadoop   4096 May  7 18:02 datastore

This datastore directory present only in the master right not on the slaves 
right? I couldn't find.

After I changed the permission for datastore I restarted dfs and mapred. But 
still it complains about the permission.

Even I changed all the directories in datastore to 777

drwxrwxrwx4 hadoop   hadoop   4096 May  8 16:31 hadoop-hadoop
drwxrwxrwx2 test  test  4096 May  9 09:29 hadoop-test

What are the places I need to change the permissions so that UserB can submit 
the job using the jobtracker and tasktracker started by UserA.

Thanks,
Senthil


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Thursday, May 08, 2008 8:32 PM
To: core-user@hadoop.apache.org
Subject: Re: Hadoop Permission Problem

Hi Senthil,

In the error message, it says that the permission for "datastore" is 755.  Are 
you sure that you have changed it to 777?

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "core-user@hadoop.apache.org" 
Sent: Thursday, May 8, 2008 11:57:46 AM
Subject: RE: Hadoop Permission Problem

Hi Nicholas,
Thanks it helped.

I gave permission 777 for /user
So now user "Test" can perform HDFS operations.

And also I gave permission 777 for /usr/local/hadoop/datastore on the master.

When user "Test" tries to submit the MapReduce job, getting this error

Exception in thread "main" org.apache.hadoop.ipc.RemoteException: 
org.apache.hadoop.fs.permission.AccessControlException: Permission denied: 
user=test, access=WRITE, inode="datastore":hadoop:supergroup:rwxr-xr-x

Where else I need to give permission so that user "Test" can submit jobs using 
jobtracker and Datanode started by user "hadoop".

Thanks,
Senthil

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Wednesday, May 07, 2008 5:49 PM
To: core-user@hadoop.apache.org
Subject: Re: Hadoop Permission Problem

Hi Senthil,

Since the path "myapps" is relative, copyFromLocal will copy the file to the 
home directory, i.e. /user/Test/myapps in your case.  If /user/Test doesn't not 
exist, it will first try to create it.  You got AccessControlException because 
the permission of /user is 755.

Hope this helps.

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Sent: Wednesday, May 7, 2008 2:36:22 PM
Subject: Hadoop Permission Problem

Hi,
My datanode and jobtracker are started by user "hadoop".
And user "Test" needs to submit the job. So if the user "Test" copies file to 
HDFS, there is a permission error.
/usr/local/hadoop/bin/hadoop dfs -copyFromLocal /home/Test/somefile.txt myapps
copyFromLocal: org.apache.hadoop.fs.permission.AccessControlException: 
Permission denied: user=Test, access=WRITE, 
inode="user":hadoop:supergroup:rwxr-xr-x
Could you please let me know how other users (other than hadoop) can access 
HDFS and then submit MapReduce jobs. Where to configure or what default 
configuration needs to be changed.

Thanks,
Senthil


Re: Hadoop Permissions Question -> [Fwd: Hbase on hadoop]

2008-05-09 Thread s29752-hadoopuser
Hi Rick,

> the hbase master must be run on the same machine as the hadoop hdfs (what 
> part of it?) if one wants to use the hdfs permissions system or that right 
> now we must run without permissions?
Hdfs and hbase (and all clients) should run under the same administrative 
domain, but not the same machine.

The stack track is good enough.  HMasting does 
DistributedFileSystem.setSafeMod(...) which required superuser privilege.

Nicholas



- Original Message 
From: Rick Hangartner <[EMAIL PROTECTED]>
To: core-user@hadoop.apache.org
Sent: Friday, May 9, 2008 11:51:55 AM
Subject: Re: Hadoop Permissions Question -> [Fwd: Hbase on hadoop]

Hi Nicholas,

I was the original poster of this question.  Thanks for your  
response.  (And thanks for elevating attention to this Stack).

Am I missing something or is one implication of how hdfs determines  
privileges from the Linux filesystem that the hbase master must be run  
on the same machine as the hadoop hdfs (what part of it?) if one wants  
to use the hdfs permissions system or that right now we must run  
without permissions?

Here's most of the full Java trace for the exception that might be  
helpful in determining why superuser privilege is required to run  
HMaster.  Unfortunately log4j appears to have chopped off the last 6  
entries.  (This is from the hbase log).

Thanks for the help.

2008-05-08 10:13:28,670 ERROR org.apache.hadoop.hbase.HMaster: Can not  
start master
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native  
Method)
at  
sun 
.reflect 
.NativeConstructorAccessorImpl 
.newInstance(NativeConstructorAccessorImpl.java:39)
at  
sun 
.reflect 
.DelegatingConstructorAccessorImpl 
.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:494)
at org.apache.hadoop.hbase.HMaster.doMain(HMaster.java:3312)
at org.apache.hadoop.hbase.HMaster.main(HMaster.java:3346)
Caused by: org.apache.hadoop.ipc.RemoteException:  
org.apache.hadoop.fs.permission.AccessControlException: Superuser  
privilege is required
at  
org 
.apache 
.hadoop.dfs.FSNamesystem.checkSuperuserPrivilege(FSNamesystem.java:4020)
at org.apache.hadoop.dfs.FSNamesystem.setSafeMode(FSNamesystem.java: 
3794)
at org.apache.hadoop.dfs.NameNode.setSafeMode(NameNode.java:473)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at  
sun 
.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: 
39)
at  
sun 
.reflect 
.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 
25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:409)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:901)

at org.apache.hadoop.ipc.Client.call(Client.java:512)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:198)
at org.apache.hadoop.dfs.$Proxy0.setSafeMode(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at  
sun 
.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: 
39)
at  
sun 
.reflect 
.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 
25)
at java.lang.reflect.Method.invoke(Method.java:585)
at  
org 
.apache 
.hadoop 
.io 
.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java: 
82)
at  
org 
.apache 
.hadoop 
.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at org.apache.hadoop.dfs.$Proxy0.setSafeMode(Unknown Source)
at org.apache.hadoop.dfs.DFSClient.setSafeMode(DFSClient.java:486)
at  
org 
.apache 
.hadoop 
.dfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:257)
at org.apache.hadoop.hbase.HMaster.(HMaster.java:893)
at org.apache.hadoop.hbase.HMaster.(HMaster.java:859)
... 6 more

On May 9, 2008, at 11:34 AM, [EMAIL PROTECTED] wrote:

> Hi Stack,
>
>> One question this raises is if the "hbase:hbase" user and group are  
>> being derived from the Linux file system user and group, or if they  
>> are the hdfs user and group?
> HDFS currently does not manage user and group information.  User and  
> group in HDFS are being derived from the underlying OS (Linux in  
> your case) user and group.
>
>> Otherwise, how can we indicate that "hbase" user is in the hdfs  
>> group "supergroup"?
> In Hadoop conf, the property dfs.permissions.supergroup specifies  
> the super-user group and the default value is "supergroup".  
> Administrator should set this property to a dedicated group in the  
> underlying OS for HDFS superuser.  For example, you could create a  
> group "hdfs-superuser" in Linux, set dfs.permissions.supergroup to  
> "hdfs-superuser" and add "hdfs-superuser" to hbase's group list.  
> Then, "hbase" becomes a HDFS superuser.
>
> I don't know why superuser privilege is required to run HMaster.  I  
> might be able to tell if a comp

Re: Hadoop Permission Problem

2008-05-09 Thread s29752-hadoopuser
Hi Senthil,

drwxrwxrwx4 hadoop   hadoop   4096 May  8 16:31 hadoop-hadoop
drwxrwxrwx2 test  test  4096 May  9 09:29 hadoop-test

>From the output format, the directories above seem not HDFS directories.  Are 
>you running map/red jobs over local file system (e.g. Linux)?

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "core-user@hadoop.apache.org" 
Sent: Friday, May 9, 2008 6:36:27 AM
Subject: RE: Hadoop Permission Problem

Hi Nicholas,
That's what I was wondering. Here is the datastore directory permission in the 
master machine.

drwxrwxrwx5 hadoop   hadoop   4096 May  7 18:02 datastore

This datastore directory present only in the master right not on the slaves 
right? I couldn't find.

After I changed the permission for datastore I restarted dfs and mapred. But 
still it complains about the permission.

Even I changed all the directories in datastore to 777

drwxrwxrwx4 hadoop   hadoop   4096 May  8 16:31 hadoop-hadoop
drwxrwxrwx2 test  test  4096 May  9 09:29 hadoop-test

What are the places I need to change the permissions so that UserB can submit 
the job using the jobtracker and tasktracker started by UserA.

Thanks,
Senthil


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Thursday, May 08, 2008 8:32 PM
To: core-user@hadoop.apache.org
Subject: Re: Hadoop Permission Problem

Hi Senthil,

In the error message, it says that the permission for "datastore" is 755.  Are 
you sure that you have changed it to 777?

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "core-user@hadoop.apache.org" 
Sent: Thursday, May 8, 2008 11:57:46 AM
Subject: RE: Hadoop Permission Problem

Hi Nicholas,
Thanks it helped.

I gave permission 777 for /user
So now user "Test" can perform HDFS operations.

And also I gave permission 777 for /usr/local/hadoop/datastore on the master.

When user "Test" tries to submit the MapReduce job, getting this error

Exception in thread "main" org.apache.hadoop.ipc.RemoteException: 
org.apache.hadoop.fs.permission.AccessControlException: Permission denied: 
user=test, access=WRITE, inode="datastore":hadoop:supergroup:rwxr-xr-x

Where else I need to give permission so that user "Test" can submit jobs using 
jobtracker and Datanode started by user "hadoop".

Thanks,
Senthil

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Wednesday, May 07, 2008 5:49 PM
To: core-user@hadoop.apache.org
Subject: Re: Hadoop Permission Problem

Hi Senthil,

Since the path "myapps" is relative, copyFromLocal will copy the file to the 
home directory, i.e. /user/Test/myapps in your case.  If /user/Test doesn't not 
exist, it will first try to create it.  You got AccessControlException because 
the permission of /user is 755.

Hope this helps.

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Sent: Wednesday, May 7, 2008 2:36:22 PM
Subject: Hadoop Permission Problem

Hi,
My datanode and jobtracker are started by user "hadoop".
And user "Test" needs to submit the job. So if the user "Test" copies file to 
HDFS, there is a permission error.
/usr/local/hadoop/bin/hadoop dfs -copyFromLocal /home/Test/somefile.txt myapps
copyFromLocal: org.apache.hadoop.fs.permission.AccessControlException: 
Permission denied: user=Test, access=WRITE, 
inode="user":hadoop:supergroup:rwxr-xr-x
Could you please let me know how other users (other than hadoop) can access 
HDFS and then submit MapReduce jobs. Where to configure or what default 
configuration needs to be changed.

Thanks,
Senthil


Re: Hadoop Permissions Question -> [Fwd: Hbase on hadoop]

2008-05-09 Thread s29752-hadoopuser
Hi Stack,

> One question this raises is if the "hbase:hbase" user and group are being 
> derived from the Linux file system user and group, or if they are the hdfs 
> user and group?
HDFS currently does not manage user and group information.  User and group in 
HDFS are being derived from the underlying OS (Linux in your case) user and 
group.

> Otherwise, how can we indicate that "hbase" user is in the hdfs group 
> "supergroup"? 
In Hadoop conf, the property dfs.permissions.supergroup specifies the 
super-user group and the default value is "supergroup".  Administrator should 
set this property to a dedicated group in the underlying OS for HDFS superuser. 
 For example, you could create a group "hdfs-superuser" in Linux, set 
dfs.permissions.supergroup to "hdfs-superuser" and add "hdfs-superuser" to 
hbase's group list.  Then, "hbase" becomes a HDFS superuser.

I don't know why superuser privilege is required to run HMaster.  I might be 
able to tell if a complete stack track is given.

Nicholas



- Original Message 
From: stack <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Thursday, May 8, 2008 8:44:42 PM
Subject: Hadoop Permissions Question -> [Fwd: Hbase on hadoop]

Can someone familiar with permissions offer an opinion on the below?
Thanks,
St.Ack
Hi,

We have an issue with hbase on hadoop and file system permissions we  
hope someone already knows the answer to.  Our apologies if we missed  
that this issue has already been addressed on this list.

We are running hbase-0.1.2 on top of hadoop-0.16.3, starting the hbase  
daemon from an "hbase" user account and the hadoop daemon and have  
observed this "feature".   We are running hbase in a separate "hadoop"  
user account and hadoop in it's own "hadoop" user account on a single  
machine.

When we try to start up hbase, we see this error message in the log:

2008-05-06 12:09:02,845 ERROR org.apache.hadoop.hbase.HMaster: Can not  
start master
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native  
Method)
at  
sun 
.reflect 
.NativeConstructorAccessorImpl 
.newInstance(NativeConstructorAccessorImpl.java:39)
at  
sun 
.reflect 
.DelegatingConstructorAccessorImpl 
.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:494)
at org.apache.hadoop.hbase.HMaster.doMain(HMaster.java:3329)
at org.apache.hadoop.hbase.HMaster.main(HMaster.java:3363)
Caused by: org.apache.hadoop.ipc.RemoteException:  
org.apache.hadoop.fs.permission.AccessControlException: Superuser  
privilege is required
 ... (etc)

If we run hbase in the hadoop user account we don't have any problems.

We think we've narrowed the issue down a bit from the debug logs.

The method "FSNameSystem.checkPermission()" method is throwing the  
exception because the "PermissionChecker()" constructor is returning  
that the hbase user is not a superuser or in the same supergroup as  
hadoop.

   private void checkSuperuserPrivilege() throws  
AccessControlException {
 if (isPermissionEnabled) {
   PermissionChecker pc = new PermissionChecker(
   fsOwner.getUserName(), supergroup);
   if (!pc.isSuper) {
 throw new AccessControlException("Superuser privilege is  
required");
   }
 }
   }

If we look at at the "PermissionChecker()" constructor we see that it  
is comparing the hdfs owner name (which should be "hadoop") and the  
hdfs file system owner's group ("supergroup") to the current user and  
groups, which the log seems to indicate the user is "hbase" and the  
groups for user "hbase" only include "hbase" :
   PermissionChecker(String fsOwner, String supergroup
   ) throws AccessControlException{
 UserGroupInformation ugi = UserGroupInformation.getCurrentUGI();
 if (LOG.isDebugEnabled()) {
   LOG.debug("ugi=" + ugi);
 }

 if (ugi != null) {
   user = ugi.getUserName();
   groups.addAll(Arrays.asList(ugi.getGroupNames()));
   isSuper = user.equals(fsOwner) || groups.contains(supergroup);
 }
 else {
   throw new AccessControlException("ugi = null");
 }
   }

The current user and group is derived from the thread information:
   private static final ThreadLocal currentUGI
 = new ThreadLocal();

   /** @return the [EMAIL PROTECTED] UserGroupInformation} for the current 
thread  
*/
   public static UserGroupInformation getCurrentUGI() {
 return currentUGI.get();
   }

which we're hoping might be enough to illuminate the problem.

One question this raises is if the "hbase:hbase" user and group are  
being derived from the Linux file system user and group, or if they  
are the hdfs user and group?
Otherwise, how can we indicate that "hbase" user is in the hdfs group  
"supergroup"? Is there a parameter in a hadoop configuration file?  
Apparently setting the groups of the web server to include  
"supergroup" didn't have any effect, although perhaps t

Re: Hadoop Permission Problem

2008-05-08 Thread s29752-hadoopuser
Hi Senthil,

In the error message, it says that the permission for "datastore" is 755.  Are 
you sure that you have changed it to 777?

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "core-user@hadoop.apache.org" 
Sent: Thursday, May 8, 2008 11:57:46 AM
Subject: RE: Hadoop Permission Problem

Hi Nicholas,
Thanks it helped.

I gave permission 777 for /user
So now user "Test" can perform HDFS operations.

And also I gave permission 777 for /usr/local/hadoop/datastore on the master.

When user "Test" tries to submit the MapReduce job, getting this error

Exception in thread "main" org.apache.hadoop.ipc.RemoteException: 
org.apache.hadoop.fs.permission.AccessControlException: Permission denied: 
user=test, access=WRITE, inode="datastore":hadoop:supergroup:rwxr-xr-x

Where else I need to give permission so that user "Test" can submit jobs using 
jobtracker and Datanode started by user "hadoop".

Thanks,
Senthil

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Wednesday, May 07, 2008 5:49 PM
To: core-user@hadoop.apache.org
Subject: Re: Hadoop Permission Problem

Hi Senthil,

Since the path "myapps" is relative, copyFromLocal will copy the file to the 
home directory, i.e. /user/Test/myapps in your case.  If /user/Test doesn't not 
exist, it will first try to create it.  You got AccessControlException because 
the permission of /user is 755.

Hope this helps.

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Sent: Wednesday, May 7, 2008 2:36:22 PM
Subject: Hadoop Permission Problem

Hi,
My datanode and jobtracker are started by user "hadoop".
And user "Test" needs to submit the job. So if the user "Test" copies file to 
HDFS, there is a permission error.
/usr/local/hadoop/bin/hadoop dfs -copyFromLocal /home/Test/somefile.txt myapps
copyFromLocal: org.apache.hadoop.fs.permission.AccessControlException: 
Permission denied: user=Test, access=WRITE, 
inode="user":hadoop:supergroup:rwxr-xr-x
Could you please let me know how other users (other than hadoop) can access 
HDFS and then submit MapReduce jobs. Where to configure or what default 
configuration needs to be changed.

Thanks,
Senthil


Re: Hadoop Permission Problem

2008-05-07 Thread s29752-hadoopuser
Hi Senthil,

Since the path "myapps" is relative, copyFromLocal will copy the file to the 
home directory, i.e. /user/Test/myapps in your case.  If /user/Test doesn't not 
exist, it will first try to create it.  You got AccessControlException because 
the permission of /user is 755.

Hope this helps.

Nicholas



- Original Message 
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
Sent: Wednesday, May 7, 2008 2:36:22 PM
Subject: Hadoop Permission Problem

Hi,
My datanode and jobtracker are started by user "hadoop".
And user "Test" needs to submit the job. So if the user "Test" copies file to 
HDFS, there is a permission error.
/usr/local/hadoop/bin/hadoop dfs -copyFromLocal /home/Test/somefile.txt myapps
copyFromLocal: org.apache.hadoop.fs.permission.AccessControlException: 
Permission denied: user=Test, access=WRITE, 
inode="user":hadoop:supergroup:rwxr-xr-x
Could you please let me know how other users (other than hadoop) can access 
HDFS and then submit MapReduce jobs. Where to configure or what default 
configuration needs to be changed.

Thanks,
Senthil

Re: distcp fails when copying from s3 to hdfs

2008-04-04 Thread s29752-hadoopuser
Your distcp command looks correct.  distcp may have created some log files 
(e.g. inside /_distcp_logs_5vzva5 from your previous email.)  Could you check 
the logs, see whether there are error messages?

If you could send me the distcp output and the logs, I may be able to find out 
the problem.  (remember to remove the id:secret   :)

Nicholas


- Original Message 
From: Siddhartha Reddy <[EMAIL PROTECTED]>
To: core-user@hadoop.apache.org
Sent: Friday, April 4, 2008 12:23:17 PM
Subject: Re: distcp fails when copying from s3 to hdfs

I am sorry, that was a mistype in my mail. The second command was (please
note the / at the end):

bin/hadoop fs -fs s3://id:[EMAIL PROTECTED] -ls /


I guess you are right, Nicholas. The
s3://id:[EMAIL PROTECTED]/file.txtindeed does not seem to be there.
But the earlier distcp command to copy the
file to S3 finished without errors. Once again, the command I am using to
copy the file to S3 is:

bin/hadoop distcp file.txt s3://id:[EMAIL PROTECTED]/file.txt

Am I doing anything wrong here?

Thanks,
Siddhartha


On Fri, Apr 4, 2008 at 11:38 PM, <[EMAIL PROTECTED]> wrote:

> >To check that the file actually exists on S3, I tried the following
> commands:
> >
> >bin/hadoop fs -fs s3://id:[EMAIL PROTECTED] -ls
> >bin/hadoop fs -fs s3://id:[EMAIL PROTECTED] -ls
> >
> >The first returned nothing, while the second returned the following:
> >
> >Found 1 items
> >/_distcp_logs_5vzva5   1969-12-31 19:00rwxrwxrwx
>
>
> Are the first and the second commands the same?  (why they return
> different results?)  It seems that distcp is right:
> s3://id:[EMAIL PROTECTED]/file.txt indeed does not exist from the output
> of your the second command.
>
> Nicholas
>
>


-- 
http://sids.in
"If you are not having fun, you are not doing it right."





Re: distcp fails when copying from s3 to hdfs

2008-04-04 Thread s29752-hadoopuser
>To check that the file actually exists on S3, I tried the following commands:
>
>bin/hadoop fs -fs s3://id:[EMAIL PROTECTED] -ls
>bin/hadoop fs -fs s3://id:[EMAIL PROTECTED] -ls
>
>The first returned nothing, while the second returned the following:
>
>Found 1 items
>/_distcp_logs_5vzva5   1969-12-31 19:00rwxrwxrwx


Are the first and the second commands the same?  (why they return different 
results?)  It seems that distcp is right: s3://id:[EMAIL PROTECTED]/file.txt 
indeed does not exist from the output of your the second command.

Nicholas



Re: distcp fails :Input source not found

2008-04-03 Thread s29752-hadoopuser
distcp supports multiple sources (link Unix cp) and if the specified source is 
a directory, it copies the entire directory.  So, you could either do
  distcp src1 src2 ... src100 dst
or
  first copy all srcs to srcdir, and then
  distcp srcdir dstdir

I have no experience on S3 and EC2.  Not sure it will work.

Nicholas


- Original Message 
From: Prasan Ary <[EMAIL PROTECTED]>
To: core-user@hadoop.apache.org
Sent: Thursday, April 3, 2008 10:06:35 AM
Subject: Re: distcp fails :Input source not found

I found it was a slight oversight on my part. I was copying the files into S3 
using Firefox EC2 UI, and then trying to access those files on S3 using hadoop. 
 The S3 filesystem provided by hadoop doesn't work with standard files. When I 
used hadoop to upload the files into S3 instead of Firefox EC2 UI, things 
sorted out.
   
  But then I had a hard time copying a whole folder from S3 onto EC2 cluster. 
The following article suggests that "distcp" can be used to copy folder from S3 
bucket onto EC2 hdfs :
  http://developer.amazonwebservices.com/connect/entry.jspa?externalID=873
   
  However, when I try it on 0.15.3, it doesn't allow a folder copy. I have 100+ 
files in my S3 bucket, and I had to run "distcp" on each one of them to get 
them on HDFS on EC2 . Not a nice experience!
  Can anyone suggest more elegant way that we can transfer 100s of files from 
S3 to HDFS on EC2 without having to iterate through each file?
   
   
  
[EMAIL PROTECTED] wrote:
  It might be a bug. Could you try the following?
bin/hadoop fs -ls s3://ID:[EMAIL PROTECTED]/InputFileFormat.xml

Nicholas


- Original Message 
From: Prasan Ary 
To: core-user@hadoop.apache.org
Sent: Wednesday, April 2, 2008 7:41:50 AM
Subject: Re: distcp fails :Input source not found

Anybody ? Any thoughts why this might be happening?

Here is what is happening directly from the ec2 screen. The ID and
Secret Key are the only things changed.

I'm running hadoop 15.3 from the public ami. I launched a 2 machine
cluster using the ec2 scripts in the src/contrib/ec2/bin . . .

The file I try and copy is 9KB (I noticed previous discussion on
empty files and files that are > 10MB)

> First I make sure that we can copy the file from s3
[EMAIL PROTECTED] hadoop-0.15.3]# bin/hadoop fs
-copyToLocal s3://ID:[EMAIL PROTECTED]/InputFileFormat.xml
/usr/InputFileFormat.xml

> Now I see that the file is copied to the ec2 master (where I'm
logged in)
[EMAIL PROTECTED] hadoop-0.15.3]# dir /usr/Input*
/usr/InputFileFormat.xml

> Next I make sure I can access the HDFS and that the input
directory is there
[EMAIL PROTECTED] hadoop-0.15.3]# bin/hadoop fs -ls /
Found 2 items
/input   2008-04-01 15:45
/mnt   2008-04-01 15:42
[EMAIL PROTECTED] hadoop-0.15.3]# bin/hadoop fs -ls
/input/
Found 0 items

> I make sure hadoop is running just fine by running an example
[EMAIL PROTECTED] hadoop-0.15.3]# bin/hadoop jar
hadoop-0.15.3-examples.jar pi 10 1000
Number of Maps = 10 Samples per Map = 1000
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Starting Job
08/04/01 17:38:14 INFO mapred.FileInputFormat: Total input paths to
process : 10
08/04/01 17:38:14 INFO mapred.JobClient: Running job:
job_200804011542_0001
08/04/01 17:38:15 INFO mapred.JobClient: map 0% reduce 0%
08/04/01 17:38:22 INFO mapred.JobClient: map 20% reduce 0%
08/04/01 17:38:24 INFO mapred.JobClient: map 30% reduce 0%
08/04/01 17:38:25 INFO mapred.JobClient: map 40% reduce 0%
08/04/01 17:38:27 INFO mapred.JobClient: map 50% reduce 0%
08/04/01 17:38:28 INFO mapred.JobClient: map 60% reduce 0%
08/04/01 17:38:31 INFO mapred.JobClient: map 80% reduce 0%
08/04/01 17:38:33 INFO mapred.JobClient: map 90% reduce 0%
08/04/01 17:38:34 INFO mapred.JobClient: map 100% reduce 0%
08/04/01 17:38:43 INFO mapred.JobClient: map 100% reduce 20%
08/04/01 17:38:44 INFO mapred.JobClient: map 100% reduce 100%
08/04/01 17:38:45 INFO mapred.JobClient: Job complete:
job_200804011542_0001
08/04/01 17:38:45 INFO mapred.JobClient: Counters: 9
08/04/01 17:38:45 INFO mapred.JobClient: Job Counters 
08/04/01 17:38:45 INFO mapred.JobClient: Launched map tasks=10
08/04/01 17:38:45 INFO mapred.JobClient: Launched reduce tasks=1
08/04/01 17:38:45 INFO mapred.JobClient: Data-local map tasks=10
08/04/01 17:38:45 INFO mapred.JobClient: Map-Reduce Framework
08/04/01 17:38:45 INFO mapred.JobClient: Map input records=10
08/04/01 17:38:45 INFO mapred.JobClient: Map output records=20
08/04/01 17:38:45 INFO mapred.JobClient: Map input bytes=240
08/04/01 17:38:45 INFO mapred.JobClient: Map output bytes=320
08/04/01 17:38:45 INFO mapred.JobClient: Reduce input groups=2
08/04/01 17:38:45 INFO mapred.JobClient: Reduce input records=20
Job Finished in 31.028 seconds
Estimated value of PI is 3.1556

> Finally, I try and copy the file over
[EMAIL PROTECTED

Re: distcp fails :Input source not found

2008-04-02 Thread s29752-hadoopuser
It might be a bug.  Could you try the following?
bin/hadoop fs -ls s3://ID:[EMAIL PROTECTED]/InputFileFormat.xml

Nicholas


- Original Message 
From: Prasan Ary <[EMAIL PROTECTED]>
To: core-user@hadoop.apache.org
Sent: Wednesday, April 2, 2008 7:41:50 AM
Subject: Re: distcp fails :Input source not found

Anybody ? Any thoughts why this might be happening?
   
  Here is what is happening directly from the ec2 screen. The ID and
 Secret Key are the only things changed.
  
  I'm running hadoop 15.3 from the public ami. I launched a 2 machine
 cluster using the ec2 scripts in  the src/contrib/ec2/bin . . .

The file I try and copy is 9KB (I noticed previous discussion on
 empty files and files that are > 10MB)
   
  > First I make sure that we can copy the file from s3
  [EMAIL PROTECTED] hadoop-0.15.3]# bin/hadoop fs
 -copyToLocal s3://ID:[EMAIL PROTECTED]/InputFileFormat.xml
 /usr/InputFileFormat.xml
 
   > Now I see that the file is copied to the ec2 master (where I'm
 logged in)
  [EMAIL PROTECTED] hadoop-0.15.3]# dir /usr/Input*
  /usr/InputFileFormat.xml
   
  > Next I make sure I can access the HDFS and that the input
 directory is there
  [EMAIL PROTECTED] hadoop-0.15.3]# bin/hadoop fs -ls /
  Found 2 items
  /input  2008-04-01 15:45
  /mnt  2008-04-01 15:42
  [EMAIL PROTECTED] hadoop-0.15.3]# bin/hadoop fs -ls
 /input/
  Found 0 items
   
  > I make sure hadoop is running just fine by running an example
  [EMAIL PROTECTED] hadoop-0.15.3]# bin/hadoop jar
 hadoop-0.15.3-examples.jar pi 10 1000
  Number of Maps = 10 Samples per Map = 1000
  Wrote input for Map #0
  Wrote input for Map #1
  Wrote input for Map #2
  Wrote input for Map #3
  Wrote input for Map #4
  Wrote input for Map #5
  Wrote input for Map #6
  Wrote input for Map #7
  Wrote input for Map #8
  Wrote input for Map #9
  Starting Job
  08/04/01 17:38:14 INFO mapred.FileInputFormat: Total input paths to
 process : 10
  08/04/01 17:38:14 INFO mapred.JobClient: Running job:
 job_200804011542_0001
  08/04/01 17:38:15 INFO mapred.JobClient: map 0% reduce 0%
  08/04/01 17:38:22 INFO mapred.JobClient: map 20% reduce 0%
  08/04/01 17:38:24 INFO mapred.JobClient: map 30% reduce 0%
  08/04/01 17:38:25 INFO mapred.JobClient: map 40% reduce 0%
  08/04/01 17:38:27 INFO mapred.JobClient: map 50% reduce 0%
  08/04/01 17:38:28 INFO mapred.JobClient: map 60% reduce 0%
  08/04/01 17:38:31 INFO mapred.JobClient: map 80% reduce 0%
  08/04/01 17:38:33 INFO mapred.JobClient: map 90% reduce 0%
  08/04/01 17:38:34 INFO mapred.JobClient: map 100% reduce 0%
  08/04/01 17:38:43 INFO mapred.JobClient: map 100% reduce 20%
  08/04/01 17:38:44 INFO mapred.JobClient: map 100% reduce 100%
  08/04/01 17:38:45 INFO mapred.JobClient: Job complete:
 job_200804011542_0001
  08/04/01 17:38:45 INFO mapred.JobClient: Counters: 9
  08/04/01 17:38:45 INFO mapred.JobClient: Job Counters 
  08/04/01 17:38:45 INFO mapred.JobClient: Launched map tasks=10
  08/04/01 17:38:45 INFO mapred.JobClient: Launched reduce tasks=1
  08/04/01 17:38:45 INFO mapred.JobClient: Data-local map tasks=10
  08/04/01 17:38:45 INFO mapred.JobClient: Map-Reduce Framework
  08/04/01 17:38:45 INFO mapred.JobClient: Map input records=10
  08/04/01 17:38:45 INFO mapred.JobClient: Map output records=20
  08/04/01 17:38:45 INFO mapred.JobClient: Map input bytes=240
  08/04/01 17:38:45 INFO mapred.JobClient: Map output bytes=320
  08/04/01 17:38:45 INFO mapred.JobClient: Reduce input groups=2
  08/04/01 17:38:45 INFO mapred.JobClient: Reduce input records=20
  Job Finished in 31.028 seconds
  Estimated value of PI is 3.1556
   
  > Finally, I try and copy the file over
  [EMAIL PROTECTED] hadoop-0.15.3]# bin/hadoop distcp
 s3://ID:[EMAIL PROTECTED]/InputFileFormat.xml
 /input/InputFileFormat.xml
  With failures, global counters are inaccurate; consider running with
 -i
  Copy failed: org.apache.hadoop.mapred.InvalidInputException: Input
 source s3://ID:[EMAIL PROTECTED]/InputFileFormat.xml does not
 exist.
  at org.apache.hadoop.util.CopyFiles.copy(CopyFiles.java:470)
  at org.apache.hadoop.util.CopyFiles.run(CopyFiles.java:550)
  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
  at org.apache.hadoop.util.CopyFiles.main(CopyFiles.java:563)


   
-
You rock. That's why Blockbuster's offering you one month of Blockbuster Total 
Access, No Cost.




Re: distcp fails :Input source not found

2008-04-01 Thread s29752-hadoopuser
> That was a typo in my email. I do have s3:// in my command when it fails.
  

Not sure what's wrong.  Your command looks right to me.  Would you mind to show 
me the exact error message you see?

Nicholas



Re: distcp fails :Input source not found

2008-04-01 Thread s29752-hadoopuser
   > bin/hadoop distcp s3//:@/fileone.txt  
/somefolder_on_hdfs/fileone.txt   : Fails - Input source doesnt exist.
Should "s3//..." be "s3://..."?

Nicholas





Re: [some bugs] Re: file permission problem

2008-03-17 Thread s29752-hadoopuser
Hi Stefan,

> any magic we can do with hadoop.dfs.umask?
>
dfs.umask is similar to Unix umask.

> Or is there any other off switch for the file security?
>
If dfs.permissions is set to false, then the security will be turned off.  

For the two questions above, see 
http://hadoop.apache.org/core/docs/r0.16.1/hdfs_permissions_guide.html for more 
details

> I definitely can reproduce the problem Johannes describes ...
>
I guess you are using the nightly builds which having the bug.  Please try 
0.16.1 release or current trunk.

> Beside of that I had some interesting observations.
> If I have permissions to write to a folder A I can delete folder A and 
> file B that is inside of folder A even if I do have no permissions for B.
>
This is also true for POSIX or Unix, where Hadoop permission bases on.

> Also I noticed following in my dfs
> [EMAIL PROTECTED] hadoop]$ bin/hadoop fs -ls /user/joa23/myApp-1205474968598
> Found 1 items
> /user/joa23/myApp-1205474968598/VOICE_CALL2008-03-13 16:00   
> rwxr-xr-xhadoopsupergroup
> [EMAIL PROTECTED] hadoop]$ bin/hadoop fs -ls 
> /user/joa23/myApp-1205474968598/VOICE_CALL
> Found 1 items
> /user/joa23/myApp-1205474968598/VOICE_CALL/part-027311   
> 2008-03-13 16:00rw-r--r--joa23supergroup
>
> Do I miss something or was I able to write as user joa23 into a 
> folder owned by hadoop where I should have no permissions. :-O.
> Should I open some jira issues?
>
Suppose joa23 is not a superuser.  Then, no.

The output above only shows a file owned by joa23 exists in a directory owned 
hadoop.  This can definitely be done by a sequence of commands with chmod/chown.

Suppose joa23 is not a superuser.  If joa23 can create a file, say by "hadoop 
fs -put ...", under hadoop's directory with rwxr-xr-x, then it is a bug.  But I 
don't think we can do this.

Hope this helps.

Nicholas






Re: [some bugs] Re: file permission problem

2008-03-17 Thread s29752-hadoopuser
Hi,

Let me clarify the versions having this problem.
0.16.0 release, 0.16.1 release, current trunk: 
no problem

Nightly builds between 0.16.0 and 0.16.1 before HADOOP-2391 or after 
HADOOP-2915: 
no problem

Nightly builds between 0.16.0 and 0.16.1 after HADOOP-2391 and before 
HADOOP-2915: 
bug exists

Similarly, codes downloading from trunk before HADOOP-2391 or after 
HADOOP-2915: 
no problem

Codes downloading from trunk after HADOOP-2391 and before HADOOP-2915: 
bug exists

Sorry for the confusion.

Nicholas


- Original Message 
From: Stefan Groschupf <[EMAIL PROTECTED]>
To: core-user@hadoop.apache.org
Sent: Saturday, March 15, 2008 8:02:07 PM
Subject: Re: [some bugs] Re: file permission problem

Great - it is even alrady fixed in 16.1!
Thanks for the hint!
Stefan

On Mar 14, 2008, at 2:49 PM, Andy Li wrote:

> I think this is the same problem related to this mail thread.
>
> http://www.mail-archive.com/[EMAIL PROTECTED]/msg02759.html
>
> A JIRA has been filed, please see HADOOP-2915.




Re: file permission problem

2008-03-13 Thread s29752-hadoopuser
Hi Johannes,

> i'm using the 0.16.0 distribution.
I assume you mean the 0.16.0 release 
(http://hadoop.apache.org/core/releases.html) without any additional patch.

I just have tried it but cannot reproduce the problem you described.  I did the 
following:
1) start a cluster with "tsz"
2) run a job with "nicholas"

The output directory and files are owned by "nicholas".  Am I doing the same 
thing you did?  Could you try again?

Nicholas


> - Original Message 
> From: Johannes Zillmann <[EMAIL PROTECTED]>
> To: core-user@hadoop.apache.org
> Sent: Wednesday, March 12, 2008 5:47:27 PM
> Subject: file permission problem
>
> Hi,
>
> i have a question regarding the file permissions.
> I have a kind of workflow where i submit a job from my laptop to a 
> remote hadoop cluster.
> After the job finished i do some file operations on the generated output.
> The "cluster-user" is different to the "laptop-user". As output i 
> specify a directory inside the users home. This output directory, 
> created through the map-reduce job has "cluster-user" permissions, so 
> this does not allow me to move or delete the output folder with my 
> "laptop-user".
>
> So it looks as follow:
> /user/jz/  rwxrwxrwx jzsupergroup
> /user/jz/output   rwxr-xr-xhadoopsupergroup
>
> I tried different things to achieve what i want (moving/deleting the 
> output folder):
> - jobConf.setUser("hadoop") on the client side
> - System.setProperty("user.name","hadoop") before jobConf instantiation 
> on the client side
> - add user.name node in the hadoop-site.xml on the client side
> - setPermision(777) on the home folder on the client side (does not work 
> recursiv)
> - setPermision(777) on the output folder on the client side (permission 
> denied)
> - create the output folder before running the job (Output directory 
> already exists exception)
>
> None of the things i tried worked. Is there a way to achieve what i want ?
> Any ideas appreciated!
>
> cheers
> Johannes
>
>
>   


-- 
~~~ 
101tec GmbH

Halle (Saale), Saxony-Anhalt, Germany
http://www.101tec.com






Re: file permission problem

2008-03-12 Thread s29752-hadoopuser
Hi Johannes,

Which version of hadoop are you using?  There is a known bug in some nightly 
builds.

Nicholas


- Original Message 
From: Johannes Zillmann <[EMAIL PROTECTED]>
To: core-user@hadoop.apache.org
Sent: Wednesday, March 12, 2008 5:47:27 PM
Subject: file permission problem

Hi,

i have a question regarding the file permissions.
I have a kind of workflow where i submit a job from my laptop to a 
remote hadoop cluster.
After the job finished i do some file operations on the generated output.
The "cluster-user" is different to the "laptop-user". As output i 
specify a directory inside the users home. This output directory, 
created through the map-reduce job has "cluster-user" permissions, so 
this does not allow me to move or delete the output folder with my 
"laptop-user".

So it looks as follow:
/user/jz/  rwxrwxrwx jzsupergroup
/user/jz/output   rwxr-xr-xhadoopsupergroup

I tried different things to achieve what i want (moving/deleting the 
output folder):
- jobConf.setUser("hadoop") on the client side
- System.setProperty("user.name","hadoop") before jobConf instantiation 
on the client side
- add user.name node in the hadoop-site.xml on the client side
- setPermision(777) on the home folder on the client side (does not work 
recursiv)
- setPermision(777) on the output folder on the client side (permission 
denied)
- create the output folder before running the job (Output directory 
already exists exception)

None of the things i tried worked. Is there a way to achieve what i want ?
Any ideas appreciated!

cheers
Johannes


-- 
~~~ 
101tec GmbH

Halle (Saale), Saxony-Anhalt, Germany
http://www.101tec.com