Re: datanode not being started

2009-02-16 Thread Rasit OZDAS
Sandy, I have no idea about your issue :(

Zander,
Your problem is probably about this JIRA issue:
http://issues.apache.org/jira/browse/HADOOP-1212

Here is 2 workarounds explained:
http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster)#java.io.IOException:_Incompatible_namespaceIDs

I haven't tried it, hope it helps.
Rasit

2009/2/17 zander1013 :
>
> hi,
>
> i am not seeing the DataNode run either. but i am seeing an extra process
> TaskTracker run.
>
> here is what hapens when i start the cluster run jps and stop the cluster...
>
> had...@node0:/usr/local/hadoop$ bin/start-all.sh
> starting namenode, logging to
> /usr/local/hadoop/bin/../logs/hadoop-hadoop-namenode-node0.out
> node0.local: starting datanode, logging to
> /usr/local/hadoop/bin/../logs/hadoop-hadoop-datanode-node0.out
> node1.local: starting datanode, logging to
> /usr/local/hadoop/bin/../logs/hadoop-hadoop-datanode-node1.out
> node0.local: starting secondarynamenode, logging to
> /usr/local/hadoop/bin/../logs/hadoop-hadoop-secondarynamenode-node0.out
> starting jobtracker, logging to
> /usr/local/hadoop/bin/../logs/hadoop-hadoop-jobtracker-node0.out
> node0.local: starting tasktracker, logging to
> /usr/local/hadoop/bin/../logs/hadoop-hadoop-tasktracker-node0.out
> node1.local: starting tasktracker, logging to
> /usr/local/hadoop/bin/../logs/hadoop-hadoop-tasktracker-node1.out
> had...@node0:/usr/local/hadoop$ jps
> 13353 TaskTracker
> 13126 SecondaryNameNode
> 12846 NameNode
> 13455 Jps
> 13232 JobTracker
> had...@node0:/usr/local/hadoop$ bin/stop-all.sh
> stopping jobtracker
> node0.local: stopping tasktracker
> node1.local: stopping tasktracker
> stopping namenode
> node0.local: no datanode to stop
> node1.local: no datanode to stop
> node0.local: stopping secondarynamenode
> had...@node0:/usr/local/hadoop$
>
> here is the tail of the log file for the session above...
> /
> 2009-02-16 19:35:13,999 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
> /
> STARTUP_MSG: Starting DataNode
> STARTUP_MSG:   host = node1/127.0.1.1
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 0.19.0
> STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r 713890;
> compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008
> /
> 2009-02-16 19:35:18,999 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException:
> Incompatible namespaceIDs in
> /usr/local/hadoop-datastore/hadoop-hadoop/dfs/data: namenode namespaceID =
> 1050914495; datanode namespaceID = 722953254
>at
> org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:233)
>at
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:148)
>at
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:287)
>at
> org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:205)
>at
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1199)
>at
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1154)
>at
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1162)
>at
> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1284)
>
> 2009-02-16 19:35:19,000 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
> /
> SHUTDOWN_MSG: Shutting down DataNode at node1/127.0.1.1
> /
>
> i have not seen DataNode run yet. i have only started and stopped the
> cluster a couple of times.
>
> i tried to reformat datanode and namenode with bin/hadoop datanode -format
> and bin/hadoop namenode -format from /usr/local/hadoop dir.
>
> please advise
>
> zander
>
>
>
> Mithila Nagendra wrote:
>>
>> Hey Sandy
>> I had a similar problem with Hadoop. All I did was I stopped all the
>> daemons
>> using stop-all.sh. Then formatted the namenode again using hadoop namenode
>> -format. After this I went on to restarting everything by using
>> start-all.sh
>>
>> I hope you dont have much data on the datanode, reformatting it would
>> erase
>> everything out.
>>
>> Hope this helps!
>> Mithila
>>
>>
>>
>> On Sat, Feb 14, 2009 at 2:39 AM, james warren  wrote:
>>
>>> Sandy -
>>>
>>> I suggest you take a look into your NameNode and DataNode logs.  From the
>>> information posted, these likely would be at
>>>
>>>
>>> /Users/hadoop/hadoop-0.18.2/bin/../logs/hadoop-hadoop-namenode-loteria.cs.tamu.edu.log
>>>
>>> /Users/hadoop/hadoop-0.18.2/bin/../logs/hadoop-hadoop-jobtracker-loteria.cs.tamu.edu.log
>>>
>>> If the cause isn't obvious from what you see there, could you please post
>>> the last few lines from each log?
>>>
>>> -

Re: HDFS architecture based on GFS?

2009-02-16 Thread Matei Zaharia
As far as I know, datanodes just know the block ID, and the namenode knows
which file this belongs to.

On Mon, Feb 16, 2009 at 4:54 PM, Amandeep Khurana  wrote:

> Ok. Thanks..
>
> Another question now. Do the datanodes have any way of linking a particular
> block of data to a global file identifier?
>
> Amandeep
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
> On Sun, Feb 15, 2009 at 9:37 PM, Matei Zaharia  wrote:
>
> > In general, yeah, the scripts can access any resource they want (within
> the
> > permissions of the user that the task runs as). It's also possible to
> > access
> > HDFS from scripts because HDFS provides a FUSE interface that can make it
> > look like a regular file system on the machine. (The FUSE module in turn
> > talks to the namenode as a regular HDFS client.)
> >
> > On Sun, Feb 15, 2009 at 8:43 PM, Amandeep Khurana 
> > wrote:
> >
> > > I dont know much about Hadoop streaming and have a quick question here.
> > >
> > > The snippets of code/programs that you attach into the map reduce job
> > might
> > > want to access outside resources (like you mentioned). Now these might
> > not
> > > need to go to the namenode right? For example a python script. How
> would
> > it
> > > access the data? Would it ask the parent java process (in the
> > tasktracker)
> > > to get the data or would it go and do stuff on its own?
> > >
> > >
> > > Amandeep Khurana
> > > Computer Science Graduate Student
> > > University of California, Santa Cruz
> > >
> > >
> > > On Sun, Feb 15, 2009 at 8:23 PM, Matei Zaharia 
> > wrote:
> > >
> > > > Nope, typically the JobTracker just starts the process, and the
> > > tasktracker
> > > > talks directly to the namenode to get a pointer to the datanode, and
> > then
> > > > directly to the datanode.
> > > >
> > > > On Sun, Feb 15, 2009 at 8:07 PM, Amandeep Khurana 
> > > > wrote:
> > > >
> > > > > Alright.. Got it.
> > > > >
> > > > > Now, do the task trackers talk to the namenode and the data node
> > > directly
> > > > > or
> > > > > do they go through the job tracker for it? So, if my code is such
> > that
> > > I
> > > > > need to access more files from the hdfs, would the job tracker get
> > > > involved
> > > > > or not?
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > Amandeep Khurana
> > > > > Computer Science Graduate Student
> > > > > University of California, Santa Cruz
> > > > >
> > > > >
> > > > > On Sun, Feb 15, 2009 at 7:20 PM, Matei Zaharia  >
> > > > wrote:
> > > > >
> > > > > > Normally, HDFS files are accessed through the namenode. If there
> > was
> > > a
> > > > > > malicious process though, then I imagine it could talk to a
> > datanode
> > > > > > directly and request a specific block.
> > > > > >
> > > > > > On Sun, Feb 15, 2009 at 7:15 PM, Amandeep Khurana <
> > ama...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Ok. Got it.
> > > > > > >
> > > > > > > Now, when my job needs to access another file, does it go to
> the
> > > > > Namenode
> > > > > > > to
> > > > > > > get the block ids? How does the java process know where the
> files
> > > are
> > > > > and
> > > > > > > how to access them?
> > > > > > >
> > > > > > >
> > > > > > > Amandeep Khurana
> > > > > > > Computer Science Graduate Student
> > > > > > > University of California, Santa Cruz
> > > > > > >
> > > > > > >
> > > > > > > On Sun, Feb 15, 2009 at 7:05 PM, Matei Zaharia <
> > ma...@cloudera.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > I mentioned this case because even jobs written in Java can
> use
> > > the
> > > > > > HDFS
> > > > > > > > API
> > > > > > > > to talk to the NameNode and access the filesystem. People
> often
> > > do
> > > > > this
> > > > > > > > because their job needs to read a config file, some small
> data
> > > > table,
> > > > > > etc
> > > > > > > > and use this information in its map or reduce functions. In
> > this
> > > > > case,
> > > > > > > you
> > > > > > > > open the second file separately in your mapper's init
> function
> > > and
> > > > > read
> > > > > > > > whatever you need from it. In general I wanted to point out
> > that
> > > > you
> > > > > > > can't
> > > > > > > > know which files a job will access unless you look at its
> > source
> > > > code
> > > > > > or
> > > > > > > > monitor the calls it makes; the input file(s) you provide in
> > the
> > > > job
> > > > > > > > description are a hint to the MapReduce framework to place
> your
> > > job
> > > > > on
> > > > > > > > certain nodes, but it's reasonable for the job to access
> other
> > > > files
> > > > > as
> > > > > > > > well.
> > > > > > > >
> > > > > > > > On Sun, Feb 15, 2009 at 6:14 PM, Amandeep Khurana <
> > > > ama...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Another question that I have here - When the jobs run
> > arbitrary
> > > > > code
> > > > > > > and
> > > > > > > > > access data from the HDFS, do they go to the namenode to
> get
> > > the
> > > > > > 

Re: Can never restart HDFS after a day or two

2009-02-16 Thread Amandeep Khurana
Where are your namenode and datanode storing the data? By default, it goes
into the /tmp directory. You might want to move that out of there.

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Mon, Feb 16, 2009 at 8:11 PM, Mark Kerzner  wrote:

> Hi all,
>
> I consistently have this problem that I can run HDFS and restart it after
> short breaks of a few hours, but the next day I always have to reformat
> HDFS
> before the daemons begin to work.
>
> Is that normal? Maybe this is treated as temporary data, and the results
> need to be copied out of HDFS and not stored for long periods of time? I
> verified that the files in /tmp related to hadoop are seemingly intact.
>
> Thank you,
> Mark
>


Can never restart HDFS after a day or two

2009-02-16 Thread Mark Kerzner
Hi all,

I consistently have this problem that I can run HDFS and restart it after
short breaks of a few hours, but the next day I always have to reformat HDFS
before the daemons begin to work.

Is that normal? Maybe this is treated as temporary data, and the results
need to be copied out of HDFS and not stored for long periods of time? I
verified that the files in /tmp related to hadoop are seemingly intact.

Thank you,
Mark


Re: datanode not being started

2009-02-16 Thread zander1013

hi,

i am not seeing the DataNode run either. but i am seeing an extra process
TaskTracker run.

here is what hapens when i start the cluster run jps and stop the cluster...

had...@node0:/usr/local/hadoop$ bin/start-all.sh
starting namenode, logging to
/usr/local/hadoop/bin/../logs/hadoop-hadoop-namenode-node0.out
node0.local: starting datanode, logging to
/usr/local/hadoop/bin/../logs/hadoop-hadoop-datanode-node0.out
node1.local: starting datanode, logging to
/usr/local/hadoop/bin/../logs/hadoop-hadoop-datanode-node1.out
node0.local: starting secondarynamenode, logging to
/usr/local/hadoop/bin/../logs/hadoop-hadoop-secondarynamenode-node0.out
starting jobtracker, logging to
/usr/local/hadoop/bin/../logs/hadoop-hadoop-jobtracker-node0.out
node0.local: starting tasktracker, logging to
/usr/local/hadoop/bin/../logs/hadoop-hadoop-tasktracker-node0.out
node1.local: starting tasktracker, logging to
/usr/local/hadoop/bin/../logs/hadoop-hadoop-tasktracker-node1.out
had...@node0:/usr/local/hadoop$ jps
13353 TaskTracker
13126 SecondaryNameNode
12846 NameNode
13455 Jps
13232 JobTracker
had...@node0:/usr/local/hadoop$ bin/stop-all.sh
stopping jobtracker
node0.local: stopping tasktracker
node1.local: stopping tasktracker
stopping namenode
node0.local: no datanode to stop
node1.local: no datanode to stop
node0.local: stopping secondarynamenode
had...@node0:/usr/local/hadoop$ 

here is the tail of the log file for the session above...
/
2009-02-16 19:35:13,999 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = node1/127.0.1.1
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.19.0
STARTUP_MSG:   build =
https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r 713890;
compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008
/
2009-02-16 19:35:18,999 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException:
Incompatible namespaceIDs in
/usr/local/hadoop-datastore/hadoop-hadoop/dfs/data: namenode namespaceID =
1050914495; datanode namespaceID = 722953254
at
org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:233)
at
org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:148)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:287)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:205)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1199)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1154)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1162)
at
org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1284)

2009-02-16 19:35:19,000 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/
SHUTDOWN_MSG: Shutting down DataNode at node1/127.0.1.1
/

i have not seen DataNode run yet. i have only started and stopped the
cluster a couple of times.

i tried to reformat datanode and namenode with bin/hadoop datanode -format
and bin/hadoop namenode -format from /usr/local/hadoop dir.

please advise

zander



Mithila Nagendra wrote:
> 
> Hey Sandy
> I had a similar problem with Hadoop. All I did was I stopped all the
> daemons
> using stop-all.sh. Then formatted the namenode again using hadoop namenode
> -format. After this I went on to restarting everything by using
> start-all.sh
> 
> I hope you dont have much data on the datanode, reformatting it would
> erase
> everything out.
> 
> Hope this helps!
> Mithila
> 
> 
> 
> On Sat, Feb 14, 2009 at 2:39 AM, james warren  wrote:
> 
>> Sandy -
>>
>> I suggest you take a look into your NameNode and DataNode logs.  From the
>> information posted, these likely would be at
>>
>>
>> /Users/hadoop/hadoop-0.18.2/bin/../logs/hadoop-hadoop-namenode-loteria.cs.tamu.edu.log
>>
>> /Users/hadoop/hadoop-0.18.2/bin/../logs/hadoop-hadoop-jobtracker-loteria.cs.tamu.edu.log
>>
>> If the cause isn't obvious from what you see there, could you please post
>> the last few lines from each log?
>>
>> -jw
>>
>> On Fri, Feb 13, 2009 at 3:28 PM, Sandy  wrote:
>>
>> > Hello,
>> >
>> > I would really appreciate any help I can get on this! I've suddenly ran
>> > into
>> > a very strange error.
>> >
>> > when I do:
>> > bin/start-all
>> > I get:
>> > hadoop$ bin/start-all.sh
>> > starting namenode, logging to
>> >
>> >
>> /Users/hadoop/hadoop-0.18.2/bin/../logs/hadoop-hadoop-namenode-loteria.cs.tamu.edu.out
>> > starting jobtracker, logging to
>> >
>> >
>> /Users/hadoop/hadoop-0.18.2/bin/../logs/hadoop-hadoop-jobtracker-loteria.cs.tamu.edu.out
>> >
>> > No datanode, s

Re: HDFS architecture based on GFS?

2009-02-16 Thread Amandeep Khurana
Ok. Thanks..

Another question now. Do the datanodes have any way of linking a particular
block of data to a global file identifier?

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Sun, Feb 15, 2009 at 9:37 PM, Matei Zaharia  wrote:

> In general, yeah, the scripts can access any resource they want (within the
> permissions of the user that the task runs as). It's also possible to
> access
> HDFS from scripts because HDFS provides a FUSE interface that can make it
> look like a regular file system on the machine. (The FUSE module in turn
> talks to the namenode as a regular HDFS client.)
>
> On Sun, Feb 15, 2009 at 8:43 PM, Amandeep Khurana 
> wrote:
>
> > I dont know much about Hadoop streaming and have a quick question here.
> >
> > The snippets of code/programs that you attach into the map reduce job
> might
> > want to access outside resources (like you mentioned). Now these might
> not
> > need to go to the namenode right? For example a python script. How would
> it
> > access the data? Would it ask the parent java process (in the
> tasktracker)
> > to get the data or would it go and do stuff on its own?
> >
> >
> > Amandeep Khurana
> > Computer Science Graduate Student
> > University of California, Santa Cruz
> >
> >
> > On Sun, Feb 15, 2009 at 8:23 PM, Matei Zaharia 
> wrote:
> >
> > > Nope, typically the JobTracker just starts the process, and the
> > tasktracker
> > > talks directly to the namenode to get a pointer to the datanode, and
> then
> > > directly to the datanode.
> > >
> > > On Sun, Feb 15, 2009 at 8:07 PM, Amandeep Khurana 
> > > wrote:
> > >
> > > > Alright.. Got it.
> > > >
> > > > Now, do the task trackers talk to the namenode and the data node
> > directly
> > > > or
> > > > do they go through the job tracker for it? So, if my code is such
> that
> > I
> > > > need to access more files from the hdfs, would the job tracker get
> > > involved
> > > > or not?
> > > >
> > > >
> > > >
> > > >
> > > > Amandeep Khurana
> > > > Computer Science Graduate Student
> > > > University of California, Santa Cruz
> > > >
> > > >
> > > > On Sun, Feb 15, 2009 at 7:20 PM, Matei Zaharia 
> > > wrote:
> > > >
> > > > > Normally, HDFS files are accessed through the namenode. If there
> was
> > a
> > > > > malicious process though, then I imagine it could talk to a
> datanode
> > > > > directly and request a specific block.
> > > > >
> > > > > On Sun, Feb 15, 2009 at 7:15 PM, Amandeep Khurana <
> ama...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Ok. Got it.
> > > > > >
> > > > > > Now, when my job needs to access another file, does it go to the
> > > > Namenode
> > > > > > to
> > > > > > get the block ids? How does the java process know where the files
> > are
> > > > and
> > > > > > how to access them?
> > > > > >
> > > > > >
> > > > > > Amandeep Khurana
> > > > > > Computer Science Graduate Student
> > > > > > University of California, Santa Cruz
> > > > > >
> > > > > >
> > > > > > On Sun, Feb 15, 2009 at 7:05 PM, Matei Zaharia <
> ma...@cloudera.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > I mentioned this case because even jobs written in Java can use
> > the
> > > > > HDFS
> > > > > > > API
> > > > > > > to talk to the NameNode and access the filesystem. People often
> > do
> > > > this
> > > > > > > because their job needs to read a config file, some small data
> > > table,
> > > > > etc
> > > > > > > and use this information in its map or reduce functions. In
> this
> > > > case,
> > > > > > you
> > > > > > > open the second file separately in your mapper's init function
> > and
> > > > read
> > > > > > > whatever you need from it. In general I wanted to point out
> that
> > > you
> > > > > > can't
> > > > > > > know which files a job will access unless you look at its
> source
> > > code
> > > > > or
> > > > > > > monitor the calls it makes; the input file(s) you provide in
> the
> > > job
> > > > > > > description are a hint to the MapReduce framework to place your
> > job
> > > > on
> > > > > > > certain nodes, but it's reasonable for the job to access other
> > > files
> > > > as
> > > > > > > well.
> > > > > > >
> > > > > > > On Sun, Feb 15, 2009 at 6:14 PM, Amandeep Khurana <
> > > ama...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Another question that I have here - When the jobs run
> arbitrary
> > > > code
> > > > > > and
> > > > > > > > access data from the HDFS, do they go to the namenode to get
> > the
> > > > > block
> > > > > > > > information?
> > > > > > > >
> > > > > > > >
> > > > > > > > Amandeep Khurana
> > > > > > > > Computer Science Graduate Student
> > > > > > > > University of California, Santa Cruz
> > > > > > > >
> > > > > > > >
> > > > > > > > On Sun, Feb 15, 2009 at 6:00 PM, Amandeep Khurana <
> > > > ama...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Assuming that the job is purely in Java and not involving
> > > > streaming
> > > > > > or
> > > > > > > > >

Re: setting blank passwords for user hadoop...

2009-02-16 Thread nitesh bhatia
http://linuxproblem.org/art_9.html

On Tue, Feb 17, 2009 at 5:51 AM, zander1013  wrote:
>
> hi,
>
> i am going through the tutorial for a multi-node cluster by m. noll. i am to
> the section where i try to start the cluster but when i run bin/start-dfs.sh
> i get the error...
>
> had...@node1.local's password: node1.local: Permission denied, please try
> agian.
> node1.local: Connection closed by 169.254.7.81
> node0.local: secondarynamenode running as process 11215. Stop it firs.
> ahd...@node0:/usr/local/hadoop$
>
> it seems i am unable to set the passwordless hadoop user. i have tried using
> passwd and the user manager in ubuntu 8.10. but it seems to be hanging me
> up.
>
> please advise.
>
> zander
> --
> View this message in context: 
> http://www.nabble.com/setting-blank-passwords-for-user-hadoop...-tp22048732p22048732.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>



-- 
Nitesh Bhatia
Dhirubhai Ambani Institute of Information & Communication Technology
Gandhinagar
Gujarat

"Life is never perfect. It just depends where you draw the line."

visit:
http://www.awaaaz.com - connecting through music
http://www.volstreet.com - lets volunteer for better tomorrow
http://www.instibuzz.com - Voice opinions, Transact easily, Have fun


setting blank passwords for user hadoop...

2009-02-16 Thread zander1013

hi,

i am going through the tutorial for a multi-node cluster by m. noll. i am to
the section where i try to start the cluster but when i run bin/start-dfs.sh
i get the error...

had...@node1.local's password: node1.local: Permission denied, please try
agian.
node1.local: Connection closed by 169.254.7.81
node0.local: secondarynamenode running as process 11215. Stop it firs.
ahd...@node0:/usr/local/hadoop$

it seems i am unable to set the passwordless hadoop user. i have tried using
passwd and the user manager in ubuntu 8.10. but it seems to be hanging me
up.

please advise.

zander
-- 
View this message in context: 
http://www.nabble.com/setting-blank-passwords-for-user-hadoop...-tp22048732p22048732.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



Re: HDFS architecture based on GFS?

2009-02-16 Thread Amr Awadallah
> I didn't understand usage of "malicuous" here,
> but any process using HDFS api should first ask NameNode where the

Rasit,

  Matei is referring to fact that a malicious peace of code can bypass the
Name Node and connect to any data node directly, or probe all data nodes for
that matter. There is no strong authentication for RPC at this layer of
HDFS, which is one of the current shortcomings that will be addressed in
hadoop 1.0.

-- amr

On Sun, Feb 15, 2009 at 11:41 PM, Rasit OZDAS  wrote:

> "If there was a
> malicious process though, then I imagine it could talk to a datanode
> directly and request a specific block."
>
> I didn't understand usage of "malicuous" here,
> but any process using HDFS api should first ask NameNode where the
> file replications are.
> Then - I assume - namenode returns the IP of best DataNode (or all IPs),
> then call to specific DataNode is made.
> Please correct me if I'm wrong.
>
> Cheers,
> Rasit
>
> 2009/2/16 Matei Zaharia :
> > In general, yeah, the scripts can access any resource they want (within
> the
> > permissions of the user that the task runs as). It's also possible to
> access
> > HDFS from scripts because HDFS provides a FUSE interface that can make it
> > look like a regular file system on the machine. (The FUSE module in turn
> > talks to the namenode as a regular HDFS client.)
> >
> > On Sun, Feb 15, 2009 at 8:43 PM, Amandeep Khurana 
> wrote:
> >
> >> I dont know much about Hadoop streaming and have a quick question here.
> >>
> >> The snippets of code/programs that you attach into the map reduce job
> might
> >> want to access outside resources (like you mentioned). Now these might
> not
> >> need to go to the namenode right? For example a python script. How would
> it
> >> access the data? Would it ask the parent java process (in the
> tasktracker)
> >> to get the data or would it go and do stuff on its own?
> >>
> >>
> >> Amandeep Khurana
> >> Computer Science Graduate Student
> >> University of California, Santa Cruz
> >>
> >>
> >> On Sun, Feb 15, 2009 at 8:23 PM, Matei Zaharia 
> wrote:
> >>
> >> > Nope, typically the JobTracker just starts the process, and the
> >> tasktracker
> >> > talks directly to the namenode to get a pointer to the datanode, and
> then
> >> > directly to the datanode.
> >> >
> >> > On Sun, Feb 15, 2009 at 8:07 PM, Amandeep Khurana 
> >> > wrote:
> >> >
> >> > > Alright.. Got it.
> >> > >
> >> > > Now, do the task trackers talk to the namenode and the data node
> >> directly
> >> > > or
> >> > > do they go through the job tracker for it? So, if my code is such
> that
> >> I
> >> > > need to access more files from the hdfs, would the job tracker get
> >> > involved
> >> > > or not?
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > Amandeep Khurana
> >> > > Computer Science Graduate Student
> >> > > University of California, Santa Cruz
> >> > >
> >> > >
> >> > > On Sun, Feb 15, 2009 at 7:20 PM, Matei Zaharia 
> >> > wrote:
> >> > >
> >> > > > Normally, HDFS files are accessed through the namenode. If there
> was
> >> a
> >> > > > malicious process though, then I imagine it could talk to a
> datanode
> >> > > > directly and request a specific block.
> >> > > >
> >> > > > On Sun, Feb 15, 2009 at 7:15 PM, Amandeep Khurana <
> ama...@gmail.com>
> >> > > > wrote:
> >> > > >
> >> > > > > Ok. Got it.
> >> > > > >
> >> > > > > Now, when my job needs to access another file, does it go to the
> >> > > Namenode
> >> > > > > to
> >> > > > > get the block ids? How does the java process know where the
> files
> >> are
> >> > > and
> >> > > > > how to access them?
> >> > > > >
> >> > > > >
> >> > > > > Amandeep Khurana
> >> > > > > Computer Science Graduate Student
> >> > > > > University of California, Santa Cruz
> >> > > > >
> >> > > > >
> >> > > > > On Sun, Feb 15, 2009 at 7:05 PM, Matei Zaharia <
> ma...@cloudera.com
> >> >
> >> > > > wrote:
> >> > > > >
> >> > > > > > I mentioned this case because even jobs written in Java can
> use
> >> the
> >> > > > HDFS
> >> > > > > > API
> >> > > > > > to talk to the NameNode and access the filesystem. People
> often
> >> do
> >> > > this
> >> > > > > > because their job needs to read a config file, some small data
> >> > table,
> >> > > > etc
> >> > > > > > and use this information in its map or reduce functions. In
> this
> >> > > case,
> >> > > > > you
> >> > > > > > open the second file separately in your mapper's init function
> >> and
> >> > > read
> >> > > > > > whatever you need from it. In general I wanted to point out
> that
> >> > you
> >> > > > > can't
> >> > > > > > know which files a job will access unless you look at its
> source
> >> > code
> >> > > > or
> >> > > > > > monitor the calls it makes; the input file(s) you provide in
> the
> >> > job
> >> > > > > > description are a hint to the MapReduce framework to place
> your
> >> job
> >> > > on
> >> > > > > > certain nodes, but it's reasonable for the job to access other
> >> > files
> >> > > as
> >> > > > > > well.
> >> > > > > >
> >> > 

Re: setting up networking and ssh on multnode cluster...

2009-02-16 Thread zander1013

i resolved this issue by appending .local to the target hostname when i ssh.
for example my nodes are node0 and node1 thus i am successful when i ssh
node0.local and node1.local. outupt is as given  in tutorial.


alonzo

Anum wrote:
> 
> I got nearly same issue, cant able to ssh or telnet node and I doesnt
> think
> provided link works for fedora.
> 
> 
> 
> 
> On Mon, Feb 16, 2009 at 5:17 AM, Norbert Burger
> wrote:
> 
>> If you can't ssh directly to node1's IP address, then it seems you have a
>> basic network configuration issue which is really outside the scope of
>> Hadoop setup.  In general, you should make sure that:
>>
>> 0) nodes are physically connected (use crossover cable if necessary)
>> 1) your nodes are configured for unique static IPs, not DHCP
>> 2) nodes are reachable from all other nodes (eg., ping 
>> reports
>> success)
>> 3) all hostnames are resolvable, either via /etc/hosts, or entries in
>> etc/resolv.conf (eg., nslookup  reports what you expect)
>>
>> Once your basic network configuration is complete, you should be able to
>> continue with
>>
>> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster)
>> <
>> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29
>> >
>>
>> You might find some useful info here:
>> https://help.ubuntu.com/8.10/serverguide/C/network-configuration.html
>>
>> Norbert
>>
>> On Sun, Feb 15, 2009 at 10:20 PM, zander1013 
>> wrote:
>>
>> >
>> > okay,
>> >
>> > i will heed the tip on the 127 address set. here is the result of ssh
>> > 192.168.0.2...
>> > a...@node0:~$ ssh 192.168.0.2
>> > ssh: connect to host 192.168.0.2 port 22: Connection timed out
>> > a...@node0:~$
>> >
>> > the boxes are just connected with a cat5 cable.
>> >
>> > i have not done this with the hadoop account but af is my normal
>> account
>> > and
>> > i figure it should work too.
>> >
>> > /etc/init.d/interfaces is empty/does not exist on the machines. (i am
>> using
>> > ubuntu 8.10)
>> >
>> > please advise.
>> >
>> >
>> > Norbert Burger wrote:
>> > >
>> > > Fwiw, the extra references to 127.0.1.1 in each host file aren't
>> > > necessary.
>> > >
>> > > From node0, does 'ssh 192.168.0.2' work?  If not, then the issue
>> isn't
>> > > name
>> > > resolution -- take look at the network configs (eg.,
>> > > /etc/init.d/interfaces)
>> > > on each machine.
>> > >
>> > > Norbert
>> > >
>> > > On Sun, Feb 15, 2009 at 7:31 PM, zander1013 
>> > wrote:
>> > >
>> > >>
>> > >> okay,
>> > >>
>> > >> i have changed /etc/hosts to look like this for node0...
>> > >>
>> > >> 127.0.0.1   localhost
>> > >> 127.0.1.1   node0
>> > >>
>> > >> # /etc/hosts (for hadoop master and slave)
>> > >> 192.168.0.1 node0
>> > >> 192.168.0.2 node1
>> > >> #end hadoop section
>> > >>
>> > >> # The following lines are desirable for IPv6 capable hosts
>> > >> ::1 ip6-localhost ip6-loopback
>> > >> fe00::0 ip6-localnet
>> > >> ff00::0 ip6-mcastprefix
>> > >> ff02::1 ip6-allnodes
>> > >> ff02::2 ip6-allrouters
>> > >> ff02::3 ip6-allhosts
>> > >>
>> > >> ...and this for node1...
>> > >>
>> > >> 127.0.0.1   localhost
>> > >> 127.0.1.1   node1
>> > >>
>> > >> # /etc/hosts (for hadoop master and slave)
>> > >> 192.168.0.1 node0
>> > >> 192.168.0.2 node1
>> > >> #end hadoop section
>> > >>
>> > >> # The following lines are desirable for IPv6 capable hosts
>> > >> ::1 ip6-localhost ip6-loopback
>> > >> fe00::0 ip6-localnet
>> > >> ff00::0 ip6-mcastprefix
>> > >> ff02::1 ip6-allnodes
>> > >> ff02::2 ip6-allrouters
>> > >> ff02::3 ip6-allhosts
>> > >>
>> > >> ... the machines are connected by a cat5 cable, they have wifi and
>> are
>> > >> showing that they are connected to my wlan. also i have enabled all
>> the
>> > >> user
>> > >> privleges in the user manager on both machines. here are the results
>> > from
>> > >> ssh on node0...
>> > >>
>> > >> had...@node0:~$ ssh node0
>> > >> Linux node0 2.6.27-11-generic #1 SMP Thu Jan 29 19:24:39 UTC 2009
>> i686
>> > >>
>> > >> The programs included with the Ubuntu system are free software;
>> > >> the exact distribution terms for each program are described in the
>> > >> individual files in /usr/share/doc/*/copyright.
>> > >>
>> > >> Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
>> > >> applicable law.
>> > >>
>> > >> To access official Ubuntu documentation, please visit:
>> > >> http://help.ubuntu.com/
>> > >> Last login: Sun Feb 15 16:00:28 2009 from node0
>> > >> had...@node0:~$ exit
>> > >> logout
>> > >> Connection to node0 closed.
>> > >> had...@node0:~$ ssh node1
>> > >> ssh: connect to host node1 port 22: Connection timed out
>> > >> had...@node0:~$
>> > >>
>> > >> i will look into the link that you gave.
>> > >>
>> > >> -zander
>> > >>
>> > >>
>> > >> Norbert Burger wrote:
>> > >> >
>> > >> >>
>> > >> >> i have commented out the 192. addresses and changed 127.0.1.1 for
>> > >> node0
>> > >> >> and
>> > >> >> 127.0.1.2 for node0 (in /etc/hosts). with 

Re: AlredyBeingCreatedExceptions after upgrade to 0.19.0

2009-02-16 Thread Thibaut_

I have the same problem.

is there any solution to this?

Thibaut


-- 
View this message in context: 
http://www.nabble.com/AlredyBeingCreatedExceptions-after-upgrade-to-0.19.0-tp21631077p22043484.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



Re: datanode not being started

2009-02-16 Thread Sandy
Hi Rasit,

Thanks for your response!

I saw the previous threads by Jerro and Mithila, but I think my problem is
slightly different. My datanodes are not being started, period. From a
previous thread:

"The common reasons for this case are configuration errors, installation
errors, or network connectivity issues due to firewalls blocking ports, or
dns lookup errors (either failure or incorrect address returned) for the
namenode hostname on the datanodes."

I'm going to reinstall hadoop once again on this machine (this will be the
third reinstall for this problem), but it's hard for me to believe that it's
configuration and/or installation. The configuration and everything worked
fine the last time I used this machine. If anything were to happen, the HDFS
would have gotten corrupted, and a reformat should have fixed it. I tried
checking the logs for the datanode, but there is nothing there. I can ssh
into localhost and my server name fine.. but I can see if there are further
problems with DNS or firewall.

Since I last used this machine, Parallels Desktop was installed by the
admin. I am currently suspecting that somehow this is interfering with the
function of Hadoop  (though Java_HOME still seems to be ok). Has anyone had
any experience with this being a cause of interference?

Thanks,
-SM

On Mon, Feb 16, 2009 at 2:32 AM, Rasit OZDAS  wrote:

> Sandy, as far as I remember, there were some threads about the same
> problem (I don't know if it's solved). Searching the mailing list for
> this error: "could only be replicated to 0 nodes, instead of 1" may
> help.
>
> Cheers,
> Rasit
>
> 2009/2/16 Sandy :
> > just some more information:
> > hadoop fsck produces:
> > Status: HEALTHY
> >  Total size: 0 B
> >  Total dirs: 9
> >  Total files: 0 (Files currently being written: 1)
> >  Total blocks (validated): 0
> >  Minimally replicated blocks: 0
> >  Over-replicated blocks: 0
> >  Under-replicated blocks: 0
> >  Mis-replicated blocks: 0
> >  Default replication factor: 1
> >  Average block replication: 0.0
> >  Corrupt blocks: 0
> >  Missing replicas: 0
> >  Number of data-nodes: 0
> >  Number of racks: 0
> >
> >
> > The filesystem under path '/' is HEALTHY
> >
> > on the newly formatted hdfs.
> >
> > jps says:
> > 4723 Jps
> > 4527 NameNode
> > 4653 JobTracker
> >
> >
> > I can't copy files onto the dfs since I get "NotReplicatedYetExceptions",
> > which I suspect has to do with the fact that there are no datanodes. My
> > "cluster" is a single MacPro with 8 cores. I haven't had to do anything
> > extra before in order to get datanodes to be generated.
> >
> > 09/02/15 15:56:27 WARN dfs.DFSClient: Error Recovery for block null bad
> > datanode[0]
> > copyFromLocal: Could not get block locations. Aborting...
> >
> >
> > The corresponding error in the logs is:
> >
> > 2009-02-15 15:56:27,123 INFO org.apache.hadoop.ipc.Server: IPC Server
> > handler 1 on 9000, call addBlock(/user/hadoop/input/.DS_Store,
> > DFSClient_755366230) from 127.0.0.1:49796: error: java.io.IOException:
> File
> > /user/hadoop/input/.DS_Store could only be replicated to 0 nodes, instead
> of
> > 1
> > java.io.IOException: File /user/hadoop/input/.DS_Store could only be
> > replicated to 0 nodes, instead of 1
> > at
> >
> org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1120)
> > at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > at java.lang.reflect.Method.invoke(Method.java:597)
> > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
> > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> >
> > On Sun, Feb 15, 2009 at 3:26 PM, Sandy 
> wrote:
> >
> >> Thanks for your responses.
> >>
> >> I checked in the namenode and jobtracker logs and both say:
> >>
> >> INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9000, call
> >> delete(/Users/hadoop/hadoop-0.18.2/hadoop-hadoop/mapred/system, true)
> from
> >> 127.0.0.1:61086: error: org.apache.hadoop.dfs.SafeModeException: Cannot
> >> delete /Users/hadoop/hadoop-0.18.2/hadoop-hadoop/mapred/system. Name
> node
> >> is in safe mode.
> >> The ratio of reported blocks 0. has not reached the threshold
> 0.9990.
> >> Safe mode will be turned off automatically.
> >> org.apache.hadoop.dfs.SafeModeException: Cannot delete
> >> /Users/hadoop/hadoop-0.18.2/hadoop-hadoop/mapred/system. Name node is in
> >> safe mode.
> >> The ratio of reported blocks 0. has not reached the threshold
> 0.9990.
> >> Safe mode will be turned off automatically.
> >> at
> >>
> org.apache.hadoop.dfs.FSNamesystem.deleteInternal(FSNamesystem.java:1505)
> >> at
> >> org.apache.hadoop.dfs.FSNamesystem.delete(FSNamesystem.java:1477)
> >> at org.apache.hadoop.dfs.NameNode.delete(NameNode.java:42

Re: setting up networking and ssh on multnode cluster...

2009-02-16 Thread Anum Ali
I got nearly same issue, cant able to ssh or telnet node and I doesnt think
provided link works for fedora.




On Mon, Feb 16, 2009 at 5:17 AM, Norbert Burger wrote:

> If you can't ssh directly to node1's IP address, then it seems you have a
> basic network configuration issue which is really outside the scope of
> Hadoop setup.  In general, you should make sure that:
>
> 0) nodes are physically connected (use crossover cable if necessary)
> 1) your nodes are configured for unique static IPs, not DHCP
> 2) nodes are reachable from all other nodes (eg., ping  reports
> success)
> 3) all hostnames are resolvable, either via /etc/hosts, or entries in
> etc/resolv.conf (eg., nslookup  reports what you expect)
>
> Once your basic network configuration is complete, you should be able to
> continue with
>
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster)
> <
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29
> >
>
> You might find some useful info here:
> https://help.ubuntu.com/8.10/serverguide/C/network-configuration.html
>
> Norbert
>
> On Sun, Feb 15, 2009 at 10:20 PM, zander1013  wrote:
>
> >
> > okay,
> >
> > i will heed the tip on the 127 address set. here is the result of ssh
> > 192.168.0.2...
> > a...@node0:~$ ssh 192.168.0.2
> > ssh: connect to host 192.168.0.2 port 22: Connection timed out
> > a...@node0:~$
> >
> > the boxes are just connected with a cat5 cable.
> >
> > i have not done this with the hadoop account but af is my normal account
> > and
> > i figure it should work too.
> >
> > /etc/init.d/interfaces is empty/does not exist on the machines. (i am
> using
> > ubuntu 8.10)
> >
> > please advise.
> >
> >
> > Norbert Burger wrote:
> > >
> > > Fwiw, the extra references to 127.0.1.1 in each host file aren't
> > > necessary.
> > >
> > > From node0, does 'ssh 192.168.0.2' work?  If not, then the issue isn't
> > > name
> > > resolution -- take look at the network configs (eg.,
> > > /etc/init.d/interfaces)
> > > on each machine.
> > >
> > > Norbert
> > >
> > > On Sun, Feb 15, 2009 at 7:31 PM, zander1013 
> > wrote:
> > >
> > >>
> > >> okay,
> > >>
> > >> i have changed /etc/hosts to look like this for node0...
> > >>
> > >> 127.0.0.1   localhost
> > >> 127.0.1.1   node0
> > >>
> > >> # /etc/hosts (for hadoop master and slave)
> > >> 192.168.0.1 node0
> > >> 192.168.0.2 node1
> > >> #end hadoop section
> > >>
> > >> # The following lines are desirable for IPv6 capable hosts
> > >> ::1 ip6-localhost ip6-loopback
> > >> fe00::0 ip6-localnet
> > >> ff00::0 ip6-mcastprefix
> > >> ff02::1 ip6-allnodes
> > >> ff02::2 ip6-allrouters
> > >> ff02::3 ip6-allhosts
> > >>
> > >> ...and this for node1...
> > >>
> > >> 127.0.0.1   localhost
> > >> 127.0.1.1   node1
> > >>
> > >> # /etc/hosts (for hadoop master and slave)
> > >> 192.168.0.1 node0
> > >> 192.168.0.2 node1
> > >> #end hadoop section
> > >>
> > >> # The following lines are desirable for IPv6 capable hosts
> > >> ::1 ip6-localhost ip6-loopback
> > >> fe00::0 ip6-localnet
> > >> ff00::0 ip6-mcastprefix
> > >> ff02::1 ip6-allnodes
> > >> ff02::2 ip6-allrouters
> > >> ff02::3 ip6-allhosts
> > >>
> > >> ... the machines are connected by a cat5 cable, they have wifi and are
> > >> showing that they are connected to my wlan. also i have enabled all
> the
> > >> user
> > >> privleges in the user manager on both machines. here are the results
> > from
> > >> ssh on node0...
> > >>
> > >> had...@node0:~$ ssh node0
> > >> Linux node0 2.6.27-11-generic #1 SMP Thu Jan 29 19:24:39 UTC 2009 i686
> > >>
> > >> The programs included with the Ubuntu system are free software;
> > >> the exact distribution terms for each program are described in the
> > >> individual files in /usr/share/doc/*/copyright.
> > >>
> > >> Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
> > >> applicable law.
> > >>
> > >> To access official Ubuntu documentation, please visit:
> > >> http://help.ubuntu.com/
> > >> Last login: Sun Feb 15 16:00:28 2009 from node0
> > >> had...@node0:~$ exit
> > >> logout
> > >> Connection to node0 closed.
> > >> had...@node0:~$ ssh node1
> > >> ssh: connect to host node1 port 22: Connection timed out
> > >> had...@node0:~$
> > >>
> > >> i will look into the link that you gave.
> > >>
> > >> -zander
> > >>
> > >>
> > >> Norbert Burger wrote:
> > >> >
> > >> >>
> > >> >> i have commented out the 192. addresses and changed 127.0.1.1 for
> > >> node0
> > >> >> and
> > >> >> 127.0.1.2 for node0 (in /etc/hosts). with this done i can ssh from
> > one
> > >> >> machine to itself and to the other but the prompt does not change
> > when
> > >> i
> > >> >> ssh
> > >> >> to the other machine. i don't know if there is a firewall
> preventing
> > >> me
> > >> >> from
> > >> >> ssh or not. i have not set any up to prevent ssh and i have not
> taken
> > >> >> action
> > >> >> to specifically allow ssh other than what was prescribed in the

Re: Copying a file to specified nodes

2009-02-16 Thread Rasit OZDAS
Yes, I've tried the long solution;
when I execute   ./hadoop dfs -put ... from a datanode,
in any case 1 copy gets written to that datanode.

But I think I should use SSH for this,
Anybody knows a better way?

Thanks,
Rasit

2009/2/16 Rasit OZDAS :
> Thanks, Jeff.
> After considering JIRA link you've given and making some investigation:
>
> It seems that this JIRA ticket didn't draw much attention, so will
> take much time to be considered.
> After some more investigation I found out that when I copy the file to
> HDFS from a specific DataNode, first copy will be written to that
> DataNode itself. This solution will take long to implement, I think.
> But we definitely need this feature, so if we have no other choice,
> we'll go though it.
>
> Any further info (or comments on my solution) is appreciated.
>
> Cheers,
> Rasit
>
> 2009/2/10 Jeff Hammerbacher :
>> Hey Rasit,
>>
>> I'm not sure I fully understand your description of the problem, but
>> you might want to check out the JIRA ticket for making the replica
>> placement algorithms in HDFS pluggable
>> (https://issues.apache.org/jira/browse/HADOOP-3799) and add your use
>> case there.
>>
>> Regards,
>> Jeff
>>
>> On Tue, Feb 10, 2009 at 5:05 AM, Rasit OZDAS  wrote:
>>>
>>> Hi,
>>>
>>> We have thousands of files, each dedicated to a user.  (Each user has
>>> access to other users' files, but they do this not very often.)
>>> Each user runs map-reduce jobs on the cluster.
>>> So we should seperate his/her files equally across the cluster,
>>> so that every machine can take part in the process (assuming he/she is
>>> the only user running jobs).
>>> For this we should initially copy files to specified nodes:
>>> User A :   first file : Node 1, second file: Node 2, .. etc.
>>> User B :   first file : Node 1, second file: Node 2, .. etc.
>>>
>>> I know, hadoop create also replicas, but in our solution at least one
>>> file will be in the right place
>>> (or we're willing to control other replicas too).
>>>
>>> Rebalancing is also not a problem, assuming it uses the information
>>> about how much a computer is in use.
>>> It even helps for a better organization of files.
>>>
>>> How can we copy files to specified nodes?
>>> Or do you have a better solution for us?
>>>
>>> I couldn't find a solution to this, probably such an option doesn't exist.
>>> But I wanted to take an expert's opinion about this.
>>>
>>> Thanks in advance..
>>> Rasit
>>
>
>
>
> --
> M. Raşit ÖZDAŞ
>



-- 
M. Raşit ÖZDAŞ


Re: setting up networking and ssh on multnode cluster...

2009-02-16 Thread Norbert Burger
If you can't ssh directly to node1's IP address, then it seems you have a
basic network configuration issue which is really outside the scope of
Hadoop setup.  In general, you should make sure that:

0) nodes are physically connected (use crossover cable if necessary)
1) your nodes are configured for unique static IPs, not DHCP
2) nodes are reachable from all other nodes (eg., ping  reports
success)
3) all hostnames are resolvable, either via /etc/hosts, or entries in
etc/resolv.conf (eg., nslookup  reports what you expect)

Once your basic network configuration is complete, you should be able to
continue with
http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster)

You might find some useful info here:
https://help.ubuntu.com/8.10/serverguide/C/network-configuration.html

Norbert

On Sun, Feb 15, 2009 at 10:20 PM, zander1013  wrote:

>
> okay,
>
> i will heed the tip on the 127 address set. here is the result of ssh
> 192.168.0.2...
> a...@node0:~$ ssh 192.168.0.2
> ssh: connect to host 192.168.0.2 port 22: Connection timed out
> a...@node0:~$
>
> the boxes are just connected with a cat5 cable.
>
> i have not done this with the hadoop account but af is my normal account
> and
> i figure it should work too.
>
> /etc/init.d/interfaces is empty/does not exist on the machines. (i am using
> ubuntu 8.10)
>
> please advise.
>
>
> Norbert Burger wrote:
> >
> > Fwiw, the extra references to 127.0.1.1 in each host file aren't
> > necessary.
> >
> > From node0, does 'ssh 192.168.0.2' work?  If not, then the issue isn't
> > name
> > resolution -- take look at the network configs (eg.,
> > /etc/init.d/interfaces)
> > on each machine.
> >
> > Norbert
> >
> > On Sun, Feb 15, 2009 at 7:31 PM, zander1013 
> wrote:
> >
> >>
> >> okay,
> >>
> >> i have changed /etc/hosts to look like this for node0...
> >>
> >> 127.0.0.1   localhost
> >> 127.0.1.1   node0
> >>
> >> # /etc/hosts (for hadoop master and slave)
> >> 192.168.0.1 node0
> >> 192.168.0.2 node1
> >> #end hadoop section
> >>
> >> # The following lines are desirable for IPv6 capable hosts
> >> ::1 ip6-localhost ip6-loopback
> >> fe00::0 ip6-localnet
> >> ff00::0 ip6-mcastprefix
> >> ff02::1 ip6-allnodes
> >> ff02::2 ip6-allrouters
> >> ff02::3 ip6-allhosts
> >>
> >> ...and this for node1...
> >>
> >> 127.0.0.1   localhost
> >> 127.0.1.1   node1
> >>
> >> # /etc/hosts (for hadoop master and slave)
> >> 192.168.0.1 node0
> >> 192.168.0.2 node1
> >> #end hadoop section
> >>
> >> # The following lines are desirable for IPv6 capable hosts
> >> ::1 ip6-localhost ip6-loopback
> >> fe00::0 ip6-localnet
> >> ff00::0 ip6-mcastprefix
> >> ff02::1 ip6-allnodes
> >> ff02::2 ip6-allrouters
> >> ff02::3 ip6-allhosts
> >>
> >> ... the machines are connected by a cat5 cable, they have wifi and are
> >> showing that they are connected to my wlan. also i have enabled all the
> >> user
> >> privleges in the user manager on both machines. here are the results
> from
> >> ssh on node0...
> >>
> >> had...@node0:~$ ssh node0
> >> Linux node0 2.6.27-11-generic #1 SMP Thu Jan 29 19:24:39 UTC 2009 i686
> >>
> >> The programs included with the Ubuntu system are free software;
> >> the exact distribution terms for each program are described in the
> >> individual files in /usr/share/doc/*/copyright.
> >>
> >> Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
> >> applicable law.
> >>
> >> To access official Ubuntu documentation, please visit:
> >> http://help.ubuntu.com/
> >> Last login: Sun Feb 15 16:00:28 2009 from node0
> >> had...@node0:~$ exit
> >> logout
> >> Connection to node0 closed.
> >> had...@node0:~$ ssh node1
> >> ssh: connect to host node1 port 22: Connection timed out
> >> had...@node0:~$
> >>
> >> i will look into the link that you gave.
> >>
> >> -zander
> >>
> >>
> >> Norbert Burger wrote:
> >> >
> >> >>
> >> >> i have commented out the 192. addresses and changed 127.0.1.1 for
> >> node0
> >> >> and
> >> >> 127.0.1.2 for node0 (in /etc/hosts). with this done i can ssh from
> one
> >> >> machine to itself and to the other but the prompt does not change
> when
> >> i
> >> >> ssh
> >> >> to the other machine. i don't know if there is a firewall preventing
> >> me
> >> >> from
> >> >> ssh or not. i have not set any up to prevent ssh and i have not taken
> >> >> action
> >> >> to specifically allow ssh other than what was prescribed in the
> >> tutorial
> >> >> for
> >> >> a single node cluster for both machines.
> >> >
> >> >
> >> > Why are you using 127.* addresses for your nodes?  These fall into the
> >> > block
> >> > of IPs reserved for the loopback network (see
> >> > http://en.wikipedia.org/wiki/Loopback).
> >> >
> >> > Try changing both nodes back to 192.168, and re-starting Hadoop.
> >> >
> >> > Norbert
> >> >
> >> >
> >>
> >> --
> >> View this message in context:
> >>
> http://www.nabble.com/sett

Re: stable version

2009-02-16 Thread Anum Ali
That's awkward , the site went down!

and ok, I note these points ,for future.


Thanks.




On 2/16/09, Steve Loughran  wrote:
> Anum Ali wrote:
>> The parser problem is related to jar files , can be resolved not a bug.
>>
>> Forwarding link for its solution
>>
>>
>> http://www.jroller.com/navanee/entry/unsupportedoperationexception_this_parser_does_not
>>
>
> this site is  down; cant see it
>
> It is a bug, because I view all operations problems as defects to be
> opened in the bug tracker, stack traces stuck in, the problem resolved.
> That's software or hardware -because that issue DB is your searchable
> history of what went wrong. Given on my system I was seeing a
> ClassNotFoundException for loading FSConstants, there was no easy way to
> work out what went wrong, and its cost me a couple of days work.
>
> furthermore, in the OSS world, every person who can't get your app to
> work is either going to walk away unhappy (=lost customer, lost
> developer and risk they compete with you), or they are going to get on
> the email list and ask questions, questions which may get answered, but
> it will cost them time.
>
> Hence
> * happyaxis.jsp: axis' diagnostics page, prints out useful stuff and
> warns if it knows it is unwell (and returns 500 error code so your
> monitoring tools can recognise this)
> * ant -diagnostics: detailed look at your ant system including xml
> parser experiments.
>
> Good open source tools have to be easy for people to get started with,
> and that means helpful error messages. If we left the code alone,
> knowing that the cause of a ClassNotFoundException was the fault of the
> user sticking the wrong XML parser on the classpath -and yet refusing to
> add the four lines of code needed to handle this- then we are letting
> down the users
>
>>
>>
>> On 2/13/09, Steve Loughran  wrote:
>>> Anum Ali wrote:
  This only occurs in linux , in windows its  fine.
>>> do a java -version for me, and an ant -diagnostics, stick both on the
>>> bugrep
>>>
>>> https://issues.apache.org/jira/browse/HADOOP-5254
>>>
>>> It may be that XInclude only went live in java1.6u5; I'm running a
>>> JRockit JVM which predates that and I'm seeing it (linux again);
>>>
>>> I will also try sticking xerces on the classpath to see what happens next
>>>
>
>


Re: stable version

2009-02-16 Thread Steve Loughran

Anum Ali wrote:

The parser problem is related to jar files , can be resolved not a bug.

Forwarding link for its solution


http://www.jroller.com/navanee/entry/unsupportedoperationexception_this_parser_does_not



this site is  down; cant see it

It is a bug, because I view all operations problems as defects to be 
opened in the bug tracker, stack traces stuck in, the problem resolved. 
That's software or hardware -because that issue DB is your searchable 
history of what went wrong. Given on my system I was seeing a 
ClassNotFoundException for loading FSConstants, there was no easy way to 
work out what went wrong, and its cost me a couple of days work.


furthermore, in the OSS world, every person who can't get your app to 
work is either going to walk away unhappy (=lost customer, lost 
developer and risk they compete with you), or they are going to get on 
the email list and ask questions, questions which may get answered, but 
it will cost them time.


Hence
* happyaxis.jsp: axis' diagnostics page, prints out useful stuff and 
warns if it knows it is unwell (and returns 500 error code so your 
monitoring tools can recognise this)
* ant -diagnostics: detailed look at your ant system including xml 
parser experiments.


Good open source tools have to be easy for people to get started with, 
and that means helpful error messages. If we left the code alone, 
knowing that the cause of a ClassNotFoundException was the fault of the 
user sticking the wrong XML parser on the classpath -and yet refusing to 
add the four lines of code needed to handle this- then we are letting 
down the users





On 2/13/09, Steve Loughran  wrote:

Anum Ali wrote:

 This only occurs in linux , in windows its  fine.

do a java -version for me, and an ant -diagnostics, stick both on the bugrep

https://issues.apache.org/jira/browse/HADOOP-5254

It may be that XInclude only went live in java1.6u5; I'm running a
JRockit JVM which predates that and I'm seeing it (linux again);

I will also try sticking xerces on the classpath to see what happens next





Cannot execute the start-mapred script

2009-02-16 Thread Arijit Mukherjee
Hi All

I'm trying to create a tiny 2-node cluster (both on linux FC7) with Hadoop
0.19.0 - previously, I was able to install and run hadoop on a single node.
Now I'm trying it on 2 nodes - my idea was to put the name node and the job
tracker on separate nodes, and initially use these two as the data nodes. So
basically, the "master" and "slave" files - both have the names of these two
nodes. When I start the dfs from the name node, it seems to work. But when I
try to run the start-mapred.sh script, I get the following exception -

blueberry: Exception in thread "main" java.lang.NoClassDefFoundError:
Could_not_reserve_enough_space_for_the_card_marking_array
blueberry: Caused by: java.lang.ClassNotFoundException:
Could_not_reserve_enough_space_for_the_card_marking_array
blueberry: at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
blueberry: at java.security.AccessController.doPrivileged(Native Method)
blueberry: at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
blueberry: at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
blueberry: at
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
blueberry: at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
blueberry: at
java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
blueberry: Could not find the main class:
Could_not_reserve_enough_space_for_the_card_marking_array.  Program will
exit.

Is it related to the heap space I allocated in the hadoop-env.sh? Or is it
something else?

Regards
Arijit

-- 
"And when the night is cloudy,
There is still a light that shines on me,
Shine on until tomorrow, let it be."


Re: datanode not being started

2009-02-16 Thread Rasit OZDAS
Sandy, as far as I remember, there were some threads about the same
problem (I don't know if it's solved). Searching the mailing list for
this error: "could only be replicated to 0 nodes, instead of 1" may
help.

Cheers,
Rasit

2009/2/16 Sandy :
> just some more information:
> hadoop fsck produces:
> Status: HEALTHY
>  Total size: 0 B
>  Total dirs: 9
>  Total files: 0 (Files currently being written: 1)
>  Total blocks (validated): 0
>  Minimally replicated blocks: 0
>  Over-replicated blocks: 0
>  Under-replicated blocks: 0
>  Mis-replicated blocks: 0
>  Default replication factor: 1
>  Average block replication: 0.0
>  Corrupt blocks: 0
>  Missing replicas: 0
>  Number of data-nodes: 0
>  Number of racks: 0
>
>
> The filesystem under path '/' is HEALTHY
>
> on the newly formatted hdfs.
>
> jps says:
> 4723 Jps
> 4527 NameNode
> 4653 JobTracker
>
>
> I can't copy files onto the dfs since I get "NotReplicatedYetExceptions",
> which I suspect has to do with the fact that there are no datanodes. My
> "cluster" is a single MacPro with 8 cores. I haven't had to do anything
> extra before in order to get datanodes to be generated.
>
> 09/02/15 15:56:27 WARN dfs.DFSClient: Error Recovery for block null bad
> datanode[0]
> copyFromLocal: Could not get block locations. Aborting...
>
>
> The corresponding error in the logs is:
>
> 2009-02-15 15:56:27,123 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 9000, call addBlock(/user/hadoop/input/.DS_Store,
> DFSClient_755366230) from 127.0.0.1:49796: error: java.io.IOException: File
> /user/hadoop/input/.DS_Store could only be replicated to 0 nodes, instead of
> 1
> java.io.IOException: File /user/hadoop/input/.DS_Store could only be
> replicated to 0 nodes, instead of 1
> at
> org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1120)
> at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
>
> On Sun, Feb 15, 2009 at 3:26 PM, Sandy  wrote:
>
>> Thanks for your responses.
>>
>> I checked in the namenode and jobtracker logs and both say:
>>
>> INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9000, call
>> delete(/Users/hadoop/hadoop-0.18.2/hadoop-hadoop/mapred/system, true) from
>> 127.0.0.1:61086: error: org.apache.hadoop.dfs.SafeModeException: Cannot
>> delete /Users/hadoop/hadoop-0.18.2/hadoop-hadoop/mapred/system. Name node
>> is in safe mode.
>> The ratio of reported blocks 0. has not reached the threshold 0.9990.
>> Safe mode will be turned off automatically.
>> org.apache.hadoop.dfs.SafeModeException: Cannot delete
>> /Users/hadoop/hadoop-0.18.2/hadoop-hadoop/mapred/system. Name node is in
>> safe mode.
>> The ratio of reported blocks 0. has not reached the threshold 0.9990.
>> Safe mode will be turned off automatically.
>> at
>> org.apache.hadoop.dfs.FSNamesystem.deleteInternal(FSNamesystem.java:1505)
>> at
>> org.apache.hadoop.dfs.FSNamesystem.delete(FSNamesystem.java:1477)
>> at org.apache.hadoop.dfs.NameNode.delete(NameNode.java:425)
>> at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
>>
>>
>> I think this is a continuation of my running problem. The nodes stay in
>> safe mode, but won't come out, even after several minutes. I believe this is
>> due to the fact that it keep trying to contact a datanode that does not
>> exist. Any suggestions on what I can do?
>>
>> I have recently tried to reformat the hdfs, using bin/hadoop namenode
>> -format. From the output directed to standard out, I thought this completed
>> correctly:
>>
>> Re-format filesystem in /Users/hadoop/hadoop-0.18.2/hadoop-hadoop/dfs/name
>> ? (Y or N) Y
>> 09/02/15 15:16:39 INFO fs.FSNamesystem:
>> fsOwner=hadoop,staff,_lpadmin,com.apple.sharepoint.group.8,com.apple.sharepoint.group.3,com.apple.sharepoint.group.4,com.apple.sharepoint.group.2,com.apple.sharepoint.group.6,com.apple.sharepoint.group.9,com.apple.sharepoint.group.1,com.apple.sharepoint.group.5
>> 09/02/15 15:16:39 INFO fs.FSNamesystem: supergroup=supergroup
>> 09/02/15 15:16:39 INFO fs.FSNamesystem: isPermissionEnabled=true
>> 09/02/15 15:16:39 INFO dfs.Storage: Image file of size 80 saved in 0
>> seconds.
>> 09/02/15 15:16:39 INFO dfs.Storage: Storage directory
>> /Users/hadoop/hadoop-0.18.2/hadoop-hadoop/dfs/name has been

Re: HADOOP-2536 supports Oracle too?

2009-02-16 Thread Fredrik Hedberg

Hi,

Although it's not MySQL; this might be of use:

http://svn.apache.org/repos/asf/hadoop/core/trunk/src/examples/org/apache/hadoop/examples/DBCountPageView.java


Fredrik

On Feb 16, 2009, at 8:33 AM, sandhiya wrote:



@Amandeep
Hi,
I'm new to Hadoop and am trying to run a simple database connectivity
program on it. Could you please tell me how u went about it?? my  
mail id is
"sandys_cr...@yahoo.com" . A copy of your code that successfully  
connected

to MySQL will also be helpful.
Thanks,
Sandhiya

Enis Soztutar-2 wrote:


From the exception :

java.io.IOException: ORA-00933: SQL command not properly ended

I would broadly guess that Oracle JDBC driver might be complaining  
that

the statement does not end with ";", or something similar. you can
1. download the latest source code of hadoop
2. add a print statement printing the query (probably in
DBInputFormat:119)
3. build hadoop jar
4. use the new hadoop jar to see the actual SQL query
5. run the query on Oracle if is gives an error.

Enis


Amandeep Khurana wrote:

Ok. I created the same database in a MySQL database and ran the same
hadoop
job against it. It worked. So, that means there is some Oracle  
specific
issue. It cant be an issue with the JDBC drivers since I am using  
the

same
drivers in a simple JDBC client.

What could it be?

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Wed, Feb 4, 2009 at 10:26 AM, Amandeep Khurana 
wrote:


Ok. I'm not sure if I got it correct. Are you saying, I should  
test the

statement that hadoop creates directly with the database?

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Wed, Feb 4, 2009 at 7:13 AM, Enis Soztutar 
wrote:


Hadoop-2536 connects to the db via JDBC, so in theory it should  
work

with
proper jdbc drivers.
It has been tested against MySQL, Hsqldb, and PostreSQL, but not
Oracle.

To answer your earlier question, the actual SQL statements might  
not be

recognized by Oracle, so I suggest the best way to test this is to
insert
print statements, and run the actual SQL statements against  
Oracle to

see if
the syntax is accepted.

We would appreciate if you publish your results.

Enis


Amandeep Khurana wrote:


Does the patch HADOOP-2536 support connecting to Oracle  
databases as

well?
Or is it just limited to MySQL?

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz












--
View this message in context: 
http://www.nabble.com/HADOOP-2536-supports-Oracle-too--tp21823199p22032715.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.





RE: capacity scheduler for 0.18.x?

2009-02-16 Thread Vivek Ratan
There is no patch for the Capacity Scheduler for 0.18.x.  

> -Original Message-
> From: Bill Au [mailto:bill.w...@gmail.com] 
> Sent: Saturday, February 14, 2009 1:00 AM
> To: core-user@hadoop.apache.org
> Subject: capacity scheduler for 0.18.x?
> 
> I see that there is a patch for the fair scheduler for 0.18.1 
> in HADOOP-3746.  Does anyone know if there is a similar patch 
> for the capacity scheduler?  I did a search on JIRA but 
> didn't find anything.
> 
> Bill
> 


Re: Copying a file to specified nodes

2009-02-16 Thread Rasit OZDAS
Thanks, Jeff.
After considering JIRA link you've given and making some investigation:

It seems that this JIRA ticket didn't draw much attention, so will
take much time to be considered.
After some more investigation I found out that when I copy the file to
HDFS from a specific DataNode, first copy will be written to that
DataNode itself. This solution will take long to implement, I think.
But we definitely need this feature, so if we have no other choice,
we'll go though it.

Any further info (or comments on my solution) is appreciated.

Cheers,
Rasit

2009/2/10 Jeff Hammerbacher :
> Hey Rasit,
>
> I'm not sure I fully understand your description of the problem, but
> you might want to check out the JIRA ticket for making the replica
> placement algorithms in HDFS pluggable
> (https://issues.apache.org/jira/browse/HADOOP-3799) and add your use
> case there.
>
> Regards,
> Jeff
>
> On Tue, Feb 10, 2009 at 5:05 AM, Rasit OZDAS  wrote:
>>
>> Hi,
>>
>> We have thousands of files, each dedicated to a user.  (Each user has
>> access to other users' files, but they do this not very often.)
>> Each user runs map-reduce jobs on the cluster.
>> So we should seperate his/her files equally across the cluster,
>> so that every machine can take part in the process (assuming he/she is
>> the only user running jobs).
>> For this we should initially copy files to specified nodes:
>> User A :   first file : Node 1, second file: Node 2, .. etc.
>> User B :   first file : Node 1, second file: Node 2, .. etc.
>>
>> I know, hadoop create also replicas, but in our solution at least one
>> file will be in the right place
>> (or we're willing to control other replicas too).
>>
>> Rebalancing is also not a problem, assuming it uses the information
>> about how much a computer is in use.
>> It even helps for a better organization of files.
>>
>> How can we copy files to specified nodes?
>> Or do you have a better solution for us?
>>
>> I couldn't find a solution to this, probably such an option doesn't exist.
>> But I wanted to take an expert's opinion about this.
>>
>> Thanks in advance..
>> Rasit
>



-- 
M. Raşit ÖZDAŞ