Stopping datanodes dynamically

2011-01-31 Thread Jam Ram

How to remove the multiple datanodes dynamically from the masternode without
stopping it?
-- 
View this message in context: 
http://old.nabble.com/Stopping-datanodes-dynamically-tp30804859p30804859.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



Re: Stopping datanodes dynamically

2011-01-31 Thread Rekha Joshi
I think if you add datanode host value for the key dfs.hosts.exclude in hdfs 
conf file and refresh nodes [hadoop dfsadmin -refreshNodes]..might work.Thanks,

On 1/31/11 2:42 PM, "Jam Ram"  wrote:



How to remove the multiple datanodes dynamically from the masternode without
stopping it?
--
View this message in context: 
http://old.nabble.com/Stopping-datanodes-dynamically-tp30804859p30804859.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.




Re: Draining/Decommisioning a tasktracker

2011-01-31 Thread Koji Noguchi
Rishi,

> Using exclude list for TT will not help as Koji has already mentioned
>
It'll help a bit in a sense that no more tasks are assigned to that TaskTracker 
once excluded.

As for TT decommissioning and map outputs handling, opened a Jira for further 
discussion.
https://issues.apache.org/jira/browse/MAPREDUCE-2291

Koji


On 1/29/11 5:37 AM, "rishi pathak"  wrote:

HI,
Here is a description of what we are trying to achieve(whether it is 
possible or not is still not cear):
We have large computing clusters used majorly  for MPI jobs. We use PBS/Torque 
and Maui for resource allocation and scheduling.
At most times utilization is very high except for very small resource pockets 
of say 16 cores for 2-5 Hrs. We are trying establish feasibility of using these 
small(but fixed sized) resource pockets for nutch crawls. Our configuration is:

# Hadoop 0.20.2 (packaged with nutch)
#Lustre parallel filesystem for data storage
# No HDFS

We have JT running on one of the login nodes at all times.
Request for resource (nodes=16, walltime=05 Hrs.) is made using batch system 
and as a part of job TTs are provisioned. The problem is, when a job expires, 
user processes are cleaned up and thus TT gets killed. With that, completed and 
running map/reduce tasks for nutch job are killed and are rescheduled. Solution 
could be as we see it:

1. As the filesystem is shared(& persistent),  restart tasks on another TT and 
make intermediate task data available. i.e. sort of checkpointing.
2. TT draining - based on a speculative time for task completion, TT whose 
walltime is nearing expiry will go into draining mode.i.e. no new tasks will be 
scheduled on that TT.

For '1', it is very far fetched(we are no Hadoop expert)
'2' seems to be a more sensible approach.

Using exclude list for TT will not help as Koji has already mentioned
We looked into capacity scheduler but did'nt find any pointers. Phil, what 
version of hadoop
have these hooks in scheduler.

On Sat, Jan 29, 2011 at 3:34 AM, phil young  wrote:
There are some hooks available in the schedulers that could be useful also.
I think they were expected to be used to allow you to schedule tasks based
on load average on the host, but I'd expect you can customize them for your
purpose.


On Fri, Jan 28, 2011 at 6:46 AM, Harsh J  wrote:

> Moving discussion to the MapReduce-User list:
> mapreduce-u...@hadoop.apache.org
>
> Reply inline:
>
> On Fri, Jan 28, 2011 at 2:39 PM, rishi pathak 
> wrote:
> > Hi,
> >Is there a way to drain a tasktracker. What we require is not to
> > schedule any more map/red tasks onto a tasktracker(mark it offline) but
> > still the running tasks should not be affected.
>
> You could simply shut the TT down. MapReduce was designed with faults
> in mind and thus tasks that are running on a particular TaskTracker
> can be re-run elsewhere if they failed. Is this not usable in your
> case?
>
> --
> Harsh J
> www.harshj.com 
>




Re: message transmission in Hadoop

2011-01-31 Thread Da Zheng

Yes, this is exactly what I observed. reading is another problem. Thanks.

Best,
Da

On 01/30/2011 05:25 PM, Jeff Hammerbacher wrote:

Hey Da,

You may have observed https://issues.apache.org/jira/browse/HDFS-1601.

Regards,
Jeff

On Fri, Jan 28, 2011 at 7:08 PM, Da Zheng  wrote:


Hello,

I monitored system calls of HDFS with systemtap and found HDFS actually
sends
many 1-byte data to the network. I could also see many 8-byte and 64-byte
data
written to the OS though I don't know whether they are written to the disk
or
sent to the network. I did see many 8-byte data sent to the network. The
number
of these data is several times more than 64KB data packet sent by HDFS.

Could anyone tell me why HDFS sends so many small packets? heartbeat
messages?
RPCs? It doesn't seem to me these messages can be just 1 byte.

Thanks,
Da





Re: Draining/Decommisioning a tasktracker

2011-01-31 Thread rishi pathak
Hi Koji,
   Thanks for opening feature request. Right now for the purpose
stated earlier
I have upgraded to hadoop to 0.21. , and trying to see if creating
individual leaf level queues for every tasktracker and changing the state of
it to 'stopped' before the expiry of the walltime. Seems like it will work
for now.

P.S. - What credentials are required for commentiong on an issue in Jira

On Mon, Jan 31, 2011 at 10:22 PM, Koji Noguchi wrote:

>  Rishi,
>
> > Using exclude list for TT will not help as Koji has already mentioned
> >
> It’ll help a bit in a sense that no more tasks are assigned to that
> TaskTracker once excluded.
>
> As for TT decommissioning and map outputs handling, opened a Jira for
> further discussion.
> https://issues.apache.org/jira/browse/MAPREDUCE-2291
>
> Koji
>
>
>
> On 1/29/11 5:37 AM, "rishi pathak"  wrote:
>
> HI,
> Here is a description of what we are trying to achieve(whether it is
> possible or not is still not cear):
> We have large computing clusters used majorly  for MPI jobs. We use
> PBS/Torque and Maui for resource allocation and scheduling.
> At most times utilization is very high except for very small resource
> pockets of say 16 cores for 2-5 Hrs. We are trying establish feasibility of
> using these small(but fixed sized) resource pockets for nutch crawls. Our
> configuration is:
>
> # Hadoop 0.20.2 (packaged with nutch)
> #Lustre parallel filesystem for data storage
> # No HDFS
>
> We have JT running on one of the login nodes at all times.
> Request for resource (nodes=16, walltime=05 Hrs.) is made using batch
> system and as a part of job TTs are provisioned. The problem is, when a job
> expires, user processes are cleaned up and thus TT gets killed. With that,
> completed and running map/reduce tasks for nutch job are killed and are
> rescheduled. Solution could be as we see it:
>
> 1. As the filesystem is shared(& persistent),  restart tasks on another TT
> and make intermediate task data available. i.e. sort of checkpointing.
> 2. TT draining - based on a speculative time for task completion, TT whose
> walltime is nearing expiry will go into draining mode.i.e. no new tasks will
> be scheduled on that TT.
>
> For '1', it is very far fetched(we are no Hadoop expert)
> '2' seems to be a more sensible approach.
>
> Using exclude list for TT will not help as Koji has already mentioned
> We looked into capacity scheduler but did'nt find any pointers. Phil, what
> version of hadoop
> have these hooks in scheduler.
>
> On Sat, Jan 29, 2011 at 3:34 AM, phil young 
> wrote:
>
> There are some hooks available in the schedulers that could be useful also.
> I think they were expected to be used to allow you to schedule tasks based
> on load average on the host, but I'd expect you can customize them for your
> purpose.
>
>
> On Fri, Jan 28, 2011 at 6:46 AM, Harsh J  wrote:
>
> > Moving discussion to the MapReduce-User list:
> > mapreduce-u...@hadoop.apache.org
> >
> > Reply inline:
> >
> > On Fri, Jan 28, 2011 at 2:39 PM, rishi pathak  >
> > wrote:
> > > Hi,
> > >Is there a way to drain a tasktracker. What we require is not to
> > > schedule any more map/red tasks onto a tasktracker(mark it offline) but
> > > still the running tasks should not be affected.
> >
> > You could simply shut the TT down. MapReduce was designed with faults
> > in mind and thus tasks that are running on a particular TaskTracker
> > can be re-run elsewhere if they failed. Is this not usable in your
> > case?
> >
> > --
> > Harsh J
> > www.harshj.com 
> >
>
>
>
>


-- 
---
Rishi Pathak
National PARAM Supercomputing Facility
C-DAC, Pune, India


Re: Draining/Decommisioning a tasktracker

2011-01-31 Thread rishi pathak
Still need to figure out whether a queue can be associated with a TT. i.e.
TT acl for a queue
in which tasks submitted to that queue will only be relayed to TT in the acl
list for the queue.

On Mon, Jan 31, 2011 at 10:51 PM, rishi pathak wrote:

> Hi Koji,
>Thanks for opening feature request. Right now for the purpose
> stated earlier
> I have upgraded to hadoop to 0.21. , and trying to see if creating
> individual leaf level queues for every tasktracker and changing the state of
> it to 'stopped' before the expiry of the walltime. Seems like it will work
> for now.
>
> P.S. - What credentials are required for commentiong on an issue in Jira
>
> On Mon, Jan 31, 2011 at 10:22 PM, Koji Noguchi wrote:
>
>>  Rishi,
>>
>> > Using exclude list for TT will not help as Koji has already mentioned
>> >
>> It’ll help a bit in a sense that no more tasks are assigned to that
>> TaskTracker once excluded.
>>
>> As for TT decommissioning and map outputs handling, opened a Jira for
>> further discussion.
>> https://issues.apache.org/jira/browse/MAPREDUCE-2291
>>
>> Koji
>>
>>
>>
>> On 1/29/11 5:37 AM, "rishi pathak"  wrote:
>>
>> HI,
>> Here is a description of what we are trying to achieve(whether it is
>> possible or not is still not cear):
>> We have large computing clusters used majorly  for MPI jobs. We use
>> PBS/Torque and Maui for resource allocation and scheduling.
>> At most times utilization is very high except for very small resource
>> pockets of say 16 cores for 2-5 Hrs. We are trying establish feasibility of
>> using these small(but fixed sized) resource pockets for nutch crawls. Our
>> configuration is:
>>
>> # Hadoop 0.20.2 (packaged with nutch)
>> #Lustre parallel filesystem for data storage
>> # No HDFS
>>
>> We have JT running on one of the login nodes at all times.
>> Request for resource (nodes=16, walltime=05 Hrs.) is made using batch
>> system and as a part of job TTs are provisioned. The problem is, when a job
>> expires, user processes are cleaned up and thus TT gets killed. With that,
>> completed and running map/reduce tasks for nutch job are killed and are
>> rescheduled. Solution could be as we see it:
>>
>> 1. As the filesystem is shared(& persistent),  restart tasks on another TT
>> and make intermediate task data available. i.e. sort of checkpointing.
>> 2. TT draining - based on a speculative time for task completion, TT whose
>> walltime is nearing expiry will go into draining mode.i.e. no new tasks will
>> be scheduled on that TT.
>>
>> For '1', it is very far fetched(we are no Hadoop expert)
>> '2' seems to be a more sensible approach.
>>
>> Using exclude list for TT will not help as Koji has already mentioned
>> We looked into capacity scheduler but did'nt find any pointers. Phil, what
>> version of hadoop
>> have these hooks in scheduler.
>>
>> On Sat, Jan 29, 2011 at 3:34 AM, phil young 
>> wrote:
>>
>> There are some hooks available in the schedulers that could be useful
>> also.
>> I think they were expected to be used to allow you to schedule tasks based
>> on load average on the host, but I'd expect you can customize them for
>> your
>> purpose.
>>
>>
>> On Fri, Jan 28, 2011 at 6:46 AM, Harsh J  wrote:
>>
>> > Moving discussion to the MapReduce-User list:
>> > mapreduce-u...@hadoop.apache.org
>> >
>> > Reply inline:
>> >
>> > On Fri, Jan 28, 2011 at 2:39 PM, rishi pathak <
>> mailmaverick...@gmail.com>
>> > wrote:
>> > > Hi,
>> > >Is there a way to drain a tasktracker. What we require is not
>> to
>> > > schedule any more map/red tasks onto a tasktracker(mark it offline)
>> but
>> > > still the running tasks should not be affected.
>> >
>> > You could simply shut the TT down. MapReduce was designed with faults
>> > in mind and thus tasks that are running on a particular TaskTracker
>> > can be re-run elsewhere if they failed. Is this not usable in your
>> > case?
>> >
>> > --
>> > Harsh J
>> > www.harshj.com 
>> >
>>
>>
>>
>>
>
>
> --
> ---
> Rishi Pathak
> National PARAM Supercomputing Facility
> C-DAC, Pune, India
>
>
>


-- 
---
Rishi Pathak
National PARAM Supercomputing Facility
C-DAC, Pune, India


Re: Draining/Decommisioning a tasktracker

2011-01-31 Thread Koji Noguchi
Hi Rishi,

> P.S. - What credentials are required for commenting on an issue in Jira
>
It's open source.  I'd say none :)

My feature request is for a regular hadoop clusters whereas yours is pretty 
unique.
Not sure if that Jira applies to your need or not.

Koji


On 1/31/11 9:21 AM, "rishi pathak"  wrote:

Hi Koji,
   Thanks for opening feature request. Right now for the purpose stated 
earlier
I have upgraded to hadoop to 0.21. , and trying to see if creating individual 
leaf level queues for every tasktracker and changing the state of it to 
'stopped' before the expiry of the walltime. Seems like it will work for now.

P.S. - What credentials are required for commentiong on an issue in Jira

On Mon, Jan 31, 2011 at 10:22 PM, Koji Noguchi  wrote:
Rishi,

> Using exclude list for TT will not help as Koji has already mentioned
>
It'll help a bit in a sense that no more tasks are assigned to that TaskTracker 
once excluded.

As for TT decommissioning and map outputs handling, opened a Jira for further 
discussion.
https://issues.apache.org/jira/browse/MAPREDUCE-2291

Koji



On 1/29/11 5:37 AM, "rishi pathak" http://mailmaverick...@gmail.com> > wrote:

HI,
Here is a description of what we are trying to achieve(whether it is 
possible or not is still not cear):
We have large computing clusters used majorly  for MPI jobs. We use PBS/Torque 
and Maui for resource allocation and scheduling.
At most times utilization is very high except for very small resource pockets 
of say 16 cores for 2-5 Hrs. We are trying establish feasibility of using these 
small(but fixed sized) resource pockets for nutch crawls. Our configuration is:

# Hadoop 0.20.2 (packaged with nutch)
#Lustre parallel filesystem for data storage
# No HDFS

We have JT running on one of the login nodes at all times.
Request for resource (nodes=16, walltime=05 Hrs.) is made using batch system 
and as a part of job TTs are provisioned. The problem is, when a job expires, 
user processes are cleaned up and thus TT gets killed. With that, completed and 
running map/reduce tasks for nutch job are killed and are rescheduled. Solution 
could be as we see it:

1. As the filesystem is shared(& persistent),  restart tasks on another TT and 
make intermediate task data available. i.e. sort of checkpointing.
2. TT draining - based on a speculative time for task completion, TT whose 
walltime is nearing expiry will go into draining mode.i.e. no new tasks will be 
scheduled on that TT.

For '1', it is very far fetched(we are no Hadoop expert)
'2' seems to be a more sensible approach.

Using exclude list for TT will not help as Koji has already mentioned
We looked into capacity scheduler but did'nt find any pointers. Phil, what 
version of hadoop
have these hooks in scheduler.

On Sat, Jan 29, 2011 at 3:34 AM, phil young http://phil.wills.yo...@gmail.com> > wrote:
There are some hooks available in the schedulers that could be useful also.
I think they were expected to be used to allow you to schedule tasks based
on load average on the host, but I'd expect you can customize them for your
purpose.


On Fri, Jan 28, 2011 at 6:46 AM, Harsh J http://qwertyman...@gmail.com> > wrote:

> Moving discussion to the MapReduce-User list:
> mapreduce-u...@hadoop.apache.org 
>
> Reply inline:
>
> On Fri, Jan 28, 2011 at 2:39 PM, rishi pathak   >
> wrote:
> > Hi,
> >Is there a way to drain a tasktracker. What we require is not to
> > schedule any more map/red tasks onto a tasktracker(mark it offline) but
> > still the running tasks should not be affected.
>
> You could simply shut the TT down. MapReduce was designed with faults
> in mind and thus tasks that are running on a particular TaskTracker
> can be re-run elsewhere if they failed. Is this not usable in your
> case?
>
> --
> Harsh J
> www.harshj.com   
>






Psuedo-Distributed Mode installation failure: Need help

2011-01-31 Thread sharath jagannath
Hey All,

I am trying to install hadoop 0.20.2 and proceeded till daemon-start-up
bin/start-all.sh.
But I see no datanode created.

jps output:
43323 JobTracker
43281 SecondaryNameNode
43162 NameNode
43403 Jps
43381 TaskTracker
19747

Named node log:
2011-01-31 15:08:47,343 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 6 on 9000, call addBlock(/tmp/hadoop-sjagannath/mapred/system/
jobtracker.info, DFSClient_-462680310) from 127.0.0.1:61167: error:
java.io.IOException: File /tmp/hadoop-sjagannath/mapred/system/
jobtracker.info could only be replicated to 0 nodes, instead of 1
java.io.IOException: File /tmp/hadoop-sjagannath/mapred/system/
jobtracker.info could only be replicated to 0 nodes, instead of 1
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)


I am sure, I am doing what ever the quickstarts asks me to do.
I am following:
a.
http://hadoop-tutorial.blogspot.com/2010/11/running-hadoop-in-pseudo-distributed.html
b. http://hadoop.apache.org/common/docs/current/single_node_setup.html

I am no sure what is going on.
I would love to clear this hurdle and get my hand dirty with the actual
work.
Please let me know what is happening.


Thanks,
Sharath


Benchmarking performance in Amazon EC2/EMR environment

2011-01-31 Thread Aaron Eng
Hi all,

I was wondering if any of you have had a similar experience working with
Hadoop in Amazon's environment.  I've been running a few jobs over the last
few months and have noticed them taking more and more time.  For instance, I
was running teragen/terasort/teravalidate as a benchmark and I've noticed
the average execution times of all three jobs have increased by 25-33% this
month vs. what I was seeing in December.  When I was able to quantify this I
started collected some disk IO stats using SAR and dd.  I found that on any
given node in an EMR cluster, the throughput to the ephemeral storage ranged
from <30MB/s to >400MB/s.  I also noticed that when using EBS volumes, the
throughput would range from ~20MB/s up to 100MB/s.  Since those jobs are I/O
bound I would have to assume that these huge swings in speed are causing my
jobs to take longer.  Unfortunately I wasn't collecting the SAR/dd info in
December so I don't have anything to compare it too.

Just wondering if others have done these types of performance benchmarks and
how they went about tuning Hadoop or tuning how you run your jobs to mediate
the effects.  If these were small variations in performance I wouldn't be
too concerned.  But in any given test, I can have a drive running >20x
faster/slower than another drive.


Join me on StumbleUpon!

2011-01-31 Thread jpxie
[StumbleUpon]
(http://www.stumbleupon.com/to/e/850951232:QZLQK980KP91TLXQ/invite-4/home1/https%253A%252F%252Fwww.stumbleupon.com%252F%253Fpre%253Dinvite%2526u%253Dsori)
 
Join for free 
(http://www.stumbleupon.com/to/e/850951232:QZLQK980KP91TLXQ/invite-4/home2/https%253A%252F%252Fwww.stumbleupon.com%252F%253Fpre%253Dinvite%2526u%253Dsori)
 

I've been discovering great stuff on the web with StumbleUpon and I think 
you'll love it!
Join now so we can see and share each other's favorite web pages.  It only 
takes a minute - check it out!
- jpxie


StumbleUpon is a discovery engine that finds the best of the web, recommended 
just for you. 
Learn more 
(http://www.stumbleupon.com/to/e/850951232:QZLQK980KP91TLXQ/invite-4/hf/http%253A%252F%252Fwww.stumbleupon.com%252F)
 




If you do not wish to receive emails sent by StumbleUpon, please
click here 
(http://www.stumbleupon.com/to/e/850951232:QZLQK980KP91TLXQ/invite-4/oo/__eNrLKCkpsNLXLy8v1ysuKc1NykktLcjP00vOz9XPyy_JTMtMTizJzM8r1ivIKLBPzU3MzEnOT0m1DYzyCfS2tDDwDrA0DPGJCFQrqSxItfXMK8ssSQVcMCNGHms%252C)
 


(c) StumbleUpon 2001 - 2011



Re: Psuedo-Distributed Mode installation failure: Need help

2011-01-31 Thread Harsh J
Check your DataNode logs under $HADOOP_HOME/logs/. It would have the
reason why it did not start.

You can also issue `hadoop datanode` and watch the exceptional movie play.

On Tue, Feb 1, 2011 at 4:46 AM, sharath jagannath
 wrote:
> Hey All,
>
> I am trying to install hadoop 0.20.2 and proceeded till daemon-start-up
> bin/start-all.sh.
> But I see no datanode created.
>
> jps output:
> 43323 JobTracker
> 43281 SecondaryNameNode
> 43162 NameNode
> 43403 Jps
> 43381 TaskTracker
> 19747
>
> Named node log:
> 2011-01-31 15:08:47,343 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 6 on 9000, call addBlock(/tmp/hadoop-sjagannath/mapred/system/
> jobtracker.info, DFSClient_-462680310) from 127.0.0.1:61167: error:
> java.io.IOException: File /tmp/hadoop-sjagannath/mapred/system/
> jobtracker.info could only be replicated to 0 nodes, instead of 1
> java.io.IOException: File /tmp/hadoop-sjagannath/mapred/system/
> jobtracker.info could only be replicated to 0 nodes, instead of 1
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
> at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
>
>
> I am sure, I am doing what ever the quickstarts asks me to do.
> I am following:
> a.
> http://hadoop-tutorial.blogspot.com/2010/11/running-hadoop-in-pseudo-distributed.html
> b. http://hadoop.apache.org/common/docs/current/single_node_setup.html
>
> I am no sure what is going on.
> I would love to clear this hurdle and get my hand dirty with the actual
> work.
> Please let me know what is happening.
>
>
> Thanks,
> Sharath
>



-- 
Harsh J
www.harshj.com


file:/// has no authority

2011-01-31 Thread danoomistmatiste

Hi,  I have setup a Hadoop cluster as per the instructions for CDH3.   When I
try to start the datanode on the slave, I get this error,

org.apache.hadoop.hdfs.server.datanode.DataNode:
java.lang.IllegalArgumentException: Invalid URI for NameNode address
(check fs.defaultFS): file:/// has no authority.

I have setup the right parameters in core-site.xml
where  is the IP address where the namenode is running


 
  fs.default.name
  hdfs://:54310



-- 
View this message in context: 
http://old.nabble.com/file%3Ahas-no-authority-tp30813534p30813534.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



Re: file:/// has no authority

2011-01-31 Thread Todd Lipcon
Hi,

Double check that your configuration XML files are well-formed. You can do
this easily using a validator like "tidy". My guess is that one of the tags
is mismatched so the configuration isn't being read.

-Todd

On Mon, Jan 31, 2011 at 9:19 PM, danoomistmatiste wrote:

>
> Hi,  I have setup a Hadoop cluster as per the instructions for CDH3.   When
> I
> try to start the datanode on the slave, I get this error,
>
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> java.lang.IllegalArgumentException: Invalid URI for NameNode address
> (check fs.defaultFS): file:/// has no authority.
>
> I have setup the right parameters in core-site.xml
> where  is the IP address where the namenode is running
>
> 
>  
>  fs.default.name
>  hdfs://:54310
> 
>
>
> --
> View this message in context:
> http://old.nabble.com/file%3Ahas-no-authority-tp30813534p30813534.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera