Dear Hadoop Users and Developers,
It looks like getTracker() method in JobTracker class (to get a hold of a
running JobTracker instance) no longer exists for 0.15.1 release.
The reason I want an instance of JobTracker is to get some information about
the current and old job status.
Is there any o
/etc/hosts may be buggered as well. What is the entry for localhost?
On 1/2/08 3:48 PM, "Billy Pearson" <[EMAIL PROTECTED]> wrote:
>
>
>> localhost: ssh: localhost: Name or service not known
>
> that error looks like ssh is not running
>
> make sure its running and working
> try to shh to
I thank the best option would be able to set the max per node in its config
file
I thank someone is or has worked on this I seen something in jira.
for the new option I would thank a job over ride would work something like
this
1) Check node config if job over ride is lower then node then use j
On Thu, Jan 03, 2008 at 10:12:04AM +0530, Arun C Murthy wrote:
>On Wed, Jan 02, 2008 at 12:08:53PM -0800, Jason Venner wrote:
>>In our case, we have specific jobs that due to resource constraints can
>>only be run serially (ie: 1 instance per machine).
>
>I see, at this point there isn't anything
On Wed, Jan 02, 2008 at 12:08:53PM -0800, Jason Venner wrote:
>In our case, we have specific jobs that due to resource constraints can
>only be run serially (ie: 1 instance per machine).
I see, at this point there isn't anything in Hadoop which can help you out
here...
Having said that, could y
localhost: ssh: localhost: Name or service not known
that error looks like ssh is not running
make sure its running and working
try to shh to localhost from the server
ssh localhost
and see if it works.
Billy
- Original Message -
From: "Natarajan, Senthil" <[EMAIL PROTECTED]>
i crawl "http://lucene.apache.org"; and in conf/crawl-urlfilter.txt i set that
"+^http://([a-z0-9]*\.)*apache.org/" when i use command "bin/nutch crawl
urls -dir crawled -depth 3" have error that
- crawl started in: crawled
- rootUrlDir = urls
- threads = 10
- depth = 3
- Injector: starting
Some of the task I have will over run the servers if I ran say 2 of them per
node but I have other task I can run 4 on a server so I was looking to get
it config on the command line to better spread the work the way we want to.
Billy
"Arun C Murthy" <[EMAIL PROTECTED]> wrote in
message news
If you ran hadoop process under account 'hadoop', and set the hadoop data
directory to a particular directory, you need make sure that your hadoop
account can write to that directory.
On Jan 2, 2008 2:06 PM, Natarajan, Senthil <[EMAIL PROTECTED]> wrote:
> I Just uncommented and changed the JAVA_H
I Just uncommented and changed the JAVA_HOME, that's all I did in hadoop-env.sh.
Do I need to configure anything else.
Here is the hadoop-env.sh
# Set Hadoop-specific environment variables here.
# The only required environment variable is JAVA_HOME. All others are
# optional. When running a d
Well, you have something very strange going on in your scripts. Have you
looked at hadoop-env.sh?
On 1/2/08 1:58 PM, "Natarajan, Senthil" <[EMAIL PROTECTED]> wrote:
>> /bin/bash: /root/.bashrc: Permission denied
>> localhost: ssh: localhost: Name or service not known
>> /bin/bash: /root/.bashr
No, I am running the processes as user "hadoop" I created a separated user for
running hadoop deamons.
-Original Message-
From: Ted Dunning [mailto:[EMAIL PROTECTED]
Sent: Wednesday, January 02, 2008 4:55 PM
To: hadoop-user@lucene.apache.org
Subject: Re: Datanode Problem
I don't know wh
I don't know what your problem is, but I note that you appear to be running
processes as root.
This is a REALLY bad idea. It may also be related to your problem.
On 1/2/08 1:33 PM, "Natarajan, Senthil" <[EMAIL PROTECTED]> wrote:
> Hi,
> I am new to Hadoop. I just downloaded release 0.14.4 (ha
> From: Joydeep Sen Sarma [mailto:[EMAIL PROTECTED]
>
> hdfs doesn't allow random overwrites or appends. so even if
> hdfs were mountable - i am guessing we couldn't just do a
> rsync to a dfs mount (never looked at rsync code - but
> assuming it does appends/random-writes). any emulation of
> rsyn
Hi,
I am new to Hadoop. I just downloaded release 0.14.4 (hadoop-0.14.4.tar.gz) and
trying to setup Hadoop on Single Machine (RedHat Linux 9) by following the link
http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
Looks like datanode is not starting seems,
In our case, we have specific jobs that due to resource constraints can
only be run serially (ie: 1 instance per machine).
Most of our jobs are more normal and can be run in parallel on the machines.
Arun C Murthy wrote:
Billy,
On Wed, Jan 02, 2008 at 01:38:06PM -0600, Billy wrote:
If I a
I believe you get this ability about 0.16.0.
as of 0.15.1 this is a per cluster set at start time value.
Billy wrote:
If I add this to a command line as a -jobconf should it be enforced?
Say I have a job that I want to run only 1 map at a time per server
I have tried this and look in the job.x
Billy,
On Wed, Jan 02, 2008 at 01:38:06PM -0600, Billy wrote:
>If I add this to a command line as a -jobconf should it be enforced?
>
This is a property of the TaskTracker and hence cannot be set on a per-job
basis...
>Say I have a job that I want to run only 1 map at a time per server
>
Coul
hdfs doesn't allow random overwrites or appends. so even if hdfs were mountable
- i am guessing we couldn't just do a rsync to a dfs mount (never looked at
rsync code - but assuming it does appends/random-writes). any emulation of
rsync would end up having to delete and recreate changed files in
If I add this to a command line as a -jobconf should it be enforced?
Say I have a job that I want to run only 1 map at a time per server
I have tried this and look in the job.xml file and its set correctly but not
enforced.
Billy
That is a good idea. I currently use a shell script that does the rough
equivalent of rsync -av, but it wouldn't be bad to have a one-liner that
solves the same problem.
One (slight) benefit to the scripted approach is that I get a list of
directories to which files have been moved. That lets m
> -Original Message-
> From: Stefan Groschupf [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, January 02, 2008 3:46 AM
> To: hadoop-user@lucene.apache.org
> Subject: Re: HBase implementation question
>
> Hi,
> > Reads are probably a bit more complicated than writes. A read
> > operation first
Hello,
Does anyone know of a modified "rsync" that gets/puts files to/from the dfs
instead of the normal, mounted filesystems? I'm guessing since the dfs can't
be mounted like a "normal" filesystem that rsync would need to be modified in
order to access it, as with any other program. We use r
Arun C Murthy wrote:
>
> What version of Hadoop are you running?
> Dhaya007:hadoop-0.15.1
>
> http://wiki.apache.org/lucene-hadoop/Help
>
> Dhaya007 wrote:
> > ..datanode-slave.log
>> 2007-12-19 19:30:55,579 WARN org.apache.hadoop.dfs.DataNode: Invalid
>> directory in dfs.data.dir: directory
What version of Hadoop are you running?
http://wiki.apache.org/lucene-hadoop/Help
Dhaya007 wrote:
> ..datanode-slave.log
2007-12-19 19:30:55,579 WARN org.apache.hadoop.dfs.DataNode: Invalid
directory in dfs.data.dir: directory is not writable:
/tmp/hadoop-hdpusr/dfs/data
2007-12-19 19:30:55,579
Hi,
Reads are probably a bit more complicated than writes. A read
operation first checks the cache and may satisfy the request
directly from the cache. If not, the operation checks the
newest MapFile for the data, then the next to newest, ...,
to the oldest stopping when the requested data has be
Thanks for your reply i am using password less ssh master to slave
and following are the logs (slave)
..datanode-slave.log
2007-12-19 19:30:55,237 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG:
/
STARTUP_MSG: Starting DataNode
STARTUP
i crawl "http://lucene.apache.org"; and in conf/crawl-urlfilter.txt i set that
"+^http://([a-z0-9]*\.)*apache.org/" when i use command "bin/nutch crawl
urls -dir crawled -depth 3" have error that
- crawl started in: crawled
- rootUrlDir = urls
- threads = 10
- depth = 3
- Injector: starting
- Inj
28 matches
Mail list logo