date:20080102

Why is getTracker() method in JobTracker class no longer in 0.15.1 release?

2008-01-02 Thread Taeho Kang

Dear Hadoop Users and Developers, It looks like getTracker() method in JobTracker class (to get a hold of a running JobTracker instance) no longer exists for 0.15.1 release. The reason I want an instance of JobTracker is to get some information about the current and old job status. Is there any o

Re: Datanode Problem

2008-01-02 Thread Ted Dunning

/etc/hosts may be buggered as well. What is the entry for localhost? On 1/2/08 3:48 PM, "Billy Pearson" <[EMAIL PROTECTED]> wrote: > > >> localhost: ssh: localhost: Name or service not known > > that error looks like ssh is not running > > make sure its running and working > try to shh to

Re: mapred.tasktracker.map.tasks.maximum

2008-01-02 Thread Billy

I thank the best option would be able to set the max per node in its config file I thank someone is or has worked on this I seen something in jira. for the new option I would thank a job over ride would work something like this 1) Check node config if job over ride is lower then node then use j

Re: mapred.tasktracker.map.tasks.maximum

2008-01-02 Thread Arun C Murthy

On Thu, Jan 03, 2008 at 10:12:04AM +0530, Arun C Murthy wrote: >On Wed, Jan 02, 2008 at 12:08:53PM -0800, Jason Venner wrote: >>In our case, we have specific jobs that due to resource constraints can >>only be run serially (ie: 1 instance per machine). > >I see, at this point there isn't anything

Re: mapred.tasktracker.map.tasks.maximum

2008-01-02 Thread Arun C Murthy

On Wed, Jan 02, 2008 at 12:08:53PM -0800, Jason Venner wrote: >In our case, we have specific jobs that due to resource constraints can >only be run serially (ie: 1 instance per machine). I see, at this point there isn't anything in Hadoop which can help you out here... Having said that, could y

Re: Datanode Problem

2008-01-02 Thread Billy Pearson

localhost: ssh: localhost: Name or service not known that error looks like ssh is not running make sure its running and working try to shh to localhost from the server ssh localhost and see if it works. Billy - Original Message - From: "Natarajan, Senthil" <[EMAIL PROTECTED]>

Re: Nutch crawl problem

2008-01-02 Thread jibjoice

i crawl "http://lucene.apache.org"; and in conf/crawl-urlfilter.txt i set that "+^http://([a-z0-9]*\.)*apache.org/" when i use command "bin/nutch crawl urls -dir crawled -depth 3" have error that - crawl started in: crawled - rootUrlDir = urls - threads = 10 - depth = 3 - Injector: starting

Re: mapred.tasktracker.map.tasks.maximum

2008-01-02 Thread Billy

Some of the task I have will over run the servers if I ran say 2 of them per node but I have other task I can run 4 on a server so I was looking to get it config on the command line to better spread the work the way we want to. Billy "Arun C Murthy" <[EMAIL PROTECTED]> wrote in message news

Re: Datanode Problem

2008-01-02 Thread charles du

If you ran hadoop process under account 'hadoop', and set the hadoop data directory to a particular directory, you need make sure that your hadoop account can write to that directory. On Jan 2, 2008 2:06 PM, Natarajan, Senthil <[EMAIL PROTECTED]> wrote: > I Just uncommented and changed the JAVA_H

RE: Datanode Problem

2008-01-02 Thread Natarajan, Senthil

I Just uncommented and changed the JAVA_HOME, that's all I did in hadoop-env.sh. Do I need to configure anything else. Here is the hadoop-env.sh # Set Hadoop-specific environment variables here. # The only required environment variable is JAVA_HOME. All others are # optional. When running a d

Re: Datanode Problem

2008-01-02 Thread Ted Dunning

Well, you have something very strange going on in your scripts. Have you looked at hadoop-env.sh? On 1/2/08 1:58 PM, "Natarajan, Senthil" <[EMAIL PROTECTED]> wrote: >> /bin/bash: /root/.bashrc: Permission denied >> localhost: ssh: localhost: Name or service not known >> /bin/bash: /root/.bashr

RE: Datanode Problem

2008-01-02 Thread Natarajan, Senthil

No, I am running the processes as user "hadoop" I created a separated user for running hadoop deamons. -Original Message- From: Ted Dunning [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 02, 2008 4:55 PM To: hadoop-user@lucene.apache.org Subject: Re: Datanode Problem I don't know wh

Re: Datanode Problem

2008-01-02 Thread Ted Dunning

I don't know what your problem is, but I note that you appear to be running processes as root. This is a REALLY bad idea. It may also be related to your problem. On 1/2/08 1:33 PM, "Natarajan, Senthil" <[EMAIL PROTECTED]> wrote: > Hi, > I am new to Hadoop. I just downloaded release 0.14.4 (ha

RE: Is there an rsyncd for HDFS

2008-01-02 Thread Greg Connor

> From: Joydeep Sen Sarma [mailto:[EMAIL PROTECTED] > > hdfs doesn't allow random overwrites or appends. so even if > hdfs were mountable - i am guessing we couldn't just do a > rsync to a dfs mount (never looked at rsync code - but > assuming it does appends/random-writes). any emulation of > rsyn

Datanode Problem

2008-01-02 Thread Natarajan, Senthil

Hi, I am new to Hadoop. I just downloaded release 0.14.4 (hadoop-0.14.4.tar.gz) and trying to setup Hadoop on Single Machine (RedHat Linux 9) by following the link http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29 Looks like datanode is not starting seems,

Re: mapred.tasktracker.map.tasks.maximum

2008-01-02 Thread Jason Venner

In our case, we have specific jobs that due to resource constraints can only be run serially (ie: 1 instance per machine). Most of our jobs are more normal and can be run in parallel on the machines. Arun C Murthy wrote: Billy, On Wed, Jan 02, 2008 at 01:38:06PM -0600, Billy wrote: If I a

Re: mapred.tasktracker.map.tasks.maximum

2008-01-02 Thread Jason Venner

I believe you get this ability about 0.16.0. as of 0.15.1 this is a per cluster set at start time value. Billy wrote: If I add this to a command line as a -jobconf should it be enforced? Say I have a job that I want to run only 1 map at a time per server I have tried this and look in the job.x

Re: mapred.tasktracker.map.tasks.maximum

2008-01-02 Thread Arun C Murthy

Billy, On Wed, Jan 02, 2008 at 01:38:06PM -0600, Billy wrote: >If I add this to a command line as a -jobconf should it be enforced? > This is a property of the TaskTracker and hence cannot be set on a per-job basis... >Say I have a job that I want to run only 1 map at a time per server > Coul

RE: Is there an rsyncd for HDFS

2008-01-02 Thread Joydeep Sen Sarma

hdfs doesn't allow random overwrites or appends. so even if hdfs were mountable - i am guessing we couldn't just do a rsync to a dfs mount (never looked at rsync code - but assuming it does appends/random-writes). any emulation of rsync would end up having to delete and recreate changed files in

mapred.tasktracker.map.tasks.maximum

2008-01-02 Thread Billy

If I add this to a command line as a -jobconf should it be enforced? Say I have a job that I want to run only 1 map at a time per server I have tried this and look in the job.xml file and its set correctly but not enforced. Billy

Re: Is there an rsyncd for HDFS

2008-01-02 Thread Ted Dunning

That is a good idea. I currently use a shell script that does the rough equivalent of rsync -av, but it wouldn't be bad to have a one-liner that solves the same problem. One (slight) benefit to the scripted approach is that I get a list of directories to which files have been moved. That lets m

RE: HBase implementation question

2008-01-02 Thread Jim Kellerman

> -Original Message- > From: Stefan Groschupf [mailto:[EMAIL PROTECTED] > Sent: Wednesday, January 02, 2008 3:46 AM > To: hadoop-user@lucene.apache.org > Subject: Re: HBase implementation question > > Hi, > > Reads are probably a bit more complicated than writes. A read > > operation first

Is there an rsyncd for HDFS

2008-01-02 Thread Greg Connor

Hello, Does anyone know of a modified "rsync" that gets/puts files to/from the dfs instead of the normal, mounted filesystems? I'm guessing since the dfs can't be mounted like a "normal" filesystem that rsync would need to be modified in order to access it, as with any other program. We use r

Re: Not able to start Data Node

2008-01-02 Thread Dhaya007

Arun C Murthy wrote: > > What version of Hadoop are you running? > Dhaya007:hadoop-0.15.1 > > http://wiki.apache.org/lucene-hadoop/Help > > Dhaya007 wrote: > > ..datanode-slave.log >> 2007-12-19 19:30:55,579 WARN org.apache.hadoop.dfs.DataNode: Invalid >> directory in dfs.data.dir: directory

Re: Not able to start Data Node

2008-01-02 Thread Arun C Murthy

What version of Hadoop are you running? http://wiki.apache.org/lucene-hadoop/Help Dhaya007 wrote: > ..datanode-slave.log 2007-12-19 19:30:55,579 WARN org.apache.hadoop.dfs.DataNode: Invalid directory in dfs.data.dir: directory is not writable: /tmp/hadoop-hdpusr/dfs/data 2007-12-19 19:30:55,579

Re: HBase implementation question

2008-01-02 Thread Stefan Groschupf

Hi, Reads are probably a bit more complicated than writes. A read operation first checks the cache and may satisfy the request directly from the cache. If not, the operation checks the newest MapFile for the data, then the next to newest, ..., to the oldest stopping when the requested data has be

Re: Not able to start Data Node

2008-01-02 Thread Dhaya007

Thanks for your reply i am using password less ssh master to slave and following are the logs (slave) ..datanode-slave.log 2007-12-19 19:30:55,237 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG: / STARTUP_MSG: Starting DataNode STARTUP

Re: Nutch crawl problem

2008-01-02 Thread jibjoice

i crawl "http://lucene.apache.org"; and in conf/crawl-urlfilter.txt i set that "+^http://([a-z0-9]*\.)*apache.org/" when i use command "bin/nutch crawl urls -dir crawled -depth 3" have error that - crawl started in: crawled - rootUrlDir = urls - threads = 10 - depth = 3 - Injector: starting - Inj

Why is getTracker() method in JobTracker class no longer in 0.15.1 release?

Re: Datanode Problem

Re: mapred.tasktracker.map.tasks.maximum

Re: mapred.tasktracker.map.tasks.maximum

Re: mapred.tasktracker.map.tasks.maximum

Re: Datanode Problem

Re: Nutch crawl problem

Re: mapred.tasktracker.map.tasks.maximum

Re: Datanode Problem

RE: Datanode Problem

Re: Datanode Problem

RE: Datanode Problem

Re: Datanode Problem

RE: Is there an rsyncd for HDFS

Datanode Problem

Re: mapred.tasktracker.map.tasks.maximum

Re: mapred.tasktracker.map.tasks.maximum

Re: mapred.tasktracker.map.tasks.maximum

RE: Is there an rsyncd for HDFS

mapred.tasktracker.map.tasks.maximum

Re: Is there an rsyncd for HDFS

RE: HBase implementation question

Is there an rsyncd for HDFS

Re: Not able to start Data Node

Re: Not able to start Data Node

Re: HBase implementation question

Re: Not able to start Data Node

Re: Nutch crawl problem

28 matches

Site Navigation

Mail list logo

Footer information