Re: How to pass object between jobs

2011-01-25 Thread Joan
Hi harsh I want to create many pairs where v will be custom object so this will happen when my reduce's finished, so the output from job1 will be file with = Then In second job I want to get MyObject from first job, ¿you know? So now, I have: job1: job2; Then I have to convert Text obj

Hadoop environment variable

2011-01-25 Thread praveen.peddi
Hello all, I have set the Hadoop environment variable HADOOP_CONF_DIR and trying to run a Hadoop job from a java application but the job is not looking the hadoop config in this HADOOP_CONF_DIR folder. If I copy the xml files from this folder on to java application classpath, it works fine. Sinc

Distributed Cache problem

2011-01-25 Thread Jacob R Rideout
Hello all: We've had an intermittent issue on our cluster when using the distributed cache: 11/01/25 13:46:19 INFO mapred.JobClient: Task Id : attempt_201101071032_13017_r_30_2, Status : FAILED java.io.FileNotFoundException: /hadoop.data.1/tmp/mapred/local/taskTracker/archive/hdfs/data/lookup

Re: How to pass object between jobs

2011-01-25 Thread Harsh J
Could you describe your need with an example? If you want the output of a Job (Job1), say being a single file with a single value in it, to be read and used as a scalar in the next Job (Job2), you can do it by reading and creating the object(s) manually from Job1's end, and then using it via a con

Re: task slots problem

2011-01-25 Thread Harsh J
This is fine. 40 may be ending, while the next 40 may be starting. The tasks will run a cleanup operation at their end (and be in such a 'status'), during which the TaskTracker is allowed to schedule the next wave of maps it needs. In these moments, it may appear as if 80 concurrent tasks may be ru

How to pass object between jobs

2011-01-25 Thread Joan
Hi, I would like pass one object from job1 to job2 Someone can I help me, please? Thanks Joan

task slots problem

2011-01-25 Thread exception
Hi, My cluster contains 5 DataNodes, each with 8 map slots and 2 reduce slots. So there are up to 40 slots in my cluster and 40 tasks can run in parallel. But when running a particular job, I have noticed that 80 tasks running in parallel. The cluster looks fine when running other jobs. This par