Re: Reading a directory in standalone Hadoop

2012-08-31 Thread Hemanth Yamijala
Hi, The stack trace mentions that it is getting an access denied. Could you check the permissions of the directory /folder/timezone ? Also, are you using the local job tracker, and not a cluster ? In general, please ensure your configuration is pointing to the right cluster where the job needs

Re: no output written to HDFS

2012-08-30 Thread Hemanth Yamijala
Hi, Do both input files contain data that needs to be processed by the mapper in the same fashion ? In which case, you could just put the input files under a directory in HDFS and provide that as input. The -input option does accept a directory as argument. Otherwise, can you please explain a

Re: Integrating hadoop with java UI application deployed on tomcat

2012-08-30 Thread Hemanth Yamijala
Hi, The error is talking about hadoop configuration. So probably you need to put the hadoop core jar in the lib folder. That said, there might be other dependencies you might need as well. But you can try it out once. Thanks hemanth On Thu, Aug 30, 2012 at 3:53 PM, Visioner Sadak

Re: Hadoop in Pseudo-Distributed mode on Mac OS X 10.8

2012-08-30 Thread Hemanth Yamijala
The mapred.local.dir is local directories on the file system of slave nodes. In pseudo distributed mode, this would be your own machine. If you've specified any configuration for it, it should be in your mapred-site.xml. If not, it defaults to value of hadoop.tmp.dir/mapred/local. The default

Re: MRBench Maps strange behaviour

2012-08-28 Thread Hemanth Yamijala
Hi, The number of maps specified to any map reduce program (including those part of MRBench) is generally only a hint, and the actual number of maps will be influenced in typical cases by the amount of data being processed. You can take a look at this wiki link to understand more:

Re: Question from a Desperate Java Newbie

2010-12-09 Thread Hemanth Yamijala
Not exactly what you may want - but could you try using a HTTP client in Java ? Some of them have the ability to automatically follow redirects, manage cookies etc. Thanks hemanth On Thu, Dec 9, 2010 at 4:35 PM, edward choi mp2...@gmail.com wrote: Excuse me for asking a general Java question

Re: Hadoop command line arguments

2010-12-03 Thread Hemanth Yamijala
Hi, On Sat, Dec 4, 2010 at 4:50 AM, yogeshv yogeshv.i...@gmail.com wrote: Dear all, Which file in the hadoop svn processes/receives the hadoop command line arguments.? While execution for ex: hadoop jar jar_file_path package_namespace inputfolderpath outputfolderpath. 'hadoop' in the

Re: delay the execution of reducers

2010-12-02 Thread Hemanth Yamijala
Hi, Changing the parameter for a specific job works better for me. But I was asking in general in which configuration file(s) should I change the value of the parameters. For parameters in hdfs-site.xml, I should changes the configuration file in each machine. But for parameters in

Re: Memory config for Hadoop cluster

2010-11-07 Thread Hemanth Yamijala
Amandeep, On Fri, Nov 5, 2010 at 11:54 PM, Amandeep Khurana ama...@gmail.com wrote: On Fri, Nov 5, 2010 at 2:00 AM, Hemanth Yamijala yhema...@gmail.com wrote: Hi, On Fri, Nov 5, 2010 at 2:23 PM, Amandeep Khurana ama...@gmail.com wrote: Right. I meant I'm not using fair or capacity

Re: Memory config for Hadoop cluster

2010-11-05 Thread Hemanth Yamijala
Amadeep, Which scheduler are you using ? Thanks hemanth On Tue, Nov 2, 2010 at 2:44 AM, Amandeep Khurana ama...@gmail.com wrote: How are the following configs supposed to be used? mapred.cluster.map.memory.mb mapred.cluster.reduce.memory.mb mapred.cluster.max.map.memory.mb

Re: Memory config for Hadoop cluster

2010-11-05 Thread Hemanth Yamijala
of the parameters are different, though you can see the correspondence with similar variables in Hadoop 0.20. Thanks Hemanth -Amandeep On Fri, Nov 5, 2010 at 12:21 AM, Hemanth Yamijala yhema...@gmail.comwrote: Amadeep, Which scheduler are you using ? Thanks hemanth On Tue, Nov 2, 2010

Re: Memory config for Hadoop cluster

2010-11-05 Thread Hemanth Yamijala
and the task trackers. Then any submission by the job would not override the settings. Thanks Hemanth -Amandeep On Nov 5, 2010, at 1:43 AM, Hemanth Yamijala yhema...@gmail.com wrote: Hi, I'm not using any scheduler.. Dont have multiple jobs running at the same time on the cluster

Re: GC overhead limit exceeded while running Terrior on Hadoop

2010-10-30 Thread Hemanth Yamijala
Hi, On Tue, Oct 26, 2010 at 8:14 PM, siddharth raghuvanshi track009.siddha...@gmail.com wrote: Hi, While running Terrior on Hadoop, I am getting the following error again again, can someone please point out where the problem is? attempt_201010252225_0001_m_09_2: WARN - Error running

Re: help with rewriting hadoop java code for new API: getPos() and getCounter()

2010-10-30 Thread Hemanth Yamijala
Hi, On Wed, Oct 27, 2010 at 2:19 AM, Bibek Paudel eternalyo...@gmail.com wrote: [Apologies for cross-posting] HI all, I am rewriting a hadoop java code for the new (0.20.2) API- the code was originally written for versions = 0.19. 1. What is the equivalent of the getCounter() method ? For

Re: Granting Permissions to HDFS

2010-10-30 Thread Hemanth Yamijala
Hi, On Thu, Oct 28, 2010 at 5:11 PM, Adarsh Sharma adarsh.sha...@orkash.com wrote: Dear all, I am listing all the HDFS delails through -fs shell. I know the superuser owns the privileges to list files. But know I want to grant all read and write privileges to two new users (for e. g  Tom and

Re: Need help on accessing datanodes local filesystem using hadoop map reduce framework

2010-10-23 Thread Hemanth Yamijala
Hi, On Sat, Oct 23, 2010 at 1:44 AM, Burhan Uddin burhan...@gmail.com wrote: Hello, I am a beginner with hadoop framework. I am trying create a distributed crawling application. I have googled a lot. but the resources are too low. Can anyone please help me on the following topics. I suppose

Re: nodes with different memory sizes

2010-10-13 Thread Hemanth Yamijala
Hi, You mentioned you'd like to configure different memory settings for the process depending on which nodes the tasks run on. Which process are you referring to here - the Hadoop daemons, or your map/reduce program ? An alternative approach could be to see if you can get only those nodes in

Re: Log4j Logger in MapReduce applications

2010-09-05 Thread Hemanth Yamijala
Hi, On Mon, Sep 6, 2010 at 9:49 AM, Rita Liu crystaldol...@gmail.com wrote: Hi! :) I add some Log4j loggers into the mapper, the reducer, and the main method of WordCount.java. However, after I run this application on the cluster, I couldn't find any of my log messages from WordCount in any

Re: Obtaining the number of map slots through the API (Hadoop 0.20.2)

2010-09-05 Thread Hemanth Yamijala
Hi, The optimization of one Hadoop job I'm running would benefit from knowing the maximum number of map slots in the Hadoop cluster. This number can be obtained (if my understanding is correct) by: * parsing the mapred-site.xml file to get  the mapred.tasktracker.map.tasks.maximum value

Re: Sorting Numbers using mapreduce

2010-09-05 Thread Hemanth Yamijala
Hi, On Mon, Sep 6, 2010 at 1:47 AM, Neil Ghosh neil.gh...@gmail.com wrote: Hi, I am trying to sort a list of numbers (one per line) using  hadoop mapreduce. Kindly suggest any reference and code. How do I implement custom input format and recordreader so that both key and value are the

Re: cluster startup problem

2010-08-30 Thread Hemanth Yamijala
Hi, On Mon, Aug 30, 2010 at 8:19 AM, Gang Luo lgpub...@yahoo.com.cn wrote: Hi all, I am trying to configure and start a hadoop cluster on EC2. I got some problems here. 1. Can I share hadoop code and its configuration across nodes? Say I have a distributed file system running in the

Re: main java.lang.UnsupportedClassVersionError: Bad version number in .class

2010-08-30 Thread Hemanth Yamijala
Hi, Can you please confirm if you've set JAVA_HOME in conf-dir/hadoop-env.sh on all the nodes ? Thanks Hemanth On Tue, Aug 31, 2010 at 6:21 AM, Mohit Anchlia mohitanch...@gmail.com wrote: Hi, I am running some basic setup and test to know about hadoop. When I try to start nodes I get this

Re: cluster write permission

2010-08-29 Thread Hemanth Yamijala
Hi, On Sun, Aug 29, 2010 at 10:14 PM, Gang Luo lgpub...@yahoo.com.cn wrote: HI all, I am setting a hadoop cluster where I have to specify the local directory for temp files/logs, etc. Should I allow everybody have the write permission to these directories? Who actually does the write

Re: Hadoop startup problem - directory name required

2010-08-25 Thread Hemanth Yamijala
Hmm. Without the / in the property tag, isn't the file malformed XML ? I am pretty sure Hadoop complains in such cases ? On Wed, Aug 25, 2010 at 4:44 AM, cliff palmer palmercl...@gmail.com wrote: Thanks Allen - that has resolved the problem.  Good catch! Cliff On Tue, Aug 24, 2010 at 3:05

Re: Managing configurations

2010-08-18 Thread Hemanth Yamijala
Mark, On Wed, Aug 18, 2010 at 10:59 PM, Mark static.void@gmail.com wrote:  What is the preferred way of managing multiple configurations.. ie development, production etc. Is there someway I can tell hadoop to use a separate conf directory other than ${hadoop_home}/conf? I think I've

Re: Profiling Hadoop Map Reduce with the 20.2 API

2010-08-17 Thread Hemanth Yamijala
. That in turn is because JobConf is still a preferred way of setting parameters in the Hadoop 0.20 major release. Later versions of the documentation will hopefully correct this. Thanks hemanth On Thu, Aug 12, 2010 at 10:16 PM, Hemanth Yamijala yhema...@gmail.com wrote: Hi,   I recently

Re: Hadoop 0.20.2: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_201008131730_0001/attempt_201008131730_0001_m_000000_2/output/file.out.index in any of

2010-08-17 Thread Hemanth Yamijala
Hi, Hi, Hemanth. Thinks for your reply! I tried your recommendation, absolute path, it worked, I was able to run the jobs successfully. Thank you! I was wondering why hadoop.tmp.dir ( or mapred.local.dir ? ) with relative path didn't work. I am not entirely sure, but when the daemon is

Re: Hadoop 0.20.2: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_201008131730_0001/attempt_201008131730_0001_m_000000_2/output/file.out.index in any of

2010-08-13 Thread Hemanth Yamijala
Hi, 1. I login through SSH without password from master and slaves, it's all right :-) 2.  property    namehadoop.tmp.dir/name    valuetmp/value  /property In fact, 'tmp' is what I want  :-) $HADOOP_HOME                         + tmp                                  + dfs          

Re: Scheduler recommendation

2010-08-11 Thread Hemanth Yamijala
Hi, On Thu, Aug 12, 2010 at 10:31 AM, Hemanth Yamijala yhema...@gmail.com wrote: Hi, On Thu, Aug 12, 2010 at 3:35 AM, Bobby Dennett bdennett+softw...@gmail.com wrote: From what I've read/seen, it appears that, if not the default scheduler, most installations are using Hadoop's Fair

Re: reuse cached files

2010-08-02 Thread Hemanth Yamijala
. This is *not* to be used by client code, and is not guaranteed to work. In the latter versions of Hadoop (0.21 and trunk), these methods have been deprecated in the public API and will be removed altogether. Thanks hemanth Thanks, -Gang - 原始邮件 发件人: Hemanth Yamijala yhema...@gmail.com 收件人

Re: Set variables in mapper

2010-08-02 Thread Hemanth Yamijala
Hi, It would also be worthwhile to look at the Tool interface (http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Tool), which is used by example programs in the MapReduce examples as well. This would allow any arguments to be passed using the -Dvar.name=var.value convention on

Re: add priority to task

2010-08-02 Thread Hemanth Yamijala
Hi, On Tue, Aug 3, 2010 at 9:42 AM, saurabhsuman8989 saurabhsuman...@gmail.com wrote: By By 'tasks' i mean different tasks under one job. When a Job is distributed in different tasks , can i add prioroty to those tasks. It would interesting to know why you want to do this. Can you please

Re: reuse cached files

2010-08-01 Thread Hemanth Yamijala
Hi, Thanks Hemanth. Is there any way to invalidate the reuse and ask Hadoop to resent exactly the same files to cache for every job? I may be able to answer this better if I understand the use case. If you need the same files for every job, why would you need to send them afresh each time ? If

Re: jobtracker.jsp reports GC overhead limit exceeded

2010-08-01 Thread Hemanth Yamijala
Hi, Actually I enabled all level logs. But I didn't realize to check logs in .out files and only looked at .log file and didn't see any error msgs. now I opened the .out file and saw the following logged exception: Exception in thread IPC Server handler 5 on 50002

Re: reuse cached files

2010-07-29 Thread Hemanth Yamijala
Hi, if I use distributed cache to send some files to all the nodes in one MR job, can I reuse these cached files locally in my next job, or will hadoop re-sent these files again? Cache files are reused across Jobs. From trunk onwards, they will be restricted to be reused across jobs of the

Re: Parameters that can be set per job

2010-07-29 Thread Hemanth Yamijala
Hi, Is there a list of configuration parameters that can be set per job. I'm almost certain there's no list that documents per-job settable parameters that well. From 0.21 onwards, I think a convention adopted is to name all job-related or task-related parameters to include 'job' or 'map' or

Re: Setting jar for embedded Job (Hadoop 0.20.2)

2010-07-26 Thread Hemanth Yamijala
Hi, I'd like to run a Hadoop (0.20.2) job from within another application, using ToolRunner. One class of this other application implements the Tool interface. The implemented run() method: * constructs a Job() * sets the input/output/mapper/reducer * sets the jar file by calling

Re: Hadoop's datajoin

2010-07-12 Thread Hemanth Yamijala
Hi, I am trying to use the hadoop's datajoin for joining two relation. According to the Readme file of datajoin, it gives the following syntax: $HADOOP_HOME/bin/hadoop jar hadoop-datajoin-examples.jar org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input   datajoin/output Text

Re: Pig share schema between projetcs

2010-07-09 Thread Hemanth Yamijala
John, Can you please redirect this to pig-u...@hadoop.apache.org ? You're more likely to get good responses there. Thanks hemanth On Thu, Jul 8, 2010 at 7:01 AM, John Seer pulsph...@yahoo.com wrote: Hello, Is there any way to share shema file in pig for the same table between projects? --

Re: Is heap size allocation of namenode dynamic or static?

2010-07-09 Thread Hemanth Yamijala
Edward, Overall, I think the consideration should be about how much load do you expect to support on your cluster. For HDFS, there's a good amount of information about how much RAM is required to support a certain amount of data stored in DFS; something similar can be found for Map/Reduce as

Re: Intermediate files generated.

2010-07-01 Thread Hemanth Yamijala
Alex, I don't think this is what I am looking for. Essential, I wish to run both mapper as well as reducer. But at the same time, i wish to make sure that the temp files that are used between mappers and reducers are of my choice. Here, the choice means that I can specify the files in HDFS

Re: how often are hadoop configuration files reloaded?

2010-06-29 Thread Hemanth Yamijala
Michael, Configuration is not reloaded for daemons. There is currently no way to refresh configuration once the cluster is started. Some specific aspects - like queue configuration, blacklist nodes can be reloaded based on commands like hadoop admin refreshQueues or some such. Thanks Hemanth On

Re: how to figure out the range of a split that failed?

2010-06-29 Thread Hemanth Yamijala
Hi, I am running a mapreduce job on my hadoop cluster. I am running a 10 gigabytes data and one tiny failed task crashes the whole operation. I am up to 98% complete and throwing away all the finished data seems just like an awful waste. I'd like to save the finished data and run again

Re: hybrid map/reducer scheduler?

2010-06-28 Thread Hemanth Yamijala
Michael, In addition to default FIFO scheduler, there are fair scheduler and capacity scheduler. In some sense, fair scheduler can be considered a user-based scheduling while capacity scheduler does a queue-based scheduling. Is there or will there be a hybrid scheduler that combines the

Re: memory management of capacity scheduling

2010-06-26 Thread Hemanth Yamijala
Shashank, Hi, Setup Info: I have 2 node hadoop (20.2) cluster on Linux boxes. HW info: 16 CPU (Hyperthreaded) RAM: 32 GB I am trying to configure capacity scheduling. I want to use memory management provided by capacity scheduler. But I am facing few issues. I have added

Re: Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.

2010-06-24 Thread Hemanth Yamijala
Hi, I've been getting the following error when trying to run a very simple MapReduce job. Map finishes without problem, but error occurs as soon as it enters Reduce phase. 10/06/24 18:41:00 INFO mapred.JobClient: Task Id : attempt_201006241812_0001_r_00_0, Status : FAILED Shuffle

Re: Feed hdfs with external data.

2010-06-23 Thread Hemanth Yamijala
23, 2010 at 10:52 AM, Hemanth Yamijala yhema...@gmail.comwrote: Pierre, I have a program that generates the data that's supposed to be treated by hadoop. It's a java program that should write right on hdfs. So as a test, I do this:            Configuration config = new

Re: Stuck MR job

2010-06-23 Thread Hemanth Yamijala
Vidhya, Hi  This looks like a trivial problem but would be glad if someone can help..  I have been trying to run a m-r job on my cluster. I had modified my configs (primarily reduced the heap sizes for the task tracker and the data nodes) and restarted my hadoop cluster and the job won't

Re: Feed hdfs with external data.

2010-06-23 Thread Hemanth Yamijala
to point to this path. On Wed, Jun 23, 2010 at 10:52 AM, Hemanth Yamijala yhema...@gmail.comwrote: Pierre, I have a program that generates the data that's supposed to be treated by hadoop. It's a java program that should write right on hdfs. So as a test, I do

Re: Hadoop JobTracker Hanging

2010-06-22 Thread Hemanth Yamijala
There was also https://issues.apache.org/jira/browse/MAPREDUCE-1316 whose cause hit clusters at Yahoo! very badly last year. The situation was particularly noticeable in the face of lots of jobs with failed tasks and a specific fix that enabled OutOfBand heartbeats. The latter (i.e. the OOB

Re: How to set the number of map tasks? (ver 0.20.2)

2010-06-21 Thread Hemanth Yamijala
Felix, I'm using the new Job class: http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Job.html There is a way to set the number of reduce tasks: setNumReduceTasks(int tasks) However, I don't see how to set the number of MAP tasks? I tried to set it through

Re: AccessControlException when calling copyFromLocalFile()

2010-06-02 Thread Hemanth Yamijala
Ted, When the user calling FileSystem.copyFromLocalFile() doesn't have permission to write to certain hdfs path: Thread [main] (Suspended (exception AccessControlException))    DFSClient.mkdirs(String, FsPermission) line: 905    DistributedFileSystem.mkdirs(Path, FsPermission) line: 262    

Re: JNI native library loading problem in standalone mode

2010-05-31 Thread Hemanth Yamijala
Edward, If it's an option to copy the libraries to a fixed location on all the cluster nodes, you could do that and configure them in the library path via mapred.child.java.opts. Please look at http://bit.ly/ab93Z8 (MapReduce tutorial on Hadoop site) to see how to use this config option for

Re: Speculative Execution and Streaming

2010-05-28 Thread Hemanth Yamijala
Greg, Does anybody know whether or not speculative execution works with Hadoop streaming? If so, I have a script that does not appear to ever launch redundant mappers for the slow performers. This may be due to the fact that each mapper quickly reports (inaccurately) that it is 100%

Re: TaskTracker and DataNodes cannot connect to master node (NoRouteToHostException)

2010-05-25 Thread Hemanth Yamijala
Erik, I've been unable to resolve this problem on my own so I've decided to ask for help. I've pasted the logs I have for the DataNode on of the slave nodes. The logs for TaskTracker are essentially the same (i.e. same exception causing a shutdown). Any suggestions or hints as to what

Re: Ordinary file pointer?

2010-05-22 Thread Hemanth Yamijala
Keith, On Sat, May 22, 2010 at 5:01 AM, Keith Wiley kwi...@keithwiley.com wrote: On May 21, 2010, at 16:07 , Mikhail Yakshin wrote: On Fri, May 21, 2010 at 11:09 PM, Keith Wiley wrote: My Java mapper hands its processing off to C++ through JNI.  On the C++ side I need to access a file.  I

Re: Eclipse plugin

2010-05-06 Thread Hemanth Yamijala
Jim, I have two machines, one is Windows XP and another one is Widows Vista. I did the same thing on two machines. Hadoop Eclipse Plugin works fine in Windows XP. But I got an error when I run it in Windows Vista. I copied hadoop-0.20.2-eclipse-plugin into Eclipse/plugins folder and

Re: How to make HOD apply more than one core on each machine?

2010-04-21 Thread Hemanth Yamijala
Song,   I guess you are very close to my point. I mean whether we can find a way to set the qsub parameter ppn? From what I could see in the HOD code, it appears you cannot override the ppn value with HOD. You could look at src/contrib/hod/hodlib/NodePools/torque.py, and specifically the

Re: How to make HOD apply more than one core on each machine?

2010-04-16 Thread Hemanth Yamijala
Song,   I know it is the way to set the capacity of each node, however, I want to know, how can we make Torque manager that we will run more than 1 mapred tasks on each machine. Because if we dont do this, torque will assign other cores on this machine to other tasks, which may cause a

Re: How to make HOD apply more than one core on each machine?

2010-04-15 Thread Hemanth Yamijala
Song,     HOD is good, and can manage a large virtual cluster on a huge physical cluster. but the problem is, it doesnt apply more than one core for each machine, and I have already recieved complaint from our admin! I assume what you want is the Map/Reduce cluster that is started by HOD to

Re: Regarding Capacity Scheduler

2009-05-14 Thread Hemanth Yamijala
Manish, The pre-emption code in capacity scheduler was found to require a good relook and due to the inherent complexity of the problem is likely to have issues of the type you have noticed. We have decided to relook at the pre-emption code from scratch and to this effect removed it from the

Re: HOD questions

2008-12-17 Thread Hemanth Yamijala
Craig, Hello, We have two HOD questions: (1) For our current Torque PBS setup, the number of nodes requested by HOD (-l nodes=X) corresponds to the number of CPUs allocated, however these nodes can be spread across various partially or empty nodes. Unfortunately, HOD does not appear to

Re: Integrate HADOOP and Map/Reduce paradigm into HPC environment

2008-09-05 Thread Hemanth Yamijala
executes the jobtracker on the first node always, which also seems useful to me. It will be nice if you can still try HOD and see if it makes your life simpler in any way. :-) Sorry for my english :-P Regards 2008/9/2 Hemanth Yamijala [EMAIL PROTECTED] Allen Wittenauer wrote: On 8

Re: Integrate HADOOP and Map/Reduce paradigm into HPC environment

2008-09-05 Thread Hemanth Yamijala
executes the jobtracker on the first node always, which also seems useful to me. It will be nice if you can still try HOD and see if it makes your life simpler in any way. :-) Sorry for my english :-P Regards 2008/9/2 Hemanth Yamijala [EMAIL PROTECTED] Allen Wittenauer wrote: On 8

Re: Integrate HADOOP and Map/Reduce paradigm into HPC environment

2008-09-01 Thread Hemanth Yamijala
Allen Wittenauer wrote: On 8/18/08 11:33 AM, Filippo Spiga [EMAIL PROTECTED] wrote: Well but I haven't understand how I should configurate HOD to work in this manner. For HDFS I folllow this sequence of steps - conf/master contain only master node of my cluster - conf/slaves contain all

Re: HoD and locality of TaskTrackers to data (on DataNodes)

2008-03-23 Thread Hemanth Yamijala
Jiaqi, Hi, I have a question about using HoD and the locality of the assigned TaskTrackers to the data. Suppose I have a long-running HDFS installation with TaskTrackers/JobTracker nodes dynamically allocated by HoD, and I uploaded my data to HDFS prior to running my job/allocating nodes using

Re: [HOD] Collecting MapReduce logs

2008-03-10 Thread Hemanth Yamijala
Luca, Luca wrote: Hello everyone, I wonder what is the meaning of hodring.log-destination-uri versus hodring.log-dir. I'd like to collect MapReduce UI logs after a job has been run and the only attribute seems to be hod.hadoop-ui-log-dir, in the hod section. log-destination-uri is a

Re: [HOD] Example script

2008-03-05 Thread Hemanth Yamijala
Luca, #!/bin/bash hadoop --config /home/luca/hod-test jar /mnt/scratch/hadoop/hadoop-0.16.0-examples.jar wordcount file:///mnt/scratch/hadoop/test/part-0 test.hodscript.out Can you try removing the --config from this script ? While running scripts, HOD automatically allocates a directory

Re: [HOD] hdfs:///mapredsystem directory

2008-02-27 Thread Hemanth Yamijala
Mahadev Konar wrote: Hi Luca, Can you do a ls -l on /mapredsystem and send the output? According to permissions for mapreduce the system directories created by jobtracker should be world writable so permissions should have worked as it is for hod. No, it doesn't appear to be working that

Re: HOD question wrt to the virtual cluster log files - where do they end up when the job ends

2008-02-27 Thread Hemanth Yamijala
Jason Venner wrote: As you have all read from my previous emails, we are still pretty low on the HOD learning curve. That is explained. It is new software, so we will improve over time with feedback from our users, like you :-) We are having jobs that terminate and the virtual mapred cluster

Re: More HOD: is there anyway to get HOD to copy back all of the log files to the submit node?

2008-02-27 Thread Hemanth Yamijala
Jason Venner wrote: I have found that HOD writes a series of log files to directories on the virtual cluster master, if you specify log directories. The interesting part is figuring out which machine was the virtual cluster master, if you have a decent sized pool of machines. Can you explain

Re: More HOD questions 0.16.0 - debug log enclosed - help with how to debug

2008-02-26 Thread Hemanth Yamijala
Jason Venner wrote: My hadoop jobs don't start This is configured to use an existing DFS and to unpack a tarball with a cut down 0.16.0 config I have looked in the mom logs on the client machines and am not getting anything meaningful. What is your hod command line ? Specifically, how did

Re: [HOD] Port for MapReduce UI interface

2008-02-26 Thread Hemanth Yamijala
Luca wrote: [hod] xrs-port-range = 1-11000 http-port-range = 1-11000 the Mapred UI is chosen outside this range. There's no port range option for Mapred and HDFS sections currently. You seem to have a use-case for specifying the range within which

<    1   2