Re: HDFS is not loading evenly across all nodes.
Did you run the dfs put commands from the master node? If you're inserting into HDFS from a machine running a DataNode, the local datanode will always be chosen as one of the three replica targets. For more balanced loading, you should use an off-cluster machine as the point of origin. If you experience uneven block distribution, you should also periodically rebalance your cluster by running bin/start-balancer.sh every so often. It will work in the background to move blocks from heavily-laden nodes to underutilized ones. - Aaron On Thu, Jun 18, 2009 at 12:57 PM, openresearch qiming...@openresearchinc.com wrote: Hi all I dfs put a large dataset onto a 10-node cluster. When I observe the Hadoop progress (via web:50070) and each local file system (via df -k), I notice that my master node is hit 5-10 times harder than others, so hard drive is get full quicker than others. Last night load, it actually crash when hard drive was full. To my understand, data should wrap around all nodes evenly (in a round-robin fashion using 64M as a unit). Is it expected behavior of Hadoop? Can anyone suggest a good troubleshooting way? Thanks -- View this message in context: http://www.nabble.com/HDFS-is-not-loading-evenly-across-all-nodes.-tp24099585p24099585.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: HDFS is not loading evenly across all nodes.
As an addendum, running a DataNode on the same machine as a NameNode is generally considered a bad idea because it hurts the NameNode's ability to maintain high throughput. - Aaron On Thu, Jun 18, 2009 at 1:26 PM, Aaron Kimball aa...@cloudera.com wrote: Did you run the dfs put commands from the master node? If you're inserting into HDFS from a machine running a DataNode, the local datanode will always be chosen as one of the three replica targets. For more balanced loading, you should use an off-cluster machine as the point of origin. If you experience uneven block distribution, you should also periodically rebalance your cluster by running bin/start-balancer.sh every so often. It will work in the background to move blocks from heavily-laden nodes to underutilized ones. - Aaron On Thu, Jun 18, 2009 at 12:57 PM, openresearch qiming...@openresearchinc.com wrote: Hi all I dfs put a large dataset onto a 10-node cluster. When I observe the Hadoop progress (via web:50070) and each local file system (via df -k), I notice that my master node is hit 5-10 times harder than others, so hard drive is get full quicker than others. Last night load, it actually crash when hard drive was full. To my understand, data should wrap around all nodes evenly (in a round-robin fashion using 64M as a unit). Is it expected behavior of Hadoop? Can anyone suggest a good troubleshooting way? Thanks -- View this message in context: http://www.nabble.com/HDFS-is-not-loading-evenly-across-all-nodes.-tp24099585p24099585.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Trying to setup Cluster
Are you encountering specific problems? I don't think that hadoop's config files will evaluate environment variables. So $HADOOP_HOME won't be interpreted correctly. For passwordless ssh, see http://rcsg-gsir.imsb-dsgi.nrc-cnrc.gc.ca/documents/internet/node31.html or just check the manpage for ssh-keygen. - Aaron On Wed, Jun 17, 2009 at 9:30 AM, Divij Durve divij.t...@gmail.com wrote: Im trying to setup a cluster with 3 different machines running Fedora. I cant get them to log into the localhost without the password but thats the least of my worries at the moment. I am posting my config files and the master and slave files let me know if anyone can spot a problem with the configs... Hadoop-site.xml ?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? !-- Put site-specific property overrides in this file. -- configuration property namedfs.data.dir/name value$HADOOP_HOME/dfs-data/value finaltrue/final /property property namedfs.name.dir/name value$HADOOP_HOME/dfs-name/value finaltrue/final /property property namehadoop.tmp.dir/name value$HADOOP_HOME/hadoop-tmp/value descriptionA base for other temporary directories./description /property property namefs.default.name/name valuehdfs://gobi.something.something:54310/value descriptionThe name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a FileSystem./description /property property namemapred.job.tracker/name valuekalahari.something.something:54311/value descriptionThe host and port that the MapReduce job tracker runs at. If local, then jobs are run in-process as a single map and reduce task. /description /property property namemapred.system.dir/name value$HADOOP_HOME/mapred-system/value finaltrue/final /property property namedfs.replication/name value1/value descriptionDefault block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. /description /property property namemapred.local.dir/name value$HADOOP_HOME/mapred-local/value namedfs.replication/name value1/value /property /configuration Slave: kongur.something.something master: kalahari.something.something i execute the dfs-start.sh command from gobi.something.something. is there any other info that i should provide in order to help? Also Kongur is where im running the data node the master file on kongur should have localhost in it rite? thanks for the help Divij
Re: Could I collect results from Map-Reduce then output myself ?
If you can make the decision locally, then it should just be performed in the reducer itself: if (guard) { output.collect(k, v); } If you need to know what results will be generated by other calls to reduce() on that same machine, then you'll need to be a bit more clever. If you know that for all jobs you'll run, your results will always fit in a buffer in RAM, then you can put your values in an ArrayList or something and then override Reducer.close() to dump your values into the output collector. Then call super.close(). If you may need to generate more data than will fit in RAM, or you need the results of multiple nodes to conference together, then this means you almost certainly want a second MapReduce pass. Your first pass should collect() all the results it generates. Then in a second pass, use an identity mapper that causes the shuffler to sort the data along some axis so that the most desirable data comes first. Then output.collect() this data a second time in the second reducer, discarding the data that doesn't meet your criterion. The input path to your second MR is the output path from the first one. - Aaron On Sun, Jun 14, 2009 at 4:02 PM, Kunsheng Chen ke...@yahoo.com wrote: Hi everyone, I am doing a map-reduce program, it is working good. Now I am thinking of inserting my own algorithm to pick the output results after 'Reduce' other than simply use 'output.colllect()' in Reduce to output all results. The only thing I could think is read the output file after JobClient finishing and does some Java program for that, but I am not sure whether there are efficient method provided by hadoop to handle that. Any idea is well appreciated, -Kun
Re: how to transfer data from one reduce to another map
You can add multiple paths to the same job -- FileInputFormat.addInputPath() can be called multiple times. - Aaron On Mon, Jun 15, 2009 at 6:56 AM, bharath vissapragada bharathvissapragada1...@gmail.com wrote: if your doubt is related to chaining of mapreduce jobs .. then this link might be useful ... http://developer.yahoo.com/hadoop/tutorial/module4.html#chaining On Mon, Jun 15, 2009 at 7:22 PM, HRoger hanxianiyongro...@163.com wrote: I'm sorry for my confusing description,It is job2 has to use two input source one from job1's output and another from anywhere. TimRobertson100 wrote: Hi I am not sure I understand the question correctly. If you mean you want to use the output of Job1 as the input of Job2, then you can set the input path to the second job as the output path (e.g. output directory) from the first job. Cheers Tim On Mon, Jun 15, 2009 at 3:30 PM, HRogerhanxianyongro...@163.com wrote: Hi ! I write a application which has two jobs: the second job use the input datasource same as the first job's added the the output(some objects) of first job.Can I transfer some objects from one job to another job or make the job has two input source? -- View this message in context: http://www.nabble.com/how-to-transfer-data-from-one-reduce-to-another-map-tp24034706p24034706.html Sent from the Hadoop core-user mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/how-to-transfer-data-from-one-reduce-to-another-map-tp24034706p24035057.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Debugging Map-Reduce programs
On Mon, Jun 15, 2009 at 10:01 AM, bharath vissapragada bhara...@students.iiit.ac.in wrote: Hi all , When running hadoop in local mode .. can we use print statements to print something to the terminal ... Yes. In distributed mode, each task will write its stdout/stderr to files which you can access through the web-based interface. Also iam not sure whether the program is reading my input files ... If i keep print statements it isn't displaying any .. can anyone tell me how to solve this problem. Is it generating exceptions? Are the files present? If you're running in local mode, you can use a debugger; set a breakpoint in your map() method and see if it gets there. How are you configuring the input files for your job? Thanks in adance,
Re: Help - ClassNotFoundException from tasktrackers
mapred.local.dir can definitely work with multiple paths, but it's a bit strict about the format. You should have one or more paths, separated by commas and no whitespace. e.g. /disk1/mapred,/disk2/mapred. If you've got them on new lines, etc, then it might try to interpret that as part of one of the paths. Could this be your issue? If you paste your hadoop-site.xml in to an email we could take a look. - Aaron On Wed, Jun 10, 2009 at 2:41 AM, Harish Mallipeddi harish.mallipe...@gmail.com wrote: Hi, After some amount of debugging, I narrowed down the problem. I'd set multiple paths (multiple disks mounted at different endpoints) for the * mapred.local.dir* setting. When I changed this to just one path, everything started working - previously the job jar and the job xml weren't being created under the local/taskTracker/jobCache folder. Any idea why? Am I doing something wrong? I've read somewhere else that specifying multiple disks for mapred.local.dir is important to increase disk bandwidth so that the map outputs get written faster to the local disks on the tasktracker nodes. Cheers, Harish On Tue, Jun 9, 2009 at 5:21 PM, Harish Mallipeddi harish.mallipe...@gmail.com wrote: I setup a new cluster (1 namenode + 2 datanodes). I'm trying to run the GenericMRLoadGenerator program from hadoop-0.20.0-test.jar. But I keep getting the ClassNotFoundException. Any reason why this would be happening? It seems to me like the tasktrackers cannot find the class files from the hadoop program but when you do hadoop jar, it will automatically ship the job jar file to all the tasktracker nodes and put them in the classpath? $ *bin/hadoop jar hadoop-0.20.0-test.jar loadgen*-Dtest.randomtextwrite.bytes_per_map=1048576 -Dhadoop.sort.reduce.keep.percent=50.0 -outKey org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text No input path; ignoring InputFormat Job started: Tue Jun 09 03:16:30 PDT 2009 09/06/09 03:16:30 INFO mapred.JobClient: Running job: job_200906090316_0001 09/06/09 03:16:31 INFO mapred.JobClient: map 0% reduce 0% 09/06/09 03:16:40 INFO mapred.JobClient: Task Id : attempt_200906090316_0001_m_00_0, Status : FAILED java.io.IOException: Split class org.apache.hadoop.mapred.GenericMRLoadGenerator$IndirectInputFormat$IndirectSplit not found at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:324) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) *Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapred.GenericMRLoadGenerator$IndirectInputFormat$IndirectSplit * at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:252) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:761) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:321) ... 2 more [more of the same exception from different map tasks...I've removed them] Actually the same thing happens even if I run the PiEstimator program from hadoop-0.20.0-examples.jar. Thanks, -- Harish Mallipeddi http://blog.poundbang.in -- Harish Mallipeddi http://blog.poundbang.in
Re: Hadoop benchmarking
Hi Stephen, That will set the maximum heap allowable, but doesn't tell Hadoop's internal systems necessarily to take advantage of it. There's a number of other settings that adjust performance. At Cloudera we have a config tool that generates Hadoop configurations with reasonable first-approximation values for your cluster -- check out http://my.cloudera.com and look at the hadoop-site.xml it generates. If you start from there you might find a better parameter space to explore. Please share back your findings -- we'd love to tweak the tool even more with some external feedback :) - Aaron On Wed, Jun 10, 2009 at 7:39 AM, stephen mulcahy stephen.mulc...@deri.orgwrote: Hi, I'm currently doing some testing of different configurations using the Hadoop Sort as follows, bin/hadoop jar hadoop-*-examples.jar randomwriter -Dtest.randomwrite.total_bytes=107374182400 /benchmark100 bin/hadoop jar hadoop-*-examples.jar sort /benchmark100 rand-sort The only changes I've made from the standard config are the following in conf/mapred-site.xml property namemapred.child.java.opts/name value-Xmx1024M/value /property property namemapred.tasktracker.map.tasks.maximum/name value8/value /property property namemapred.tasktracker.reduce.tasks.maximum/name value4/value /property I'm running this on 4 systems, each with 8 processor cores and 4 separate disks. Is there anything else I should change to stress memory more? The systems in questions have 16GB of memory but the most thats getting used during a run of this benchmark is about 2GB (and most of that seems to be os caching). Thanks, -stephen -- Stephen Mulcahy, DI2, Digital Enterprise Research Institute, NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland http://di2.deri.iehttp://webstar.deri.iehttp://sindice.com
Re: Large size Text file split
The FileSplit boundaries are rough edges -- the mapper responsible for the previous split will continue until it finds a full record, and the next mapper will read ahead and only start on the first record boundary after the byte offset. - Aaron On Wed, Jun 10, 2009 at 7:53 PM, Wenrui Guo wenrui@ericsson.com wrote: I think the default TextInputFormat can meet my requirement. However, even if the JavaDoc of TextInputFormat says the TextInputFormat could divide input file as text lines which ends with CRLF. But I'd like to know if the FileSplit size is not N times of line length, what will be happen eventually? BR/anderson -Original Message- From: jason hadoop [mailto:jason.had...@gmail.com] Sent: Wednesday, June 10, 2009 8:39 PM To: core-user@hadoop.apache.org Subject: Re: Large size Text file split There is always NLineInputFormat. You specify the number of lines per split. The key is the position of the line start in the file, value is the line itself. The parameter mapred.line.input.format.linespermap controls the number of lines per split On Wed, Jun 10, 2009 at 5:27 AM, Harish Mallipeddi harish.mallipe...@gmail.com wrote: On Wed, Jun 10, 2009 at 5:36 PM, Wenrui Guo wenrui@ericsson.com wrote: Hi, all I have a large csv file ( larger than 10 GB ), I'd like to use a certain InputFormat to split it into smaller part thus each Mapper can deal with piece of the csv file. However, as far as I know, FileInputFormat only cares about byte size of file, that is, the class can divide the csv file as many part, and maybe some part is not a well-format CVS file. For example, one line of the CSV file is not terminated with CRLF, or maybe some text is trimed. How to ensure each FileSplit is a smaller valid CSV file using a proper InputFormat? BR/anderson If all you care about is the splits occurring at line boundaries, then TextInputFormat will work. http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapre d/TextInputFormat.html If not I guess you can write your own InputFormat class. -- Harish Mallipeddi http://blog.poundbang.in -- Pro Hadoop, a book to guide you from beginner to hadoop mastery, http://www.apress.com/book/view/9781430219422 www.prohadoopbook.com a community for Hadoop Professionals
Re: Hadoop scheduling question
Hi Kristi :) JobControl, as I understand it, runs on the client's machine and sends Job objects to the JobTracker when their dependencies are all met. So once the dependencies are met, they go into the JobTracker's normal scheduler queue. There are three different scheduler implementations that you can choose from. All of these support multiple concurrent jobs, if there are more task slots available than tasks. So if a job contains 50 map tasks, and there are 64 map task slots (e.g., 8 machines with a tasktracker.max.map.tasks == 8) available, then the whole job will be scheduled, then the first 14 tasks from the next job can run simultaneously. The schedulers differ in what happens when a cluster is already loaded up with work. The default scheduler is the FIFO scheduler. Jobs are in a queue and all tasks from job 1 will be satisfied before any tasks from job 2 are dispatched to task trackers. There's another scheduler called the FairScheduler. The user configures a set of pools on the cluster. Each job, when created, is bound to a specific pool. Each pool is guaranteed a minimum share of the scheduler. e.g., my pool may guarantee me a minimum of 3 map task slots and 1 reduce slot. If the whole cluster is loaded up, then the scheduler will prioritise my tasks up in its dispatch line until I'm getting my 3 task minimum. If there are multiple jobs which are tagged with different pools, then the fair scheduler's algorithm balances between them. The aaron pool may have a 3 task minimum. The bob pool may have a 6 task minimum. There might be many more task slots available than 9. If aaron and bob both submit jobs at once, then aaron will get 3/9 of the available task slots. bob will get 6/9. But all tasks from a given user are usually put in the same pool, so they will run in the same shared set of resources. Multiple tasks in a pool can run in parallel if the pool has slots available. Finally, there's a third scheduler called the Capacity scheduler. It's similar to the fair scheduler, in that it allows guarantees of minimum availability for different pools. I don't know how it apportions additional extra resources though -- this is the one I'm least familiar with. Someone else will have to chime in here. - Aaron On Fri, Jun 5, 2009 at 12:19 PM, Scott Carey sc...@richrelevance.comwrote: Even more general context: Cascading does something similar, but I am not sure if it uses Hadoop's JobControl or manages dependencies itself. It definitely runs multiple jobs in parallel when the dependencies allow it. On 6/5/09 11:44 AM, Alan Gates ga...@yahoo-inc.com wrote: To add a little context, Pig uses Hadoop's JobControl to schedule it's jobs. Pig defines the dependencies between jobs in JobControl, and then submits the entire graph of jobs. So, using JobControl, does Hadoop schedule jobs serially or in parallel (assuming no dependencies)? Alan. On Jun 5, 2009, at 10:50 AM, Kristi Morton wrote: Hi Pankil, Sorry about having to send my question email twice to the list... the first time I sent it I had forgotten to subscribe to the list. I resent it after subscribing, and your response to the first email I sent did not make it into my inbox. I saw your response on the archives list. So, to recap, you said: We are not able to carry out all joins in a single job..we also tried our hadoop code using Pig scripts and found that for each join in PIG script new job is used.So basically what i think its a sequential process to handle typesof join where output of one job is required s an input to other one. I, too, have seen this sequential behavior with joins. However, it seems like it could be possible for there to be two jobs executing in parallel whose output is the input to the subsequent job. Is this possible or are all jobs scheduled sequentially? Thanks, Kristi
Re: Command-line jobConf options in 0.18.3
Can you give an example of the exact arguments you're sending on the command line? - Aaron On Wed, Jun 3, 2009 at 5:46 PM, Ian Soboroff ian.sobor...@nist.gov wrote: If after I call getConf to get the conf object, I manually add the key/value pair, it's there when I need it. So it feels like ToolRunner isn't parsing my args for some reason. Ian On Jun 3, 2009, at 8:45 PM, Ian Soboroff wrote: Yes, and I get the JobConf via 'JobConf job = new JobConf(conf, the.class)'. The conf is the Configuration object that comes from getConf. Pretty much copied from the WordCount example (which this program used to be a long while back...) thanks, Ian On Jun 3, 2009, at 7:09 PM, Aaron Kimball wrote: Are you running your program via ToolRunner.run()? How do you instantiate the JobConf object? - Aaron On Wed, Jun 3, 2009 at 10:19 AM, Ian Soboroff ian.sobor...@nist.gov wrote: I'm backporting some code I wrote for 0.19.1 to 0.18.3 (long story), and I'm finding that when I run a job and try to pass options with -D on the command line, that the option values aren't showing up in my JobConf. I logged all the key/value pairs in the JobConf, and the option I passed through with -D isn't there. This worked in 0.19.1... did something change with command-line options from 18 to 19? Thanks, Ian
Re: How do I convert DataInput and ResultSet to array of String?
e.g. for readFields(), myItems = new ArrayListString(); int numItems = dataInput.readInt(); for (i = 0; i numItems; i++) { myItems.add(Text.readString(dataInput)); } then on the serialization (write) side, send: dataOutput.writeInt(myItems.length()); for (int i = 0; i myItems.length(); i++) { new Text(myItems.get(i)).writeString(dataOutput); } You should look at the source code for ArrayWritable, IntWritable, and Text -- specifically their write() and readFields() methods -- to get a feel for how to write such methods for your own types. - Aaron On Wed, Jun 3, 2009 at 4:19 PM, dealmaker vin...@gmail.com wrote: Would you provide a code sample of it? I don't know how to do serializer in hadoop. I am using the following class as type of my value object: private static class StringArrayWritable extends ArrayWritable { private StringArrayWritable (String [] aSString) { super (aSString); } } Thanks. Aaron Kimball-3 wrote: The text serializer will pull out an entire string by using a null terminator at the end. If you need to know the number of string objects, though, you'll have to serialize that before the strings, then use a for loop to decode the rest of them. - Aaron On Tue, Jun 2, 2009 at 6:01 PM, dealmaker vin...@gmail.com wrote: Thanks. The number of elements in this array of String is unknown until run time. If datainput treats it as a byte array, I still have to know the size of each String. How do I do that? Would you suggest some code samples or links that deal with similar situation like this? The only examples I got are the ones about counting number of words which deal with integers. Thanks. Aaron Kimball-3 wrote: Hi, You can't just turn either of these two types into arrays of strings automatically, because they are interfaces to underlying streams of data. You are required to know what protocol you are implementing -- i.e., how many fields you are transmitting -- and manually read through that many fields yourself. For example, a DataInput object is effectively a pointer into a byte array. There may be many records in that byte array, but you only want to read the fields of the first record out. For DataInput / DataOutput, you can UTF8-decode the next field by calling Text.readString(dataInput) and Text.writeString(dataOutput). For ResultSet, you want resultSet.getString(fieldNum) As (yet another) shameless plug ( :smile: ), check out the tool we just released, which automates database import tasks. It auto-generates the classes necessary for your tables, too. http://www.cloudera.com/blog/2009/06/01/introducing-sqoop/ At the very least, you might want to play with it a bit and read its source code so you have a better idea of how to implement your own class (since you're doing some more creative stuff like building up associative arrays for each field). Cheers, - Aaron On Mon, Jun 1, 2009 at 9:53 PM, dealmaker vin...@gmail.com wrote: bump. Does anyone know? I am using the following class of arraywritable: private static class StringArrayWritable extends ArrayWritable { private StringArrayWritable (String [] aSString) { super (aSString); } } dealmaker wrote: Hi, How do I convert DataInput to array of String? How do I convert ResultSet to array of String? Thanks. Following is the code: static class Record implements Writable, DBWritable { String [] aSAssoc; public void write(DataOutput arg0) throws IOException { throw new UnsupportedOperationException(Not supported yet.); } public void readFields(DataInput in) throws IOException { this.aSAssoc = // How to convert DataInput to String Array? } public void write(PreparedStatement arg0) throws SQLException { throw new UnsupportedOperationException(Not supported yet.); } public void readFields(ResultSet rs) throws SQLException { this.aSAssoc = // How to convert ResultSet to String Array? } } -- View this message in context: http://www.nabble.com/How-do-I-convert-DataInput-and-ResultSet-to-array-of-String--tp23770747p23826464.html Sent from the Hadoop core-user mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/How-do-I-convert-DataInput-and-ResultSet-to-array-of-String--tp23770747p23843679.html Sent from the Hadoop core-user mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/How-do-I-convert-DataInput-and-ResultSet-to-array-of-String--tp23770747p23861270.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Do I need to implement Readfields and Write Functions If I have Only One Field?
If you don't add any member fields, then no, I don't think you need to change anything. - Aaron On Wed, Jun 3, 2009 at 4:11 PM, dealmaker vin...@gmail.com wrote: I have the following as my type of my value object. Do I need to implement readfields and write functions? private static class StringArrayWritable extends ArrayWritable { private StringArrayWritable (String [] aSString) { super (aSString); } } Aaron Kimball-3 wrote: If you can use an existing serializeable type to hold that field (e.g., if it's an integer, then use IntWritable) then you can just get away with that. If you are specifying your own class for a key or value class, then yes, the class must implement readFields() and write(). There's no concept of introspection or other magic to determine how to serialize types by decomposing them into more primitive types - you must manually insert the logic for this in every class you use. - Aaron On Wed, Jun 3, 2009 at 2:41 PM, dealmaker vin...@gmail.com wrote: Hi, I have only one field for the record. I wonder if I even need to define a member variable in class Record. Do I even need to implement readfields and write functions if I have only one field? Can I just use the value object directly instead of a member variable of value object? Thanks. -- View this message in context: http://www.nabble.com/Do-I-need-to-implement-Readfields-and-Write-Functions-If-I-have-Only-One-Field--tp23860009p23860009.html Sent from the Hadoop core-user mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/Do-I-need-to-implement-Readfields-and-Write-Functions-If-I-have-Only-One-Field--tp23860009p23861181.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Subscription
You need to send a message to core-user-subscr...@hadoop.apache.org from the address you want registered. See http://hadoop.apache.org/core/mailing_lists.html - Aaron On Thu, Jun 4, 2009 at 12:10 PM, Akhil langer akhilan...@gmail.com wrote: Please, add me to the hadoop-core user mailing list. email address: *akhilan...@gmail.com* Thank You! Akhil
Re: How do I convert DataInput and ResultSet to array of String?
The text serializer will pull out an entire string by using a null terminator at the end. If you need to know the number of string objects, though, you'll have to serialize that before the strings, then use a for loop to decode the rest of them. - Aaron On Tue, Jun 2, 2009 at 6:01 PM, dealmaker vin...@gmail.com wrote: Thanks. The number of elements in this array of String is unknown until run time. If datainput treats it as a byte array, I still have to know the size of each String. How do I do that? Would you suggest some code samples or links that deal with similar situation like this? The only examples I got are the ones about counting number of words which deal with integers. Thanks. Aaron Kimball-3 wrote: Hi, You can't just turn either of these two types into arrays of strings automatically, because they are interfaces to underlying streams of data. You are required to know what protocol you are implementing -- i.e., how many fields you are transmitting -- and manually read through that many fields yourself. For example, a DataInput object is effectively a pointer into a byte array. There may be many records in that byte array, but you only want to read the fields of the first record out. For DataInput / DataOutput, you can UTF8-decode the next field by calling Text.readString(dataInput) and Text.writeString(dataOutput). For ResultSet, you want resultSet.getString(fieldNum) As (yet another) shameless plug ( :smile: ), check out the tool we just released, which automates database import tasks. It auto-generates the classes necessary for your tables, too. http://www.cloudera.com/blog/2009/06/01/introducing-sqoop/ At the very least, you might want to play with it a bit and read its source code so you have a better idea of how to implement your own class (since you're doing some more creative stuff like building up associative arrays for each field). Cheers, - Aaron On Mon, Jun 1, 2009 at 9:53 PM, dealmaker vin...@gmail.com wrote: bump. Does anyone know? I am using the following class of arraywritable: private static class StringArrayWritable extends ArrayWritable { private StringArrayWritable (String [] aSString) { super (aSString); } } dealmaker wrote: Hi, How do I convert DataInput to array of String? How do I convert ResultSet to array of String? Thanks. Following is the code: static class Record implements Writable, DBWritable { String [] aSAssoc; public void write(DataOutput arg0) throws IOException { throw new UnsupportedOperationException(Not supported yet.); } public void readFields(DataInput in) throws IOException { this.aSAssoc = // How to convert DataInput to String Array? } public void write(PreparedStatement arg0) throws SQLException { throw new UnsupportedOperationException(Not supported yet.); } public void readFields(ResultSet rs) throws SQLException { this.aSAssoc = // How to convert ResultSet to String Array? } } -- View this message in context: http://www.nabble.com/How-do-I-convert-DataInput-and-ResultSet-to-array-of-String--tp23770747p23826464.html Sent from the Hadoop core-user mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/How-do-I-convert-DataInput-and-ResultSet-to-array-of-String--tp23770747p23843679.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Do I need to implement Readfields and Write Functions If I have Only One Field?
If you can use an existing serializeable type to hold that field (e.g., if it's an integer, then use IntWritable) then you can just get away with that. If you are specifying your own class for a key or value class, then yes, the class must implement readFields() and write(). There's no concept of introspection or other magic to determine how to serialize types by decomposing them into more primitive types - you must manually insert the logic for this in every class you use. - Aaron On Wed, Jun 3, 2009 at 2:41 PM, dealmaker vin...@gmail.com wrote: Hi, I have only one field for the record. I wonder if I even need to define a member variable in class Record. Do I even need to implement readfields and write functions if I have only one field? Can I just use the value object directly instead of a member variable of value object? Thanks. -- View this message in context: http://www.nabble.com/Do-I-need-to-implement-Readfields-and-Write-Functions-If-I-have-Only-One-Field--tp23860009p23860009.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Hadoop ReInitialization.
You can block for safemode exit by running 'hadoop dfsadmin -safemode wait' rather than sleeping for an arbitrary amount of time. More generally, I'm a bit confused what you mean by all this. Hadoop daemons may individually crash, but you should never need to reformat HDFS and start from scratch. If you're doing this, that means that you're probably sticking some important hadoop files in a temp dir that's getting cleaned out or something of the like. Are dfs.data.dir and dfs.name.dir suitably well-protected from tmpwatch or other such housekeeping programs? - Aaron On Wed, Jun 3, 2009 at 4:50 AM, Steve Loughran ste...@apache.org wrote: b wrote: But after formatting and starting DFS i need to wait some time (sleep 60) before putting data into HDFS. Else i will receive NotReplicatedYetException. that means the namenode is up but there aren't enough workers yet.
Re: Command-line jobConf options in 0.18.3
Are you running your program via ToolRunner.run()? How do you instantiate the JobConf object? - Aaron On Wed, Jun 3, 2009 at 10:19 AM, Ian Soboroff ian.sobor...@nist.gov wrote: I'm backporting some code I wrote for 0.19.1 to 0.18.3 (long story), and I'm finding that when I run a job and try to pass options with -D on the command line, that the option values aren't showing up in my JobConf. I logged all the key/value pairs in the JobConf, and the option I passed through with -D isn't there. This worked in 0.19.1... did something change with command-line options from 18 to 19? Thanks, Ian
Re: How do I convert DataInput and ResultSet to array of String?
Hi, You can't just turn either of these two types into arrays of strings automatically, because they are interfaces to underlying streams of data. You are required to know what protocol you are implementing -- i.e., how many fields you are transmitting -- and manually read through that many fields yourself. For example, a DataInput object is effectively a pointer into a byte array. There may be many records in that byte array, but you only want to read the fields of the first record out. For DataInput / DataOutput, you can UTF8-decode the next field by calling Text.readString(dataInput) and Text.writeString(dataOutput). For ResultSet, you want resultSet.getString(fieldNum) As (yet another) shameless plug ( :smile: ), check out the tool we just released, which automates database import tasks. It auto-generates the classes necessary for your tables, too. http://www.cloudera.com/blog/2009/06/01/introducing-sqoop/ At the very least, you might want to play with it a bit and read its source code so you have a better idea of how to implement your own class (since you're doing some more creative stuff like building up associative arrays for each field). Cheers, - Aaron On Mon, Jun 1, 2009 at 9:53 PM, dealmaker vin...@gmail.com wrote: bump. Does anyone know? I am using the following class of arraywritable: private static class StringArrayWritable extends ArrayWritable { private StringArrayWritable (String [] aSString) { super (aSString); } } dealmaker wrote: Hi, How do I convert DataInput to array of String? How do I convert ResultSet to array of String? Thanks. Following is the code: static class Record implements Writable, DBWritable { String [] aSAssoc; public void write(DataOutput arg0) throws IOException { throw new UnsupportedOperationException(Not supported yet.); } public void readFields(DataInput in) throws IOException { this.aSAssoc = // How to convert DataInput to String Array? } public void write(PreparedStatement arg0) throws SQLException { throw new UnsupportedOperationException(Not supported yet.); } public void readFields(ResultSet rs) throws SQLException { this.aSAssoc = // How to convert ResultSet to String Array? } } -- View this message in context: http://www.nabble.com/How-do-I-convert-DataInput-and-ResultSet-to-array-of-String--tp23770747p23826464.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Pseudo-distributed mode problem
Glad you got it sorted out - Aaron On Tue, Jun 2, 2009 at 12:49 AM, Vasyl Keretsman vasi...@gmail.com wrote: Sorry it was my fault. Instead of starting my job as bin/hadoop jar job.jar I ran it as bin/hadoop -cp job.jar. I thought it would be the same. Thanks anyway Vasyl 2009/6/2 Aaron Kimball aa...@cloudera.com: Can you post the contents of your hadoop-site.xml file here? - Aaron On Sat, May 30, 2009 at 2:44 AM, Vasyl Keretsman vasi...@gmail.com wrote: Hi all, I am just getting started with hadoop 0.20 and trying to run a job in pseudo-distributed mode. I configured hadoop according to the tutorial, but it seems it does not work as expected. My map/reduce tasks are running sequencially and output result is stored on local filesystem instead of the dfs space. Job tracker does not see the running job at all. I have checked the logs but don't see any errors either. I have also copied some files manually to the dfs to make sure it works. The only difference between the manual and my configuration is that I had to change the ports for the job tracker and namenode as 9000 and 9001 are already used by other apps on my workstation. Any hints? Thanks Regards, Vasyl
Re: Subdirectory question revisited
There is no technical limit that prevents Hadoop from operating in this fashion; it's simply the case that the included InputFormat implementations do not do so. This behavior has been set in this fashion for a long time, so it's unlikely that it will change soon, as that might break existing applications. But you can write your own subclass of TextInputFormat or SequenceFileInputFormat that overrides the getSplits() method to recursively descend through directories and search for files. - Aaron On Tue, Jun 2, 2009 at 1:22 PM, David Rosenstrauch dar...@darose.netwrote: As per a previous list question ( http://mail-archives.apache.org/mod_mbox/hadoop-core-user/200804.mbox/%3ce75c02ef0804011433x144813e6x2450da7883de3...@mail.gmail.com%3e) it looks as though it's not possible for hadoop to traverse input directories recursively in order to discover input files. Just wondering a) if there's any particular reason why this functionality doesn't exist, and b) if not, if there's any workaround/hack to make it possible. Like the OP, I was thinking it would be helpful to partition my input data by year, month, and day. I figured his would enable me to run jobs against specific date ranges of input data, and thereby speed up the execution of my jobs since they wouldn't have to process every single record. Any way to make this happen? (Or am I totally going about this the wrong way for what I'm trying to achieve?) TIA, DR
Re: start-all.sh not work in hadoop-0.20.0
This exception is logged with a severity level of INFO. I think this is a relatively common exception. The JobTracker should just wait until the DFS exits safemode and then clearing the system directory will proceed as usual. So I don't think that this is killing your JobTracker. Can you confirm that the JobTracker process, in fact, does die? If so, there's probably an exception lower down in the log marked with a severity level of ERROR or FATAL -- do you see any of those? - Aaron On Mon, Jun 1, 2009 at 7:22 AM, Nick Cen cenyo...@gmail.com wrote: Hi All, I can start the hadoop by typing start-dfs.sh and start-mapred.sh. But when i use the start-all.sh only the hdfs is start. The jobtracker's log indicate there is an exception when starting the job-tracker. 2009-06-01 22:06:59,675 INFO org.apache.hadoop.mapred.JobTracker: problem cleaning system directory: hdfs://localhost:54310/tmp/hadoop-nick/mapred/system org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /tmp/hadoop-nick/mapred/system. Name node is in safe mode. The ratio of reported blocks 0. has not reached the threshold 0.9990. Safe mode will be turned off automatically. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:1681) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:1661) at org.apache.hadoop.hdfs.server.namenode.NameNode.delete(NameNode.java:517) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) at org.apache.hadoop.ipc.Client.call(Client.java:739) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy4.delete(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy4.delete(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:550) at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:227) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:1637) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:174) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3528) is this exception cause by no time delay between the start-dfs.sh and the start-mapred.sh script? Thanks. -- http://daily.appspot.com/food/
Re: what is the efficient way to implement InputFormat
The major problem with this model is that an InputSplit is the unit of work that comprises a map task. So no map tasks can be created before you know how many InputSplits there are, and what their nature is. This means that the creation of input splits is performed on the client. This is inherently single-threaded (or at least, single-node); and if the client machine is physically far away from the compute cluster, requires that the cluster pass a lot of data to a single node. Given that MapReduce is designed to allow for reading files in parallel via many data-local map tasks, this is something of an anti-pattern. You want to delay reading the data files until after the task starts. To read through your data in a high-performance way, you're going to have to rethink how you compute the input work units. Small tweaks to your data format may be all you need. If one of your two files has lines or other delimited records which reference the other file, maybe you can use the regular TextInputFormat on the first file, and then just do HDFS reads on the second file from within your mappers. Remember, the TextInputFormat doesn't magically know where newlines are. It accepts that there's fuzziness. It creates splits based purely on block boundaries, but the borders are a bit elastic. If a line continues past the end of a block boundary, the previous task will keep reading to the end of the line. And the next task seeks to the block boundary, but then doesn't use any text until it finds a newline and has aligned on a new record. If your data format can be modified so that you can put in some sort of record delimiters in one of the files, then you can leverage similar behavior in your InputFormat. Or else if you know how many splits you are going to have ahead of time (Even without knowing their exact contents), you can just write out a file that contains the numbers 1 ... N where N is the number of splits you have. Then use NLineInputFormat to read this file. Each mapper then gets its split id as its official record, then you use the FileSystem interface to access the i'th split of data from within map task i (0 i = N). I hope these suggestions help. Good luck, - Aaron On Mon, Jun 1, 2009 at 4:57 PM, Kunsheng Chen ke...@yahoo.com wrote: Hi Sun, Sounds you are dealing with I met before. The format of my file is something like this, there is only a space between each element source destination degree timestamp. for example: http://www.google.com http://something.net 3 1234555 The source is my key for map and reduce, I just use 'String [] splits = value.toString().split( );' to split everything. Maybe you are looking for something more complicated, hope this helps. --- On Mon, 6/1/09, Zhengguo 'Mike' SUN zhengguo...@yahoo.com wrote: From: Zhengguo 'Mike' SUN zhengguo...@yahoo.com Subject: what is the efficient way to implement InputFormat To: core-user@hadoop.apache.org Date: Monday, June 1, 2009, 8:52 PM Hi, All, The input of my MapReduce job is two large txt files. And an InputSplit consists of a portion of the file from both files. And this Split is content dependent. So I have to read the input file to generate a split. Now the thing is that most of the time is spent in generating these splits. The Map and Reduce phases actually take less time than that. I was wondering if there is an efficient way to generate splits from files. My InputFormat class is based on FileInputFormat. The getSplits function of FileInputFormat doesn't read input file. But this is impossible for me because my split depends on the content of the file. Any ideas or comments are appreciated.
Re: Distributed Lucene Questions
According to the JIRA issue mentioned, it doesn't seem to be production ready. The author claims it is a prototype... not ready for inclusion. The patch was posted over a year ago, and there's been no further work or discussion. You might want to email the author Mark Butler ( https://issues.apache.org/jira/secure/ViewProfile.jspa?name=butlermh) directly. This probably renders the performance data question moot. In general, if you're using Hadoop and HDFS to serve some content, updates will need to be performed by rewriting a whole index. So frequent updates are going to be troublesome. Likely what will happen is that updates will need to be batched up, rewritten to new index files, and then those will be installed in place of the outdated ones. I haven't read their design doc, though, so they might do something different, but since HDFS doesn't allow for modification of closed files, it'll be challenging to be more clever than that. You might want to go with option (1) and investigate using something like memcached, etc, to manage interactive query load. - Aaron On Mon, Jun 1, 2009 at 9:54 AM, Tarandeep Singh tarand...@gmail.com wrote: Hi All, I am trying to build a distributed system to build and serve lucene indexes. I came across the Distributed Lucene project- http://wiki.apache.org/hadoop/DistributedLucene https://issues.apache.org/jira/browse/HADOOP-3394 and have a couple of questions. It will be really helpful if someone can provide some insights. 1) Is this code production ready? 2) Does someone has performance data for this project? 3) It allows searches and updates/deletes to be performed at the same time. How well the system will perform if there are frequent updates to the system. Will it handle the search and update load easily or will it be better to rebuild or update the indexes on different machines and then deploy the indexes back to the machines that are serving the indexes? Basically I am trying to choose between the 2 approaches- 1) Use Hadoop to build and/or update Lucene indexes and then deploy them on separate cluster that will take care or load balancing, fault tolerance etc. There is a package in Hadoop contrib that does this, so I can use that code. 2) Use and/or modify the Distributed Lucene code. I am expecting daily updates to our index so I am not sure if Distribtued Lucene code (which allows searches and updates on the same indexes) will be able to handle search and update load efficiently. Any suggestions ? Thanks, Tarandeep
Re: Pseudo-distributed mode problem
Can you post the contents of your hadoop-site.xml file here? - Aaron On Sat, May 30, 2009 at 2:44 AM, Vasyl Keretsman vasi...@gmail.com wrote: Hi all, I am just getting started with hadoop 0.20 and trying to run a job in pseudo-distributed mode. I configured hadoop according to the tutorial, but it seems it does not work as expected. My map/reduce tasks are running sequencially and output result is stored on local filesystem instead of the dfs space. Job tracker does not see the running job at all. I have checked the logs but don't see any errors either. I have also copied some files manually to the dfs to make sure it works. The only difference between the manual and my configuration is that I had to change the ports for the job tracker and namenode as 9000 and 9001 are already used by other apps on my workstation. Any hints? Thanks Regards, Vasyl
Re: Username in Hadoop cluster
A slightly longer answer: If you're starting daemons with bin/start-dfs.sh or start-all.sh, you'll notice that these defer to hadoop-daemons.sh to do the heavy lifting. This evaluates the string: cd $HADOOP_HOME \; $bin/hadoop-daemon.sh --config $HADOOP_CONF_DIR $@ and passes it to an underlying loop to execute on all the slaves via ssh. $bin and $HADOOP_HOME are thus macro-replaced on the server-side. The more problematic one here is the $bin one, which resolves to the absolute path of the cwd on the server that is starting Hadoop. You've got three basic options: 1) Install Hadoop in the exact same path on all nodes 2) Modify bin/hadoop-daemons.sh to do something more clever on your system by deferring evaluation of HADOOP_HOME and the bin directory (probably really hairy; you might have to escape the variable names more than once since there's another script named slaves.sh that this goes through) 3) Start the slaves manually on each node by logging in yourself, and doing a cd $HADOOP_HOME bin/hadoop-daemon.sh datanode start As a shameless plug, Cloudera's distribution for Hadoop ( www.cloudera.com/hadoop) will also provide init.d scripts so that you can start Hadoop daemons via the 'service' command. By default, the RPM installation will also standardize on the hadoop username. But you can't install this without being root. - Aaron On Tue, May 26, 2009 at 12:30 PM, Alex Loddengaard a...@cloudera.comwrote: It looks to me like you didn't install Hadoop consistently across all nodes. xxx.xx.xx.251: bash: /home/utdhadoop1/Hadoop/ hadoop-0.18.3/bin/hadoop-daemon.sh: No such file or directory The above makes me suspect that xxx.xx.xx.251 has Hadoop installed differently. Can you try and locate hadoop-daemon.sh on xxx.xx.xx.251 and adjust its location properly? Alex On Mon, May 25, 2009 at 10:25 PM, Pankil Doshi forpan...@gmail.com wrote: Hello, I tried adding usern...@hostname for eachentry in slaves file. My slave file have 2 data nodes.it looks like below localhost utdhado...@xxx.xx.xx.229 utdhad...@xxx.xx.xx.251 error what I get when i start dfs is as below: starting namenode, logging to /home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/../logs/hadoop-utdhadoop1-namenode-opencirrus-992.hpl.hp.com.out xxx.xx.xx.229: starting datanode, logging to /home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/../logs/hadoop-utdhadoop1-datanode-opencirrus-992.hpl.hp.com.out *xxx.xx.xx.251: bash: line 0: cd: /home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/..: No such file or directory xxx.xx.xx.251: bash: /home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/hadoop-daemon.sh: No such file or directory *localhost: datanode running as process 25814. Stop it first. xxx.xx.xx.229: starting secondarynamenode, logging to /home/utdhadoop1/Hadoop/hadoop-0.18.3/bin/../logs/hadoop-utdhadoop1-secondarynamenode-opencirrus-992.hpl.hp.com.out localhost: secondarynamenode running as process 25959. Stop it first. Basically it looks for * /home/utdhadoop1/Hadoop/** hadoop-0.18.3/bin/hadoop-**daemon.sh but instead it should look for /home/utdhadoop/Hadoop/ as xxx.xx.xx.251 has username as utdhadoop* . Any inputs?? Thanks Pankil On Wed, May 20, 2009 at 6:30 PM, Todd Lipcon t...@cloudera.com wrote: On Wed, May 20, 2009 at 4:14 PM, Alex Loddengaard a...@cloudera.com wrote: First of all, if you can get all machines to have the same user, that would greatly simplify things. If, for whatever reason, you absolutely can't get the same user on all machines, then you could do either of the following: 1) Change the *-all.sh scripts to read from a slaves file that has two fields: a host and a user To add to what Alex said, you should actually already be able to do this with the existing scripts by simply using the format usern...@hostname for each entry in the slaves file. -Todd
Re: Question regarding dwontime model for DataNodes
A DataNode is regarded as dead after 10 minutes of inactivity. If a DataNode is down for 10 minutes, and quickly returns, then nothing happens. After 10 minutes, its blocks are assumed to be lost forever. So the NameNode then begins scheduling those blocks for re-replication from the two surviving copies. There is no real-time bound on how long this process takes. It's a function of your available network bandwidth, server loads, disk speeds, amount of storage used, etc. Blocks which are down to a single surviving replica are prioritized over blocks which have two surviving replicas (in the event of a simultaneous or near-simultaneous double fault), since they are more vulnerable. If a DataNode does reappear, then its re-replication is cancelled, and over-provisioned blocks are scaled back to the target number of replicas. If that machine comes back with all blocks intact, then the node needs no rebuilding. (In fact, some of the over-provisioned replicas that get removed might be from the original node, if they're available elsewhere too!) Don't forget that machines in Hadoop do not have strong notions of identity. If a particular machine is taken offline and its disks are wiped, the blocks which were there (which also existed in two other places) will be re-replicated elsewhere from the live copies. When that same machine is then brought back online, it has no incentive to copy back all the blocks that it used to have, as there will be three replicas elsewhere in the cluster. Blocks are never permanently bound to particular machines. If you add recommissed or new nodes to a cluster, you should run the rebalancing script which will take a random sampling of blocks from heavily-laden nodes and move them onto emptier nodes in an attempt to spread the data as evenly as possible. - Aaron On Tue, May 26, 2009 at 3:08 PM, Joe Hammerman jhammer...@videoegg.comwrote: Hello Hadoop Users list: We are running Hadoop version 0.18.2. My team lead has asked me to investigate the answer to a particular question regarding Hadoop's handling of offline DataNodes - specifically, we would like to know how long a node can be offline before it is totally rebuilt when it has been readded to the cluster. From what I've been able to determine from the documentation it appears to me that the NameNode will simply begin scheduling block replication on its remaining cluster members. If the offline node comes back online, and it reports all its blocks as being uncorrupted, then the NameNode just cleans up the extra blocks. In other words, there is no explicit handling based on the length of the outage - the behavior of the cluster will depend entirely on the outage duration. Anyone care to shed some light on this? Thanks! Regards, Joseph Hammerman
Re: Runtime error for wordcount-like example.
You said the folder may_24 might be there -- but you need to provide the full name of a class, not just a package. A folder denotes a package. So is there a may_24.class file? Or is the main() method in some may_24/Foo.class? If so, then you need to run bin/hadoop jar hadoop_out_degree.jar may_24.Foo (args...) Creating the jar using Sun's jar program should automatically insert the necessary manifest. What does 'jar tf hadoop_out_degree.jar' return? - Aaron On Sun, May 24, 2009 at 4:54 PM, Edward J. Yoon edwardy...@apache.orgwrote: It looks like that you miss some packages (xxx.xxx.xx.may_24). Check the package names. If not, try to run the hadoop_out_degree.jar with main class. On Mon, May 25, 2009 at 8:36 AM, Kunsheng Chen ke...@yahoo.com wrote: Hello everyone, I am getting some issues when compiling a map-reduce program similar to wordcount example. I followed the instruction in the tutorial and succeeded to compile a jar file. The problem won't appear until I run it as below: had...@lobo:/osr/Projects/anansi/mrjobs/src$ /osr/Projects/anansi/hadoop/bin/hadoop jar ./hadoop_out_degree.jar may_24 outout1/ java.lang.ClassNotFoundException: may_24 at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at java.lang.ClassLoader.loadClass(ClassLoader.java:251) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.util.RunJar.main(RunJar.java:148) at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220) The folder 'may_24' is definitely there, I am not sure whether I missing something when compile, do I have to add manifest file in the class folders ? also I am confused about the error info. Could anyone give me some idea ? Thanks in advance, -KUn -- Best Regards, Edward J. Yoon @ NHN, corp. edwardy...@apache.org http://blog.udanax.org
Re: Specifying NameNode externally to hadoop-site.xml
Same way. Configuration conf = new Configuration(); conf.set(fs.default.name, hdfs://foo); FileSystem fs = FileSystem.get(conf); - Aaron On Mon, May 25, 2009 at 1:02 PM, Stas Oskin stas.os...@gmail.com wrote: Hi. And if I don't use jobs but only DFS for now? Regards. 2009/5/25 jason hadoop jason.had...@gmail.com conf.set(fs.default.name, hdfs://host:port); where conf is the JobConf object of your job, before you submit it. On Mon, May 25, 2009 at 10:16 AM, Stas Oskin stas.os...@gmail.com wrote: Hi. Thanks for the tip, but is it possible to set this in dynamic way via code? Thanks. 2009/5/25 jason hadoop jason.had...@gmail.com if you launch your jobs via bin/hadoop jar jar_file [main class] [options] you can simply specify -fs hdfs://host:port before the jar_file On Sun, May 24, 2009 at 3:02 PM, Stas Oskin stas.os...@gmail.com wrote: Hi. I'm looking to move the Hadoop NameNode URL outside the hadoop-site.xml file, so I could set it at the run-time. Any idea how to do it? Or perhaps there is another configuration that can be applied to the FileSystem object? Regards. -- Alpha Chapters of my book on Hadoop are available http://www.apress.com/book/view/9781430219422 www.prohadoopbook.com a community for Hadoop Professionals -- Alpha Chapters of my book on Hadoop are available http://www.apress.com/book/view/9781430219422 www.prohadoopbook.com a community for Hadoop Professionals
Re: how to do pairwise reduce in multiple steps?!
Tim, There is not a direct way to set up an arbitrary-depth chain of reduces as part of the same job. You have two basic options: 1) Do a level of mapping followed by a single level of your tree reduce. Then follow up with an IdentityMapper forwarding your data to a second level of your tree reduce. Repeat this part until you're done. 2) Leverage the sorted nature of keys being sent into your reducer to topographically sort your results before they are reduced up pairwise. The reducer task which received a set of keys receives all keys that return the same partition number from the Partitioner. The default partitioner returns a partition id of key.hashCode() % numReduceTasks. So by playing with the key's hashCode() method, you could force elements of the same subtree to go to the same reducer. The reducer will then crunch a set of keys and their related values, in sorted order by the compareTo() method of the keys. e.g., for the example below, you could make sure that x, y, z, and w all had the same key.hashCode() so they arrive at the same reducer; then have their compareTo() methods sort them in order: x, y, z, w. You can then compute f(x, y) and store that in memory somewhere (call it 'a'), then compute f(z, w) and store that in 'b'. Then since you know that you computed a full level of the tree, you'd go back through your second-level array in the same reduce task, computing f(a, b), and emitting that to the output collector. You could also emit sentinel-values from the mapper to your reducers to denote the end of a level of the tree, so that you don't have to count in the reducer itself. The amount of parallellism you can achieve is limited roughly by the amount of memory you'll need on a task node to buffer the intermediate values. After each reducer computes an aggregate over a subtree, you'll have only as many results as you have reduce tasks -- maybe 64 or 128 or so -- at which point the final aggregation can be done single-threaded in the main program. It's a bit of a stretch, but you could probably flex MapReduce into doing this for you. - Aaron On Mon, May 25, 2009 at 12:19 AM, Tim Kiefer tim-kie...@gmx.de wrote: Hi everyone, I do have an application that requires to combine/reduce results in pairs - and the results of this step again in pairs. The binary way ;). In case it is not clear what is meant, have a look at the schema: Assume to have 4 inputs (in my case from an earlier MR job). Now the first two inputs need to be combined and the second two. The results need to be combined again. x --- |- x*y --- y --- | |- x*y*z*w z --- | |- z*w --- w --- Note that building x*y from x and y is an expensive operation which is why I want to use 2 nodes to combine the 4 inputs in the first step. It is not an option to collect all inputs and do the combinations on 1 node. I guess I basically want to know whether there is a way to have logN reduce steps in a row to combine all inputs to just one final output?! Any suggestions are appreciated :) - Tim
Re: input/output error while setting up superblock
More specifically: HDFS does not support operations such as opening a file for write/append after it has already been closed, or seeking to a new location in a writer. You can only write files linearly; all other operations will return a not supported error. You'll also find that random-access read performance, while implemented, is not particularly high-throughput. For serving Xen images even in read-only mode, you'll likely have much better luck with a different FS. - Aaron 2009/5/22 Taeho Kang tka...@gmail.com I don't think HDFS is a good place to store your Xen image file as it will likely be updated/appended frequently in small blocks. With the way HDFS is designed for, you can't quite use it like a regular filesystem (e.g. ones that support frequent small block appends/updates in files). My suggestion is to use another filesystem like NAS or SAN. /Taeho 2009/5/22 신승엽 mikas...@naver.com Hi, I have a problem to use hdfs. I mounted hdfs using fuse-dfs. I created a dummy file for 'Xen' in hdfs and then formated the dummy file using 'mke2fs'. But the operation was faced error. The error message is as follows. [r...@localhost hdfs]# mke2fs -j -F ./file_dumy mke2fs 1.40.2 (12-Jul-2007) ./file_dumy: Input/output error while setting up superblock Also, I copyed an image file of xen to hdfs. But Xen couldn't the image files in hdfs. r...@localhost hdfs]# fdisk -l fedora6_demo.img last_lba(): I don't know how to handle files with mode 81a4 You must set cylinders. You can do this from the extra functions menu. Disk fedora6_demo.img: 0 MB, 0 bytes 255 heads, 63 sectors/track, 0 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System fedora6_demo.img1 * 1 156 1253038+ 83 Linux Could you answer me anything about this problem. Thank you.
Re: ssh issues
Pankil, That means that either you're using the wrong ssh key and it's falling back to password authentication, or else you created your ssh keys with passphrases attached; try making new ssh keys with ssh-keygen and distributing those to start again? - Aaron On Thu, May 21, 2009 at 3:49 PM, Pankil Doshi forpan...@gmail.com wrote: The problem is that it also prompts for the pass phrase. On Thu, May 21, 2009 at 2:14 PM, Brian Bockelman bbock...@cse.unl.edu wrote: Hey Pankil, Use ~/.ssh/config to set the default key location to the proper place for each host, if you're going down that route. I'd remind you that SSH is only used as a convenient method to launch daemons. If you have a preferred way to start things up on your cluster, you can use that (I think most large clusters don't use ssh... could be wrong). Brian On May 21, 2009, at 2:07 PM, Pankil Doshi wrote: Hello everyone, I got hint how to solve the problem where clusters have different usernames.but now other problem I face is that i can ssh a machine by using -i path/to key/ ..I cant ssh them directly but I will have to always pass the key. Now i face problem in ssh-ing my machines.Does anyone have any ideas how to deal with that?? Regards Pankil
Re: difference between bytes read and local bytes read?
I believe that bytes read is the total volume of data read into the mappers, whereas local bytes read refers to the bytes read from map tasks which could be scheduled on machines that already held local copies of their inputs. So the ratio denotes a rough measure of scheduler locality efficiency. - Aaron On Tue, May 19, 2009 at 12:54 PM, Foss User foss...@gmail.com wrote: When we see the job details on the job tracker web interface, we see bytes read as well as local bytes read. What is the difference between the two?
Re: Setting up another machine as secondary node
See this regarding instructions on configuring a 2NN on a separate machine from the NN: http://www.cloudera.com/blog/2009/02/10/multi-host-secondarynamenode-configuration/ - Aaron On Thu, May 14, 2009 at 10:42 AM, Koji Noguchi knogu...@yahoo-inc.comwrote: Before 0.19, fsimage/edits were on the same directory. So whenever secondary finishes checkpointing, it copies back the fsimage while namenode still kept on writing to the edits file. Usually we observed some latency on the namenode side during that time. HADOOP-3948 would probably help after 0.19 or later. Koji -Original Message- From: Brian Bockelman [mailto:bbock...@cse.unl.edu] Sent: Thursday, May 14, 2009 10:32 AM To: core-user@hadoop.apache.org Subject: Re: Setting up another machine as secondary node Hey Koji, It's an expensive operation - for the secondary namenode, not the namenode itself, right? I don't particularly care if I stress out a dedicated node that doesn't have to respond to queries ;) Locally we checkpoint+backup fairly frequently (not 5 minutes ... maybe less than the default hour) due to sheer paranoia of losing metadata. Brian On May 14, 2009, at 12:25 PM, Koji Noguchi wrote: The secondary namenode takes a snapshot at 5 minute (configurable) intervals, This is a bit too aggressive. Checkpointing is still an expensive operation. I'd say every hour or even every day. Isn't the default 3600 seconds? Koji -Original Message- From: jason hadoop [mailto:jason.had...@gmail.com] Sent: Thursday, May 14, 2009 7:46 AM To: core-user@hadoop.apache.org Subject: Re: Setting up another machine as secondary node any machine put in the conf/masters file becomes a secondary namenode. At some point there was confusion on the safety of more than one machine, which I believe was settled, as many are safe. The secondary namenode takes a snapshot at 5 minute (configurable) intervals, rebuilds the fsimage and sends that back to the namenode. There is some performance advantage of having it on the local machine, and some safety advantage of having it on an alternate machine. Could someone who remembers speak up on the single vrs multiple secondary namenodes? On Thu, May 14, 2009 at 6:07 AM, David Ritch david.ri...@gmail.com wrote: First of all, the secondary namenode is not a what you might think a secondary is - it's not failover device. It does make a copy of the filesystem metadata periodically, and it integrates the edits into the image. It does *not* provide failover. Second, you specify its IP address in hadoop-site.xml. This is where you can override the defaults set in hadoop-default.xml. dbr On Thu, May 14, 2009 at 9:03 AM, Rakhi Khatwani rakhi.khatw...@gmail.com wrote: Hi, I wanna set up a cluster of 5 nodes in such a way that node1 - master node2 - secondary namenode node3 - slave node4 - slave node5 - slave How do we go about that? there is no property in hadoop-env where i can set the ip-address for secondary name node. if i set node-1 and node-2 in masters, and when we start dfs, in both the m/cs, the namenode n secondary namenode processes r present. but i think only node1 is active. n my namenode fail over operation fails. ny suggesstions? Regards, Rakhi -- Alpha Chapters of my book on Hadoop are available http://www.apress.com/book/view/9781430219422 www.prohadoopbook.com a community for Hadoop Professionals
Re: How to Rename Create Mysql DB Table in Hadoop?
For your use case, you'll need to just do a ranged import (i.e., SELECT * FROM foo WHERE id X and id Y), and then delete the same records after the import succeeds (DELETE FROM foo WHERE id X and id Y). Before the import, you can SELECT max(id) FROM foo to establish what Y should be; X is informed by the previous import operation. You said that you're concerned with the performance of DELETE, but I don't know a better way around this if all your input sources are forced to write to the same table. Ideally you could have a current table and a frozen table; writes always go to the current table and the import is done from the frozen table. Then you can DROP TABLE frozen relatively quickly post-import. At the time of next import you change which table is current and which is frozen, and repeat. In MySQL you can create updateable views, so you might want to use a view as an indirection pointer to synchronously change all your writers from one underlying table to the other. I'll put a shameless plug here -- I'm developing a tool called sqoop designed to import from databases into HDFS; patch is available at http://issues.apache.org/jira/browse/hadoop-5815. It doesn't currently have support for WHERE clauses, but it's on the roadmap. Please check it out and let me know what you think. Cheers, - Aaron On Wed, May 20, 2009 at 9:48 AM, dealmaker vin...@gmail.com wrote: No, my prime objective is not to backup db. I am trying to move the records from mysql db to hadoop for processing. Hadoop itself doesn't keep any records. After that, I will remove the same mysql records processed in hadoop from the mysql db. The main point isn't about getting the mysql records, the main point is removing the same mysql records that are processed in hadoop from mysql db. Edward J. Yoon-2 wrote: Oh.. According to my understanding, To maintain a steady DB size, delete and backup the old records. If so, I guess you can continuously do that using WHERE and LIMIT clauses. Then you can reduce the I/O costs.. It should be dumped at once? On Thu, May 21, 2009 at 12:48 AM, dealmaker vin...@gmail.com wrote: Other parts of the non-hadoop system will continue to add records to mysql db when I move those records (and remove the very same records from mysql db at the same time) to hadoop for processing. That's why I am doing those mysql commands. What are you suggesting? If I do it like you suggest, dump all records from mysql db to a file in hdfs, how do I remove those very same records from the mysql db at the same time? Just rename it first and then dump them and then read them from the hdfs file? or should I do it my way? which way is faster? Thanks. Edward J. Yoon-2 wrote: Hadoop is a distributed filesystem. If you wanted to backup your table data to hdfs, you can use SELECT * INTO OUTFILE 'file_name' FROM tbl_name; Then, put it to hadoop dfs. Edward On Thu, May 21, 2009 at 12:08 AM, dealmaker vin...@gmail.com wrote: No, actually I am using mysql. So it doesn't belong to Hive, I think. owen.omalley wrote: On May 19, 2009, at 11:48 PM, dealmaker wrote: Hi, I want to backup a table and then create a new empty one with following commands in Hadoop. How do I do it in java? Thanks. Since this is a question about Hive, you should be asking on hive-u...@hadoop.apache.org . -- Owen -- View this message in context: http://www.nabble.com/How-to-Rename---Create-DB-Table-in-Hadoop--tp23629956p23637131.html Sent from the Hadoop core-user mailing list archive at Nabble.com. -- Best Regards, Edward J. Yoon @ NHN, corp. edwardy...@apache.org http://blog.udanax.org -- View this message in context: http://www.nabble.com/How-to-Rename---Create-Mysql-DB-Table-in-Hadoop--tp23629956p23638051.html Sent from the Hadoop core-user mailing list archive at Nabble.com. -- Best Regards, Edward J. Yoon @ NHN, corp. edwardy...@apache.org http://blog.udanax.org -- View this message in context: http://www.nabble.com/How-to-Rename---Create-Mysql-DB-Table-in-Hadoop--tp23629956p23639294.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Linking against Hive in Hadoop development tree
I've worked around needing any compile-time dependencies for now. :) No longer an issue. - Aaron On Wed, May 20, 2009 at 10:29 AM, Ashish Thusoo athu...@facebook.comwrote: You could either do what Owen suggested and put the plugin in hive contrib, or you could just put the whole thing in hive contrib as then you would have access to all the lower level api (core, hdfs, hive etc.). Owen's approach makes a lot of sense if you think that the hive dependency is a loose one and you would have plugins for other systems to achieve your goal. However, if this is a hard dependency, then putting it in hive contrib make more sense. Either approach is fine, depending upon your goals. Ashish -Original Message- From: Owen O'Malley [mailto:omal...@apache.org] Sent: Wednesday, May 20, 2009 5:39 AM To: core-user@hadoop.apache.org Subject: Re: Linking against Hive in Hadoop development tree On May 15, 2009, at 3:25 PM, Aaron Kimball wrote: Yikes. So part of sqoop would wind up in one source repository, and part in another? This makes my head hurt a bit. I'd say rather that Sqoop is in Mapred and the adapter to Hive is in Hive. I'm also not convinced how that helps. Clearly, what you need to arrange is to not have a compile time dependence on Hive. Clearly we don't want cycles in the dependence tree, so you need to figure out how to make the adapter for Hive a plugin rather than a part of the Sqoop core. -- Owen
Linking against Hive in Hadoop development tree
Hi all, For the database import tool I'm writing (Sqoop; HADOOP-5815), in addition to uploading data into HDFS and using MapReduce to load/transform the data, I'd like to integrate more closely with Hive. Specifically, to run the CREATE TABLE statements needed to automatically inject table defintions into Hive's metastore for the data files that sqoop loads into HDFS. Doing this requires linking against Hive in some way (either directly by using one of their API libraries, or loosely by piping commands into a Hive instance). In either case, there's a dependency there. I was hoping someone on this list with more Ivy experience than I knows what's the best way to make this happen. Hive isn't in the maven2 repository that Hadoop pulls most of its dependencies from. It might be necessary for sqoop to have access to a full build of Hive. It doesn't seem like a good idea to check that binary distribution into Hadoop svn, but I'm not sure what's the most expedient alternative. Is it acceptable to just require that developers who wish to compile/test/run sqoop have a separate standalone Hive deployment and a proper HIVE_HOME variable? This would keep our source repo clean. The downside here is that it makes it difficult to test Hive-specific integration functionality with Hudson and requires extra leg-work of developers. Thanks, - Aaron Kimball
Re: Linking against Hive in Hadoop development tree
Yikes. So part of sqoop would wind up in one source repository, and part in another? This makes my head hurt a bit. I'm also not convinced how that helps. So if I write (e.g.,) o.a.h.sqoop.HiveImporter and check that into a contrib module in the Hive project, then the main sqoop program (o.a.h.sqoop.Sqoop) still needs to compile against/load at runtime o.a.h.s.HiveImporter. So the net result is the same: building/running a cohesive program requires fetching resources from the hive repo and compiling them in. For the moment, though, I'm finding that the Hive JDBC interface is actually misbehaving more than I care to wrangle with. My current solution is actually to generate script files and run them with hive -f tmpfilename, which doesn't require any compile-time linkage. So maybe this is a nonissue for the moment. - Aaron On Fri, May 15, 2009 at 3:06 PM, Owen O'Malley omal...@apache.org wrote: On May 15, 2009, at 2:05 PM, Aaron Kimball wrote: In either case, there's a dependency there. You need to split it so that there are no cycles in the dependency tree. In the short term it looks like: avro: core: avro hdfs: core mapred: hdfs, core hive: mapred, core pig: mapred, core Adding a dependence from core to hive would be bad. To integrate with Hive, you need to add a contrib module to Hive that adds it. -- Owen
Re: unable to see anything in stdout
First thing I would do is to run the job in the local jobrunner (as a single process on your local machine without involving the cluster): JobConf conf = . // set other params, mapper, etc. here conf.set(mapred.job.tracker, local); // use localjobrunner conf.set(fs.default.name, file:///); // read from local hard disk instead of hdfs JobClient.runJob(conf); This will actually print stdout, stderr, etc. to your local terminal. Try this on a single input file. This will let you confirm that it does, in fact, write to stdout. - Aaron On Thu, Apr 30, 2009 at 9:00 AM, Asim linka...@gmail.com wrote: Hi, I am not able to see any job output in userlogs/task_id/stdout. It remains empty even though I have many println statements. Are there any steps to debug this problem? Regards, Asim
Re: Specifying System Properties in the had
So you want a different -Dfoo=test on each node? It's probably grabbing the setting from the node where the job was submitted, and this overrides the settings on each task node. Try adding finaltrue/final to the property block on the tasktrackers, then restart Hadoop and try again. This will prevent the job from overriding the setting. - Aaron On Thu, Apr 30, 2009 at 9:25 AM, Marc Limotte mlimo...@feeva.com wrote: I'm trying to set a System Property in the Hadoop config, so my jobs will know which cluster they are running on. I think I should be able to do this with -Dname=value in mapred.child.java.opts (example below), but the setting is ignored. In hadoop-site.xml I have: property namemapred.child.java.opts/name value-Xmx200m -Dfoo=test/value /property But the job conf through the web server indicates: mapred.child.java.opts -Xmx1024M -Duser.timezone=UTC I'm using Hadoop-0.17.2.1. Any tips on why my setting is not picked up? Marc PRIVATE AND CONFIDENTIAL - NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT FOR ONLY THE INTENDED RECIPIENT OF THE TRANSMISSION, AND MAY BE A COMMUNICATION PRIVILEGE BY LAW. IF YOU RECEIVED THIS E-MAIL IN ERROR, ANY REVIEW, USE, DISSEMINATION, DISTRIBUTION, OR COPYING OF THIS EMAIL IS STRICTLY PROHIBITED. PLEASE NOTIFY US IMMEDIATELY OF THE ERROR BY RETURN E-MAIL AND PLEASE DELETE THIS MESSAGE FROM YOUR SYSTEM.
Re: Database access in 0.18.3
Cloudera's Distribution for Hadoop is based on Hadoop 0.18.3 but includes a backport of HADOOP-2536. You could switch to this distribution instead. Otherwise, download the 18-branch patch from issue HADOOP-2536 ( http://issues.apache.org/jira/browse/hadoop-2536) and apply it to your local copy and recompile. System.out.println() (and System.err) are writen to separate files for each map and reduce task on each machine; there is not a common console. - Aaron On Mon, Apr 27, 2009 at 9:13 PM, rajeev gupta graj1...@yahoo.com wrote: I need to write output of reduce in database. There is support for this in 0.19 but I am using 0.18.3. Any suggestion? I tried to process output myself in the reduce() function by writing some System.out.println; but it is writing output in userlogs corresponding to the map function (intermediate output). -rajeev
Re: Database access in 0.18.3
Should also say, the link to CDH is http://www.cloudera.com/hadoop - Aaron On Tue, Apr 28, 2009 at 5:06 PM, Aaron Kimball aa...@cloudera.com wrote: Cloudera's Distribution for Hadoop is based on Hadoop 0.18.3 but includes a backport of HADOOP-2536. You could switch to this distribution instead. Otherwise, download the 18-branch patch from issue HADOOP-2536 ( http://issues.apache.org/jira/browse/hadoop-2536) and apply it to your local copy and recompile. System.out.println() (and System.err) are writen to separate files for each map and reduce task on each machine; there is not a common console. - Aaron On Mon, Apr 27, 2009 at 9:13 PM, rajeev gupta graj1...@yahoo.com wrote: I need to write output of reduce in database. There is support for this in 0.19 but I am using 0.18.3. Any suggestion? I tried to process output myself in the reduce() function by writing some System.out.println; but it is writing output in userlogs corresponding to the map function (intermediate output). -rajeev
Re: Processing High CPU Memory intensive tasks on Hadoop - Architecture question
Amit, This can be made to work with Hadoop. Basically, in your mapper's configure stage it would do the heavy load-in process, then it would process your individual work items as records during the actual map stage. A map task can be comprised of many records, so you'll be fine here. If you use Hadoop 0.19 or 0.20, you can also enable JVM reuse, where multiple map tasks are performed serially in the same JVM instance. In this case, the first task in the JVM would do the heavy load-in process into static fields or other globally-accessible items; subsequent tasks could recognize that the system state is already initialized and would not need to repeat it. The number of mapper/reducer tasks that run in parallel on a given node can be configured with a simple setting; setting this to 6 will work just fine. The capacity / fairshare schedulers are not what you need here -- their main function is to ensure that multiple jobs (separate sets of tasks) can all make progress simultaneously by sharing cluster resources across jobs rather than running jobs in a FIFO fashion. - Aaron On Sat, Apr 25, 2009 at 2:36 PM, amit handa amha...@gmail.com wrote: Hi, We are planning to use hadoop for some very expensive and long running processing tasks. The computing nodes that we plan to use are very heavy in terms of CPU and memory requirement e.g one process instance takes almost 100% CPU (1 core) and around 300 -400 MB of RAM. The first time the process loads it can take around 1-1:30 minutes but after that we can provide the data to process and it takes few seconds to process. Can I model it on hadoop ? Can I have my processes pre-loaded on the task processing machines and the data be provided by hadoop? This will save the 1-1:30 minutes of intial load time that it would otherwise take for each task. I want to run a number of these processes in parallel based on the machines capacity (e.g 6 instances on a 8 cpu box) or using capacity scheduler. Please let me know if this is possible or any pointers to how it can be done ? Thanks, Amit
Re: Advice on restarting HDFS in a cron
If your logs were being written to the root partition (/dev/sda1), that's going to fill up fast. This partition is always = 10 GB on EC2 and much of that space is consumed by the OS install. You should redirect your logs to some place under /mnt (/dev/sdb1); that's 160 GB. - Aaron On Sun, Apr 26, 2009 at 3:21 AM, Rakhi Khatwani rakhi.khatw...@gmail.comwrote: Hi, I have faced somewhat a similar issue... i have a couple of map reduce jobs running on EC2... after a week or so, i get a no space on device exception while performing any linux command... so end up shuttin down hadoop and hbase, clear the logs and then restart them. is there a cleaner way to do it??? thanks Raakhi On Fri, Apr 24, 2009 at 11:59 PM, Todd Lipcon t...@cloudera.com wrote: On Fri, Apr 24, 2009 at 11:18 AM, Marc Limotte mlimo...@feeva.com wrote: Actually, I'm concerned about performance of map/reduce jobs for a long-running cluster. I.e. it seems to get slower the longer it's running. After a restart of HDFS, the jobs seems to run faster. Not concerned about the start-up time of HDFS. Hi Marc, Does it sound like this JIRA describes your problem? https://issues.apache.org/jira/browse/HADOOP-4766 If so, restarting just the JT should help with the symptoms. (I say symptoms because this is clearly a problem! Hadoop should be stable and performant for months without a cluster restart!) -Todd Of course, as you suggest, this could be poor configuration of the cluster on my part; but I'd still like to hear best practices around doing a scheduled restart. Marc -Original Message- From: Allen Wittenauer [mailto:a...@yahoo-inc.com] Sent: Friday, April 24, 2009 10:17 AM To: core-user@hadoop.apache.org Subject: Re: Advice on restarting HDFS in a cron On 4/24/09 9:31 AM, Marc Limotte mlimo...@feeva.com wrote: I've heard that HDFS starts to slow down after it's been running for a long time. And I believe I've experienced this. We did an upgrade (== complete restart) of a 2000 node instance in ~20 minutes on Wednesday. I wouldn't really consider that 'slow', but YMMV. I suspect people aren't running the secondary name node and therefore have massively large edits file. The name node appears slow on restart because it has to apply the edits to the fsimage rather than having the secondary keep it up to date. -Original Message- From: Marc Limotte Hi. I've heard that HDFS starts to slow down after it's been running for a long time. And I believe I've experienced this. So, I was thinking to set up a cron job to execute every week to shutdown HDFS and start it up again. In concept, it would be something like: 0 0 0 0 0 $HADOOP_HOME/bin/stop-dfs.sh; $HADOOP_HOME/bin/start-dfs.sh But I'm wondering if there is a safer way to do this. In particular: * What if a map/reduce job is running when this cron hits. Is there a way to suspend jobs while the HDFS restart happens? * Should I also restart the mapred daemons? * Should I wait some time after stop-dfs.sh for things to settle down, before executing start-dfs.sh? Or maybe I should run a command to verify that it is stopped before I run the start? Thanks for any help. Marc PRIVATE AND CONFIDENTIAL - NOTICE TO RECIPIENT: THIS E-MAIL IS MEANT FOR ONLY THE INTENDED RECIPIENT OF THE TRANSMISSION, AND MAY BE A COMMUNICATION PRIVILEGE BY LAW. IF YOU RECEIVED THIS E-MAIL IN ERROR, ANY REVIEW, USE, DISSEMINATION, DISTRIBUTION, OR COPYING OF THIS EMAIL IS STRICTLY PROHIBITED. PLEASE NOTIFY US IMMEDIATELY OF THE ERROR BY RETURN E-MAIL AND PLEASE DELETE THIS MESSAGE FROM YOUR SYSTEM.
Re: Processing High CPU Memory intensive tasks on Hadoop - Architecture question
I'm not aware of any documentation about this particular use case for Hadoop. I think your best bet is to look into the JNI documentation about loading native libraries, and go from there. - Aaron On Sat, Apr 25, 2009 at 10:44 PM, amit handa amha...@gmail.com wrote: Thanks Aaron, The processing libs that we use, which take time to load are all c++ based .so libs. Can i invoke it from JVM during the configure stage of the mapper and keep it running as you suggested ? Can you point me to some documentation regarding the same ? Regards, Amit On Sat, Apr 25, 2009 at 1:42 PM, Aaron Kimball aa...@cloudera.com wrote: Amit, This can be made to work with Hadoop. Basically, in your mapper's configure stage it would do the heavy load-in process, then it would process your individual work items as records during the actual map stage. A map task can be comprised of many records, so you'll be fine here. If you use Hadoop 0.19 or 0.20, you can also enable JVM reuse, where multiple map tasks are performed serially in the same JVM instance. In this case, the first task in the JVM would do the heavy load-in process into static fields or other globally-accessible items; subsequent tasks could recognize that the system state is already initialized and would not need to repeat it. The number of mapper/reducer tasks that run in parallel on a given node can be configured with a simple setting; setting this to 6 will work just fine. The capacity / fairshare schedulers are not what you need here -- their main function is to ensure that multiple jobs (separate sets of tasks) can all make progress simultaneously by sharing cluster resources across jobs rather than running jobs in a FIFO fashion. - Aaron On Sat, Apr 25, 2009 at 2:36 PM, amit handa amha...@gmail.com wrote: Hi, We are planning to use hadoop for some very expensive and long running processing tasks. The computing nodes that we plan to use are very heavy in terms of CPU and memory requirement e.g one process instance takes almost 100% CPU (1 core) and around 300 -400 MB of RAM. The first time the process loads it can take around 1-1:30 minutes but after that we can provide the data to process and it takes few seconds to process. Can I model it on hadoop ? Can I have my processes pre-loaded on the task processing machines and the data be provided by hadoop? This will save the 1-1:30 minutes of intial load time that it would otherwise take for each task. I want to run a number of these processes in parallel based on the machines capacity (e.g 6 instances on a 8 cpu box) or using capacity scheduler. Please let me know if this is possible or any pointers to how it can be done ? Thanks, Amit
Re: When will hadoop 0.19.2 be released?
In general, there is no way to do an automated downgrade of HDFS metadata :\ If you're up an HDFS version, I'm afraid you're stuck there. The only real way to downgrade requires that you have enough free space to distcp from one cluster to the other. If you have 100 TB of free space (!!) then that's easy, if time-consuming. If not, then you'll have to be a bit more clever. e.g., by downreplicating all the files first and storing them in downreplicated fashion on the destination hdfs instance until you're done, then upping the replication on the destination cluster after the source cluster has been drained. Waiting for 0.19.2 might be the better call here. - Aaron On Fri, Apr 24, 2009 at 3:12 PM, Zhou, Yunqing azure...@gmail.com wrote: But there are already 100TB data stored on DFS. Is there a safe solution to do such a downgrade? On Fri, Apr 24, 2009 at 2:08 PM, jason hadoop jason.had...@gmail.com wrote: You could try the cloudera release based on 18.3, with many backported features. http://www.cloudera.com/distribution On Thu, Apr 23, 2009 at 11:06 PM, Zhou, Yunqing azure...@gmail.com wrote: currently I'm managing a 64-nodes hadoop 0.19.1 cluster with 100TB data. and I found 0.19.1 is buggy and I have already applied some patches on hadoop jira to solve problems. But I'm looking forward to a more stable release of hadoop. Do you know when will 0.19.2 be released? Thanks. -- Alpha Chapters of my book on Hadoop are available http://www.apress.com/book/view/9781430219422
Re: which is better Text or Custom Class
In general, serializing to text and then parsing back into a different format will always be slower than using a purpose-built class that can serialize itself. The tradeoff, of course, is that going to text is often more convenient from a developer-time perspective. - Aaron On Mon, Apr 20, 2009 at 2:23 PM, chintan bhatt chin1...@hotmail.com wrote: Hi all, I want to ask you about the performance difference between using the Text class and using a custom Class which implements Writable interface. Lets say in InvertedIndex problem when I emit token and a list of document Ids which contains it , using Text we usually Concat the list of document ids with space as a separator d1 d2 d3 d4 etc..If I need the same values in a later step of map reduce, I need to split the value string to get the list of all document Ids. Is it not better to use Writable List instead?? I need to ask it because I am using too many Concats and Splits in my project to use documents total tokens count, token frequency in a particular document etc.. Thanks in advance, Chintan _ Windows Live Messenger. Multitasking at its finest. http://www.microsoft.com/india/windows/windowslive/messenger.aspx
Re: Are SequenceFiles split? If so, how?
Explicitly controlling your splits will be very challenging. Taking the case where you have expensive (X) and cheap (C) objects to process, you may have a file where the records are lined up X C X C X C X X X X X C C C. In this case, you'll need to scan through the whole file and build splits such that the lengthy run of expensive objects is broken up into separate splits, but the run of cheap objects is consolidated. I'm suspicious that you can do this without scanning through the data (which is what often constitutes the bulk of a time in a mapreduce program). But how much data are you using? I would imagine that if you're operating at the scale where Hadoop makes sense, then the high- and low-cost objects will -- on average -- balance out and tasks will be roughly evenly proportioned. In general, I would just dodge the problem by making sure your splits relatively small compared to the size of your input data. If you have 5 million objects to process, then make each split be roughly equal to say 20,000 of them. Then even if some splits take long to process and others take a short time, then one CPU may dispatch with a dozen cheap splits in the same time where one unlucky JVM had to process a single very expensive split. Now you haven't had to manually balance anything, and you still get to keep all your CPUs full. - Aaron On Mon, Apr 20, 2009 at 11:25 PM, Barnet Wagman b.wag...@comcast.netwrote: Thanks Aaron, that really helps. I probably do need to control the number of splits. My input 'data' consists of Java objects and their size (in bytes) doesn't necessarily reflect the amount of time needed for each map operation. I need to ensure that I have enough map tasks so that all cpus are utilized and the job gets done in a reasonable amount of time. (Currently I'm creating multiple input files and making them unsplitable, but subclassing SequenceFileInputFormat to explicitly control then number of splits sounds like a better approach). Barnet Aaron Kimball wrote: Yes, there can be more than one InputSplit per SequenceFile. The file will be split more-or-less along 64 MB boundaries. (the actual edges of the splits will be adjusted to hit the next block of key-value pairs, so it might be a few kilobytes off.) The SequenceFileInputFormat regards mapred.map.tasks (conf.setNumMapTasks()) as a hint, not a set-in-stone metric. (The number of reduce tasks, though, is always 100% user-controlled.) If you need exact control over the number of map tasks, you'll need to subclass it and modify this behavior. That having been said -- are you sure you actually need to precisely control this value? Or is it enough to know how many splits were created? - Aaron On Sun, Apr 19, 2009 at 7:23 PM, Barnet Wagman b.wag...@comcast.net wrote: Suppose a SequenceFile (containing keys and values that are BytesWritable) is used as input. Will it be divided into InputSplits? If so, what's the criteria use for splitting? I'm interested in this because I need to control the number of map tasks used, which (if I understand it correctly), is equal to the number of InputSplits. thanks, bw
Re: Error reading task output
Cam, This isn't Hadoop-specific, it's how Linux treats its network configuration. If you look at /etc/host.conf, you'll probably see a line that says order hosts, bind -- this is telling Linux's DNS resolution library to first read your /etc/hosts file, then check an external DNS server. You could probably disable local hostfile checking, but that means that every time a program on your system queries the authoritative hostname for localhost, it'll go out to the network. You'll probably see a big performance hit. The better solution, I think, is to get your nodes' /etc/hosts files squared away. You only need to do so once :) -- Aaron On Thu, Apr 16, 2009 at 11:31 AM, Cam Macdonell c...@cs.ualberta.ca wrote: Cam Macdonell wrote: Hi, I'm getting the following warning when running the simple wordcount and grep examples. 09/04/15 16:54:16 INFO mapred.JobClient: Task Id : attempt_200904151649_0001_m_19_0, Status : FAILED Too many fetch-failures 09/04/15 16:54:16 WARN mapred.JobClient: Error reading task outputhttp://localhost.localdomain:50060/tasklog?plaintext=truetaskid=attempt_200904151649_0001_m_19_0filter=stdout 09/04/15 16:54:16 WARN mapred.JobClient: Error reading task outputhttp://localhost.localdomain:50060/tasklog?plaintext=truetaskid=attempt_200904151649_0001_m_19_0filter=stderr The only advice I could find from other posts with similar errors is to setup /etc/hosts with all slaves and the host IPs. I did this, but I still get the warning above. The output seems to come out alright however (I guess that's why it is a warning). I tried running a wget on the http:// address in the warning message and I get the following back 2009-04-15 16:53:46 ERROR 400: Argument taskid is required. So perhaps the wrong task ID is being passed to the http request. Any ideas on what can get rid of these warnings? Thanks, Cam Well, for future googlers, I'll answer my own post. Watch our for the hostname at the end of localhost lines on slaves. One of my slaves was registering itself as localhost.localdomain with the jobtracker. Is there a way that Hadoop could be made to not be so dependent on /etc/hosts, but on more dynamic hostname resolution? Cam
Re: More Replication on dfs
That setting will instruct future file writes to replicate two-fold. This has no bearing on existing files; replication can be set on a per-file basis, so they already have their replications set in the DFS indivdually. Use the command: bin/hadoop fs -setrep [-R] repl_factor filename... to change the replication factor for files already in HDFS - Aaron On Wed, Apr 15, 2009 at 10:04 PM, Puri, Aseem aseem.p...@honeywell.comwrote: Hi My problem is not that my data is under replicated. I have 3 data nodes. In my hadoop-site.xml I also set the configuration as: property namedfs.replication/name value2/value /property But after this also data is replicated on 3 nodes instead of two nodes. Now, please tell what can be the problem? Thanks Regards Aseem Puri -Original Message- From: Raghu Angadi [mailto:rang...@yahoo-inc.com] Sent: Wednesday, April 15, 2009 2:58 AM To: core-user@hadoop.apache.org Subject: Re: More Replication on dfs Aseem, Regd over-replication, it is mostly app related issue as Alex mentioned. But if you are concerned about under-replicated blocks in fsck output : These blocks should not stay under-replicated if you have enough nodes and enough space on them (check NameNode webui). Try grep-ing for one of the blocks in NameNode log (and datnode logs as well, since you have just 3 nodes). Raghu. Puri, Aseem wrote: Alex, Ouput of $ bin/hadoop fsck / command after running HBase data insert command in a table is: . . . . . /hbase/test/903188508/tags/info/4897652949308499876: Under replicated blk_-5193 695109439554521_3133. Target Replicas is 3 but found 1 replica(s). . /hbase/test/903188508/tags/mapfiles/4897652949308499876/data: Under replicated blk_-1213602857020415242_3132. Target Replicas is 3 but found 1 replica(s). . /hbase/test/903188508/tags/mapfiles/4897652949308499876/index: Under replicated blk_3934493034551838567_3132. Target Replicas is 3 but found 1 replica(s). . /user/HadoopAdmin/hbase table.doc: Under replicated blk_4339521803948458144_103 1. Target Replicas is 3 but found 2 replica(s). . /user/HadoopAdmin/input/bin.doc: Under replicated blk_-3661765932004150973_1030 . Target Replicas is 3 but found 2 replica(s). . /user/HadoopAdmin/input/file01.txt: Under replicated blk_2744169131466786624_10 01. Target Replicas is 3 but found 2 replica(s). . /user/HadoopAdmin/input/file02.txt: Under replicated blk_2021956984317789924_10 02. Target Replicas is 3 but found 2 replica(s). . /user/HadoopAdmin/input/test.txt: Under replicated blk_-3062256167060082648_100 4. Target Replicas is 3 but found 2 replica(s). ... /user/HadoopAdmin/output/part-0: Under replicated blk_8908973033976428484_1 010. Target Replicas is 3 but found 2 replica(s). Status: HEALTHY Total size:48510226 B Total dirs:492 Total files: 439 (Files currently being written: 2) Total blocks (validated): 401 (avg. block size 120973 B) (Total open file blocks (not validated): 2) Minimally replicated blocks: 401 (100.0 %) Over-replicated blocks:0 (0.0 %) Under-replicated blocks: 399 (99.50124 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor:2 Average block replication: 1.3117207 Corrupt blocks:0 Missing replicas: 675 (128.327 %) Number of data-nodes: 2 Number of racks: 1 The filesystem under path '/' is HEALTHY Please tell what is wrong. Aseem -Original Message- From: Alex Loddengaard [mailto:a...@cloudera.com] Sent: Friday, April 10, 2009 11:04 PM To: core-user@hadoop.apache.org Subject: Re: More Replication on dfs Aseem, How are you verifying that blocks are not being replicated? Have you ran fsck? *bin/hadoop fsck /* I'd be surprised if replication really wasn't happening. Can you run fsck and pay attention to Under-replicated blocks and Mis-replicated blocks? In fact, can you just copy-paste the output of fsck? Alex On Thu, Apr 9, 2009 at 11:23 PM, Puri, Aseem aseem.p...@honeywell.comwrote: Hi I also tried the command $ bin/hadoop balancer. But still the same problem. Aseem -Original Message- From: Puri, Aseem [mailto:aseem.p...@honeywell.com] Sent: Friday, April 10, 2009 11:18 AM To: core-user@hadoop.apache.org Subject: RE: More Replication on dfs Hi Alex, Thanks for sharing your knowledge. Till now I have three machines and I have to check the behavior of Hadoop so I want replication factor should be 2. I started my Hadoop server with replication factor 3. After that I upload 3 files to implement word count program. But as my all files are stored on one machine and replicated to other datanodes also, so my map reduce program takes input from one Datanode
Re: Using 3rd party Api in Map class
That certainly works, though if you plan to upgrade the underlying library, you'll find that copying files with the correct versions into $HADOOP_HOME/lib rapidly gets tedious, and subtle mistakes (e.g., forgetting one machine) can lead to frustration. When you consider the fact that you're using a Hadoop cluster to process and transfer around GBs of data on the low end, the difference between a 10 MB and a 20 MB job jar starts to look meaningless. Putting other jars in a lib/ directory inside your job jar keeps the version consistent and doesn't clutter up a shared directory on your cluster (assuming there are other users). - Aaron On Tue, Apr 14, 2009 at 11:15 AM, Farhan Husain russ...@gmail.com wrote: Hello, I got another solution for this. I just pasted all the required jar files in lib folder of each hadoop node. In this way the job jar is not too big and will require less time to distribute in the cluster. Thanks, Farhan On Mon, Apr 13, 2009 at 7:22 PM, Nick Cen cenyo...@gmail.com wrote: create a directroy call 'lib' in your project's root dir, then put all the 3rd party jar in it. 2009/4/14 Farhan Husain russ...@gmail.com Hello, I am trying to use Pellet library for some OWL inferencing in my mapper class. But I can't find a way to bundle the library jar files in my job jar file. I am exporting my project as a jar file from Eclipse IDE. Will it work if I create the jar manually and include all the jar files Pellet library has? Is there any simpler way to include 3rd party library jar files in a hadoop job jar? Without being able to include the library jars I am getting ClassNotFoundException. Thanks, Farhan -- http://daily.appspot.com/food/
Re: Map-Reduce Slow Down
/192.168.0.18:54310. Already tried 1 time(s). 2009-04-14 10:08:37,155 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: node18/192.168.0.18:54310. Already tried 2 time(s). Hmmm I still cant figure it out.. Mithila On Tue, Apr 14, 2009 at 10:22 PM, Mithila Nagendra mnage...@asu.edu wrote: Also, Would the way the port is accessed change if all these node are connected through a gateway? I mean in the hadoop-site.xml file? The Ubuntu systems we worked with earlier didnt have a gateway. Mithila On Tue, Apr 14, 2009 at 9:48 PM, Mithila Nagendra mnage...@asu.edu wrote: Aaron: Which log file do I look into - there are alot of them. Here s what the error looks like: [mith...@node19:~]$ cd hadoop [mith...@node19:~/hadoop]$ bin/hadoop dfs -ls 09/04/14 10:09:29 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 0 time(s). 09/04/14 10:09:30 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 1 time(s). 09/04/14 10:09:31 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 2 time(s). 09/04/14 10:09:32 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 3 time(s). 09/04/14 10:09:33 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 4 time(s). 09/04/14 10:09:34 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 5 time(s). 09/04/14 10:09:35 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 6 time(s). 09/04/14 10:09:36 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 7 time(s). 09/04/14 10:09:37 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 8 time(s). 09/04/14 10:09:38 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 9 time(s). Bad connection to FS. command aborted. Node19 is a slave and Node18 is the master. Mithila On Tue, Apr 14, 2009 at 8:53 PM, Aaron Kimball aa...@cloudera.com wrote: Are there any error messages in the log files on those nodes? - Aaron On Tue, Apr 14, 2009 at 9:03 AM, Mithila Nagendra mnage...@asu.edu wrote: I ve drawn a blank here! Can't figure out what s wrong with the ports. I can ssh between the nodes but cant access the DFS from the slaves - says Bad connection to DFS. Master seems to be fine. Mithila On Tue, Apr 14, 2009 at 4:28 AM, Mithila Nagendra mnage...@asu.edu wrote: Yes I can.. On Mon, Apr 13, 2009 at 5:12 PM, Jim Twensky jim.twen...@gmail.com wrote: Can you ssh between the nodes? -jim On Mon, Apr 13, 2009 at 6:49 PM, Mithila Nagendra mnage...@asu.edu wrote: Thanks Aaron. Jim: The three clusters I setup had ubuntu running on them and the dfs was accessed at port 54310. The new cluster which I ve setup has Red Hat Linux release 7.2 (Enigma)running on it. Now when I try to access the dfs from one of the slaves i get the following response: dfs cannot be accessed. When I access the DFS throught the master there s no problem. So I feel there a problem with the port. Any ideas? I did check the list of slaves, it looks fine to me. Mithila On Mon, Apr 13, 2009 at 2:58 PM, Jim Twensky jim.twen...@gmail.com wrote: Mithila, You said all the slaves were being utilized in the 3 node cluster. Which application did you run to test that and what was your input size? If you tried the word count application on a 516 MB input file on both cluster setups, than some of your nodes in the 15 node cluster may not be running at all. Generally, one map job is assigned to each input split and if you are running your cluster with the defaults, the splits are 64 MB each. I got confused when you said the Namenode seemed to do all the work. Can you check conf/slaves and make sure you put the names of all task trackers there? I also suggest comparing both clusters with a larger input size, say at least 5 GB, to really see a difference. Jim On Mon, Apr 13, 2009 at 4:17 PM, Aaron Kimball aa...@cloudera.com wrote: in hadoop-*-examples.jar, use randomwriter to generate the data and sort to sort it. - Aaron On Sun, Apr 12, 2009 at 9:33 PM, Pankil Doshi forpan...@gmail.com wrote: Your data is too small I guess for 15 clusters ..So it might be overhead time of these clusters making your total MR jobs more time consuming. I guess you
Re: How to submit a project to Hadoop/Apache
Tarandeep, You might want to start by releasing your project as a contrib module for Hadoop. The overhead there is much easier -- just get it compiliing in the contrib/ directory, file a JIRA ticket on Hadoop Core, and attach your patch :) - Aaron On Wed, Apr 15, 2009 at 10:29 AM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: This is how things get into Apache Incubator: http://incubator.apache.org/ But the rules are, I believe, that you can skip the incubator and go straight under a project's wing (e.g. Hadoop) if the project PMC approves. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Tarandeep Singh tarand...@gmail.com To: core-user@hadoop.apache.org Sent: Wednesday, April 15, 2009 1:08:38 PM Subject: How to submit a project to Hadoop/Apache Hi, Can anyone point me to a documentation which explains how to submit a project to Hadoop as a subproject? Also, I will appreciate if someone points me to the documentation on how to submit a project as Apache project. We have a project that is built on Hadoop. It is released to the open source community under GPL license but we are thinking of submitting it as a Hadoop or Apache project. Any help on how to do this is appreciated. Thanks, Tarandeep
Re: Is combiner and map in same JVM?
They're in the same JVM, and I believe in the same thread. - Aaron On Tue, Apr 14, 2009 at 10:25 AM, Saptarshi Guha saptarshi.g...@gmail.comwrote: Hello, Suppose I have a Hadoop job and have set my combiner to the Reducer class. Does the map function and the combiner function run in the same JVM in different threads? or in different JVMs? I ask because I have to load a native library and if they are in the same JVM then the native library is loaded once and I have to take precautions. Thank you Saptarshi Guha
Re: Map-Reduce Slow Down
Are there any error messages in the log files on those nodes? - Aaron On Tue, Apr 14, 2009 at 9:03 AM, Mithila Nagendra mnage...@asu.edu wrote: I ve drawn a blank here! Can't figure out what s wrong with the ports. I can ssh between the nodes but cant access the DFS from the slaves - says Bad connection to DFS. Master seems to be fine. Mithila On Tue, Apr 14, 2009 at 4:28 AM, Mithila Nagendra mnage...@asu.edu wrote: Yes I can.. On Mon, Apr 13, 2009 at 5:12 PM, Jim Twensky jim.twen...@gmail.com wrote: Can you ssh between the nodes? -jim On Mon, Apr 13, 2009 at 6:49 PM, Mithila Nagendra mnage...@asu.edu wrote: Thanks Aaron. Jim: The three clusters I setup had ubuntu running on them and the dfs was accessed at port 54310. The new cluster which I ve setup has Red Hat Linux release 7.2 (Enigma)running on it. Now when I try to access the dfs from one of the slaves i get the following response: dfs cannot be accessed. When I access the DFS throught the master there s no problem. So I feel there a problem with the port. Any ideas? I did check the list of slaves, it looks fine to me. Mithila On Mon, Apr 13, 2009 at 2:58 PM, Jim Twensky jim.twen...@gmail.com wrote: Mithila, You said all the slaves were being utilized in the 3 node cluster. Which application did you run to test that and what was your input size? If you tried the word count application on a 516 MB input file on both cluster setups, than some of your nodes in the 15 node cluster may not be running at all. Generally, one map job is assigned to each input split and if you are running your cluster with the defaults, the splits are 64 MB each. I got confused when you said the Namenode seemed to do all the work. Can you check conf/slaves and make sure you put the names of all task trackers there? I also suggest comparing both clusters with a larger input size, say at least 5 GB, to really see a difference. Jim On Mon, Apr 13, 2009 at 4:17 PM, Aaron Kimball aa...@cloudera.com wrote: in hadoop-*-examples.jar, use randomwriter to generate the data and sort to sort it. - Aaron On Sun, Apr 12, 2009 at 9:33 PM, Pankil Doshi forpan...@gmail.com wrote: Your data is too small I guess for 15 clusters ..So it might be overhead time of these clusters making your total MR jobs more time consuming. I guess you will have to try with larger set of data.. Pankil On Sun, Apr 12, 2009 at 6:54 PM, Mithila Nagendra mnage...@asu.edu wrote: Aaron That could be the issue, my data is just 516MB - wouldn't this see a bit of speed up? Could you guide me to the example? I ll run my cluster on it and see what I get. Also for my program I had a java timer running to record the time taken to complete execution. Does Hadoop have an inbuilt timer? Mithila On Mon, Apr 13, 2009 at 1:13 AM, Aaron Kimball aa...@cloudera.com wrote: Virtually none of the examples that ship with Hadoop are designed to showcase its speed. Hadoop's speedup comes from its ability to process very large volumes of data (starting around, say, tens of GB per job, and going up in orders of magnitude from there). So if you are timing the pi calculator (or something like that), its results won't necessarily be very consistent. If a job doesn't have enough fragments of data to allocate one per each node, some of the nodes will also just go unused. The best example for you to run is to use randomwriter to fill up your cluster with several GB of random data and then run the sort program. If that doesn't scale up performance from 3 nodes to 15, then you've definitely got something strange going on. - Aaron On Sun, Apr 12, 2009 at 8:39 AM, Mithila Nagendra mnage...@asu.edu wrote: Hey all I recently setup a three node hadoop cluster and ran an examples on it. It was pretty fast, and all the three nodes were being used (I checked the log files to make sure that the slaves are utilized). Now I ve setup another cluster consisting of 15 nodes. I ran the same example, but instead of speeding up, the map-reduce task seems to take forever! The slaves are not being used for some reason. This second cluster has a lower, per node processing power, but should that make any difference
Re: Map-Reduce Slow Down
in hadoop-*-examples.jar, use randomwriter to generate the data and sort to sort it. - Aaron On Sun, Apr 12, 2009 at 9:33 PM, Pankil Doshi forpan...@gmail.com wrote: Your data is too small I guess for 15 clusters ..So it might be overhead time of these clusters making your total MR jobs more time consuming. I guess you will have to try with larger set of data.. Pankil On Sun, Apr 12, 2009 at 6:54 PM, Mithila Nagendra mnage...@asu.edu wrote: Aaron That could be the issue, my data is just 516MB - wouldn't this see a bit of speed up? Could you guide me to the example? I ll run my cluster on it and see what I get. Also for my program I had a java timer running to record the time taken to complete execution. Does Hadoop have an inbuilt timer? Mithila On Mon, Apr 13, 2009 at 1:13 AM, Aaron Kimball aa...@cloudera.com wrote: Virtually none of the examples that ship with Hadoop are designed to showcase its speed. Hadoop's speedup comes from its ability to process very large volumes of data (starting around, say, tens of GB per job, and going up in orders of magnitude from there). So if you are timing the pi calculator (or something like that), its results won't necessarily be very consistent. If a job doesn't have enough fragments of data to allocate one per each node, some of the nodes will also just go unused. The best example for you to run is to use randomwriter to fill up your cluster with several GB of random data and then run the sort program. If that doesn't scale up performance from 3 nodes to 15, then you've definitely got something strange going on. - Aaron On Sun, Apr 12, 2009 at 8:39 AM, Mithila Nagendra mnage...@asu.edu wrote: Hey all I recently setup a three node hadoop cluster and ran an examples on it. It was pretty fast, and all the three nodes were being used (I checked the log files to make sure that the slaves are utilized). Now I ve setup another cluster consisting of 15 nodes. I ran the same example, but instead of speeding up, the map-reduce task seems to take forever! The slaves are not being used for some reason. This second cluster has a lower, per node processing power, but should that make any difference? How can I ensure that the data is being mapped to all the nodes? Presently, the only node that seems to be doing all the work is the Master node. Does 15 nodes in a cluster increase the network cost? What can I do to setup the cluster to function more efficiently? Thanks! Mithila Nagendra Arizona State University
Re: Changing block size of hadoop
Blocks already written to HDFS will remain their current size. Blocks are immutable objects. That procedure would set the size used for all subsequently-written blocks. I don't think you can change the block size while the cluster is running, because that would require the NameNode and DataNodes to re-read their configurations, which they only do at startup. - Aaron On Sun, Apr 12, 2009 at 6:08 AM, Rakhi Khatwani rakhi.khatw...@gmail.comwrote: Hi, I would like to know if it is feasbile to change the blocksize of Hadoop while map reduce jobs are executing? and if not would the following work? 1. stop map-reduce 2. stop-hbase 3. stop hadoop 4. change hadoop-sites.xml to reduce the blocksize 5. restart all whether the data in the hbase tables will be safe and automatically split after changing the block size of hadoop?? Thanks, Raakhi
Re: Map-Reduce Slow Down
Virtually none of the examples that ship with Hadoop are designed to showcase its speed. Hadoop's speedup comes from its ability to process very large volumes of data (starting around, say, tens of GB per job, and going up in orders of magnitude from there). So if you are timing the pi calculator (or something like that), its results won't necessarily be very consistent. If a job doesn't have enough fragments of data to allocate one per each node, some of the nodes will also just go unused. The best example for you to run is to use randomwriter to fill up your cluster with several GB of random data and then run the sort program. If that doesn't scale up performance from 3 nodes to 15, then you've definitely got something strange going on. - Aaron On Sun, Apr 12, 2009 at 8:39 AM, Mithila Nagendra mnage...@asu.edu wrote: Hey all I recently setup a three node hadoop cluster and ran an examples on it. It was pretty fast, and all the three nodes were being used (I checked the log files to make sure that the slaves are utilized). Now I ve setup another cluster consisting of 15 nodes. I ran the same example, but instead of speeding up, the map-reduce task seems to take forever! The slaves are not being used for some reason. This second cluster has a lower, per node processing power, but should that make any difference? How can I ensure that the data is being mapped to all the nodes? Presently, the only node that seems to be doing all the work is the Master node. Does 15 nodes in a cluster increase the network cost? What can I do to setup the cluster to function more efficiently? Thanks! Mithila Nagendra Arizona State University
Re: Thin version of Hadoop jar for client
Not currently, sorry. - Aaron On Fri, Apr 10, 2009 at 8:35 AM, Stas Oskin stas.os...@gmail.com wrote: Hi. Is there any thin version of Hadoop jar, specifically for Java client? Regards.
Re: More Replication on dfs
Changing the default replication in hadoop-site.xml does not affect files already loaded into HDFS. File replication factor is controlled on a per-file basis. You need to use the command `hadoop fs -setrep n path...` to set the replication factor to n for a particular path already present in HDFS. It can also take a -R for recursive. - Aaron On Fri, Apr 10, 2009 at 10:34 AM, Alex Loddengaard a...@cloudera.comwrote: Aseem, How are you verifying that blocks are not being replicated? Have you ran fsck? *bin/hadoop fsck /* I'd be surprised if replication really wasn't happening. Can you run fsck and pay attention to Under-replicated blocks and Mis-replicated blocks? In fact, can you just copy-paste the output of fsck? Alex On Thu, Apr 9, 2009 at 11:23 PM, Puri, Aseem aseem.p...@honeywell.com wrote: Hi I also tried the command $ bin/hadoop balancer. But still the same problem. Aseem -Original Message- From: Puri, Aseem [mailto:aseem.p...@honeywell.com] Sent: Friday, April 10, 2009 11:18 AM To: core-user@hadoop.apache.org Subject: RE: More Replication on dfs Hi Alex, Thanks for sharing your knowledge. Till now I have three machines and I have to check the behavior of Hadoop so I want replication factor should be 2. I started my Hadoop server with replication factor 3. After that I upload 3 files to implement word count program. But as my all files are stored on one machine and replicated to other datanodes also, so my map reduce program takes input from one Datanode only. I want my files to be on different data node so to check functionality of map reduce properly. Also before starting my Hadoop server again with replication factor 2 I formatted all Datanodes and deleted all old data manually. Please suggest what I should do now. Regards, Aseem Puri -Original Message- From: Mithila Nagendra [mailto:mnage...@asu.edu] Sent: Friday, April 10, 2009 10:56 AM To: core-user@hadoop.apache.org Subject: Re: More Replication on dfs To add to the question, how does one decide what is the optimal replication factor for a cluster. For instance what would be the appropriate replication factor for a cluster consisting of 5 nodes. Mithila On Fri, Apr 10, 2009 at 8:20 AM, Alex Loddengaard a...@cloudera.com wrote: Did you load any files when replication was set to 3? If so, you'll have to rebalance: http://hadoop.apache.org/core/docs/r0.19.1/commands_manual.html#balance r http://hadoop.apache.org/core/docs/r0.19.1/hdfs_user_guide.html#Rebalanc er Note that most people run HDFS with a replication factor of 3. There have been cases when clusters running with a replication of 2 discovered new bugs, because replication is so often set to 3. That said, if you can do it, it's probably advisable to run with a replication factor of 3 instead of 2. Alex On Thu, Apr 9, 2009 at 9:56 PM, Puri, Aseem aseem.p...@honeywell.com wrote: Hi I am a new Hadoop user. I have a small cluster with 3 Datanodes. In hadoop-site.xml values of dfs.replication property is 2 but then also it is replicating data on 3 machines. Please tell why is it happening? Regards, Aseem Puri
Re: Multithreaded Reducer
Rather than implementing a multi-threaded reducer, why not simply increase the number of reducer tasks per machine via mapred.tasktracker.reduce.tasks.maximum, and increase the total number of reduce tasks per job via mapred.reduce.tasks to ensure that they're all filled. This will effectively utilize a higher number of cores. - Aaron On Fri, Apr 10, 2009 at 11:12 AM, Sagar Naik sn...@attributor.com wrote: Hi, I would like to implement a Multi-threaded reducer. As per my understanding , the system does not have one coz we expect the output to be sorted. However, in my case I dont need the output sorted. Can u pl point to me any other issues or it would be safe to do so -Sagar
Re: Multithreaded Reducer
At that level of parallelism, you're right that the process overhead would be too high. - Aaron On Fri, Apr 10, 2009 at 11:36 AM, Sagar Naik sn...@attributor.com wrote: Two things - multi-threaded is preferred over multi-processes. The process I m planning is IO bound so I can really take advantage of multi-threads (100 threads) - Correct me if I m wrong. The next MR_JOB in the pipeline will have increased number of splits to process as the number of reducer-outputs (from prev job) have increased . This leads to increase in the map-task completion time. -Sagar Aaron Kimball wrote: Rather than implementing a multi-threaded reducer, why not simply increase the number of reducer tasks per machine via mapred.tasktracker.reduce.tasks.maximum, and increase the total number of reduce tasks per job via mapred.reduce.tasks to ensure that they're all filled. This will effectively utilize a higher number of cores. - Aaron On Fri, Apr 10, 2009 at 11:12 AM, Sagar Naik sn...@attributor.com wrote: Hi, I would like to implement a Multi-threaded reducer. As per my understanding , the system does not have one coz we expect the output to be sorted. However, in my case I dont need the output sorted. Can u pl point to me any other issues or it would be safe to do so -Sagar
Re: Getting free and used space
You can insert this propery into the jobconf, or specify it on the command line e.g.: -D hadoop.job.ugi=username,group,group,group. - Aaron On Wed, Apr 8, 2009 at 7:04 AM, Brian Bockelman bbock...@cse.unl.eduwrote: Hey Stas, What we do locally is apply the latest patch for this issue: https://issues.apache.org/jira/browse/HADOOP-4368 This makes getUsed (actually, it switches to FileSystem.getStatus) not a privileged action. As far as specifying the user ... gee, I can't think of it off the top of my head. It's a variable you can insert into the JobConf, but I'd have to poke around google or the code to remember which one (I try to not override it if possible). Brian On Apr 8, 2009, at 8:51 AM, Stas Oskin wrote: Hi. Thanks for the explanation. Now for the easier part - how do I specify the user when connecting? :) Is it a config file level, or run-time level setting? Regards. 2009/4/8 Brian Bockelman bbock...@cse.unl.edu Hey Stas, Did you try this as a privileged user? There might be some permission errors... in most of the released versions, getUsed() is only available to the Hadoop superuser. It may be that the exception isn't propagating correctly. Brian On Apr 8, 2009, at 3:13 AM, Stas Oskin wrote: Hi. I'm trying to use the API to get the overall used and free spaces. I tried this function getUsed(), but it always returns 0. Any idea? Thanks.
Re: How many people is using Hadoop Streaming ?
Excellent. Thanks - A On Tue, Apr 7, 2009 at 2:16 PM, Owen O'Malley omal...@apache.org wrote: On Apr 7, 2009, at 11:41 AM, Aaron Kimball wrote: Owen, Is binary streaming actually readily available? https://issues.apache.org/jira/browse/HADOOP-1722
Re: Example of deploying jars through DistributedCache?
Ooh. The other DCache-based operations assume that you're dcaching files already resident in HDFS. I guess this assumes that the filenames are on the local filesystem. - Aaron On Wed, Apr 8, 2009 at 8:32 AM, Brian MacKay brian.mac...@medecision.comwrote: I use addArchiveToClassPath, and it works for me. DistributedCache.addArchiveToClassPath(new Path(path), conf); I was curious about this block of code. Why are you coping to tmp? FileSystem fs = FileSystem.get(conf); fs.copyFromLocalFile(new Path(aaronTest2.jar), new Path(tmp/aaronTest2.jar)); -Original Message- From: Tom White [mailto:t...@cloudera.com] Sent: Wednesday, April 08, 2009 9:36 AM To: core-user@hadoop.apache.org Subject: Re: Example of deploying jars through DistributedCache? Does it work if you use addArchiveToClassPath()? Also, it may be more convenient to use GenericOptionsParser's -libjars option. Tom On Mon, Mar 2, 2009 at 7:42 AM, Aaron Kimball aa...@cloudera.com wrote: Hi all, I'm stumped as to how to use the distributed cache's classpath feature. I have a library of Java classes I'd like to distribute to jobs and use in my mapper; I figured the DCache's addFileToClassPath() method was the correct means, given the example at http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/filecache/DistributedCache.html . I've boiled it down to the following non-working example: in TestDriver.java: private void runJob() throws IOException { JobConf conf = new JobConf(getConf(), TestDriver.class); // do standard job configuration. FileInputFormat.addInputPath(conf, new Path(input)); FileOutputFormat.setOutputPath(conf, new Path(output)); conf.setMapperClass(TestMapper.class); conf.setNumReduceTasks(0); // load aaronTest2.jar into the dcache; this contains the class ValueProvider FileSystem fs = FileSystem.get(conf); fs.copyFromLocalFile(new Path(aaronTest2.jar), new Path(tmp/aaronTest2.jar)); DistributedCache.addFileToClassPath(new Path(tmp/aaronTest2.jar), conf); // run the job. JobClient.runJob(conf); } and then in TestMapper: public void map(LongWritable key, Text value, OutputCollectorLongWritable, Text output, Reporter reporter) throws IOException { try { ValueProvider vp = (ValueProvider) Class.forName(ValueProvider).newInstance(); Text val = vp.getValue(); output.collect(new LongWritable(1), val); } catch (ClassNotFoundException e) { throw new IOException(not found: + e.toString()); // newInstance() throws to here. } catch (Exception e) { throw new IOException(Exception: + e.toString()); } } The class ValueProvider is to be loaded from aaronTest2.jar. I can verify that this code works if I put ValueProvider into the main jar I deploy. I can verify that aaronTest2.jar makes it into the ${mapred.local.dir}/taskTracker/archive/ But when run with ValueProvider in aaronTest2.jar, the job fails with: $ bin/hadoop jar aaronTest1.jar TestDriver 09/03/01 22:36:03 INFO mapred.FileInputFormat: Total input paths to process : 10 09/03/01 22:36:03 INFO mapred.FileInputFormat: Total input paths to process : 10 09/03/01 22:36:04 INFO mapred.JobClient: Running job: job_200903012210_0005 09/03/01 22:36:05 INFO mapred.JobClient: map 0% reduce 0% 09/03/01 22:36:14 INFO mapred.JobClient: Task Id : attempt_200903012210_0005_m_00_0, Status : FAILED java.io.IOException: not found: java.lang.ClassNotFoundException: ValueProvider at TestMapper.map(Unknown Source) at TestMapper.map(Unknown Source) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) Do I need to do something else (maybe in Mapper.configure()?) to actually classload the jar? The documentation makes me believe it should already be in the classpath by doing only what I've done above. I'm on Hadoop 0.18.3. Thanks, - Aaron _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this message in error, please contact the sender and delete the material from any computer.
Re: connecting two clusters
Clusters don't really have identities beyond the addresses of the NameNodes and JobTrackers. In the example below, nn1 and nn2 are the hostnames of the namenodes of the source and destination clusters. The 8020 in each address assumes that they're on the default port. Hadoop provides no inter-task or inter-job synchronization primitives, on purpose (even within a cluster, the most you get in terms of synchronization is the ability to join on the status of a running job to determine that it's completed). The model is designed to be as identity-independent as possible to make it more resiliant to failure. If individual jobs/tasks could lock common resources, then the intermittent failure of tasks could easily cause deadlock. Using a file as a scoreboard or other communication mechanism between multiple jobs is not something explicitly designed for, and likely to end in frustration. Can you describe the goal you're trying to accomplish? It's likely that there's another, more MapReduce-y way of looking at the job and refactoring the code to make it work more cleanly with the intended programming model. - Aaron On Mon, Apr 6, 2009 at 10:08 PM, Mithila Nagendra mnage...@asu.edu wrote: Thanks! I was looking at the link sent by Philip. The copy is done with the following command: hadoop distcp hdfs://nn1:8020/foo/bar \ hdfs://nn2:8020/bar/foo I was wondering if nn1 and nn2 are the names of the clusters or the name of the masters on each cluster. I wanted map/reduce tasks running on each of the two clusters to communicate with each other. I dont know if hadoop provides for synchronization between two map/reduce tasks. The tasks run simultaneouly, and they need to access a common file - something like a map/reduce task at a higher level utilizing the data produced by the map/reduce at the lower level. Mithila On Tue, Apr 7, 2009 at 7:57 AM, Owen O'Malley omal...@apache.org wrote: On Apr 6, 2009, at 9:49 PM, Mithila Nagendra wrote: Hey all I'm trying to connect two separate Hadoop clusters. Is it possible to do so? I need data to be shuttled back and forth between the two clusters. Any suggestions? You should use hadoop distcp. It is a map/reduce program that copies data, typically from one cluster to another. If you have the hftp interface enabled, you can use that to copy between hdfs clusters that are different versions. hadoop distcp hftp://namenode1:1234/foo/bar hdfs://foo/bar -- Owen
Re: connecting two clusters
Hi Mithila, Unfortunately, Hadoop MapReduce jobs determine their inputs as soon as they begin; the inputs for the job are then fixed. So additional files that arrive in the input directory after processing has begun, etc, do not participate in the job. And HDFS does not currently support appends to files, so existing files cannot be updated. A typical way in which this sort of problem is handled is to do processing in incremental wavefronts; process A generates some data which goes in an incoming directory for process B; process B starts on a timer every so often and collects the new input files and works on them. After it's done, it moves those inputs which it processed into a done directory. In the mean time, new files may have arrived. After another time interval, another round of process B starts. The major limitation of this model is that it requires that your process work incrementally, or that you are emitting a small enough volume of data each time in process B that subsequent iterations can load into memory a summary table of results from previous iterations. Look into using the DistributedCache to disseminate such files. Also, why are you using two MapReduce clusters for this, as opposed to one? Is there a common HDFS cluster behind them? You'll probably get much better performance for the overall process if the output data from one job does not need to be transferred to another cluster before it is further processed. Does this model make sense? - Aaron On Tue, Apr 7, 2009 at 1:06 AM, Mithila Nagendra mnage...@asu.edu wrote: Aaron, We hope to achieve a level of pipelining between two clusters - similar to how pipelining is done in executing RDB queries. You can look at it as the producer-consumer problem, one cluster produces some data and the other cluster consumes it. The issue that has to be dealt with here is the data exchange between the clusters - synchronized interaction between the map-reduce jobs on the two clusters is what I m hoping to achieve. Mithila On Tue, Apr 7, 2009 at 10:10 AM, Aaron Kimball aa...@cloudera.com wrote: Clusters don't really have identities beyond the addresses of the NameNodes and JobTrackers. In the example below, nn1 and nn2 are the hostnames of the namenodes of the source and destination clusters. The 8020 in each address assumes that they're on the default port. Hadoop provides no inter-task or inter-job synchronization primitives, on purpose (even within a cluster, the most you get in terms of synchronization is the ability to join on the status of a running job to determine that it's completed). The model is designed to be as identity-independent as possible to make it more resiliant to failure. If individual jobs/tasks could lock common resources, then the intermittent failure of tasks could easily cause deadlock. Using a file as a scoreboard or other communication mechanism between multiple jobs is not something explicitly designed for, and likely to end in frustration. Can you describe the goal you're trying to accomplish? It's likely that there's another, more MapReduce-y way of looking at the job and refactoring the code to make it work more cleanly with the intended programming model. - Aaron On Mon, Apr 6, 2009 at 10:08 PM, Mithila Nagendra mnage...@asu.edu wrote: Thanks! I was looking at the link sent by Philip. The copy is done with the following command: hadoop distcp hdfs://nn1:8020/foo/bar \ hdfs://nn2:8020/bar/foo I was wondering if nn1 and nn2 are the names of the clusters or the name of the masters on each cluster. I wanted map/reduce tasks running on each of the two clusters to communicate with each other. I dont know if hadoop provides for synchronization between two map/reduce tasks. The tasks run simultaneouly, and they need to access a common file - something like a map/reduce task at a higher level utilizing the data produced by the map/reduce at the lower level. Mithila On Tue, Apr 7, 2009 at 7:57 AM, Owen O'Malley omal...@apache.org wrote: On Apr 6, 2009, at 9:49 PM, Mithila Nagendra wrote: Hey all I'm trying to connect two separate Hadoop clusters. Is it possible to do so? I need data to be shuttled back and forth between the two clusters. Any suggestions? You should use hadoop distcp. It is a map/reduce program that copies data, typically from one cluster to another. If you have the hftp interface enabled, you can use that to copy between hdfs clusters that are different versions. hadoop distcp hftp://namenode1:1234/foo/bar hdfs://foo/bar -- Owen
Re: Why namenode logs into itself as well as job tracker?
If you're using the start-dfs.sh script, it's ssh'ing into every machine in conf/masters to start a SecondaryNameNode instance. The NameNode itself is started locally. It then ssh's into every machine in conf/slaves to start a DataNode instance. Similarly, the JT is started locally by start-mapred.sh. But it sshs to every machine in conf/slaves to start a TaskTracker. - Aaron On Sat, Apr 4, 2009 at 6:36 AM, Foss User foss...@gmail.com wrote: I have a namenode and job tracker on two different machines. I see that a namenode tries to do an ssh log into itself (name node), job tracker as well as all slave machines. However, the job tracker tries to do an ssh log into the slave machines only. Why this difference in behavior? Could someone please explain what is going on behind the scenes?
Re: Question on distribution of classes and jobs
On Fri, Apr 3, 2009 at 11:39 PM, Foss User foss...@gmail.com wrote: If I have written a WordCount.java job in this manner: conf.setMapperClass(Map.class); conf.setCombinerClass(Combine.class); conf.setReducerClass(Reduce.class); So, you can see that three classes are being used here. I have packaged these classes into a jar file called wc.jar and I run it like this: $ bin/hadoop jar wc.jar WordCountJob 1) I want to know when the job runs in a 5 machine cluster, is the whole JAR file distributed across the 5 machines or the individual class files are distributed individually? The whole jar. 2) Also, let us say the number of reducers are 2 while the number of mappers are 5. What happens in this case? How are the class files or jar files distributed? It's uploaded into HDFS; specifically into a subdirectory of wherever you configured mapred.system.dir. 3) Are they distributed via RPC or HTTP? The client uses the HDFS protocol to inject its jar file into HDFS. Then all the TaskTrackers retrieve it with the same protocol
Re: Running MapReduce without setJar
All the nodes in your Hadoop cluster need access to the class files for your MapReduce job. The current mechanism that Hadoop has to distribute classes and attach them to the classpath assumes they're in a JAR together. Thus, merely specifying the names of mapper/reducer classes with setMapperClass(), etc, isn't enough -- you need to actually deliver a jar containing those classes to all your nodes. Since the mapper and reducer classes are separate classes, you'd need to bundle those .class files together somehow. JAR is the standard way to do this, so that's what Hadoop supports. If you're running in fully-local mode (e.g., with jobConf.set(mapred.job.tracker, local)), then no jar is needed since it's all running inside the original process space. - Aaron On Thu, Apr 2, 2009 at 1:13 PM, Farhan Husain russ...@gmail.com wrote: I did all of them i.e. I used setMapClass, setReduceClass and new JobConf(MapReduceWork.class) but still it cannot run the job without a jar file. I understand the reason that it looks for those classes inside a jar but I think there should be some better way to find those classes without using a jar. But I am not sure whether it is possible at all. On Thu, Apr 2, 2009 at 2:56 PM, Rasit OZDAS rasitoz...@gmail.com wrote: You can point to them by using conf.setMapClass(..) and conf.setReduceClass(..) - or something similar, I don't have the source nearby. But something weird has happened to my code. It runs locally when I start it as java process (tries to find input path locally). I'm now using trunk, maybe something has changed with new version. With version 0.19 it was fine. Can somebody point out a clue? Rasit -- Mohammad Farhan Husain Research Assistant Department of Computer Science Erik Jonsson School of Engineering and Computer Science University of Texas at Dallas
Re: Handling Non Homogenous tasks via Hadoop
Amit, The mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum properties can be controlled on a per-host basis in their hadoop-site.xml files. With this you can configure nodes with more/fewer cores/RAM/etc to take on varying amounts of work. There's no current mechanism to provide feedback to the task scheduler, though, based on actual machine utilization in real time. - Aaron On Tue, Apr 7, 2009 at 7:54 AM, amit handa amha...@gmail.com wrote: Hi, Is there a way I can control number of tasks that can be spawned on a machine based on the machine capacity and how loaded the machine already is ? My use case is as following: I have to perform task 1,task2,task3 ...task n . These tasks have varied CPU and memory usage patterns. All tasks of type task 1,task3 can take 80-90%CPU and 800 MB of RAM. All type of tasks task2 take only 1-2% of CPU and 5-10 MB of RAM How do i model this using Hadoop ? Can i use only one cluster for running all these type of tasks ? Shall I use different hadoop clusters for each tasktype , if yes, then how do i share data between these tasks (the data can be few MB to few GB) Please suggest or point to any docs which i can dig up. Thanks, Amit
Re: Too many fetch errors
Xiaolin, Are you certain that the two nodes can fetch mapper outputs from one another? If it's taking that long to complete, it might be the case that what makes it complete is just that eventually it abandons one of your two nodes and runs everything on a single node where it succeeds -- defeating the point, of course. Might there be a firewall between the two nodes that blocks the port used by the reducer to fetch the mapper outputs? (I think this is on 50060 by default.) - Aaron On Tue, Apr 7, 2009 at 8:08 AM, xiaolin guo xiao...@hulu.com wrote: This simple map-recude application will take nearly 1 hour to finish running on the two-node cluster ,due to lots of Failed/Killed task attempts, while in the single node cluster this application only takes 1 minite ... I am quite confusing why there are so many Failed/Killed attempts .. On Tue, Apr 7, 2009 at 10:40 PM, xiaolin guo xiao...@hulu.com wrote: I am trying to setup a small hadoop cluster , everything was ok before I moved from single node cluster to two-node cluster. I followed the article http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster)http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29 http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29to config master and slaves.However, when I tried to run the example wordcount map-reduce application , the reduce task got stuck in 19% for a log time . Then I got a notice:INFO mapred.JobClient: TaskId : attempt_200904072219_0001_m_02_0, Status : FAILED too many fetch errors and an error message : Error reading task outputslave. All map tasks in both task nodes had been finished which could be verified in task tracker pages. Both nodes work well in single node mode . And the Hadoop file system seems to be healthy in multi-node mode. Can anyone help me with this issue? Have already got entangled in this issue for a long time ... Thanks very much!
Re: How many people is using Hadoop Streaming ?
Owen, Is binary streaming actually readily available? Looking at http://issues.apache.org/jira/browse/HADOOP-3227, it appears uncommitted. - Aaron On Fri, Apr 3, 2009 at 8:37 PM, Tim Wintle tim.win...@teamrubber.comwrote: On Fri, 2009-04-03 at 09:42 -0700, Ricky Ho wrote: 1) I can pick the language that offers a different programming paradigm (e.g. I may choose functional language, or logic programming if they suit the problem better). In fact, I can even chosen Erlang at the map() and Prolog at the reduce(). Mix and match can optimize me more. Agreed (as someone who has written mappers/reducers in Python, perl, shell script and Scheme before).
Re: skip setting output path for a sequential MR job..
You must remove the existing output directory before running the job. This check is put in to prevent you from inadvertently destroying or muddling your existing output data. You can remove the output directory in advance programmatically with code similar to: FileSystem fs = FileSystem.get(conf); // use your JobConf here fs.delete(new Path(/path/to/output/dir), true); See http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/fs/FileSystem.htmlfor more details. - Aaron On Mon, Mar 30, 2009 at 9:25 PM, some speed speed.s...@gmail.com wrote: Hello everyone, Is it necessary to redirect the ouput of reduce to a file? When I am trying to run the same M-R job more than once, it throws an error that the output file already exists. I dont want to use command line args so I hard coded the file name into the program. So, Is there a way , I could delete a file on HDFS programatically? or can i skip setting a output file path n just have my output print to console? or can I just append to an existing file? Any help is appreciated. Thanks. -Sharath
Re: a doubt regarding an appropriate file system
The short version is that the in-memory structures used by the NameNode are heavy on a per-file basis, and light on a per-block basis. So petabytes of files that are only a few hundred KB will require the NameNode to have a huge amount of memory to hold the filesystem data structures. More than you want to pay to fit in a single server. There's more details about this issue in an older post on our blog: http://www.cloudera.com/blog/2009/02/02/the-small-files-problem/ - Aaron On Mon, Mar 30, 2009 at 12:46 AM, deepya m_dee...@yahoo.co.in wrote: Hi, Thanks. Can you please specify in detail what kind of problems I will face if I use Hadoop for this project. SreeDeepya TimRobertson100 wrote: I believe Hadoop is not best suited to many small files like yours but is really geared to handling very large files that get split into many smaller files (like 128M chunks) and HDFS is designed with this in mind. Therefore I could *imagine* that there are other distributed file systems that would far outperform HDFS if they were designed to replicate and track small files without any *split* and *merging* which Hadoop provides. Having not used MogileFS I cant really advise well but a quick read through does look like it might be a candidate for you to consider - it looks like it distributes across machines and tracks replicas like HDFS without the splitting, and offers access through http to the individual files which I could imagine would be ideal for pulling back small images. Please don't just follow my advise though - I am still a relative newbie to DFS's in general. Cheers Tim On Sun, Mar 29, 2009 at 12:51 PM, deepya m_dee...@yahoo.co.in wrote: Hi, I am doing a project scalable storage server to store images.Can Hadoop efficiently support this purpose???Our image size will be around 250 to 300 KB each.But we have many such images.Like the total storage may run upto petabytes( in future) .At present it is in gigabytes. We want to access these images via apache server.I mean,is there any mechanism that we can directly talk to hdfs via apache server??? I went through one of the posts here and got to know that rather than using FUSE it is better to use HDFS API.That is fine.But they also mentioned that mozilefs will be more appropriate. Can some one please clarify why mozilefs is more appropriate.Cant hadoop be used???How is mozile more advantageous.Can you suggest which filesystem would be more appropriate for the project I am doing at present. Thanks in advance SreeDeepya -- View this message in context: http://www.nabble.com/a-doubt-regarding-an-appropriate-file-system-tp22766331p22766331.html Sent from the Hadoop core-user mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/a-doubt-regarding-an-appropriate-file-system-tp22766331p22777879.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: corrupt unreplicated block in dfs (0.18.3)
Just because a block is corrupt doesn't mean the entire file is corrupt. Furthermore, the presence/absence of a file in the namespace is a completely separate issue to the data in the file. I think it would be a surprising interface change if files suddenly disappeared just because 1 out of potentially many blocks were corrupt. - Aaron On Thu, Mar 26, 2009 at 1:21 PM, Mike Andrews m...@xoba.com wrote: i noticed that when a file with no replication (i.e., replication=1) develops a corrupt block, hadoop takes no action aside from the datanode throwing an exception to the client trying to read the file. i manually corrupted a block in order to observe this. obviously, with replication=1 its impossible to fix the block, but i thought perhaps hadoop would take some other action, such as deleting the file outright, or moving it to a corrupt directory, or marking it or keeping track of it somehow to note that there's un-fixable corruption in the filesystem? thus, the current behaviour seems to sweep the corruption under the rug and allows its continued existence, aside from notifying the specific client doing the read with an exception. if anyone has any information about this issue or how to work around it, please let me know. on the other hand, i tested that corrupting a block in a replication=3 file causes hadoop to re-replicate the block from another existing copy, which is good and is i what i expected. best, mike -- permanent contact information at http://mikerandrews.com
Re: comma delimited files
Sure. Put all your comma-delimited data into the output key as a Text object, and set the output value to the empty string. It'll dump the output key, as text, to the reducer output files. - Aaron On Thu, Mar 26, 2009 at 4:14 PM, nga pham nga.p...@gmail.com wrote: Hi all, Can Hadoop export into comma delimited files? Thank you, Nga P.
Re: Using HDFS to serve www requests
In general, Hadoop is unsuitable for the application you're suggesting. Systems like Fuse HDFS do exist, though they're not widely used. I don't know of anyone trying to connect Hadoop with Apache httpd. When you say that you have huge images, how big is huge? It might be useful if these images are 1 GB or larger. But in general, huge on Hadoop means 10s of GBs up to TBs. If you have a large number of moderately-sized files, you'll find that HDFS responds very poorly for your needs. It sounds like glusterfs is designed more for your needs. - Aaron On Thu, Mar 26, 2009 at 4:06 PM, phil cryer p...@cryer.us wrote: This is somewhat of a noob question I know, but after learning about Hadoop, testing it in a small cluster and running Map Reduce jobs on it, I'm still not sure if Hadoop is the right distributed file system to serve web requests. In other words, can, or is it right to, serve Images and data from HDFS using something like FUSE to mount a filesystem where Apache could serve images from it? We have huge images, thus the need for a distributed file system, and they go in, get stored with lots of metadata, and are redundant with Hadoop/HDFS - but is it the right way to serve web content? I looked at glusterfs before, they had an Apache and Lighttpd module which made it simple, does HDFS have something like this, do people just use a FUSE option as I described, or is this not a good use of Hadoop? Thanks P
Re: Sorting data numerically
Simplest possible solution: zero-pad your keys to ten places? - Aaron On Sat, Mar 21, 2009 at 11:40 PM, Akira Kitada akit...@gmail.com wrote: Hi, By default Hadoop does ASCII sort the mapper's output, not numeric sort. However, I often want the framework to sort records in numeric order. Can I make the framework to do numeric sort? (I use Hadoop Streaming) Thanks, Akira
Re: Reduce task going away for 10 seconds at a time
If you jstack the process in the middle of one of these pauses, can you see where it's sticking? - Aaron On Fri, Mar 13, 2009 at 6:51 AM, Doug Cook nab...@candiru.com wrote: Hi folks, I've been debugging a severe performance problems with a Hadoop-based application (a highly modified version of Nutch). I've recently upgraded to Hadoop 0.19.1 from a much, much older version, and a reduce that used to work just fine is now running orders of magnitude more slowly. From the logs I can see that progress of my reduce stops for periods that average almost exactly 10 seconds (with a very narrow distribution around 10 seconds), and it does so in various places in my code, but more or less in proportion to how much time I'd expect the task would normally spend in that particular place in the code, i.e. the behavior seems like my code is randomly being interrupted for 10 seconds at a time. I'm planning to keep digging, but thought that these symptoms might sound familiar to someone on this list. Ring any bells? Your help much appreciated. Thanks! Doug Cook -- View this message in context: http://www.nabble.com/Reduce-task-going-away-for-10-seconds-at-a-time-tp22496810p22496810.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Running 0.19.2 branch in production before release
Right, there's no sense in freezing your Hadoop version forever :) But if you're an ops team tasked with keeping a production cluster running 24/7, running on 0.19 (or even more daringly, TRUNK) is not something that I would consider a Best Practice. Ideally you'll be able to carve out some spare capacity (maybe 3--5 nodes) to use as a staging cluster that runs on 0.19 or TRUNK that you can use to evaluate the next version. Then when you are convinced that it's stable, and your staging cluster passes your internal tests (e.g., running test versions of your critical nightly jobs successfully), you can move that to production. - Aaron On Thu, Mar 5, 2009 at 2:33 AM, Steve Loughran ste...@apache.org wrote: Aaron Kimball wrote: I recommend 0.18.3 for production use and avoid the 19 branch entirely. If your priority is stability, then stay a full minor version behind, not just a revision. Of course, if everyone stays that far behind, they don't get to find the bugs for other people. * If you play with the latest releases early, while they are in the beta phase -you will encounter the problems specific to your applications/datacentres, and get them fixed fast. * If you work with stuff further back you get stability, but not only are you behind on features, you can't be sure that all fixes that matter to you get pushed back. * If you plan on making changes, of adding features, get onto SVN_HEAD * If you want to catch changes being made that break your site, SVN_HEAD. Better yet, have a private Hudson server checking out SVN_HEAD hadoop *then* building and testing your app against it. Normally I work with stable releases of things I dont depend on, and SVN_HEAD of OSS stuff whose code I have any intent to change; there is a price -merge time, the odd change breaking your code- but you get to make changes that help you long term. Where Hadoop is different is that it is a filesystem, and you don't want to hit bugs that delete files that matter. I'm only bringing up transient clusters on VMs, pulling in data from elsewhere, so this isn't an issue. All that remains is changing APIs. -Steve
Re: Repartitioned Joins
Richa, Since the mappers run independently, you'd have a hard time determining whether a record in mapper A would be joined by a record in mapper B. The solution, as it were, would be to do this in two separate MapReduce passes: * Take an educated guess at which table is the smaller data set. * Run a MapReduce over this dataset, building up a bloom filter for the record ids. Set entries in the filter to 1 for each record id you see; leave the rest as 0. * The bloom filter now has 1 meaning maybe joinable and 0 meaning definitely not joinable. * Run a second MapReduce job over both datasets. Use the distributed cache to send the filter to all mappers. Mappers emit all records where filter[hash(record_id)] == 1. - Aaron On Wed, Mar 4, 2009 at 11:18 AM, Richa Khandelwal richa...@gmail.com wrote: Hi All, Does anyone know of tweaking in map-reduce joins that will optimize it further in terms of the moving only those tuples to reduce phase that join in the two tables? There are replicated joins and semi-join strategies but they are more of databases than map-reduce. Thanks, Richa Khandelwal University Of California, Santa Cruz. Ph:425-241-7763
Re: Throw an exception if the configure method fails
Try throwing RuntimeException, or any other unchecked exception (e.g., any descendant classes of RuntimeException) - Aaron On Thu, Mar 5, 2009 at 4:24 PM, Saptarshi Guha saptarshi.g...@gmail.comwrote: hello, I'm not that comfortable with java, so here is my question. In the MapReduceBase class, i have implemented the configure method, which does not throw an exception. Suppose I detect an error in some options, i wish to raise an exception(in the configure method) - is there a way to do that? Is there a way to stop the job in case the configure method fails?, Saptarshi Guha [1] My map extends MapReduceBase and my reduce extends MapReduceBase - two separate classes. Thank you
Re: The cpu preemption between MPI and Hadoop programs on Same Cluster
Song, you should be able to use 'nice' to reprioritize the MPI task below that of your Hadoop jobs. - Aaron On Thu, Mar 5, 2009 at 8:26 PM, 柳松 lamfeel...@126.com wrote: Dear all: I run my hadoop program with another MPI program on the same cluster. here is the result of top. PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 11750 qianglv 25 0 233m 99m 6100 R 99.7 2.5 116:05.59 rosetta.mpich 18094 cip 17 0 3136m 68m 15m S 0.5 1.7 0:12.69 java 18244 cip 17 0 3142m 80m 15m S 0.2 2.0 0:17.61 java 18367 cip 18 0 2169m 88m 15m S 0.1 2.3 0:17.46 java 18012 cip 18 0 3141m 77m 15m S 0.1 2.0 0:14.49 java 18584 cip 21 0 m 46m 15m S 0.1 1.2 0:05.12 java My Hadoop program can only get no more than 1 percent cpu time slide in total, compared with the rosetta.mpich program's 99.7%. I'm sure my program is in progress since the log files told me, they are running normally. Someone told me, it's the nature of Java program, low cpu priority, especially compared with C program. Is that true? Regards Song Liu in Suzhou University.
Example of deploying jars through DistributedCache?
Hi all, I'm stumped as to how to use the distributed cache's classpath feature. I have a library of Java classes I'd like to distribute to jobs and use in my mapper; I figured the DCache's addFileToClassPath() method was the correct means, given the example at http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/filecache/DistributedCache.html. I've boiled it down to the following non-working example: in TestDriver.java: private void runJob() throws IOException { JobConf conf = new JobConf(getConf(), TestDriver.class); // do standard job configuration. FileInputFormat.addInputPath(conf, new Path(input)); FileOutputFormat.setOutputPath(conf, new Path(output)); conf.setMapperClass(TestMapper.class); conf.setNumReduceTasks(0); // load aaronTest2.jar into the dcache; this contains the class ValueProvider FileSystem fs = FileSystem.get(conf); fs.copyFromLocalFile(new Path(aaronTest2.jar), new Path(tmp/aaronTest2.jar)); DistributedCache.addFileToClassPath(new Path(tmp/aaronTest2.jar), conf); // run the job. JobClient.runJob(conf); } and then in TestMapper: public void map(LongWritable key, Text value, OutputCollectorLongWritable, Text output, Reporter reporter) throws IOException { try { ValueProvider vp = (ValueProvider) Class.forName(ValueProvider).newInstance(); Text val = vp.getValue(); output.collect(new LongWritable(1), val); } catch (ClassNotFoundException e) { throw new IOException(not found: + e.toString()); // newInstance() throws to here. } catch (Exception e) { throw new IOException(Exception: + e.toString()); } } The class ValueProvider is to be loaded from aaronTest2.jar. I can verify that this code works if I put ValueProvider into the main jar I deploy. I can verify that aaronTest2.jar makes it into the ${mapred.local.dir}/taskTracker/archive/ But when run with ValueProvider in aaronTest2.jar, the job fails with: $ bin/hadoop jar aaronTest1.jar TestDriver 09/03/01 22:36:03 INFO mapred.FileInputFormat: Total input paths to process : 10 09/03/01 22:36:03 INFO mapred.FileInputFormat: Total input paths to process : 10 09/03/01 22:36:04 INFO mapred.JobClient: Running job: job_200903012210_0005 09/03/01 22:36:05 INFO mapred.JobClient: map 0% reduce 0% 09/03/01 22:36:14 INFO mapred.JobClient: Task Id : attempt_200903012210_0005_m_00_0, Status : FAILED java.io.IOException: not found: java.lang.ClassNotFoundException: ValueProvider at TestMapper.map(Unknown Source) at TestMapper.map(Unknown Source) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207) Do I need to do something else (maybe in Mapper.configure()?) to actually classload the jar? The documentation makes me believe it should already be in the classpath by doing only what I've done above. I'm on Hadoop 0.18.3. Thanks, - Aaron
Re: RecordReader and non thread safe JNI libraries
It's situation (2). Each map task gets its own JVM instance; this has its own RecordReader and its own Mapper implementation. There's basically a loop in each task jvm that says: while (recordReader.hasNext()) { recordReader.getNext(k, v); myMapper.map(k, v, output, reporter); } If your mapper and the RR use the same library and tread on one another's state, you're going to have undefined results. - Aaron On Sun, Mar 1, 2009 at 8:33 PM, Saptarshi Guha saptarshi.g...@gmail.comwrote: Hello, I am quite confused and my email seems to prove it. My question is essentially, I need to use this non thread safe library in the Mapper, Reducer and RecordReader. assume, i do not create threads. Will I run into any thread safety issues? In a given JVM, the maps will run sequentially, so will the reduces, but will maps run alongside recorder reader? Hope this is clearer. Regards Saptarshi Guha On Sun, Mar 1, 2009 at 11:07 PM, Saptarshi Guha saptarshi.g...@gmail.com wrote: Hello, My RecordReader subclass reads from object X. To parse this object and emit records, i need the use of a C library and a JNI wrapper. public boolean next(LongWritable key, BytesWritable value) throws IOException { if (leftover == 0) return false; long wi = pos + split.getStart(); key.set(wi); value.readFields(X.at( wi); pos ++; leftover --; return true; } X.at uses the JNI lib to read a record number wi My question is who running this? 1) For a given job, is one instance of this running on each tasktracker? reading records and feeding to the mappers on its machine? Or, 2) as I have mapred.tasktracker.map.tasks.maximum == 7, does each jvm launched have one RecordReader running feeding records to the maps its jvm is running. If it's either (1) or (2), I guess I'm safe from threading issues. Please correct me if i'm totally wrong. Regards Saptarshi Guha
Re: GenericOptionsParser warning
You should put this stub code in your program as the means to start your MapReduce job: public class Foo extends Configured implements Tool { public int run(String [] args) throws IOException { JobConf conf = new JobConf(getConf(), Foo.class); // run the job here. return 0; } public static void main(String [] args) throws Exception { int ret = ToolRunner.run(new Foo(), args); // calls your run() method. System.exit(ret); } } On Wed, Feb 18, 2009 at 7:09 AM, Rasit OZDAS rasitoz...@gmail.com wrote: Hi, There is a JIRA issue about this problem, if I understand it correctly: https://issues.apache.org/jira/browse/HADOOP-3743 Strange, that I searched all source code, but there exists only this control in 2 places: if (!(job.getBoolean(mapred.used.genericoptionsparser, false))) { LOG.warn(Use GenericOptionsParser for parsing the arguments. + Applications should implement Tool for the same.); } Just an if block for logging, no extra controls. Am I missing something? If your class implements Tool, than there shouldn't be a warning. Cheers, Rasit 2009/2/18 Steve Loughran ste...@apache.org Sandhya E wrote: Hi All I prepare my JobConf object in a java class, by calling various set apis in JobConf object. When I submit the jobconf object using JobClient.runJob(conf), I'm seeing the warning: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. From hadoop sources it looks like setting mapred.used.genericoptionsparser will prevent this warning. But if I set this flag to true, will it have some other side effects. Thanks Sandhya Seen this message too -and it annoys me; not tracked it down -- M. Raşit ÖZDAŞ
Re: Finding small subset in very large dataset
I don't see why a HAR archive needs to be involved. You can use a MapFile to create a scannable index over a SequenceFile and do lookups that way. But if A is small enough to fit in RAM, then there is a much simpler way: Write it out to a file and disseminate to all mappers via the DistributedCache. They then reach read in the entire A set into a HashSet or other data structure during configure(), before they scan through their slices of B. They then emit only the B values which hit in A. This is called a map-side join. If you don't care about sorted ordering of your results, you can then disable the reducers entirely. Hive already supports this behavior; but you have to explicitly tell it to enable map-side joins for each query because only you know that one data set is small enough ahead of time. If your A set doesn't fit in RAM, you'll need to get more creative. One possibility is to do the same thing as above, but instead of reading all of A into memory, use a hash function to squash the keys from A into some bounded amount of RAM. For example, allocate yourself a 256 MB bitvector; for each key in A, set bitvector[hash(A_key) % len(bitvector)] = 1. Then for each B key in the mapper, if bitvector[hash(B_key) % len(bitvector)] == 1, then it may match an A key; if it's 0 then it definitely does not match an A key. For each potential match, send it to the reducer. Send all the A keys to the reducer as well, where the precise joining will occur. (Note: this is effectively the same thing as a Bloom Filter.) This will send much less data to each reducer and should see better throughput. - Aaron On Wed, Feb 11, 2009 at 4:07 PM, Amit Chandel amitchan...@gmail.com wrote: Are the keys in collection B unique? If so, I would like to try this approach: For each key, value of collection B, make a file out of it with file name given by MD5 hash of the key, and value being its content, and then store all these files into a HAR archive. The HAR archive will create an index for you over the keys. Now you can iterate over the collection A, get the MD5 hash of the key, and look up in the archive for the file (to get the value). On Wed, Feb 11, 2009 at 4:39 PM, Thibaut_ tbr...@blue.lu wrote: Hi, Let's say the smaller subset has name A. It is a relatively small collection 100 000 entries (could also be only 100), with nearly no payload as value. Collection B is a big collection with 10 000 000 entries (Each key of A also exists in the collection B), where the value for each key is relatively big ( 100 KB) For all the keys in A, I need to get the corresponding value from B and collect it in the output. - I can do this by reading in both files, and on the reduce step, do my computations and collect only those which are both in A and B. The map phase however will take very long as all the key/value pairs of collection B need to be sorted (and each key's value is 100 KB) at the end of the map phase, which is overkill if A is very small. What I would need is an option to somehow make the intersection first (Mapper only on keys, then a reduce functino based only on keys and not the corresponding values which collects the keys I want to take), and then running the map input and filtering the output collector or the input based on the results from the reduce phase. Or is there another faster way? Collection A could be so big that it doesn't fit into the memory. I could split collection A up into multiple smaller collections, but that would make it more complicated, so I want to evade that route. (This is similar to the approach I described above, just a manual approach) Thanks, Thibaut -- View this message in context: http://www.nabble.com/Finding-small-subset-in-very-large-dataset-tp21964853p21964853.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: HDD benchmark/checking tool
Dmitry, Look into cluster/system monitoring tools: nagios and ganglia are two to start with. - Aaron On Tue, Feb 3, 2009 at 9:53 AM, Dmitry Pushkarev u...@stanford.edu wrote: Dear hadoop users, Recently I have had a number of drive failures that slowed down processes a lot until they were discovered. It is there any easy way or tool, to check HDD performance and see if there any IO errors? Currently I wrote a simple script that looks at /var/log/messages and greps everything abnormal for /dev/sdaX. But if you have better solution I'd appreciate if you share it. --- Dmitry Pushkarev +1-650-644-8988
Re: Hadoop's reduce tasks are freezes at 0%.
Alternatively, maybe all the TaskTracker nodes can contact the NameNode and JobTracker, but cannot communicate with one another due to firewall issues? - Aaron On Mon, Feb 2, 2009 at 7:54 PM, jason hadoop jason.had...@gmail.com wrote: A reduce stall at 0% implies that the map tasks are not outputting any records via the output collector. You need to go look at the task tracker and the task logs on all of your slave machines, to see if anything that seems odd appears in the logs. On the tasktracker web interface detail screen for your job, Are all of the map tasks finished Are any of the map tasks started Are there any Tasktracker nodes to service your job On Sun, Feb 1, 2009 at 11:41 PM, Kwang-Min Choi kmbest.c...@samsung.com wrote: I'm newbie in Hadoop. and i'm trying to follow Hadoop Quick Guide at hadoop homepage. but, there are some problems... Downloading, unzipping hadoop is done. and ssh successfully operate without password phrase. once... I execute grep example attached to Hadoop... map task is ok. it reaches 100%. but reduce task freezes at 0% without any error message. I've waited it for more than 1 hour, but it still freezes... same job in standalone mode is well done... i tried it with version 0.18.3 and 0.17.2.1. all of them had same problem. could help me to solve this problem? Additionally... I'm working on cloud-infra of GoGrid(Redhat). So, disk's space health is OK. and, i've installed JDK 1.6.11 for linux successfully. - KKwams
Re: extra documentation on how to write your own partitioner class
er? It seems to be using value.get(). That having been said, you should really partition based on key, not on value. (I am not sure why, exactly, the value is provided to the getPartition() method.) Moreover, I think the problem is that you are using division ( / ) not modulus ( % ). Your code simplifies to: (value.get() / T) / (T / numPartitions) = value.get() * numPartitions / T^2. The contract of getPartition() is that it returns a value in [0, numPartitions). The division operators are not guaranteed to return anything in this range, but (foo % numPartitions) will always do the right thing. So it's probably just assigning everything to reduce partition 0. (Alternatively, it could be that value * numPartitions T^2 for any values of T you're testing with, which means that integer division will return 0.) - Aaron On Fri, Jan 30, 2009 at 3:43 PM, Sandy snickerdoodl...@gmail.com wrote: Hi James, Thank you very much! :-) -SM On Fri, Jan 30, 2009 at 4:17 PM, james warren ja...@rockyou.com wrote: Hello Sandy - Your partitioner isn't using any information from the key/value pair - it's only using the value T which is read once from the job configuration. getPartition() will always return the same value, so all of your data is being sent to one reducer. :P cheers, -James On Fri, Jan 30, 2009 at 1:32 PM, Sandy snickerdoodl...@gmail.com wrote: Hello, Could someone point me toward some more documentation on how to write one's own partition class? I have having quite a bit of trouble getting mine to work. So far, it looks something like this: public class myPartitioner extends MapReduceBase implements PartitionerIntWritable, IntWritable { private int T; public void configure(JobConf job) { super.configure(job); String myT = job.get(tval);//this is user defined T = Integer.parseInt(myT); } public int getPartition(IntWritable key, IntWritable value, int numReduceTasks) { int newT = (T/numReduceTasks); int id = ((value.get()/ T); return (int)(id/newT); } } In the run() function of my M/R program I just set it using: conf.setPartitionerClass(myPartitioner.class); Is there anything else I need to set in the run() function? The code compiles fine. When I run it, I know it is using the partitioner, since I get different output than if I just let it use HashPartitioner. However, it is not splitting between the reducers at all! If I set the number of reducers to 2, all the output shows up in part-0, while part-1 has nothing. I am having trouble debugging this since I don't know how I can observe the values of numReduceTasks (which I assume is being set by the system). Is this a proper assumption? If I try to insert any println() statements in the function, it isn't outputted to either my terminal or my log files. Could someone give me some general advice on how best to debug pieces of code like this?
Re: HADOOP-2536 supports Oracle too?
HADOOP-2536 supports the JDBC connector interface. Any database that exposes a JDBC library will work - Aaron On Tue, Feb 3, 2009 at 6:17 PM, Amandeep Khurana ama...@gmail.com wrote: Does the patch HADOOP-2536 support connecting to Oracle databases as well? Or is it just limited to MySQL? Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz
Re: sudden instability in 0.18.2
Hi David, If your tasks are failing on only the new nodes, it's likely that you're missing a library or something on those machines. See this Hadoop tutorial http://public.yahoo.com/gogate/hadoop-tutorial/html/module5.html about distributing debug scripts. These will allow you to capture stdout/err and the syslog from tasks that fail. - Aaron On Wed, Jan 28, 2009 at 9:40 AM, Sagar Naik sn...@attributor.com wrote: Pl check which nodes have these failures. I guess the new tasktrackers/machines are not configured correctly. As a result, the map-task will die and the remaining map-tasks will be sucked onto these machines -Sagar David J. O'Dell wrote: We've been running 0.18.2 for over a month on an 8 node cluster. Last week we added 4 more nodes to the cluster and have experienced 2 failures to the tasktrackers since then. The namenodes are running fine but all jobs submitted will die when submitted with this error on the tasktrackers. 2009-01-28 08:07:55,556 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction: attempt_200901280756_0012_m_74_2 2009-01-28 08:07:55,682 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901280756_0012_m_74_2 Child Error java.io.IOException: Task process exit with nonzero status of 1. at org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:462) at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:403) I tried running the tasktrackers in debug mode but the entries above are all that show up in the logs. As of now my cluster is down.
Re: sudden instability in 0.18.2
Wow. How many subdirectories were there? how many jobs do you run a day? - Aaron On Wed, Jan 28, 2009 at 12:13 PM, David J. O'Dell dod...@videoegg.comwrote: It was failing on all the nodes both new and old. The problem was there were too many subdirectories under $HADOOP_HOME/logs/userlogs The fix was just to delete the subdirs and change this setting from 24 hours(the default) to 2 hours. mapred.userlog.retain.hours Would have been nice if there was an error message that pointed to this. Aaron Kimball wrote: Hi David, If your tasks are failing on only the new nodes, it's likely that you're missing a library or something on those machines. See this Hadoop tutorial http://public.yahoo.com/gogate/hadoop-tutorial/html/module5.html about distributing debug scripts. These will allow you to capture stdout/err and the syslog from tasks that fail. - Aaron On Wed, Jan 28, 2009 at 9:40 AM, Sagar Naik sn...@attributor.com wrote: Pl check which nodes have these failures. I guess the new tasktrackers/machines are not configured correctly. As a result, the map-task will die and the remaining map-tasks will be sucked onto these machines -Sagar David J. O'Dell wrote: We've been running 0.18.2 for over a month on an 8 node cluster. Last week we added 4 more nodes to the cluster and have experienced 2 failures to the tasktrackers since then. The namenodes are running fine but all jobs submitted will die when submitted with this error on the tasktrackers. 2009-01-28 08:07:55,556 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction: attempt_200901280756_0012_m_74_2 2009-01-28 08:07:55,682 WARN org.apache.hadoop.mapred.TaskRunner: attempt_200901280756_0012_m_74_2 Child Error java.io.IOException: Task process exit with nonzero status of 1. at org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:462) at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:403) I tried running the tasktrackers in debug mode but the entries above are all that show up in the logs. As of now my cluster is down. -- David O'Dell Director, Operations e: dod...@videoegg.com t: (415) 738-5152 180 Townsend St., Third Floor San Francisco, CA 94107
Re: tools for scrubbing HDFS data nodes?
By scrub do you mean delete the blocks from the node? Read your conf/hadoop-site.xml file to determine where dfs.data.dir points, then for each directory in that list, just rm the directory. If you want to ensure that your data is preserved with appropriate replication levels on the rest of your clutser, you should use Hadoop's DataNode Decommission feature to up-replicate the data before you blow a copy away. - Aaron On Wed, Jan 28, 2009 at 2:10 PM, Sriram Rao srirams...@gmail.com wrote: Hi, Is there a tool that one could run on a datanode to scrub all the blocks on that node? Sriram
Re: What happens in HDFS DataNode recovery?
Also, see the balancer tool that comes with Hadoop. This background process should be run periodically (Every week or so?) to make sure that data's evenly distributed. http://hadoop.apache.org/core/docs/r0.19.0/hdfs_user_guide.html#Rebalancer - Aaron On Sat, Jan 24, 2009 at 7:40 PM, jason hadoop jason.had...@gmail.comwrote: The blocks will be invalidated on the returned to service datanode. If you want to save your namenode and network a lot of work, wipe the hdfs block storage directory before returning the Datanode to service. dfs.data.dir will be the directory, most likley the value is ${hadoop.tmp.dir}/dfs/data Jason - Ex Attributor On Sat, Jan 24, 2009 at 6:19 PM, C G parallel...@yahoo.com wrote: Hi All: I elected to take a node out of one of our grids for service. Naturally HDFS recognized the loss of the DataNode and did the right stuff, fixing replication issues and ultimately delivering a clean file system. So now the node I removed is ready to go back in service. When I return it to service a bunch of files will suddenly have a replication of 4 instead of 3. My questions: 1. Will HDFS delete a copy of the data to bring replication back to 3? 2. If (1) above is yes, will it remove the copy by deleting from other nodes, or will it remove files from the returned node, or both? The motivation for asking the questions are that I have a file system which is extremely unbalanced - we recently doubled the size of the grid when a few dozen terabytes already stored on the existing nodes. I am wondering if an easy way to restore some sense of balance is to cycle through the old nodes, removing each one from service for several hours and then return it to service. Thoughts? Thanks in Advance, C G
Re: Mapred job parallelism
Indeed, you will need to enable the Fair Scheduler or Capacity Scheduler (which are both in 0.19) to do this. mapred.map.tasks is more a hint than anything else -- if you have more files to map than you set this value to, it will use more tasks than you configured the job to. The newer schedulers will ensure that each job's many map tasks are only using a portion of the available slots. - Aaron On Mon, Jan 26, 2009 at 1:43 PM, jason hadoop jason.had...@gmail.comwrote: I believe that the schedule code in 0.19.0 has a framework for this, but I haven't dug into it in detail yet. http://hadoop.apache.org/core/docs/r0.19.0/capacity_scheduler.html From what I gather you would set up 2 queues, each with guaranteed access to 1/2 of the cluster Then you submit your jobs to alternate queues. This is not ideal as you have to balance what queue you submit jobs to, to ensure that there is some depth. On Mon, Jan 26, 2009 at 1:30 PM, Sagar Naik sn...@attributor.com wrote: Hi Guys, I was trying to setup a cluster so that two jobs can run simultaneously. The conf : number of nodes : 4(say) mapred.tasktracker.map.tasks.maximum=2 and in the joblClient mapred.map.tasks=4 (# of nodes) I also have a condition, that each job should have only one map-task per node In short, created 8 map slots and set the number of mappers to 4. So now, we have two jobs running simultaneously However, I realized that, if a tasktracker happens to die, potentially, I will have 2 map-tasks running on a node Setting mapred.tasktracker.map.tasks.maximum=1 in Jobclient has no effect. It is tasktracker property and cant be changed per job Any ideas on how to have 2 jobs running simultaneously ? -Sagar
Re: Distributed cache testing in local mode
Hi Bhupesh, I've noticed the same problem -- LocalJobRunner makes the DistributedCache effectively not work; so my code often winds up with two codepaths to retrieve the local data :\ You could try running in pseudo-distributed mode to test, though then you lose the ability to run a single-stepping debugger on the whole end-to-end process. - Aaron On Thu, Jan 22, 2009 at 11:29 AM, Bhupesh Bansal bban...@linkedin.comwrote: Hey folks, I am trying to use Distributed cache in hadoop jobs to pass around configuration files , external-jars (job sepecific) and some archive data. I want to test Job end-to-end in local mode, but I think the distributed caches are localized in TaskTracker code which is not called in local mode Through LocalJobRunner. I can do some fairly simple workarounds for this but was just wondering if folks have more ideas about it. Thanks Bhupesh
Re: Hadoop 0.17.1 = EOFException reading FSEdits file, what causes this? how to prevent?
Does the file exist or maybe was it deleted? Also, are the permissions on that directory set correctly, or could they have been changed out from under you by accident? - Aaron On Tue, Jan 13, 2009 at 9:53 AM, Joe Montanez jmonta...@veoh.com wrote: Hi: I'm using Hadoop 0.17.1 and I'm encountering EOFException reading the FSEdits file. I don't have a clear understanding what is causing this and how to prevent this. Has anyone seen this and can advise? Thanks in advance, Joe 2009-01-12 22:51:45,573 ERROR org.apache.hadoop.dfs.NameNode: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at org.apache.hadoop.io.UTF8.readFields(UTF8.java:106) at org.apache.hadoop.io.ArrayWritable.readFields(ArrayWritable.java:90) at org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:599) at org.apache.hadoop.dfs.FSImage.loadFSEdits(FSImage.java:766) at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:640) at org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:223) at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:80) at org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:274) at org.apache.hadoop.dfs.FSNamesystem.init(FSNamesystem.java:255) at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:133) at org.apache.hadoop.dfs.NameNode.init(NameNode.java:178) at org.apache.hadoop.dfs.NameNode.init(NameNode.java:164) at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:848) at org.apache.hadoop.dfs.NameNode.main(NameNode.java:857) 2009-01-12 22:51:45,574 INFO org.apache.hadoop.dfs.NameNode: SHUTDOWN_MSG:
Re: Map input records(on JobTracker website) increasing and decreasing
The actual number of input records is most likely steadily increasing. The counters on the web site are inaccurate until the job is complete; their values will fluctuate wildly. I'm not sure why this is. - Aaron On Mon, Jan 5, 2009 at 8:34 AM, Saptarshi Guha saptarshi.g...@gmail.comwrote: Hello, When I check the job tracker web page, and look at the Map Input records read,the map input records goes up to say 1.4MN and then drops to 410K and then goes up again. The same happens with input/output bytes and output records. Why is this? Is there something wrong with the mapper code? In my map function, i assume I have received one line of input. The oscillatory behavior does not occur for tiny datasets, but for 1GB of data (tiny for others) i see this happening. Thank s Saptarshi -- Saptarshi Guha - saptarshi.g...@gmail.com