Re: Indexed Hashtables
Delip, So far we have had pretty good luck with memcached. We are building a hadoop based solution for data warehouse ETL on XML based log files that represent click stream data on steroids. We process about 34 million records or about 70 GB data a day. We have to process dimensional data in our warehouse and then load the surrogate pairs in memcached so we can traverse the XML files once again to perform the substitutions. We are using the memcached solution because is scales out just like hadoop. We will have code that allows us to fall back to the DB if the memcached lookup fails but that should not happen to often. Thanks. --sean Sean Shanny ssha...@tripadvisor.com On Jan 14, 2009, at 9:47 PM, Delip Rao wrote: Hi, I need to lookup a large number of key/value pairs in my map(). Is there any indexed hashtable available as a part of Hadoop I/O API? I find Hbase an overkill for my application; something on the lines of HashStore (www.cellspark.com/hashstore.html) should be fine. Thanks, Delip
Re: TestDFSIO delivers bad values of "throughput" and "average IO rate"
In TestDFSIO we want each task to create only one file. It is a one-to-one mapping from files to map tasks. And splits are defined so that each map gets only one file name, which it creates or reads. --Konstantin tienduc_dinh wrote: I don't understand, why the parameter -nrFiles of TestDFSIO should override mapred.map.tasks. nrFiles is the number of the files which will be created and mapred.map.tasks is the number how many splits will be done by the input file. Thanks Konstantin Shvachko wrote: Hi tienduc_dinh, Just a bit of a background, which should help to answer your questions. TestDFSIO mappers perform one operation (read or write) each, measure the time taken by the operation and output the following three values: (I am intentionally omitting some other output stuff.) - size(i) - time(i) - rate(i) = size(i) / time(i) i is the index of the map task 0 <= i < N, and N is the "-nrFiles" value, which equals the number of maps. Then the reduce sums those values and writes them into "part-0". That is you get three fields in it size = size(0) + ... + size(N-1) time = time(0) + ... + time(N-1) rate = rate(0) + ... + rate(N-1) Then we calculate throughput = size / time averageIORate = rate / N So answering your questions - There should be only one reduce task, otherwise you will have to manually sum corresponding values in "part-0" and "part-1". - The value of the ":rate" after the reduce equals the sum of individual rates of each operation. So if you want to have an average you should divide it by the number tasks rather than multiply. Now, in your case you create only one file "-nrFiles 1", which means you run only one map task. Setting "mapred.map.tasks" to 10 in hadoop-site.xml defines the default number of tasks per job. See here http://hadoop.apache.org/core/docs/current/hadoop-default.html#mapred.map.tasks In case of TestDFSIO it will be overridden by "-nrFiles". Hope this answers your questions. Thanks, --Konstantin tienduc_dinh wrote: Hello, I'm now using hadoop-0.18.0 and testing it on a cluster with 1 master and 4 slaves. In hadoop-site.xml the value of "mapred.map.tasks" is 10. Because the values "throughput" and "average IO rate" are similar, I just post the values of "throughput" of the same command with 3 times running - > hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 2048 -nrFiles 1 + with "dfs.replication = 1" => 33,60 / 31,48 / 30,95 + with "dfs.replication = 2" => 26,40 / 20,99 / 21,70 I find something strange while reading the source code. - The value of mapred.reduce.tasks is always set to 1 job.setNumReduceTasks(1) in the function runIOTest() and reduceFile = new Path(WRITE_DIR, "part-0") in analyzeResult(). So I think, if we properly have mapred.reduce.tasks = 2, we will have on the file system 2 Paths to "part-0" and "part-1", e.g. /benchmarks/TestDFSIO/io_write/part-0 - And i don't understand the line with "double med = rate / 1000 / tasks". Is it not "double med = rate * tasks / 1000 "
Indexed Hashtables
Hi, I need to lookup a large number of key/value pairs in my map(). Is there any indexed hashtable available as a part of Hadoop I/O API? I find Hbase an overkill for my application; something on the lines of HashStore (www.cellspark.com/hashstore.html) should be fine. Thanks, Delip
Re: Merging reducer outputs into a single part-00000 file
Owen and Rasit, Thank you for the responses. I've figured that mapred.reduce.tasks was set to 1 in my hadoop-default xml and I didn't overwrite it in my hadoop-site.xml configuration file. Jim On Wed, Jan 14, 2009 at 11:23 AM, Owen O'Malley wrote: > On Jan 14, 2009, at 12:46 AM, Rasit OZDAS wrote: > > Jim, >> >> As far as I know, there is no operation done after Reducer. >> > > Correct, other than output promotion, which moves the output file to the > final filename. > > But if you are a little experienced, you already know these. >> Ordered list means one final file, or am I missing something? >> > > There is no value and a lot of cost associated with creating a single file > for the output. The question is how you want the keys divided between the > reduces (and therefore output files). The default partitioner hashes the key > and mods by the number of reduces, which "stripes" the keys across the > output files. You can use the mapred.lib.InputSampler to generate good > partition keys and mapred.lib.TotalOrderPartitioner to get completely sorted > output based on the partition keys. With the total order partitioner, each > reduce gets an increasing range of keys and thus has all of the nice > properties of a single file without the costs. > > -- Owen >
Re: RAID vs. JBOD
Hi, We at Yahoo did some Hadoop benchmarking experiments on clusters with JBOD and RAID0. We found that under heavy loads (such as gridmix), JBOD cluster performed better. Gridmix tests: Load: gridmix2 Cluster size: 190 nodes Test results: RAID0: 75 minutes JBOD: 67 minutes Difference: 10% Tests on HDFS writes performances We ran map only jobs writing data to dfs concurrently on different clusters. The overall dfs write throughputs on the jbod cluster are 30% (with a 58 nodes cluster) and 50% (with an 18 nodes cluster) better than that on the raid0 cluster, respectively. To understand why, we did some file level benchmarking on both clusters. We found that the file write throughput on a JBOD machine is 30% higher than that on a comparable machine with RAID0. This performance difference may be explained by the fact that the throughputs of different disks can vary 30% to 50%. With such variations, the overall throughput of a raid0 system may be bottlenecked by the slowest disk. -- Runping On 1/11/09 1:23 PM, "David B. Ritch" wrote: > How well does Hadoop handle multiple independent disks per node? > > I have a cluster with 4 identical disks per node. I plan to use one > disk for OS and temporary storage, and dedicate the other three to > HDFS. Our IT folks have some disagreement as to whether the three disks > should be striped, or treated by HDFS as three independent disks. Could > someone with more HDFS experience comment on the relative advantages and > disadvantages to each approach? > > Here are some of my thoughts. It's a bit easier to manage a 3-disk > striped partition, and we wouldn't have to worry about balancing files > between them. Single-file I/O should be considerably faster. On the > other hand, I would expect typical use to require multiple files reads > or write simultaneously. I would expect Hadoop to be able to manage > read/write to/from the disks independently. Managing 3 streams to 3 > independent devices would likely result in less disk head movement, and > therefore better performance. I would expect Hadoop to be able to > balance load between the disks fairly well. Availability doesn't really > differentiate between the two approaches - if a single disk dies, the > striped array would go down, but all its data should be replicated on > another datanode, anyway. And besides, I understand that datanode will > shut down a node, even if only one of 3 independent disks crashes. > > So - any one want to agree or disagree with these thoughts? Anyone have > any other ideas, or - better - benchmarks and experience with layouts > like these two? > > Thanks! > > David
Re: getting null from CompressionCodecFactory.getCodec(Path file)
I got it. For some reason getDefaultExtension() returns ".lzo_deflate". Is that a bug? Shouldn't it be .lzo? The .lzo suffix is reserved for lzop (LzopCodec). LzoCodec doesn't generate compatible output, hence "lzo_deflate". -C In the head revision I couldn't find it at all in http://svn.apache.org/repos/asf/hadoop/core/trunk/src/core/org/apache/hadoop/io/compress/ There should be a Class LzoCodec.java. Was that moved to somewhere else? Gert Gert Pfeifer wrote: Arun C Murthy wrote: On Jan 13, 2009, at 7:29 AM, Gert Pfeifer wrote: Hi, I want to use an lzo file as input for a mapper. The record reader determines the codec using a CompressionCodecFactory, like this: (Hadoop version 0.19.0) http://hadoop.apache.org/core/docs/r0.19.0/native_libraries.html I should have mentioned that I have these native libs running: 2009-01-14 10:00:21,107 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library 2009-01-14 10:00:21,111 INFO org.apache.hadoop.io.compress.LzoCodec: Successfully loaded & initialized native-lzo library Is that what you tried to point out with this link? Gert hth, Arun compressionCodecs = new CompressionCodecFactory(job); System.out.println("Using codecFactory: "+compressionCodecs.toString()); final CompressionCodec codec = compressionCodecs.getCodec(file); System.out.println("Using codec: "+codec+" for file "+file.getName()); The output that I get is: Using codecFactory: { etalfed_ozl.: org.apache.hadoop.io.compress.LzoCodec } Using codec: null for file test.lzo Of course, the mapper does not work without codec. What could be the problem? Thanks, Gert
Re: General questions about Map-Reduce
I got it ... Thanks to all Cheers, Duc -- View this message in context: http://www.nabble.com/General-questions-about-Map-Reduce-tp21399361p21461628.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: libhdfs append question[MESSAGE NOT SCANNED]
Tamas, There is a patch attached to the issue, which you should be able to apply to get O_APPEND . https://issues.apache.org/jira/browse/HADOOP-4494 Craig Tamás Szokol wrote: Hi! I'm using the latest stable 0.19.0 version of hadoop. I'd like to try the new append functionality. Is it available from libhdfs? I didn't find it in hdfs.h interface nor in the hdfs.c implementation. I saw the hdfsOpenFile's new flag O_APPEND in: HADOOP-4494 (http://mail-archives.apache.org/mod_mbox/hadoop-core-dev/200810.mbox/%3c270645717.1224722204314.javamail.j...@brutus%3e), but I still don't find it in the latest release. Is it available as a patch, or maybe only available in the svn repository? Could you please give me some pointers how to use the append functionality from libhdfs? Thank you in advance! Cheers, Tamas
Re: Merging reducer outputs into a single part-00000 file
On Jan 14, 2009, at 12:46 AM, Rasit OZDAS wrote: Jim, As far as I know, there is no operation done after Reducer. Correct, other than output promotion, which moves the output file to the final filename. But if you are a little experienced, you already know these. Ordered list means one final file, or am I missing something? There is no value and a lot of cost associated with creating a single file for the output. The question is how you want the keys divided between the reduces (and therefore output files). The default partitioner hashes the key and mods by the number of reduces, which "stripes" the keys across the output files. You can use the mapred.lib.InputSampler to generate good partition keys and mapred.lib.TotalOrderPartitioner to get completely sorted output based on the partition keys. With the total order partitioner, each reduce gets an increasing range of keys and thus has all of the nice properties of a single file without the costs. -- Owen
Hello, world for Hadoop + Lucene
Howdy! Is there any sort of "Hello, world!" example for building a Lucene index with Hadoop? I am looking through the source in contrib/index and it is a bit beyond me at the moment. Alternatively, is there more documentation related to the contrib/index example code? There seems to be a lot of information out on the tubes for how to do distribute indices and query them (e.g. Katta). Nutch obviously also comes up, but it is not clear to me how to come to grips with Nutch and I'm not interested in web crawling. What I'm looking for is a simple example for the hadoop/lucene newbie where you can take a String or a Text object and index it as a document. If, understandably, such an example does not exist, any pointers from the experts would be appreciated. I don't care as much about real world usage/performance, as I do about pedagogical code which can serve as a base for learning, just to give me a toehold. Many thanks, John
Does the logging properties still work?
Hi all, In hadoop-default.xml file, I found properties hadoop.logfile.size and hadoop.logfile.count. I didn't change the default settings, but still log files grew beyond the limit. Does these logging properties still work in hadoop 0.19.0? btw: I found a class org.apache.hadoop.util.LogFormatter using property hadoop.logfile.size was missing in hadoop 0.9 after the LogFormatter was replaced by log4j. -Wei
HOD: cluster allocation don't work
Hi, We use an hadoop cluster with version 0.18.2. The static hdfs and mapred system works fine. Sometimes tasktrackerchild processes don't finish after job completion. That is another problem. My present issue is hod. I've set up torque and hod according the online documentations. The torque service run on all cluster nodes and works fine. I want to use hod with an external hdfs. I run 'hod allocate -d ~/hod-clusters/test -n 4' to allocate a mapred cluster. According the log messages it founds the external hdfs but don't starts the jobtracker and tasktracker. I would appreciate any pointers for this problem? Thanks, Thomas [2009-01-14 11:35:15,885] DEBUG/10 torque:54 - qsub stdin: #!/bin/sh [2009-01-14 11:35:15,891] DEBUG/10 torque:54 - qsub stdin: /opt/hadoop/hod-0.18.2/bin/ringmaster --hodring.tarball-retry-initial-time 1.0 --hodring.cmd-retry-initial-time 2.0 --hodring.log-destination-uri hdfs://hn01.berlin.semgine.com:3/hod/logs --hodring.log-dir /opt/hadoop/hod-0.18.2/logs --hodring.temp-dir /tmp/hod --hodring.userid hadoop --hodring.register --hodring.http-port-range 8000-9000 --hodring.java-home /usr/lib/jvm/java-1.6.0-sun-1.6.0.11 --hodring.tarball-retry-interval 3.0 --hodring.cmd-retry-interval 2.0 --hodring.mapred-system-dir-root /hod/mapredsystem --hodring.xrs-port-range 32768-65536 --hodring.debug 4 --hodring.pkgs /opt/hadoop/hadoop-0.18.2 --resource_manager.queue batch --resource_manager.env-vars "HOD_PYTHON_HOME=/usr/bin/python25" --resource_manager.id torque --resource_manager.batch-home /usr --gridservice-hdfs.fs_port 3 --gridservice-hdfs.host hn01.berlin.semgine.com --gridservice-hdfs.pkgs /opt/hadoop/hadoop-0.18.2 --gridservice-hdfs.info_port 30030 --gridservice-hdfs.external --ringmaster.http-port-range 8000-9000 --ringmaster.temp-dir /tmp/hod --ringmaster.userid hadoop --ringmaster.register --ringmaster.max-master-failures 2 --ringmaster.work-dirs /tmp/hod/1,/tmp/hod/2 --ringmaster.svcrgy-addr hn01.berlin.semgine.com:44223 --ringmaster.log-dir /opt/hadoop/hod-0.18.2/logs --ringmaster.max-connect 30 --ringmaster.xrs-port-range 32768-65536 --ringmaster.jt-poll-interval 120 --ringmaster.debug 4 --ringmaster.idleness-limit 3600 --gridservice-mapred.tracker_port 0 --gridservice-mapred.host localhost --gridservice-mapred.pkgs /opt/hadoop/hadoop-0.18.2 --gridservice-mapred.info_port 0 [2009-01-14 11:35:15,904] DEBUG/10 torque:76 - qsub jobid: 38.hn01.berlin.semgine.com [2009-01-14 11:35:15,912] DEBUG/10 torque:87 - /usr/bin/qstat -f -1 38.hn01.berlin.semgine.com [2009-01-14 11:35:15,955] DEBUG/10 hadoop:196 - job state R [2009-01-14 11:35:15,959] INFO/20 hadoop:543 - Cluster Id 38.hn01.berlin.semgine.com [2009-01-14 11:35:19,996] DEBUG/10 hadoop:547 - Ringmaster at : http://hn04.berlin.semgine.com:39581/ [2009-01-14 11:35:20,039] INFO/20 hadoop:556 - HDFS UI at http://hn01.berlin.semgine.com:30030 [2009-01-14 11:35:50,767] DEBUG/10 torque:87 - /usr/bin/qstat -f -1 38.hn01.berlin.semgine.com [2009-01-14 11:35:50,810] DEBUG/10 hadoop:196 - job state R [2009-01-14 11:36:21,644] DEBUG/10 torque:87 - /usr/bin/qstat -f -1 38.hn01.berlin.semgine.com [2009-01-14 11:36:21,684] DEBUG/10 hadoop:196 - job state R The ringmaster process logs the following messages. [2009-01-14 11:35:21,614] DEBUG/10 torque:147 - pbsdsh command: /usr/bin/pbsdsh /opt/hadoop/hod-0.18.2/bin/hodring --hodring.log-dir /opt/hadoop/hod-0.18.2/logs --hodring.tar ball-retry-initial-time 1.0 --hodring.cmd-retry-initial-time 2.0 --hodring.cmd-retry-interval 2.0 --hodring.service-id 38.hn01.berlin.semgine.com --hodring.temp-dir /tmp/hod --hodring.http-port-range 8000-9000 --hodring.userid hadoop --hodring.java-home /usr/lib/jvm/java-1.6.0-sun-1.6.0.11 --hodring.svcrgy-addr hn01.berlin.semgine.com:44223 --hod ring.tarball-retry-interval 3.0 --hodring.log-destination-uri hdfs://hn01.berlin.semgine.com:3/hod/logs --hodring.mapred-system-dir-root /hod/mapredsystem --hodring.pkgs /opt/hadoop/hadoop-0.18.2 --hodring.debug 4 --hodring.ringmaster-xrs-addr hn04.berlin.semgine.com:39581 --hodring.xrs-port-range 32768-65536 --hodring.register [2009-01-14 11:35:21,634] DEBUG/10 ringMaster:479 - getServiceAddr name: mapred [2009-01-14 11:35:21,643] DEBUG/10 ringMaster:487 - getServiceAddr service: [2009-01-14 11:35:21,653] DEBUG/10 ringMaster:504 - getServiceAddr addr mapred: not found [2009-01-14 11:35:21,657] DEBUG/10 ringMaster:925 - Returned from runWorkers. [2009-01-14 11:35:22,087] DEBUG/10 ringMaster:479 - getServiceAddr name: mapred [2009-01-14 11:35:22,096] DEBUG/10 ringMaster:487 - getServiceAddr service: [2009-01-14 11:35:22,104] DEBUG/10 ringMaster:504 - getServiceAddr addr mapred: not found [2009-01-14 11:35:23,113] DEBUG/10 ringMaster:479 - getServiceAddr name: mapred [2009-01-14 11:35:23,120] DEBUG/10 ringMaster:487 - getServiceAddr service: [2009-01-14 11:35:23,127] DEBUG/10 ringMaster:504 - getServiceAddr addr mapred: not found
Re: Re: getting null from CompressionCodecFactory.getCodec(Path file)
LZO was removed due to license incompatibility: https://issues.apache.org/jira/browse/HADOOP-4874 Tom On Wed, Jan 14, 2009 at 11:18 AM, Gert Pfeifer wrote: > I got it. For some reason getDefaultExtension() returns ".lzo_deflate". > > Is that a bug? Shouldn't it be .lzo? > > In the head revision I couldn't find it at all in > http://svn.apache.org/repos/asf/hadoop/core/trunk/src/core/org/apache/hadoop/io/compress/ > > There should be a Class LzoCodec.java. Was that moved to somewhere else? > > Gert > > Gert Pfeifer wrote: >> Arun C Murthy wrote: >>> On Jan 13, 2009, at 7:29 AM, Gert Pfeifer wrote: >>> Hi, I want to use an lzo file as input for a mapper. The record reader determines the codec using a CompressionCodecFactory, like this: (Hadoop version 0.19.0) >>> http://hadoop.apache.org/core/docs/r0.19.0/native_libraries.html >> >> I should have mentioned that I have these native libs running: >> 2009-01-14 10:00:21,107 INFO org.apache.hadoop.util.NativeCodeLoader: >> Loaded the native-hadoop library >> 2009-01-14 10:00:21,111 INFO org.apache.hadoop.io.compress.LzoCodec: >> Successfully loaded & initialized native-lzo library >> >> Is that what you tried to point out with this link? >> >> Gert >> >>> hth, >>> Arun >>> compressionCodecs = new CompressionCodecFactory(job); System.out.println("Using codecFactory: "+compressionCodecs.toString()); final CompressionCodec codec = compressionCodecs.getCodec(file); System.out.println("Using codec: "+codec+" for file "+file.getName()); The output that I get is: Using codecFactory: { etalfed_ozl.: org.apache.hadoop.io.compress.LzoCodec } Using codec: null for file test.lzo Of course, the mapper does not work without codec. What could be the problem? Thanks, Gert >
Re: Re: getting null from CompressionCodecFactory.getCodec(Path file)
I got it. For some reason getDefaultExtension() returns ".lzo_deflate". Is that a bug? Shouldn't it be .lzo? In the head revision I couldn't find it at all in http://svn.apache.org/repos/asf/hadoop/core/trunk/src/core/org/apache/hadoop/io/compress/ There should be a Class LzoCodec.java. Was that moved to somewhere else? Gert Gert Pfeifer wrote: > Arun C Murthy wrote: >> On Jan 13, 2009, at 7:29 AM, Gert Pfeifer wrote: >> >>> Hi, >>> I want to use an lzo file as input for a mapper. The record reader >>> determines the codec using a CompressionCodecFactory, like this: >>> >>> (Hadoop version 0.19.0) >>> >> http://hadoop.apache.org/core/docs/r0.19.0/native_libraries.html > > I should have mentioned that I have these native libs running: > 2009-01-14 10:00:21,107 INFO org.apache.hadoop.util.NativeCodeLoader: > Loaded the native-hadoop library > 2009-01-14 10:00:21,111 INFO org.apache.hadoop.io.compress.LzoCodec: > Successfully loaded & initialized native-lzo library > > Is that what you tried to point out with this link? > > Gert > >> hth, >> Arun >> >>> compressionCodecs = new CompressionCodecFactory(job); >>> System.out.println("Using codecFactory: "+compressionCodecs.toString()); >>> final CompressionCodec codec = compressionCodecs.getCodec(file); >>> System.out.println("Using codec: "+codec+" for file "+file.getName()); >>> >>> >>> The output that I get is: >>> >>> Using codecFactory: { etalfed_ozl.: >>> org.apache.hadoop.io.compress.LzoCodec } >>> Using codec: null for file test.lzo >>> >>> Of course, the mapper does not work without codec. What could be the >>> problem? >>> >>> Thanks, >>> Gert
Re: Re: getting null from CompressionCodecFactory.getCodec(Path file)
Arun C Murthy wrote: > > On Jan 13, 2009, at 7:29 AM, Gert Pfeifer wrote: > >> Hi, >> I want to use an lzo file as input for a mapper. The record reader >> determines the codec using a CompressionCodecFactory, like this: >> >> (Hadoop version 0.19.0) >> > > http://hadoop.apache.org/core/docs/r0.19.0/native_libraries.html I should have mentioned that I have these native libs running: 2009-01-14 10:00:21,107 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library 2009-01-14 10:00:21,111 INFO org.apache.hadoop.io.compress.LzoCodec: Successfully loaded & initialized native-lzo library Is that what you tried to point out with this link? Gert > > hth, > Arun > >> compressionCodecs = new CompressionCodecFactory(job); >> System.out.println("Using codecFactory: "+compressionCodecs.toString()); >> final CompressionCodec codec = compressionCodecs.getCodec(file); >> System.out.println("Using codec: "+codec+" for file "+file.getName()); >> >> >> The output that I get is: >> >> Using codecFactory: { etalfed_ozl.: >> org.apache.hadoop.io.compress.LzoCodec } >> Using codec: null for file test.lzo >> >> Of course, the mapper does not work without codec. What could be the >> problem? >> >> Thanks, >> Gert
Re: Dynamic Node Removal and Addition
Hi Alyssa, http://markmail.org/message/jyo4wssouzlb4olm#query:%22Decommission%20of%20datanodes%22+page:1+mid:p2krkt6ebysrsrpl+state:results as pointed here, decommission (removal) of datanodes was not an easy job at the date of version 0.12. I strongly think it's still not easy. As far as I know, one node should be used both as datanode and tasktracker. So, performance loss will be possibly far greater than performance gain of your design. My solution would be using them still as datanodes, and changing TaskTracker code a little bit, so that they won't be used for jobs. Code manipulation here should be easy, as I assume. Hope this helps, Rasit 2009/1/12 Hargraves, Alyssa : > Hello everyone, > > I have a question and was hoping some on the mailinglist could offer some pointers. I'm working on a project with another student and for part of this project we are trying to create something that will allow nodes to be added and removed from the hadoop cluster at will. The goal is to have the nodes run a program that gives the user the freedom to add or remove themselves from the cluster to take advantage of a workstation when the user leaves (or if they'd like it running anyway when they're at the PC). This would be on Windows computers of various different OSes. > > From what we can find, hadoop does not already support this feature, but it does seem to support dynamically adding nodes and removing nodes in other ways. For example, to add a node, one would have to make sure hadoop is set up on the PC along with cygwin, Java, and ssh, but after that initial setup it's just a matter of adding the PC to the conf/slaves file, making sure the node is not listed in the exclude file, and running the start datanode and start tasktracker commands from the node you are adding (basically described in FAQ item 25). To remove a node, it seems to be just a matter of adding it to dfs.hosts.exclude and refreshing the list of nodes (described in hadoop FAQ 17). > > Our question is whether or not a simple interface for this already exists, and whether or not anyone sees any potential flaws with how we are planning to accomplish these tasks. From our research we were not able to find anything that already exists for this purpose, but we find it surprising that an interface for this would not already exist. We welcome any comments, recommendations, and insights anyone might have for accomplishing this task. > > Thank you, > Alyssa Hargraves > Patrick Crane > WPI Class of 2009 -- M. Raşit ÖZDAŞ
Namenode freeze
Hi Datanode goes down. and then looks like ReplicationMonitor tries to even-out the replication However while doing so, it holds the lock on FsNameSystem With this lock held, other threads wait on this lock to respond As a result, the namenode does not list the dirs/ Web-UI does not respond I would appreciate any pointers for this problem ? (Hadoop .18.1) -Sagar Namenode freeze stackdump : 2009-01-14 00:57:02 Full thread dump Java HotSpot(TM) 64-Bit Server VM (10.0-b23 mixed mode): "SocketListener0-4" prio=10 tid=0x2aac54008000 nid=0x644d in Object.wait() [0x4535a000..0x4535aa80] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x2aab6cb1dba0> (a org.mortbay.util.ThreadPool$PoolThread) at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:522) - locked <0x2aab6cb1dba0> (a org.mortbay.util.ThreadPool$PoolThread) Locked ownable synchronizers: - None "SocketListener0-5" prio=10 tid=0x2aac54008c00 nid=0x63f1 in Object.wait() [0x4545b000..0x4545bb00] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x2aab6c2ea1a8> (a org.mortbay.util.ThreadPool$PoolThread) at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:522) - locked <0x2aab6c2ea1a8> (a org.mortbay.util.ThreadPool$PoolThread) Locked ownable synchronizers: - None "Trash Emptier" daemon prio=10 tid=0x511ca400 nid=0x1fd waiting on condition [0x45259000..0x45259a00] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.fs.Trash$Emptier.run(Trash.java:219) at java.lang.Thread.run(Thread.java:619) Locked ownable synchronizers: - None "org.apache.hadoop.dfs.dfsclient$leasechec...@767a9224" daemon prio=10 tid=0x51384400 nid=0x1fc sleeping[0x45158000..0x45158a80] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.dfs.DFSClient$LeaseChecker.run(DFSClient.java:792) at java.lang.Thread.run(Thread.java:619) Locked ownable synchronizers: - None "IPC Server handler 44 on 54310" daemon prio=10 tid=0x2aac40183c00 nid=0x1f4 waiting for monitor entry [0x44f56000..0x44f56d80] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.dfs.FSNamesystem.blockReportProcessed(FSNamesystem.java:1880) - waiting to lock <0x2aaab423a530> (a org.apache.hadoop.dfs.FSNamesystem) at org.apache.hadoop.dfs.FSNamesystem.handleHeartbeat(FSNamesystem.java:2127) at org.apache.hadoop.dfs.NameNode.sendHeartbeat(NameNode.java:602) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888) Locked ownable synchronizers: - None "IPC Server handler 43 on 54310" daemon prio=10 tid=0x2aac40182400 nid=0x1f3 waiting for monitor entry [0x44e55000..0x44e55a00] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.dfs.FSNamesystem.startFileInternal(FSNamesystem.java:922) - waiting to lock <0x2aaab423a530> (a org.apache.hadoop.dfs.FSNamesystem) at org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.java:903) at org.apache.hadoop.dfs.NameNode.create(NameNode.java:284) at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888) Locked ownable synchronizers: - None "IPC Server handler 42 on 54310" daemon prio=10 tid=0x2aac40181000 nid=0x1f2 waiting for monitor entry [0x44d54000..0x44d54a80] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.dfs.FSNamesystem.blockReportProcessed(FSNamesystem.java:1880) - waiting to lock <0x2aaab423a530> (a org.apache.hadoop.dfs.FSNamesystem) at org.apache.hadoop.dfs.FSNamesystem.handleHeartbeat(FSNamesystem.java:2127) at org.apache.hadoop.dfs.NameNode.sendHeartbeat(NameNode.java:602) at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888) Locked ownable synchronizers: - None "IPC Server hand
Re: Merging reducer outputs into a single part-00000 file
Jim, As far as I know, there is no operation done after Reducer. At the first look, the situation reminds me of same keys for all the tasks, This can be the result of one of following cases: - input format reads same keys for every task. - mapper collects every incoming key-value pairs under same key. - reducer makes the same. But if you are a little experienced, you already know these. Ordered list means one final file, or am I missing something? Hope this helps, Rasit 2009/1/11 Jim Twensky : > Hello, > > The original map-reduce paper states: "After successful completion, the > output of the map-reduce execution is available in the R output files (one > per reduce task, with file names as specified by the user)." However, when > using Hadoop's TextOutputFormat, all the reducer outputs are combined in a > single file called part-0. I was wondering how and when this merging > process is done. When the reducer calls output.collect(key,value), is this > record written to a local temporary output file in the reducer's disk and > then these local files (a total of R) are later merged into one single file > with a final thread or is it directly written to the final output file > (part-0)? I am asking this because I'd like to get an ordered sample of > the final output data, ie. one record per every 1000 records or something > similar and I don't want to run a serial process that iterates on the final > output file. > > Thanks, > Jim > -- M. Raşit ÖZDAŞ