map tasks fail to report status
Hi, I have a map task that works most of the time but fails on some data. I keep getting these exceptions: Task attempt_200811031947_0003_m_95_0 failed to report status for 600 seconds. Killing! I noticed that the tasks that fail have a lot of these at the end of the syslogs: 2008-11-03 21:05:52,745 INFO org.apache.hadoop.mapred.Merger: Merging 41 sorted segments 2008-11-03 21:05:52,746 INFO org.apache.hadoop.mapred.Merger: Merging 5 intermediate segments out of a total of 41 2008-11-03 21:05:53,016 INFO org.apache.hadoop.mapred.Merger: Merging 10 intermediate segments out of a total of 37 2008-11-03 21:05:53,147 INFO org.apache.hadoop.mapred.Merger: Merging 10 intermediate segments out of a total of 28 2008-11-03 21:05:53,329 INFO org.apache.hadoop.mapred.Merger: Merging 10 intermediate segments out of a total of 19 2008-11-03 21:05:53,525 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 10 segments left of total size: 7866139 bytes 2008-11-03 21:05:53,848 INFO org.apache.hadoop.mapred.MapTask: Index: (2465254733, 7866121, 7866121) 2008-11-03 21:05:53,900 INFO org.apache.hadoop.mapred.Merger: Merging 41 sorted segments 2008-11-03 21:05:53,900 INFO org.apache.hadoop.mapred.Merger: Merging 5 intermediate segments out of a total of 41 2008-11-03 21:05:53,963 INFO org.apache.hadoop.mapred.Merger: Merging 10 intermediate segments out of a total of 37 2008-11-03 21:05:53,976 INFO org.apache.hadoop.mapred.Merger: Merging 10 intermediate segments out of a total of 28 2008-11-03 21:05:53,996 INFO org.apache.hadoop.mapred.Merger: Merging 10 intermediate segments out of a total of 19 2008-11-03 21:05:54,013 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 10 segments left of total size: 4290611 bytes ... Sure the ones that succeed have them too but the number of segments is always significantly lower: 2008-11-03 20:42:38,214 INFO org.apache.hadoop.mapred.MapTask: Index: (125745724, 351203, 351203) 2008-11-03 20:42:38,221 INFO org.apache.hadoop.mapred.Merger: Merging 2 sorted segments 2008-11-03 20:42:38,221 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 345895 bytes 2008-11-03 20:42:38,226 INFO org.apache.hadoop.mapred.MapTask: Index: (126096927, 345893, 345893) 2008-11-03 20:42:38,232 INFO org.apache.hadoop.mapred.Merger: Merging 2 sorted segments 2008-11-03 20:42:38,232 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 364718 bytes 2008-11-03 20:42:38,237 INFO org.apache.hadoop.mapred.MapTask: Index: (126442820, 364716, 364716) 2008-11-03 20:42:38,241 INFO org.apache.hadoop.mapred.Merger: Merging 2 sorted segments 2008-11-03 20:42:38,241 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 440435 bytes 2008-11-03 20:42:38,247 INFO org.apache.hadoop.mapred.MapTask: Index: (126807536, 440433, 440433) I don't get any exceptions beside the timeouts because the tasks don't report their status. So, my questions are: - what exactly is the Merger? Why is it only merging at the end of the tasks? Why does it seems to merge several times the same data? - Can it really be causing the problem or should I look somewhere else (there's no exception after all) ? It's most probably in my code but I don't see any exception so it's kind of hard to tell what's happening. Thanks in advance, Sebastien
Interaction between JobTracker / TaskTracker
Hi, As a relatively new users of Hadoop, I am trying to construct an architectural picture about how Hadoop processes interact when performing a Map/Reduce job. I haven't find much documentation on it. Can someone suggest any links or documentation that describe such detail ? Here I am making some guess based on what I've seen from the API, Admin interface and what the simplest possible implementation could be. I haven't looked at the source code and I know my guess is wrong because the implementation is simple, but unsophisticated. I am trying to layout the ground for the expert who is familiar with the underlying implementation to correct me. So here is my guess ... I appreciate if anyone can correct me with the actual implementation knowledge. 1. To start the map/reduce cluster, run the "start-mapred.sh" script, which based on the "hadoop-site.xml" file, starts (via SSH) one "JobTracker" process and multiple "TaskTracker" processes across multiple machines. (Of course, here I assume the HDFS is already started). 2. Now the client program submit a Map/Reduce job by creating a "JobConf" object and invoking "JobClient.runJob(jobConf)", this API will upload the Mapper and Reducer implementation classes, as well as the job config to the JobTracker daemon. (at this moment, I expect the directory of the input path is freezed. In other words, no new files can be created or removed from the input path). 3. After assigning a unique Job id, the JobTracker look at the "jobConf.inputPath" to determine how many TaskTrackers are needed for the mapper phase, based on the number of files in the input path, as well as how busy existing "TaskTrackers" are. Then it will select a number of "Map-Phase-TaskTrackers" who is relatively idle as well as physically close to the HDFS that host a copy (e.g. on the same machine or same rack). 4. Based on some scheduling algorithm, the JobTracker determines multiple files (from the input path) to assign to each selected Map-Phase-TaskTracker. It sends to each TaskTracker the job id, the mapper implementation class bytecode, as well as the name of the assigned input files. 5. Each Map-Phase-TaskTracker process will spawn multiple threads, one for each assigned input file. (so there is a 1 to 1 correspondence between threads and files). The TaskTracker also monitor the progress of each thread associated with this specific job id. The map phase is now started ... a. Each thread will start to read the assigned input file "sequentially", one record at a time using the "InputFormat class" specified in the jobConf. b. For each record it read, it invoke the uploaded Mapper implementation class's map() method. Whenever the "output.collect(key, value)" method is called, a record is added to an in-memory mapResultHashtable. This step is repeated until the whole input file is consumed (EOF is true). c. Then the thread inspect the mapResultHashtable. For each key, it invoke the uploaded combiner class's reduce() method. Whenever the "output.collect(key, value)" method is called, a record is added to an in-memory combineResultHashtable. Then the thread persist the combineResultHashtable into a local file (not HDFS). Finally the thread quits. 6. When all the threads associated with the Job ID quits, the Map-Phase-TaskTracker send a "map_complete" notification to the JobTracker. 7. When the JobTracker receives all the "map_complete" notifications, it know the map phases is completed. Now it is time to start the "reduce" phase. 8. The JobTracker look at the "jobConf.numReduceTask" to determine how many Reduce-phase-TaskTrackers are needed. It will also selected (randomly) those Reduce-Phase-TaskTrackers. For each of them, the jobTracker will send the job id as well as the reducer implementation class bytecode. 9. Now, for each of the previous Map-phase-TaskTracker, the JobTracker send a "reduce_ready" message, as well as an array of addresses of the Reduce-Phase-TaskTrackers. Each Map-Phase-TaskTracker will start a thread. 10. The thread iterate the persisted combineResultHashtable. For each key, it invoke the partitioner class's getPartition() function to determine the partition number and then the address of the Reduce-PhaseTaskTracker. And then open a network connection to the Reduce-Phase-TaskTracker and pass along the key and values. 11. At each Reduce-Phase-TaskTracker, for each new key received, it will spawn a new thread. This thread will invoke the Reducer.reduce() method. (so there is a 1 to 1 correspondence between threads and unique keys). The Reduce-Phase-TaskTracker also monitor the progress of each thread associated with this specific job id. a. Within the reduce() method, whenever the "output.collect(key, value)" method is called, a record is written using the "OutputFormat class" specified in the jobConf. b.
Re: How to cache a folder in HDFS?
On 1-Nov-08, at 11:57 AM, lamfeeling wrote: Hi all! I have a problem here. In my program, my code will read some config files in a folder, but it always fail on hadoop and says "Can not find the file...", I looked up the reference, it told me to cache the files in HDFS instead read the file from local. Now I can cache a file, But I really dont know how to cache a folder. May I need cache the file one by one? I am able to cache a directory the same way I cache files, I just give the HDFS directory name as I would a file name. The link created points to the directory. Karl Anderson [EMAIL PROTECTED] http://monkey.org/~kra
Re: Passing Constants from One Job to the Next
The Mapper and Reducer interfaces both provide a method 'void configure(JobConf conf) throws IOException'; if you extend MapReduceBase, this will provide a dummy implementation of configure(). You can add your own implementation; it will be called before the first call to map() or reduce(). You can read your initialization data at this time. - Aaron On Thu, Oct 30, 2008 at 4:02 PM, Erik Holstad <[EMAIL PROTECTED]> wrote: > Hi! > Is there a way of using the value read in the configure() in the Map or > Reduce phase? > > Erik > > On Thu, Oct 23, 2008 at 2:40 AM, Aaron Kimball <[EMAIL PROTECTED]> wrote: > > > See Configuration.setInt() in the API. (JobConf inherits from > > Configuration). You can read it back in the configure() method of your > > mappers/reducers > > - Aaron > > > > On Wed, Oct 22, 2008 at 3:03 PM, Yih Sun Khoo <[EMAIL PROTECTED]> wrote: > > > > > Are you saying that I can pass, say, a single integer constant with > > either > > > of these three: JobConf? A HDFS file? DistributedCache? > > > Or are you asking if I can pass given the context of: JobConf? A HDFS > > file? > > > DistributedCache? > > > I'm thinking of how to pass a single int so from one Jobconf to the > next > > > > > > On Wed, Oct 22, 2008 at 2:57 PM, Arun C Murthy <[EMAIL PROTECTED]> > > wrote: > > > > > > > > > > > On Oct 22, 2008, at 2:52 PM, Yih Sun Khoo wrote: > > > > > > > > I like to hear some good ways of passing constants from one job to > the > > > >> next. > > > >> > > > > > > > > Unless I'm missing something: JobConf? A HDFS file? DistributedCache? > > > > > > > > Arun > > > > > > > > > > > > > > > >> These are some ways that I can think of: > > > >> 1) The obvious solution is to carry the constant as part of your > > value > > > >> from > > > >> one job to the next, but that would mean every value would hold that > > > >> constant > > > >> 2) Use the reporter as a hack so that you can set the status > message > > > and > > > >> then get the status message back when u need the constant > > > >> > > > >> Any other ideas? (Also please do not include code) > > > >> > > > > > > > > > > > > > >
Re: Can anyone recommend me a inter-language data file format?
I've been using protocol buffers to serialize the data and then encoding them in base64 so that I can then treat them like text. This obviously isn't optimal, but I'm assuming that this is only a short term solution which won't be necessary when non-Java clients become first class citizens of the Hadoop world. Chris On Mon, Nov 3, 2008 at 2:24 PM, Pete Wyckoff <[EMAIL PROTECTED]> wrote: > > Protocol buffers, thrift? > > > On 11/3/08 4:07 AM, "Steve Loughran" <[EMAIL PROTECTED]> wrote: > > Zhou, Yunqing wrote: >> embedded database cannot handle large-scale data, not very efficient >> I have about 1 billion records. >> these records should be passed through some modules. >> I mean a data exchange format similar to XML but more flexible and >> efficient. > > > JSON > CSV > erlang-style records (name,value,value,value) > RDF-triples in non-XML representations > > For all of these, you need to test with data that includes things like > high unicode characters, single and double quotes, to see how well they > get handled. > > you can actually append with XML by not having opening/closing tags, > just stream out the entries to the tail of the file > ... > > To read this in an XML parser, include it inside another XML file: > > > > ]> > > > &log; > > > I've done this for very big files, as long as you aren't trying to load > it in-memory to a DOM, things should work > > -- > Steve Loughran http://www.1060.org/blogxter/publish/5 > Author: Ant in Action http://antbook.org/ > > >
Re: hadoop 0.18.1 x-trace
Hi Veiko, Right now the patches represent an instrumentation API for the RPC layer (the X-Trace implementation is not currently part of the patch-- I'm hoping to submit it as a contrib/ project). I'll be talking at the Hadoop Camp later this week about X-Trace and Hadoop. There is much to do in terms of building UIs, analysis tools, and trace storage/query interfaces. So stay tuned (and if you would be willing to talk more about your anticipated uses--please let me know. I'd be very interested in talking with you). Thanks, George On Nov 3, 2008, at 11:32 AM, Michael Bieniosek wrote: Try applying the last one only. Let us know if it works! -Michael On 11/3/08 6:23 AM, "Veiko Schnabel" <[EMAIL PROTECTED]> wrote: Dear Hadoop Users and Developers, I have a requirement of monitoring the hadoop-cluster by using x- trace. i found these pathes on http://issues.apache.org/jira/browse/HADOOP-4049 but when i try to integrate them with 0.18.1, then i cannot build hadoop anymore first of all , the patch-order is not clear to me, can anyone explain to me which patches i really need and the order to bring in these patches thanks Veiko -- Veiko Schnabel System-Administration optivo GmbH Stralauer Allee 2 10245 Berlin Tel.: +49 30 41724241 Fax: +49 30 41724239 Email: mailto:[EMAIL PROTECTED] Website: http://www.optivo.de Handelsregister: HRB Berlin 88738 Geschäftsführer: Peter Romianowski, Ulf Richter
Re: hadoop 0.18.1 x-trace
Try applying the last one only. Let us know if it works! -Michael On 11/3/08 6:23 AM, "Veiko Schnabel" <[EMAIL PROTECTED]> wrote: Dear Hadoop Users and Developers, I have a requirement of monitoring the hadoop-cluster by using x-trace. i found these pathes on http://issues.apache.org/jira/browse/HADOOP-4049 but when i try to integrate them with 0.18.1, then i cannot build hadoop anymore first of all , the patch-order is not clear to me, can anyone explain to me which patches i really need and the order to bring in these patches thanks Veiko -- Veiko Schnabel System-Administration optivo GmbH Stralauer Allee 2 10245 Berlin Tel.: +49 30 41724241 Fax: +49 30 41724239 Email: mailto:[EMAIL PROTECTED] Website: http://www.optivo.de Handelsregister: HRB Berlin 88738 Geschäftsführer: Peter Romianowski, Ulf Richter
Re: Can anyone recommend me a inter-language data file format?
Protocol buffers, thrift? On 11/3/08 4:07 AM, "Steve Loughran" <[EMAIL PROTECTED]> wrote: Zhou, Yunqing wrote: > embedded database cannot handle large-scale data, not very efficient > I have about 1 billion records. > these records should be passed through some modules. > I mean a data exchange format similar to XML but more flexible and > efficient. JSON CSV erlang-style records (name,value,value,value) RDF-triples in non-XML representations For all of these, you need to test with data that includes things like high unicode characters, single and double quotes, to see how well they get handled. you can actually append with XML by not having opening/closing tags, just stream out the entries to the tail of the file ... To read this in an XML parser, include it inside another XML file: ]> &log; I've done this for very big files, as long as you aren't trying to load it in-memory to a DOM, things should work -- Steve Loughran http://www.1060.org/blogxter/publish/5 Author: Ant in Action http://antbook.org/
Re: Status FUSE-Support of HDFS
+1 but since hadoop deals well with such directories currently, fuse-dfs will basically lock up on such directories - this is because ls -color=blah causes a stat on every file in a directory. There is a JIRA open for this but it is a pretty rare case although it has happened to me at facebook. -- pete >It's good for a portable application to keep the #of files/directory low by having two levels of directory for storing files -just use a hash operation to determine which dir to store a specific file in. On 11/3/08 4:00 AM, "Steve Loughran" <[EMAIL PROTECTED]> wrote: Pete Wyckoff wrote: > It has come a long way since 0.18 and facebook keeps our (0.17) dfs mounted > via fuse and uses that for some operations. > > There have recently been some problems with fuse-dfs when used in a > multithreaded environment, but those have been fixed in 0.18.2 and 0.19. (do > not use 0.18 or 0.18.1) > > The current (known) issues are: > 2. When directories have 10s of thousands of files, performance can be very > poor. I've known other filesystems to top out at 64k-1 files per directory, even if they don't slow down. It's good for a portable application to keep the #of files/directory low by having two levels of directory for storing files -just use a hash operation to determine which dir to store a specific file in.
Re: SecondaryNameNode on separate machine
You can either do what you just described with dfs.name.dir = dirX or you can start name-node with -importCheckpoint option. This is an automation for copying image files from secondary to primary. See here: http://hadoop.apache.org/core/docs/current/commands_manual.html#namenode http://hadoop.apache.org/core/docs/current/hdfs_user_guide.html#Secondary+NameNode http://issues.apache.org/jira/browse/HADOOP-2585#action_12584755 --Konstantin Tomislav Poljak wrote: Hi, Thank you all for your time and your answers! Now SecondaryNameNode connects to the NameNode (after I configured dfs.http.address to the NN's http server -> NN hostname on port 50070) and creates(transfers) edits and fsimage from NameNode. Can you explain me a little bit more how NameNode failover should work now? For example, SecondaryNameNode now stores fsimage and edits to (SNN's) dirX and let's say NameNode goes down (disk becomes unreadable). Now I create/dedicate a new machine for NameNode (also change DNS to point to this new NameNode machine as nameNode host) and take the data dirX from SNN and copy it to new NameNode. How do I configure new NameNode to use data from dirX (do I configure dfs.name.dir to point to dirX and start new NameNode)? Thanks, Tomislav On Fri, 2008-10-31 at 11:38 -0700, Konstantin Shvachko wrote: True, dfs.http.address is the NN Web UI address. This where the NN http server runs. Besides the Web UI there also a servlet running on that server which is used to transfer image and edits from NN to the secondary using http get. So SNN uses both addresses fs.default.name and dfs.http.address. When SNN finishes the checkpoint the primary needs to transfer the resulting image back. This is done via the http server running on SNN. Answering Tomislav's question: The difference between fs.default.name and dfs.http.address is that fs.default.name is the name-node's PRC address, where clients and data-nodes connect to, while dfs.http.address is the NN's http server address where our browsers connect to, but it is also used for transferring image and edits files. --Konstantin Otis Gospodnetic wrote: Konstantin & Co, please correct me if I'm wrong, but looking at hadoop-default.xml makes me think that dfs.http.address is only the URL for the NN *Web UI*. In other words, this is where we people go look at the NN. The secondary NN must then be using only the Primary NN URL specified in fs.default.name. This URL looks like hdfs://name-node-hostname-here/. Something in Hadoop then knows the exact port for the Primary NN based on the URI schema (e.g. "hdfs://") in this URL. Is this correct? Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Tomislav Poljak <[EMAIL PROTECTED]> To: core-user@hadoop.apache.org Sent: Thursday, October 30, 2008 1:52:18 PM Subject: Re: SecondaryNameNode on separate machine Hi, can you, please, explain the difference between fs.default.name and dfs.http.address (like how and when is SecondaryNameNode using fs.default.name and how/when dfs.http.address). I have set them both to same (namenode's) hostname:port. Is this correct (or dfs.http.address needs some other port)? Thanks, Tomislav On Wed, 2008-10-29 at 16:10 -0700, Konstantin Shvachko wrote: SecondaryNameNode uses http protocol to transfer the image and the edits from the primary name-node and vise versa. So the secondary does not access local files on the primary directly. The primary NN should know the secondary's http address. And the secondary NN need to know both fs.default.name and dfs.http.address of the primary. In general we usually create one configuration file hadoop-site.xml and copy it to all other machines. So you don't need to set up different values for all servers. Regards, --Konstantin Tomislav Poljak wrote: Hi, I'm not clear on how does SecondaryNameNode communicates with NameNode (if deployed on separate machine). Does SecondaryNameNode uses direct connection (over some port and protocol) or is it enough for SecondaryNameNode to have access to data which NameNode writes locally on disk? Tomislav On Wed, 2008-10-29 at 09:08 -0400, Jean-Daniel Cryans wrote: I think a lot of the confusion comes from this thread : http://www.nabble.com/NameNode-failover-procedure-td11711842.html Particularly because the wiki was updated with wrong information, not maliciously I'm sure. This information is now gone for good. Otis, your solution is pretty much like the one given by Dhruba Borthakur and augmented by Konstantin Shvachko later in the thread but I never did it myself. One thing should be clear though, the NN is and will remain a SPOF (just like HBase's Master) as long as a distributed manager service (like Zookeeper) is not plugged into Hadoop to help with failover. J-D On Wed, Oct 29, 2008 at 2:12 AM, Otis Gospodnetic < [EMAIL PROTECTED]> wrote: Hi, So what is the "recipe" for avoiding NN SPOF using only what comes with H
Re: hadoop mapside joins
There is an example in src/examples/.../examples/Join.java -C On Nov 3, 2008, at 3:13 AM, meda vijendharreddy wrote: Hi, As my dataset is quite large I wanted to do use map side joins. I have got the basic idea from http://issues.apache.org/jira/browse/HADOOP-2085 Can you anybody please provide me the working code. Thanks in Advance Vijen Connect with friends all over the world. Get Yahoo! India Messenger at http://in.messenger.yahoo.com/?wm=n/
Re: Question on opening file info from namenode in DFSClient
In the current code, details about block locations of a file are cached on the client when the file is opened. This cache remains with the client until the file is closed. If the same file is re-opened by the same DFSClient, it re-contacts the namenode and refetches the block locations. This works ok for most map-reduce apps because it is rare that the same DSClient re-opens the same file again. Can you pl explain your use-case? thanks, dhruba On Sun, Nov 2, 2008 at 10:57 PM, Taeho Kang <[EMAIL PROTECTED]> wrote: > Dear Hadoop Users and Developers, > > I was wondering if there's a plan to add "file info cache" in DFSClient? > > It could eliminate network travelling cost for contacting Namenode and I > think it would greatly improve the DFSClient's performance. > The code I was looking at was this > > --- > DFSClient.java > >/** > * Grab the open-file info from namenode > */ >synchronized void openInfo() throws IOException { > /* Maybe, we could add a file info cache here! */ > LocatedBlocks newInfo = callGetBlockLocations(src, 0, prefetchSize); > if (newInfo == null) { >throw new IOException("Cannot open filename " + src); > } > if (locatedBlocks != null) { >Iterator oldIter = > locatedBlocks.getLocatedBlocks().iterator(); >Iterator newIter = > newInfo.getLocatedBlocks().iterator(); >while (oldIter.hasNext() && newIter.hasNext()) { > if (! oldIter.next().getBlock().equals(newIter.next().getBlock())) > { >throw new IOException("Blocklist for " + src + " has changed!"); > } >} > } > this.locatedBlocks = newInfo; > this.currentNode = null; >} > --- > > Does anybody have an opinion on this matter? > > Thank you in advance, > > Taeho >
Re: Any Way to Skip Mapping?
I need the Reduce to Sort so I can merge the records and output in a sorted order. I do not need to join any data just merge rows together so I do not thank the join will be any help. I am storing the data like >> with a sorted map as the value. and on the merge I need to take all the rows that have the same key and merge all the sorted maps together and output one row that has all the data for that key something like what hbase is doing but without the in memory index's Maybe it will be come an option later down the row to skip the maps and let the reduce Shuffle directly from the inputSplits. Billy "Owen O'Malley" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] If you don't need a sort, which is what it sounds like, Hadoop supports that by turning off the reduce. That is done by setting the number of reduces to 0. This typically is much faster than if you need the sort. It also sounds like you may need/want the library that does map-side joins. http://tinyurl.com/43j5pp -- Owen
Re: Question regarding reduce tasks
i believe so, yes: but note that individual reducer task needs to finish, not just when processing a given key/value pair Miles 2008/11/3 Ryan LeCompte <[EMAIL PROTECTED]>: > What happens when the reducer task gets invoked more than once? My > guess is once a reducer task finishes writing the data for a > particular key to HDFS, it won't somehow get re-executed again for the > same key right? > > > On Mon, Nov 3, 2008 at 11:28 AM, Miles Osborne <[EMAIL PROTECTED]> wrote: >> you can't guarantee that a reducer (or mapper for that matter) will be >> executed exactly once unless you turn-off preemptive scheduling. but, >> a distinct key gets sent to a single reducer, so yes, only one reducer >> will see a particulat key + associated values >> >> Miles >> >> 2008/11/3 Ryan LeCompte <[EMAIL PROTECTED]>: >>> Hello, >>> >>> Is it safe to assume that only one reduce task will ever operate on >>> values for a particular key? Or is it possible that more than one >>> reduce task can work on values for the same key? The reason I ask is >>> because I want to ensure that a piece of code that I write at the end >>> of my reducer method will only ever be executed once after all values >>> for a particular key are aggregated/summed. >>> >>> Thanks, >>> Ryan >>> >> >> >> >> -- >> The University of Edinburgh is a charitable body, registered in >> Scotland, with registration number SC005336. >> > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
Re: Status FUSE-Support of HDFS
Reads are 20-30% slower Writes are 33% slower before https://issues.apache.org/jira/browse/HADOOP-3805 - You need a kernel > 2.6.26-rc* to test 3805, which I don't have :( These #s are with hadoop 0.17 and the 0.18.2 version of fuse-dfs. -- pete On 11/2/08 6:23 AM, "Robert Krüger" <[EMAIL PROTECTED]> wrote: Hi Pete, thanks for the info. That helps a lot. We will probably test it for our use cases then. Did you benchmark throughput when reading writing files through fuse-dfs and compared it to command line tool or API access? Is there a notable difference? Thanks again, Robert Pete Wyckoff wrote: > It has come a long way since 0.18 and facebook keeps our (0.17) dfs mounted > via fuse and uses that for some operations. > > There have recently been some problems with fuse-dfs when used in a > multithreaded environment, but those have been fixed in 0.18.2 and 0.19. (do > not use 0.18 or 0.18.1) > > The current (known) issues are: > 1. Wrong semantics when copying over an existing file - namely it does a > delete and then re-creates the file, so ownership/permissions may end up > wrong. There is a patch for this. > 2. When directories have 10s of thousands of files, performance can be very > poor. > 3. Posix truncate is supported only for truncating it to 0 size since hdfs > doesn't support truncate. > 4. Appends are not supported - this is a libhdfs problem and there is a > patch for it. > > It is still a pre-1.0 product for sure, but it has been pretty stable for us. > > > -- pete > > > On 10/31/08 9:08 AM, "Robert Krüger" <[EMAIL PROTECTED]> wrote: > > > > Hi, > > could anyone tell me what the current Status of FUSE support for HDFS > is? Is this something that can be expected to be usable in a few > weeks/months in a production environment? We have been really > happy/successful with HDFS in our production system. However, some > software we use in our application simply requires an OS-Level file > system which currently requires us to do a lot of copying between HDFS > and a regular file system for processes which require that software and > FUSE support would really eliminate that one disadvantage we have with > HDFS. We wouldn't even require the performance of that to be outstanding > because just by eliminatimng the copy step, we would greatly increase > the thruput of those processes. > > Thanks for sharing any thoughts on this. > > Regards, > > Robert > > >
Re: Question regarding reduce tasks
What happens when the reducer task gets invoked more than once? My guess is once a reducer task finishes writing the data for a particular key to HDFS, it won't somehow get re-executed again for the same key right? On Mon, Nov 3, 2008 at 11:28 AM, Miles Osborne <[EMAIL PROTECTED]> wrote: > you can't guarantee that a reducer (or mapper for that matter) will be > executed exactly once unless you turn-off preemptive scheduling. but, > a distinct key gets sent to a single reducer, so yes, only one reducer > will see a particulat key + associated values > > Miles > > 2008/11/3 Ryan LeCompte <[EMAIL PROTECTED]>: >> Hello, >> >> Is it safe to assume that only one reduce task will ever operate on >> values for a particular key? Or is it possible that more than one >> reduce task can work on values for the same key? The reason I ask is >> because I want to ensure that a piece of code that I write at the end >> of my reducer method will only ever be executed once after all values >> for a particular key are aggregated/summed. >> >> Thanks, >> Ryan >> > > > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. >
Re: Question regarding reduce tasks
you can't guarantee that a reducer (or mapper for that matter) will be executed exactly once unless you turn-off preemptive scheduling. but, a distinct key gets sent to a single reducer, so yes, only one reducer will see a particulat key + associated values Miles 2008/11/3 Ryan LeCompte <[EMAIL PROTECTED]>: > Hello, > > Is it safe to assume that only one reduce task will ever operate on > values for a particular key? Or is it possible that more than one > reduce task can work on values for the same key? The reason I ask is > because I want to ensure that a piece of code that I write at the end > of my reducer method will only ever be executed once after all values > for a particular key are aggregated/summed. > > Thanks, > Ryan > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
Question regarding reduce tasks
Hello, Is it safe to assume that only one reduce task will ever operate on values for a particular key? Or is it possible that more than one reduce task can work on values for the same key? The reason I ask is because I want to ensure that a piece of code that I write at the end of my reducer method will only ever be executed once after all values for a particular key are aggregated/summed. Thanks, Ryan
hadoop 0.18.1 x-trace
Dear Hadoop Users and Developers, I have a requirement of monitoring the hadoop-cluster by using x-trace. i found these pathes on http://issues.apache.org/jira/browse/HADOOP-4049 but when i try to integrate them with 0.18.1, then i cannot build hadoop anymore first of all , the patch-order is not clear to me, can anyone explain to me which patches i really need and the order to bring in these patches thanks Veiko -- Veiko Schnabel System-Administration optivo GmbH Stralauer Allee 2 10245 Berlin Tel.: +49 30 41724241 Fax: +49 30 41724239 Email: mailto:[EMAIL PROTECTED] Website: http://www.optivo.de Handelsregister: HRB Berlin 88738 Geschäftsführer: Peter Romianowski, Ulf Richter
Nutch/Hadoop: Crawl is crashing
Hi, I started an internet crawl of 30 million pages in a single segment. The crawl was crashing with the following exception: java.lang.ArrayIndexOutOfBoundsException: 17 at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:540) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:607) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:193) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760) Any idea on why is it happenning and what would be the soln. ... am using hadoop 0.15.3 and nutch 1.0 versions. Regards, Ilay
Re: SecondaryNameNode on separate machine
Hi, Thank you all for your time and your answers! Now SecondaryNameNode connects to the NameNode (after I configured dfs.http.address to the NN's http server -> NN hostname on port 50070) and creates(transfers) edits and fsimage from NameNode. Can you explain me a little bit more how NameNode failover should work now? For example, SecondaryNameNode now stores fsimage and edits to (SNN's) dirX and let's say NameNode goes down (disk becomes unreadable). Now I create/dedicate a new machine for NameNode (also change DNS to point to this new NameNode machine as nameNode host) and take the data dirX from SNN and copy it to new NameNode. How do I configure new NameNode to use data from dirX (do I configure dfs.name.dir to point to dirX and start new NameNode)? Thanks, Tomislav On Fri, 2008-10-31 at 11:38 -0700, Konstantin Shvachko wrote: > True, dfs.http.address is the NN Web UI address. > This where the NN http server runs. Besides the Web UI there also > a servlet running on that server which is used to transfer image > and edits from NN to the secondary using http get. > So SNN uses both addresses fs.default.name and dfs.http.address. > > When SNN finishes the checkpoint the primary needs to transfer the > resulting image back. This is done via the http server running on SNN. > > Answering Tomislav's question: > The difference between fs.default.name and dfs.http.address is that > fs.default.name is the name-node's PRC address, where clients and > data-nodes connect to, while dfs.http.address is the NN's http server > address where our browsers connect to, but it is also used for > transferring image and edits files. > > --Konstantin > > Otis Gospodnetic wrote: > > Konstantin & Co, please correct me if I'm wrong, but looking at > > hadoop-default.xml makes me think that dfs.http.address is only the URL for > > the NN *Web UI*. In other words, this is where we people go look at the NN. > > > > The secondary NN must then be using only the Primary NN URL specified in > > fs.default.name. This URL looks like hdfs://name-node-hostname-here/. > > Something in Hadoop then knows the exact port for the Primary NN based on > > the URI schema (e.g. "hdfs://") in this URL. > > > > Is this correct? > > > > > > Thanks, > > Otis > > -- > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > - Original Message > >> From: Tomislav Poljak <[EMAIL PROTECTED]> > >> To: core-user@hadoop.apache.org > >> Sent: Thursday, October 30, 2008 1:52:18 PM > >> Subject: Re: SecondaryNameNode on separate machine > >> > >> Hi, > >> can you, please, explain the difference between fs.default.name and > >> dfs.http.address (like how and when is SecondaryNameNode using > >> fs.default.name and how/when dfs.http.address). I have set them both to > >> same (namenode's) hostname:port. Is this correct (or dfs.http.address > >> needs some other port)? > >> > >> Thanks, > >> > >> Tomislav > >> > >> On Wed, 2008-10-29 at 16:10 -0700, Konstantin Shvachko wrote: > >>> SecondaryNameNode uses http protocol to transfer the image and the edits > >>> from the primary name-node and vise versa. > >>> So the secondary does not access local files on the primary directly. > >>> The primary NN should know the secondary's http address. > >>> And the secondary NN need to know both fs.default.name and > >>> dfs.http.address of > >> the primary. > >>> In general we usually create one configuration file hadoop-site.xml > >>> and copy it to all other machines. So you don't need to set up different > >>> values for all servers. > >>> > >>> Regards, > >>> --Konstantin > >>> > >>> Tomislav Poljak wrote: > Hi, > I'm not clear on how does SecondaryNameNode communicates with NameNode > (if deployed on separate machine). Does SecondaryNameNode uses direct > connection (over some port and protocol) or is it enough for > SecondaryNameNode to have access to data which NameNode writes locally > on disk? > > Tomislav > > On Wed, 2008-10-29 at 09:08 -0400, Jean-Daniel Cryans wrote: > > I think a lot of the confusion comes from this thread : > > http://www.nabble.com/NameNode-failover-procedure-td11711842.html > > > > Particularly because the wiki was updated with wrong information, not > > maliciously I'm sure. This information is now gone for good. > > > > Otis, your solution is pretty much like the one given by Dhruba > > Borthakur > > and augmented by Konstantin Shvachko later in the thread but I never > > did it > > myself. > > > > One thing should be clear though, the NN is and will remain a SPOF (just > > like HBase's Master) as long as a distributed manager service (like > > Zookeeper) is not plugged into Hadoop to help with failover. > > > > J-D > > > > On Wed, Oct 29, 2008 at 2:12 AM, Otis Gospodnetic < > > [EMAIL PROTECTED]> wrote: > > > >> Hi, > >> So what is the "r
hadoop 0.18.1 x-trace
Dear Hadoop Users and Developers, I have a requirement of monitoring the hadoop-cluster by using x-trace. i found these pathes on http://issues.apache.org/jira/browse/HADOOP-4049 but when i try to integrate them with 0.18.1, then i cannot build hadoop anymore first of all , the patch-order is not clear to me, can anyone explain to me which patches i really need and the order to bring in these patches thanks Veiko
Re: Question on opening file info from namenode in DFSClient
Consider case of file getting removed & recreated with same name .. while there is cached info about the file (and job is running) in the DFSClient's (mapper/reducer). - Mridul Taeho Kang wrote: Dear Hadoop Users and Developers, I was wondering if there's a plan to add "file info cache" in DFSClient? It could eliminate network travelling cost for contacting Namenode and I think it would greatly improve the DFSClient's performance. The code I was looking at was this --- DFSClient.java /** * Grab the open-file info from namenode */ synchronized void openInfo() throws IOException { /* Maybe, we could add a file info cache here! */ LocatedBlocks newInfo = callGetBlockLocations(src, 0, prefetchSize); if (newInfo == null) { throw new IOException("Cannot open filename " + src); } if (locatedBlocks != null) { Iterator oldIter = locatedBlocks.getLocatedBlocks().iterator(); Iterator newIter = newInfo.getLocatedBlocks().iterator(); while (oldIter.hasNext() && newIter.hasNext()) { if (! oldIter.next().getBlock().equals(newIter.next().getBlock())) { throw new IOException("Blocklist for " + src + " has changed!"); } } } this.locatedBlocks = newInfo; this.currentNode = null; } --- Does anybody have an opinion on this matter? Thank you in advance, Taeho
hadoop mapside joins
Hi, As my dataset is quite large I wanted to do use map side joins. I have got the basic idea from http://issues.apache.org/jira/browse/HADOOP-2085 Can you anybody please provide me the working code. Thanks in Advance Vijen Connect with friends all over the world. Get Yahoo! India Messenger at http://in.messenger.yahoo.com/?wm=n/
Re: Can anyone recommend me a inter-language data file format?
Zhou, Yunqing wrote: embedded database cannot handle large-scale data, not very efficient I have about 1 billion records. these records should be passed through some modules. I mean a data exchange format similar to XML but more flexible and efficient. JSON CSV erlang-style records (name,value,value,value) RDF-triples in non-XML representations For all of these, you need to test with data that includes things like high unicode characters, single and double quotes, to see how well they get handled. you can actually append with XML by not having opening/closing tags, just stream out the entries to the tail of the file ... To read this in an XML parser, include it inside another XML file: ]> &log; I've done this for very big files, as long as you aren't trying to load it in-memory to a DOM, things should work -- Steve Loughran http://www.1060.org/blogxter/publish/5 Author: Ant in Action http://antbook.org/
Re: Status FUSE-Support of HDFS
Pete Wyckoff wrote: It has come a long way since 0.18 and facebook keeps our (0.17) dfs mounted via fuse and uses that for some operations. There have recently been some problems with fuse-dfs when used in a multithreaded environment, but those have been fixed in 0.18.2 and 0.19. (do not use 0.18 or 0.18.1) The current (known) issues are: 2. When directories have 10s of thousands of files, performance can be very poor. I've known other filesystems to top out at 64k-1 files per directory, even if they don't slow down. It's good for a portable application to keep the #of files/directory low by having two levels of directory for storing files -just use a hash operation to determine which dir to store a specific file in.