Sun JVM 1.6.0u18

2010-02-15 Thread Todd Lipcon
Hey all, Just a note that you should avoid upgrading your clusters to 1.6.0u18. We've seen a lot of segfaults or bus errors on the DN when running with this JVM - Stack found the ame thing on one of his clusters as well. We've found 1.6.0u16 to be very stable. -Todd

Re: Problem with large .lzo files

2010-02-14 Thread Todd Lipcon
Hi Steve, On Sun, Feb 14, 2010 at 12:11 PM, Steve Kuo kuosen...@gmail.com wrote: I am running a hadoop job that combines daily results with results with previous days.  The reduce output is lzo compressed and growing daily in size.   - DistributedLzoIndexer is used to index lzo files to

Re: Why does the default HEARTBEAT_INTERVAL value is 3?

2010-02-09 Thread Todd Lipcon
On a small cluster, I'm of the opinion that a value less than 3 would actually be useful in reducing job startup time a little bit. https://issues.apache.org/jira/browse/MAPREDUCE-1266 The issue got stalled a bit. If you want it, pipe up on the JIRA :) Especially if you have hard data indicating

Re: using multiple disks for HDFS

2010-02-09 Thread Todd Lipcon
Hi Vasilis, Two things: 1) You're missing a matching } in your hadoop.tmp.dir setting 2) When you use ${hadoop.tmp.dir}/dfs/data, it does a literal string interpolation. Thus, it's not adding dfs/data to each of the hadoop.tmp.dir directories, but rather just the last one. I'd recommend setting

Re: Why does the default HEARTBEAT_INTERVAL value is 3?

2010-02-09 Thread Todd Lipcon
On Tue, Feb 9, 2010 at 11:37 AM, Edward Capriolo edlinuxg...@gmail.com wrote: With the setting of 5 each tasktracker checks into the jobtracker every 5 seconds. The concept is that with enough TaskTrackers , say a 1000 node cluster 1000/5= 200 will be checking in at and given second. Actually

Re: EOFException and BadLink, but file descriptors number is ok?

2010-02-05 Thread Todd Lipcon
Yes, you're likely to see an error in the DN log. Do you see anything about max number of xceivers? -Todd On Thu, Feb 4, 2010 at 11:42 PM, Meng Mao meng...@gmail.com wrote: not sure what else I could be checking to see where the problem lies. Should I be looking in the datanode logs? I looked

Re: Is there Synthetic Load generator for Hadoop-0.18.3

2010-02-05 Thread Todd Lipcon
Why not try downloading the load gen source from 0.19 or 0.20, compiling against 0.18, and seeing if there are any errors? If there are, fix them. The APIs haven't changed much in a long time, it shouldn't be a lot of work. -Todd On Fri, Feb 5, 2010 at 5:50 AM, Ashish Pareek pareek...@gmail.com

Re: Using two threads to read data from disk and send out in DataXceiver ?

2010-02-05 Thread Todd Lipcon
Hi Martin, Not sure what you mean - why would it be faster to split it into two threads? Keep in mind that there is a TCP send buffer so if the client is reading faster than the disk, the server's sends won't block anyway. -Todd On Fri, Feb 5, 2010 at 8:03 AM, Martin Mituzas

Re: What framework Hadoop uses for daemonizing?

2010-02-04 Thread Todd Lipcon
Hi Stas, Hadoop doesn't daemonize itself. The shell scripts use nohup and a lot of bash code to achieve a similar idea. -Todd On Thu, Feb 4, 2010 at 1:03 PM, Stas Oskin stas.os...@gmail.com wrote: Hi. Just wondering - does anyone know what framework Hadoop uses for daemonizing? Any chance

Re: What framework Hadoop uses for daemonizing?

2010-02-04 Thread Todd Lipcon
On Thu, Feb 4, 2010 at 1:21 PM, Stas Oskin stas.os...@gmail.com wrote: Hi Todd. Hadoop doesn't daemonize itself. The shell scripts use nohup and a lot of bash code to achieve a similar idea. Was there any design decision behind this approach? It long predates my involvement in the

Re: setup cluster with cloudera repo

2010-02-03 Thread Todd Lipcon
Hi Jim, Sorry about the broken links. We just launched a new website a couple days ago and a few of the pages are still in transition. This link should help you get started: http://archive.cloudera.com/docs/cdh2-pseudo-distributed.html Thanks -Todd On Wed, Feb 3, 2010 at 11:07 AM, Jim Kusznir

Re: ClassCastException in lzo indexer

2010-02-02 Thread Todd Lipcon
Hi Vasilis, Did you make sure to ant clean before rebuilding hadoop-lzo if you updated the code? Also, can you paste your configuration for io.compression.codecs ? Thanks -Todd On Tue, Feb 2, 2010 at 9:09 AM, Vasilis Liaskovitis vlias...@gmail.comwrote: Hi, I am trying to use hadoop-0.20.1

Re: Exception: Can not get the relative path...

2010-01-17 Thread Todd Lipcon
Hi Bradford, My guess is that somewhere it's using hdfs://localhost:8020/ and elsewhere using hdfs://localhost/ (and letting the default port get used). It's certainly a bug that this causes a problem, but if you want to work around it I'd just make sure you've used the fully specified

Re: Hadoop and X11 related error

2010-01-17 Thread Todd Lipcon
On Sun, Jan 17, 2010 at 6:19 PM, Tarandeep Singh tarand...@gmail.comwrote: On Sun, Jan 17, 2010 at 1:57 PM, Vladimir Klimontovich klimontov...@gmail.com wrote: Maybe, hadoop running MR jobs using different user? For example, if you followed installation instructions from official site

Re: isSplitable() deprecated

2010-01-15 Thread Todd Lipcon
... [exec] configure: error: C compiler cannot create executables On Fri, Jan 15, 2010 at 12:09 PM, Todd Lipcon t...@cloudera.com wrote: Hi Ted, Did you also install liblzo-devel? Here are the packages I install for LZO: lzo-2.02-2.el5.1 lzo-devel-2.02-2.el5.1 -Todd On Tue

Re: isSplitable() deprecated

2010-01-15 Thread Todd Lipcon
create executables See `config.log' for more details. On Fri, Jan 15, 2010 at 1:17 PM, Todd Lipcon t...@cloudera.com wrote: Are you starting from a clean tarball of the lzo stuff? Can you make sure your /tmp/ partition isn't full? There should be a config.log file hanging around somewhere

Re: Mapreduce and Exclude

2010-01-14 Thread Todd Lipcon
Hi David, The ability to administratively blacklist TaskTrackers wasn't added until 0.21, which is not yet released. In general, there isn't usually a great reason to do this -- you can take down a tasktracker by just stopping it. The purpose of the HDFS-side operation is to decommission a node

Re: isSplitable() deprecated

2010-01-12 Thread Todd Lipcon
returned: 2 Has anybody seen the above ? Thanks On Mon, Jan 11, 2010 at 3:34 PM, Todd Lipcon t...@cloudera.com wrote: Hi Ted, You need to install liblzo from EPEL: http://fr.rpmfind.net/linux/RPM/Extras_Packages_for_Enterprise_Linux.html -Todd On Mon, Jan 11, 2010 at 3:21 PM

Re: isSplitable() deprecated

2010-01-12 Thread Todd Lipcon
[exec] See `config.log' for more details. BUILD FAILED /home/rialto/kevinweil-hadoop-lzo-916aeae/build.xml:243: exec returned: 77 On Tue, Jan 12, 2010 at 10:25 AM, Todd Lipcon t...@cloudera.com wrote: Hi Ted, Please make sure you have version 2.02 of liblzo installed. There's

Re: isSplitable() deprecated

2010-01-11 Thread Todd Lipcon
Hi Ted, You need to install liblzo from EPEL: http://fr.rpmfind.net/linux/RPM/Extras_Packages_for_Enterprise_Linux.html -Todd On Mon, Jan 11, 2010 at 3:21 PM, Ted Yu yuzhih...@gmail.com wrote: Can someone tell me how I can install liblzo ? [r...@tyu-linux lzo-2.03]# uname -a Linux

Re: I am not able to run this program in distributed mode but I am able to run it in pseudo distributed mode

2010-01-02 Thread Todd Lipcon
http://catb.org/~esr/faqs/smart-questions.html#id383250 Please go through the above explanation of how to ask questions on a mailing list, and repost your question. Thanks in advance. On Sat, Jan 2, 2010 at 8:35 AM, Ravi ravindra.babu.rav...@gmail.com wrote: import java.lang.Integer; import

Re: large reducer output with same key

2009-12-31 Thread Todd Lipcon
Hi Himanshu, Sounds like your mapred.local.dir doesn't have enough space. My guess is that you've configured it somewhere inside /tmp/. Instead you should spread it across all of your local physical disks by comma-separating the directories in the configuration. Something like: property

Re: debian package of hadoop

2009-12-30 Thread Todd Lipcon
Hi Thomas (and Debian developers), My responses inline below: On Wed, Dec 30, 2009 at 10:53 AM, Thomas Koch tho...@koch.ro wrote: Hi, today I tried to run the cloudera debian dist on a 4 machine cluster. I still have some itches, see my list below. Some of them may require a fix in the

Re: Text coding

2009-12-28 Thread Todd Lipcon
Furthermore, Text is meant for use when you have a UTF8-encoded string. Creating a Text object from a byte array that is not proper UTF-8 is likely to result in some kind of exception or data mangling. You should use BytesWritable for this purpose -Todd 2009/12/28 Edward Capriolo

Re: Secondary NameNodes or NFS exports?

2009-12-24 Thread Todd Lipcon
How long does the checkpoint take? It seems possible to me that if the 2NN checkpoint takes longer than the interval, it's possible that multiple checkpoints will overlap and might trigger this. (this is conjecture, so definitely worth testing) -Todd On Wed, Dec 23, 2009 at 6:38 PM, Jason Venner

Re: sharing variables across chained jobs

2009-12-23 Thread Todd Lipcon
On Wed, Dec 23, 2009 at 6:55 AM, Jason Venner jason.had...@gmail.comwrote: If your jobs are launched by separate jvm instances, the only real persistence framework you have is hdfs. You have to basic choices: 1. Write a summary data to a persistent store, an hdfs file being a simple

Re: io.sort.mb configuration?

2009-12-23 Thread Todd Lipcon
Hi Mark, For what it's wroth, you're unlikely to see a big difference in performance unless you cut down the number of spills from 1 to 1, or io.sort.factor to io.sort.factor. The difference between 3 spills and 5 spills is not huge, in my experience, since you're still writing the same amount of

Re: MapFileoutput Format: keys out of order when emitting in reduce (Hadoop 0.20)

2009-12-23 Thread Todd Lipcon
On Wed, Dec 23, 2009 at 12:46 PM, Saptarshi Guha saptarshi.g...@gmail.comwrote: Hello, I re-wrote MapFileOutputFormat for use with Hadoop 0.20.1 and have a question. Suppose my Map sends key-value pairs to the reducers. In my reducer, for a given key value, i emit key1,value1, key2,value2,

Re: Why I can only run 2 map/reduce task at a time?

2009-12-21 Thread Todd Lipcon
Hi Starry, The assignmultiple feature in the Fair Scheduler fixes this issue after MAPREDUCE-706. This is not in any current releases, but we will be rolling it into our next major release of Cloudera's distribution. -Todd P.S. Please don't cross-post questions to 4 lists. It's not necessary

Re: What's HDFS's balancer criteria?

2009-12-18 Thread Todd Lipcon
Hi Jeff, You can tune dfs.datanode.balance.bandwidthPerSec in order to change the speed at which the balancer moves data around. It defaults to 1MB/sec so as to avoid using significant cluster resources, but you could certainly bump it up for an individual balancer run if you need it to go

Re: why does not BlockSender use BufferedInputStream?

2009-12-17 Thread Todd Lipcon
The DataNode actually uses the sendfile call to do the block data transfers (through FileChannel.transferTo). The FileInputStream is just used in order to get to the FileChannel object. -Todd On Thu, Dec 17, 2009 at 6:40 PM, Martin Mituzas xietao1...@hotmail.comwrote: I checked the DataNode

Re: map reduce to achieve cartessian product

2009-12-16 Thread Todd Lipcon
Hi Eguzki, Is one of the tables vastly smaller than the other? If one is small enough to fit in RAM, you can do this like so: 1. Add the small file to the DistributedCache 2. In the configure() method of the mapper, read the entire file into an ArrayList or somesuch in RAM 3. Set the input path

Re: error message when executing SecondaryNameNode

2009-12-16 Thread Todd Lipcon
Hi Fu-Ming, Looks similar to this bug: http://issues.apache.org/jira/browse/HDFS-686 Does this problem persist, or was it a one time occurrence? -Todd On Tue, Dec 15, 2009 at 5:42 PM, Fu-Ming Tsai sary...@gmail.com wrote: Hello, all, I tried to execute 2 SecondaryNamenode in my env.

Re: addChild NullPointerException when starting namenode and reading edits file

2009-12-16 Thread Todd Lipcon
Hi Erik, A few things to try: - does this FS store sensitive data or would it be possible to bzip2 the files and upload them somewhere? - can you add logging to the replay of FSEditLog so as to be aware of what byte offset is causing the issue? - DO take a backup of all of the state,

Re: map reduce to achieve cartessian product

2009-12-16 Thread Todd Lipcon
in the scalability of our system. I will test the upper limits. Thanks a lot. Eguzki Todd Lipcon escribió: Hi Eguzki, Is one of the tables vastly smaller than the other? If one is small enough to fit in RAM, you can do this like so: 1. Add the small file to the DistributedCache 2. In the configure

Re: Re: Re: map output not euqal to reduce input

2009-12-10 Thread Todd Lipcon
On Thu, Dec 10, 2009 at 1:15 PM, Gang Luo lgpub...@yahoo.com.cn wrote: Hi Todd, I didn't change the partitioner, just use the default one. Will the default partitioner cause the lost of the records? -Gang Do the maps output data nondeterministically? Did you experience any task failures in

Re: hadoop idle time on terasort

2009-12-09 Thread Todd Lipcon
As always, Scott provided lots of great advice below. One note to be aware of: The fair scheduler assignmultiple feature in 0.20 doesn't do quite what you think it might. It gives the ability to assign one map and one reduce per TT heartbeat, but doesn't assign multiple map tasks in a single

Re: Hadoop dfs usage and actual size discrepancy

2009-12-09 Thread Todd Lipcon
Hi Nick, My guess is that the tmp/ directories of the DNs were rather full. I've occasionally seen this on clusters where writes have been failing. There should be some kind of thread which garbage collects partial blocks from the DN's tmp dirs, but it's not implemented, as far as I know. This

Re: some current features in hadoop

2009-12-08 Thread Todd Lipcon
is temporary, at least relative to me. -Original Message- From: Todd Lipcon [mailto:t...@cloudera.com] Sent: Tuesday, December 08, 2009 12:48 PM To: common-user@hadoop.apache.org Subject: Re: some current features in hadoop On Mon, Dec 7, 2009 at 10:58 PM, Krishna Kumar krishna.ku

Re: some current features in hadoop

2009-12-07 Thread Todd Lipcon
On Mon, Dec 7, 2009 at 10:58 PM, Krishna Kumar krishna.ku...@nechclst.inwrote: Dear All, Can anybody please let me know about some of the current features of hadoop on which development work is going on / or planning to go in future, like : 1. Record append Not implemented and

Re: whoami can't be executed.

2009-12-06 Thread Todd Lipcon
Your mail came through empty - not sure if you meant to attach something, but if you did it didn't appear. Try pastebin.org On Mon, Dec 7, 2009 at 12:58 AM, pavel kolodin pavelkolodinhad...@gmail.com wrote: On Mon, 07 Dec 2009 04:54:16 -, Todd Lipcon t...@cloudera.com wrote: grep clone

Re: Trouble with tutorial

2009-12-03 Thread Todd Lipcon
either. Anyone seen something similar / possible fixes? Thanks again - Mikko - Original Message - *From:* Todd Lipcon t...@cloudera.com *To:* common-user@hadoop.apache.org ; Mikko Lahtimikko.la...@pp1.inet.fi *Sent:* Wednesday, December 02, 2009 9:03 PM *Subject:* Re: Trouble

Re: 0.20 ConcurrentModificationException

2009-12-02 Thread Todd Lipcon
Certainly looks like HADOOP-6269 to me. Can you try Cloudera's distribution?This patch is included. -Todd On Wed, Dec 2, 2009 at 4:23 AM, Arv Mistry a...@kindsight.net wrote: Hi, I've recently upgraded hadoop to 0.20 and am seeing this concurrent mod exception on startup which I never got

Re: 0.20 ConcurrentModificationException

2009-12-02 Thread Todd Lipcon
that the problem sticks around, please report back or file a JIRA. Thanks -Todd Cheers Arv -Original Message- From: Todd Lipcon [mailto:t...@cloudera.com] Sent: December 2, 2009 11:39 AM To: common-user@hadoop.apache.org Subject: Re: 0.20 ConcurrentModificationException Certainly looks like

Re: hadoop idle time on terasort

2009-12-02 Thread Todd Lipcon
Hi Vasilis, This is seen reasonably often, and could be partly due to missed configuration changes. A few things to check: - Did you increase the number of tasks per node from the default? If you have a reasonable number of disks/cores, you're going to want to run a lot more than 2 map and 2

Re: fair scheduler preemptions timeout difficulties

2009-12-02 Thread Todd Lipcon
No problem :) Also worth noting for anyone listening on that this feature is not in 0.20.1 - it's been backported into CDH. It will arrive in 0.21. Thanks -Todd On Wed, Dec 2, 2009 at 4:55 PM, james warren ja...@rockyou.com wrote: Todd from Cloudera solved this for me on their company's forum.

Re: Fair Scheduler config issues

2009-12-02 Thread Todd Lipcon
Hi Derek, You should set poolnameproperty to pool.name, not ${pool.name} That should fix your issues. -Todd On Wed, Dec 2, 2009 at 7:46 PM, Derek Brown de...@media6degrees.com wrote: I'm using Cloudera's distribution of 0.20.1, but this seems like a general question to I'm posting here.

Re: Web Interface Not Working

2009-12-01 Thread Todd Lipcon
Hi Mark, Both web UIs went down but jobs can still be submitted, etc? This seems like a problem external to Hadoop, since the JT (port 50030) and NN (port 50070) are entirely separate processes. If they both became inaccessible at the same time, perhaps a firewall rule was added that

Re: fair scheduler making jobs fail?

2009-11-30 Thread Todd Lipcon
Any errors in your jobtracker log? Usually you'll see something there if the scheduler fails to start. What errors are the jobs failing with? -Todd On Mon, Nov 30, 2009 at 3:49 PM, Mike Kendall mkend...@justin.tv wrote: no dice... and the default configuration from

Re: fair scheduler making jobs fail?

2009-11-30 Thread Todd Lipcon
where? copy paste the startup sequence and the job submission logs from the JT log? You gotta provide some details here :) -Todd On Mon, Nov 30, 2009 at 3:53 PM, Mike Kendall mkend...@justin.tv wrote: java runtime error, exit code 1.. On Mon, Nov 30, 2009 at 3:52 PM, Todd Lipcon t

Re: Identifying lines in map()

2009-11-29 Thread Todd Lipcon
line. So, give the input: Amy Sue Fred John Jack Joe Sue John Alice Bob Fred Sue John The output should be: Sue John because Sue and John appear on every line. I don't know Sue and John in advance. Thanks, Jim Todd Lipcon wrote: Hi James, Something like the following pseudocode

Re: why does not hdfs read ahead ?

2009-11-24 Thread Todd Lipcon
Also, keep in mind that, when you open a block for reading, the DN immediately starts writing the entire block (assuming it's requested via the xceiver protocol) - it's TCP backpressure on the send window that does flow control there. So, although it's not explicitly reading ahead, most of the

Re: why does not hdfs read ahead ?

2009-11-24 Thread Todd Lipcon
On Tue, Nov 24, 2009 at 10:33 AM, Brian Bockelman bbock...@cse.unl.eduwrote: On Nov 24, 2009, at 12:06 PM, Todd Lipcon wrote: Also, keep in mind that, when you open a block for reading, the DN immediately starts writing the entire block (assuming it's requested via the xceiver protocol

Re: why does not hdfs read ahead ?

2009-11-24 Thread Todd Lipcon
On Tue, Nov 24, 2009 at 10:35 AM, Raghu Angadi ang...@gmail.com wrote: Sequential read is the simplest case and it is pretty hard to improve upon the current raw performance (HDFS client does take more CPU than one might expect, Todd implemented an improvement for CPU consumed). Just to

Re: Job.setJarByClass in Hadoop 0.20.1

2009-11-23 Thread Todd Lipcon
Hi Mike, I haven't seen that problem. There is one patch in the Cloudera distribution that does modify the behavior of that method, though. Would you mind trying this on the stock Apache 0.20.1 release? I see no reason to believe this is the issue, since hundreds of other people are using our

Re: Job.setJarByClass in Hadoop 0.20.1

2009-11-23 Thread Todd Lipcon
fine. I just tried an Apache release. I copied the example source file of WordCount.java to my jar and submitted the job. It worked with Apache's release, but failed with CloudEra's release. But my code failed for both of the versions. From: Todd Lipcon t

Re: How to handle imbalanced data in hadoop ?

2009-11-23 Thread Todd Lipcon
, and these reducers get over in (1 min 30 sec on avg). Pankil On Tue, Nov 17, 2009 at 5:07 PM, Todd Lipcon t...@cloudera.com wrote: On Tue, Nov 17, 2009 at 1:54 PM, Pankil Doshi forpan...@gmail.com wrote: With respect to Imbalanced data, Can anyone guide me how sorting takes

Re: How to handle imbalanced data in hadoop ?

2009-11-23 Thread Todd Lipcon
()) { iCount++; - sValue += values.next().toString() + '\t'; + sb.append(values.next().toString()).append('\t'); } Hope that helps, Pankil. -Todd On Mon, Nov 23, 2009 at 9:32 PM, Todd Lipcon t...@cloudera.com wrote: Interesting. I

Re: Cloudera 18.3 splits bz2 inputs

2009-11-17 Thread Todd Lipcon
On Tue, Nov 17, 2009 at 7:52 AM, Edward Capriolo edlinuxg...@gmail.comwrote: Todd, I think this is very important. From the grid on Hadoop the Definative guide 78, it appears that bzip2 and zip are the only formats the are splittable. As a result bzip2 would be my format of choice to

Re: Cloudera 18.3 splits bz2 inputs

2009-11-17 Thread Todd Lipcon
/kevinweil/hadoop-lzo http://github.com/kevinweil/hadoop-lzoMD On Tue, Nov 17, 2009 at 8:08 AM, Todd Lipcon t...@cloudera.com wrote: On Tue, Nov 17, 2009 at 7:52 AM, Edward Capriolo edlinuxg...@gmail.com wrote: Todd, I think this is very important. From the grid on Hadoop

Re: Cloudera 18.3 splits bz2 inputs

2009-11-17 Thread Todd Lipcon
Coincidentally, we *just* posted a blog entry about this, courtesy of Kevin Weil from Twitter: http://www.cloudera.com/blog/2009/11/17/hadoop-at-twitter-part-1-splittable-lzo-compression/ -Todd On Tue, Nov 17, 2009 at 8:36 AM, Todd Lipcon t...@cloudera.com wrote: On Tue, Nov 17, 2009 at 8:33

Re: How to handle imbalanced data in hadoop ?

2009-11-17 Thread Todd Lipcon
On Tue, Nov 17, 2009 at 1:54 PM, Pankil Doshi forpan...@gmail.com wrote: With respect to Imbalanced data, Can anyone guide me how sorting takes place in Hadoop after Map phase. I did some experiments and found that if there are two reducers which have same number of keys to sort and one

Re: client.Client: failed to interact with node......ERROR

2009-11-16 Thread Todd Lipcon
Message- From: Todd Lipcon [mailto:t...@cloudera.com] Sent: Monday, November 16, 2009 9:02 PM To: common-user@hadoop.apache.org Subject: Re: client.Client: failed to interact with node..ERROR Hi Yair, This looks like a katta-specific problem. Please direct your question to the katta

Re: hadoop versions

2009-11-16 Thread Todd Lipcon
recommend moving back to 0.18.3 at this point for new development. For new projects, unless you have need for the absolute most stable release, I'd recommend 0.20.1, which is being used successfully in production by many organizations. -Todd On Sun, Nov 15, 2009 at 8:47 PM, Todd Lipcon t

Re: Cloudera 18.3 splits bz2 inputs

2009-11-16 Thread Todd Lipcon
Hi Usman/Mike, This feature is slated for 0.21 (not 0.20.1) We have not backported it into Cloudera's release of 0.20.1, though we'll certainly consider doing so if there appears to be demand for it in the community. Anecdotally we've seen that not too many people are using bzip2 since the CPU

Re: hadoop versions

2009-11-15 Thread Todd Lipcon
Hi Mark, The simple answer is yes, to be safest, they should match. In truth, the answer is a bit more complex. Since Java is dynamically linked (classloaded) at runtime, as long as the method signatures and class names you're using in your code haven't changed between versions, your jar

Re: hadoop versions

2009-11-15 Thread Todd Lipcon
, right? Thank you, Mark On Sun, Nov 15, 2009 at 3:54 PM, Todd Lipcon t...@cloudera.com wrote: Hi Mark, The simple answer is yes, to be safest, they should match. In truth, the answer is a bit more complex. Since Java is dynamically linked (classloaded) at runtime, as long as the method

Re: NameNode/DataNode JobTracker/TaskTracker

2009-11-10 Thread Todd Lipcon
individually and start the daemons by hand. I do not recommend this last option :) sorry for all of the seemingly basic questions, but want to get it right the first time:) Sure thing- we're here to help. -Todd On Nov 9, 2009, at 1:11 PM, Todd Lipcon wrote: On Mon, Nov 9, 2009 at 7:20 AM

Re: NameNode/DataNode JobTracker/TaskTracker

2009-11-09 Thread Todd Lipcon
On Mon, Nov 9, 2009 at 7:20 AM, John Martyniak j...@beforedawnsolutions.com wrote: Can the NameNode/DataNode JobTracker/TaskTracker run on a server that isn't part of the cluster meaning I would like to run it on a machine that wouldn't participate in the processing of data, and wouldn't

Re: Using jobtracker api for developing UI in hadoop 0.19.1

2009-11-09 Thread Todd Lipcon
, 2009 at 12:15 PM, Todd Lipcon t...@cloudera.com wrote: Hi, No, there is currently no public-facing API for tracking job status aside from parsing those JSPs (or adding your own) -Todd On Mon, Nov 9, 2009 at 5:28 AM, pnd prafulla.daw...@gmail.com wrote: Hi i am new to hadoop

Re: Problem while starting hadoop

2009-11-05 Thread Todd Lipcon
stand alone mode does not use any daemons - it's just a way of running mapreduce in a single process for quick testing and getting started. You need pseudo distributed mode if you want to run the daemons. -Todd On Thu, Nov 5, 2009 at 12:06 AM, Mohan Agarwal mohan.agarwa...@gmail.comwrote: Hi ,

Re: Passing Properties With Whitespace To Streaming

2009-10-28 Thread Todd Lipcon
Hi Brian, Any chance you are using the Cloudera distribution? We did accidentally ship a bug like this which will be ameliorated in our next release. The temporary workarounds are: a) edit /usr/bin/hadoop and change the $* to a $@ (including the quotes!) or b) use

Re: openssh - can't achieve passphraseless ssh

2009-10-22 Thread Todd Lipcon
Hi Dennis, This is normal to see pseudodistributed much slower than standalone. Most of the extra overhead you're seeing is due to the heartbeat mechanism of Hadoop. When the JobTracker has tasks that need to be run, it does not actually push them to the TaskTrackers. Instead, the TaskTrackers

Re: Advantages of moving from 0.18.3 to 0.20.1?

2009-10-20 Thread Todd Lipcon
Hi John, You can see a short slide deck from the July HUG here that includes some info about what's new in 0.19 and 0.20: http://cloudera-todd.s3.amazonaws.com/hug-20090917.pdf -Todd On Tue, Oct 20, 2009 at 1:44 AM, John Clarke clarke...@gmail.com wrote: Hi, I currently have an app written

Re: hive-0.4.0 build

2009-10-19 Thread Todd Lipcon
Hi Schubert, Regarding the Hive problem - you should be able to build against 0.20.0 and then run against 0.20.1. Within a release (0.20.*) the APIs are completely compatible, so you can compile any external application against one and run against another. Regarding the format of the md5 file,

Re: Hardware performance from HADOOP cluster

2009-10-16 Thread Todd Lipcon
On Fri, Oct 16, 2009 at 4:01 AM, tim robertson timrobertson...@gmail.comwrote: Hi all, Adding the following to core-site.xml, mapred-site.xml and hdfs-site.xml (based on Cloudera guidelines: http://tinyurl.com/ykupczu) io.sort.factor: 15 (mapred-site.xml) io.sort.mb: 150

Re: Error register getProtocolVersion

2009-10-15 Thread Todd Lipcon
Hi Tim, jps should show it running if it's hung. In that case, see if you can get a stack trace using jstack pid. Paste that trace and we may be able to figure out where it's hung up. -Todd On Thu, Oct 15, 2009 at 12:12 AM, tim robertson timrobertson...@gmail.com wrote: Thanks for the info

Re: NullPointer on starting NameNode

2009-10-15 Thread Todd Lipcon
page: http://getsatisfaction.com/cloudera/products/cloudera_cloudera_s_distribution_for_hadoop (we don't want to confuse people on this list if anything is specific to our distro) -Todd On Wed, 14 Oct 2009 10:02:22 -0700, Todd Lipcon t...@cloudera.com wrote: Hi Bryn, Just to let you know

Re: NullPointer on starting NameNode

2009-10-14 Thread Todd Lipcon
Hi Bryn, Just to let you know, we've queued the patch Hairong mentioned for the next update to our distribution, due out around the end of this month. Thanks! -Todd On Wed, Oct 14, 2009 at 9:15 AM, Bryn Divey b...@bengueladev.com wrote: Hi all, I'm getting the following on initializing my

Re: Outputting extended ascii characters in Hadoop?

2009-10-12 Thread Todd Lipcon
with extended ASCII delimiters. Following your answer, however, I will try to use low-value ASCII, like 9 or 11, unless someone has a better suggestion. Thank you, Mark On Fri, Oct 9, 2009 at 6:49 PM, Todd Lipcon t...@cloudera.com wrote: Hi Mark, If you're using TextOutputFormat, it assumes

Slightly OT: Interesting article about ECC RAM (study by Google)

2009-10-09 Thread Todd Lipcon
As this topic comes up reasonably often on the list, I thought others might be interested in this: http://arstechnica.com/business/news/2009/10/dram-study-turns-assumptions-about-errors-upside-down.ars?utm_source=rssutm_medium=rssutm_campaign=rss Basically, the takeaway is that RAM errors are

Re: Outputting extended ascii characters in Hadoop?

2009-10-09 Thread Todd Lipcon
Hi Mark, If you're using TextOutputFormat, it assumes you're dealing in UTF8. Decimal 254 wouldn't be valid as a standalone character in UTF8 encoding. If you're dealing with binary (ie non-textual) data, you shouldn't use TextOutputFormat. -Todd On Fri, Oct 9, 2009 at 3:09 PM, Mark Kerzner

Re: detecting stalled daemons?

2009-10-08 Thread Todd Lipcon
Hi James, This doesn't quite answer your original question, but if you want to help track down these kinds of bugs, you should grab a stack trace next time this happens. You can do this either using jstack from the command line, by visiting /stacks on the HTTP interface, or by sending the process

Re: NameNode high availability

2009-10-01 Thread Todd Lipcon
On Thu, Oct 1, 2009 at 10:53 AM, Stas Oskin stas.os...@gmail.com wrote: Hi I'm looking into Name Node high availability, but so far found only an approach using DRBD. I tried to make it work using Xen over DRBD, but it didn't quite work - in fact I received a very valuable experience of

Re: Running Hadoop on cluster with NFS booted systems

2009-09-29 Thread Todd Lipcon
to me at this point. Any help deciphering it would be greatly appreciated. I have also now disabled the IB interface on my 2 test systems, unfortunately that had no impact. -Nick Todd Lipcon wrote: Hi Nick, Figure out the pid of the DataNode process using either jps or straight ps auxw

Re: Advice on new Datacenter Hadoop Cluster?

2009-09-29 Thread Todd Lipcon
Hi Kevin, Less than $1k/box is unrealistic and won't be your best price/performance. Most people building new clusters at this point seem to be leaning towards dual quad core Nehalem with 4x1TB 7200RPM SATA and at least 8G RAM. You're better off starting with a small cluster of these nicer

Re: Is it OK to run with no secondary namenode?

2009-09-28 Thread Todd Lipcon
Hi Mayuran, Yes, you need to run a secondary namenode. The secondary namenode is *not* a backup mechanism. It is an important part of the HDFS metadata system, and is responsible for periodically checkpointing the filesystem namespace into a single file. Without the secondary namenode running,

Re: Where are temp files stored?

2009-09-28 Thread Todd Lipcon
On Sun, Sep 27, 2009 at 7:39 PM, Starry SHI starr...@gmail.com wrote: Hi Dave. Thank you for your reply! I have checked {dfs.data.dir}/tmp, the tmp files are there while the job is running. However, it seems that the tmp files on each node are the same. That is to say, the whole HDFS is

Re: Is it OK to run with no secondary namenode?

2009-09-28 Thread Todd Lipcon
On Mon, Sep 28, 2009 at 10:44 AM, Mayuran Yogarajah mayuran.yogara...@casalemedia.com wrote: Hey Todd, Note that you do not need to run the 2NN on a separate machine *if* you have enough RAM for two entire copies of your filesystem namespace. For small clusters you should be fine to run

Re: Can we configure two or more datanode under pseudo-distributed mode?

2009-09-25 Thread Todd Lipcon
Hi Huang, Boris's answer should work fine. If it would be useful for you to have a single command line tool to start up a pseudo-distributed cluster for testing, please comment on this JIRA: http://issues.apache.org/jira/browse/MAPREDUCE-987 -Todd On Fri, Sep 25, 2009 at 10:19 AM, Boris

Re: Task process exit with nonzero status of 1

2009-09-24 Thread Todd Lipcon
Hi Marc, Exit status 1 usually means some kind of controlled exit by the mapreduce child task. Things like JVM crashes usually are indicated by other exit codes (134 seems to be the code most commonly reported). If you look at the stderr and stdout from your task (in the userlogs/ directory on

Re: Task process exit with nonzero status of 1

2009-09-24 Thread Todd Lipcon
18:16:43,092 INFO org.apache.hadoop.mapred.TaskTracker: addFreeSlot : current free slots : 2 2009-09-24 18:17:07,057 INFO org.apache.hadoop.mapred.TaskTracker: Received 'KillJobAction' for job: job_200909221656_0006 -Original Message- From: Todd Lipcon [mailto:t...@cloudera.com] Sent

Re: Can not stop hadoop cluster ?

2009-09-21 Thread Todd Lipcon
On Mon, Sep 21, 2009 at 2:57 AM, Steve Loughran ste...@apache.org wrote: Jeff Zhang wrote: My cluster has running for several months. Nice. Is this a bug of hadoop? I think hadoop is supposed to run for long time. I'm doing work in HDFS-326 on making it easier to start/stop the

Re: HADOOP-4539 question

2009-09-21 Thread Todd Lipcon
On Mon, Sep 21, 2009 at 7:50 AM, Edward Capriolo edlinuxg...@gmail.comwrote: Storing the only copy of the NN data into NFS would make the NFS server an SPOF, and you still need to solve the problems of @Steve correct. It is hair splitting but Stas asked if there was an approach that did

Re: Stretched HDFS cluster

2009-09-16 Thread Todd Lipcon
Hi Gregory, This is way outside the design parameters of HDFS. It may work, but you are very likely to run into issues, and I don't think anyone would recommend this as a solution. More reasonable would be a HDFS cluster spanning two datacenters within the same metro area (1-2ms latency), but

Re: Hadoop 0.20.1 (a big secret?)

2009-09-15 Thread Todd Lipcon
Hi Stephen, There was an email thread about two weeks ago announcing the release candidate and subsequent release. Regarding Cloudera's version, the most recent one available for download is based on 0.20.0, but we've included many of the bug fixes that are in 0.20.1. We're currently working on

Re: Multiple disks for DFS

2009-09-15 Thread Todd Lipcon
Hi Stas, Look at the LocalDirAllocator class. It's not really meant for public consumption, so its API might break in future releases, but that's the class that the hadoop daemons use for round robin behavior. -Todd On Tue, Sep 15, 2009 at 4:51 AM, Stas Oskin stas.os...@gmail.com wrote: Hi.

Re: Hadoop 0.20.1 (a big secret?)

2009-09-15 Thread Todd Lipcon
On Tue, Sep 15, 2009 at 9:20 AM, CubicDesign cubicdes...@gmail.com wrote: We're currently working on updating our patch series to include the entirety of 0.20.1 and should have that out next week. Hi. Do you think that 0.20.1 will be the next stable version (like 0.18.3?) We will

Re: Hadoop installation problems - help

2009-09-14 Thread Todd Lipcon
are specified with full name. Do you have any idea about why this happens? Again, thanks for your help 2009/9/14 Todd Lipcon t...@cloudera.com That's not an error - that just means that the daemon thread is waiting for a connection (IO event) The logs in $HADOOP_HOME/log/ are entirely empty

Re: Hadoop on EC2 - public AMIs in hadoop-images

2009-09-07 Thread Todd Lipcon
Hi, The EC2 scripts will boot Cloudera's distribution for Hadoop. Currently they boot our distribution of 0.18.3, but 0.20 support should be ready pretty soon now. Here's a list of what patches are in our newest 0.18.3 distribution:

<    1   2   3   >