Re: Optimized Hadoop

2012-02-17 Thread Todd Lipcon
ainly add the following two features: >> 1) shuffle avoidance >> 2) work pool >> >> >> On Fri, Feb 17, 2012 at 3:27 AM, Todd Lipcon wrote: >>> >>> Hey Schubert, >>> >>> Looking at the code on github, it looks like your rewritten shuff

Re: Optimized Hadoop

2012-02-16 Thread Todd Lipcon
nue to do more improvements, thanks > for your help. > > > > > On Thu, Feb 16, 2012 at 11:01 PM, Anty wrote: >> >> Hi: Guys >>    We just deliver a optimized hadoop , if you are interested, Pls >> refer to https://github.com/hanborq/hadoop >> >> -- >> Best Regards >> Anty Rao > > -- Todd Lipcon Software Engineer, Cloudera

Re: Performance of direct vs indirect shuffling

2011-12-20 Thread Todd Lipcon
x27;t think they're > practical and that direct shuffling is superior. > > Anyone have any thoughts here? -- Todd Lipcon Software Engineer, Cloudera

Re: Tasktracker Task Attempts Stuck (mapreduce.task.timeout not working)

2011-12-19 Thread Todd Lipcon
fine for many jobs and then just all of a sudden some tasks >> will stall and stick at 0.0%. There are no visible errors in the log >> outputs, although nothing will move forward nor will it release the mappers >> for any other jobs to use until the stalled job is killed. It seems that >> the default ‘mapreduce.task.timeout’ just isn’t working for some reason.* >> *** >> >> >> >> Has anyone come across anything similar to this? I can provide more >> details/data as needed. >> >> >> >> *John Miller **|* Sr. Linux Systems Administrator >> >> <http://mybuys.com/> >> >> 530 E. Liberty St. >> >> Ann Arbor, MI 48104 >> >> Direct: 734.922.7007 >> >> *http://mybuys.com/* >> >> >> >> ** ** >> > > -- Todd Lipcon Software Engineer, Cloudera <>

Re: -libjars?

2011-08-16 Thread Todd Lipcon
> the root cause of this. > So my question is : > with -libjars option ,the configuration file is already on the > classpath, why classload can't the configuration file , > but why JVM classload CAN find the shipped jar with -libjars option? > > any help will be appreciated

Re: Job works only when TaskTracker & JobTracker on the same machine

2011-05-17 Thread Todd Lipcon
amp;attemptid=attempt_201104201820_0007_m_02_0&filter=stdout > 11/04/21 12:21:02 WARN mapred.JobClient: Error reading task outputhttp:// > cene.ro.schlund.net:50060/tasklog?plaintext=true&attemptid=attempt_201104201820_0007_m_02_0&filter=stderr > 11/04/21 12:21:02 INFO mapred.JobCl

Re: Problems with LinuxTaskController, LocalJobRunner, and localRunner directory

2011-05-06 Thread Todd Lipcon
the job. The next time I restart the > daemons, the task tracker will fail because it can't rename > /var/lib/hadoop-0.20/cache/pseudo/localRunner. > > Does anybody have suggestions how to fix this? > > Thanks > Jeremy > > > -- Todd Lipcon Software Engineer, Cloudera

Re: "Reduce input groups" vs "Reduce input records"

2011-03-25 Thread Todd Lipcon
t records=1820763 > 10/07/11 22:25:48 INFO mapred.JobClient:     Map output records=1734298 > 10/07/11 22:25:48 INFO mapred.JobClient:     Reduce input records=41788 > 10/07/11 22:25:48 INFO simple.DemoWordCount: Job Finished in 5.345 seconds > > > -- > Pedro > -- Todd Lipcon Software Engineer, Cloudera

Re: Which version to choose

2010-12-22 Thread Todd Lipcon
r are you keeping up > with new releases instead? Are there advantages to running Cloudera's > packages instead of the Apache releases (besides that it is slightly easier > to install)? > > This is an ASF list so I won't address this here. I'll let other users answer this question if they so choose. Feel free to redirect this question to the cdh-user mailing list and some Cloudera employees can help you out. Thanks -Todd -- Todd Lipcon Software Engineer, Cloudera

Re: -libjars?

2010-12-10 Thread Todd Lipcon
; public int run(String[] args) throws Exception { > : > : >               } > : > public static void main(String[] args) throws Exception { > int res = ToolRunner.run(new Configuration(), new FindSchoolsJob(), > args); > System.exit(res); > } > } > Any hint would be highly appreciated. > Thank You! > ~V -- Todd Lipcon Software Engineer, Cloudera

Re: How to modify task assignment algorithm?

2010-10-14 Thread Todd Lipcon
gt;> >>>> On Fri, Oct 8, 2010 at 2:57 AM, Shen LI wrote: >>>> > Hi, >>>> > How can I modify the task assignment strategy in hadoop which is used to >>>> > assign tasks to different worker nodes? (Not the job scheduler) >>>> > Big thanks, >>>> > Shen >>>> >>>> >>>> >>>> -- >>>> Best Regards >>>> >>>> Jeff Zhang >>> >>> >> >> >> >> -- >> Best Regards >> >> Jeff Zhang >> > -- Todd Lipcon Software Engineer, Cloudera

Re: [ANN] lzo indexing

2010-08-31 Thread Todd Lipcon
://github.com/tcurdt/lzo-index > > It's not much tested yet and I am sure it will still need some work > ...but I thought I just announce it and maybe get some more testers > and maybe some feedback. > > cheers > -- > Torsten > -- Todd Lipcon Software Engineer, Cloudera

Re: Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.

2010-07-08 Thread Todd Lipcon
r the last week. I was running >> 0.20.1 >> >>>> first and then >> >>>> upgrade to 0.20.2 but results have been exactly the same. >> >>>> >> >>>>> can you look at the counters for the two jobs using the JobTracker >> web >> >>>>> UI - things like map records, bytes etc and see if there is a >> >>>>> noticeable difference ? >> >>>> >> >>>> Ok, so here is the first job using write.set(value.toString()); >> having >> >>>> *no* errors: >> >>>> http://pastebin.com/xvy0iGwL >> >>>> >> >>>> And here is the second job using >> >>>> write.set(value.toString().substring(0, 10)); that fails: >> >>>> http://pastebin.com/uGw6yNqv >> >>>> >> >>>> And here is even another where I used a longer, and therefore unique >> string, >> >>>> by write.set(value.toString().substring(0, 20)); This makes every >> line >> >>>> unique, similar to first job. >> >>>> Still fails. >> >>>> http://pastebin.com/GdQ1rp8i >> >>>> >> >>>>>Also, are the two programs being run against >> >>>>> the exact same input data ? >> >>>> >> >>>> Yes, exactly the same input: a single csv file with 23K lines. >> >>>> Using a shorter string leads to more like keys and therefore more >> >>>> combining/reducing, but going >> >>>> by the above it seems to fail whether the substring/key is entirely >> >>>> unique (23000 combine output records) or >> >>>> mostly the same (9 combine output records). >> >>>> >> >>>>> >> >>>>> Also, since the cluster size is small, you could also look at the >> >>>>> tasktracker logs on the machines where the maps have run to see if >> >>>>> there are any failures when the reduce attempts start failing. >> >>>> >> >>>> Here is the TT log from the last failed job. I do not see anything >> >>>> besides the shuffle failure, but there >> >>>> may be something I am overlooking or simply do not understand. >> >>>> http://pastebin.com/DKFTyGXg >> >>>> >> >>>> Thanks again! >> >>>> >> >>>>> >> >>>>> Thanks >> >>>>> Hemanth >> >>>>> >> >>>> >> >>> >> >> >> > >> > > -- Todd Lipcon Software Engineer, Cloudera

Re: TotalOrderPartitioner for the new API?

2010-07-05 Thread Todd Lipcon
Over in HBase, we include a backported TotalOrderPartitioner for the new API. See: http://github.com/apache/hbase/tree/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/hadoopbackport/ -Todd > > thanks in advance, > > Juber > -- Todd Lipcon Software Engineer, Cloudera

Re: Do we shoot ourselves by using all task slots?

2010-05-28 Thread Todd Lipcon
, the ops team suspected (but never did the homework to verify) that > our grids and their usage so massive that the data locality rarely happened, > especially for "popular" data. I can't help but wonder if the situation > would have been better if we would have kept x% (say .005%?) of the grid > free based upon the speculation above. > >Thoughts? -- Todd Lipcon Software Engineer, Cloudera

Re: Query over efficient utilization of cluster using fair scheduling

2010-01-15 Thread Todd Lipcon
This way, we can avoid stalling other users' jobs and also > efficiently utilize the cluster. Kindly clarify. > > Thanks > Pallavi > > > > Todd Lipcon wrote: > > Hi Pallavi, > > This doesn't sound right. Can you visit > http://jobtracker:50030/scheduler?a

Re: Query over efficient utilization of cluster using fair scheduling

2010-01-14 Thread Todd Lipcon
Hi Pallavi, This doesn't sound right. Can you visit http://jobtracker:50030/scheduler?advanced and maybe send a screenshot? And also upload the allocations.xml file you're using? It sounds like you've managed to set either userMaxJobsDefault or maxRunningJobs for that user to 1. -Todd On Thu, J

Re: compiling hadoop?

2009-12-10 Thread Todd Lipcon
Hi Calvin, In order to compile hadoop, you need a net connection to download its dependencies. One thing you can do is to compile it on another machine, and then rsync your ~/.ivy2/cache directory from that machine to the machine you'll primarily be developing on. Hope that helps -Todd On Thu,

Re: Maps getting stuck at 100%

2009-11-23 Thread Todd Lipcon
Hi Himanshu, The map progress percentage is calculated based on the input read, rather than the processing actually done. So, if you're doing a lot of work in your mapper, or reading ahead of what you've processed, you'll see this behavior reasonably often. It also can show up sometimes in streami

Re: DefaultCodec vs. LZO

2009-10-26 Thread Todd Lipcon
source trees that are supposed to have their own debian/ directory are so-called "Debian native" packages, where the version number and the debian release are one and the same (eg debian system scripts, etc) -Todd > On Oct 23, 2009, at 2:17 PM, Todd Lipcon wrote: > > I would, e

Re: Map output compression leads to JVM crash (0.20.0)

2009-10-26 Thread Todd Lipcon
What Linux distro are you running? It seems vaguely possible that you're using some incompatible library versions compared to what everyone else has tested libhadoop with. -Todd On Sun, Oct 25, 2009 at 8:36 PM, Ed Mazur wrote: > I'm having problems on 0.20.0 when map output compression is enabl

Re: DefaultCodec vs. LZO

2009-10-23 Thread Todd Lipcon
I would, except currently the github project seems to have some bug fixes not in the google code one. I wanted to keep it external so it could apply to either one. What's the story on merging the two projects? (Kevin's and the G Code one) -Todd On Fri, Oct 23, 2009 at 2:10 PM, Owen O'Malley wrot

Re: command to reload log4j settings for a JobTracker?

2009-10-20 Thread Todd Lipcon
Hey Zheng, I don't know that there's a command to reload the properties file, but you can use the "hadoop daemonlog" command to change an individual logger at runtime. e.g: t...@todd-laptop:~$ hadoop-0.20 daemonlog -getlevel localhost:50030 org.apache.hadoop Connecting to http://localhost:50030/l

Re: Heap size on Jobs

2009-10-19 Thread Todd Lipcon
Hi Geoff, The heap size configuration for your child processes is controlled by the mapred.child.java.opts configuration parameter, which is separate from the HADOOP_HEAPSIZE setting. HADOOP_HEAPSIZE is used to control how much heap the daemons themselves get, whereas mapred.child.java.opts control

Reduce input records >> Map output records

2009-08-13 Thread Todd Lipcon
Hey all, Has anyone seen behavior where the number of reduce input records is significantly larger than the number of map output records? There's no combiner involved in the job at hand, and it's not particularly large (250GB in, about the same output). The numbers on one example job are: 2,202,29

Re: how to read replicated blocks with hdfs api?

2009-08-06 Thread Todd Lipcon
On Thu, Aug 6, 2009 at 1:20 PM, Harold Valdivia Garcia < harold.valdi...@upr.edu> wrote: > Hi... I was reading the HDFS code, and I can't find a way to read the > replicated blocks of a block-file. > > DFS.getFileBlockLocations returns all blocks of a file > File = block-a, block-b, . block-n.

Re: Task process exit with nonzero status of 134

2009-07-28 Thread Todd Lipcon
it code 134 was finally gone. > > On that webpage, someone posted in the comments a code snippet to reproduce > the JVM crash. I have not yet confirmed whether it was reported to Sun as > well. > > Cheers, > Chris > > > Todd Lipcon schrieb: > >> Hi Christi

Re: Task process exit with nonzero status of 134

2009-07-21 Thread Todd Lipcon
Hi Christian, Generally along with a nonzero exit code you should see something in the stderr for that attempt. If you look on the TaskTracker inside logs/userlogs/attempt_/stderr do you see anything useful? If it's a segfault or a linux OOM kill, you should also see something in your system's ke