Slow generation of blockReport at DataNode

2011-08-06 Thread Joe Stein
out having to upgrade. Any folks have this that have worked around it or anyone have any ideas it would be greatly appreciated. I have already tried cronning find ./ to my hdfs directories but no go. Thanks in advance. /* Joe Stein http://www.linkedin.com/in/charmalloc Twitter: @allthingshadoop */

Re: Which release to use?

2011-07-18 Thread Joe Stein
Arun, Thanks for the update. Again, I hate to have to play the part of captain obvious. Glad to hear the same contiguous mantra for this next release. I think sometimes the plebeians ( of which I am one ) need that affirmation. One love, Apache Hadoop! /* Joe Stein http

Re: Which release to use?

2011-07-18 Thread Joe Stein
"state the obvious" paper on my back but kinda feel it had to be said. One love, Apache Hadoop! /* Joe Stein http://www.medialets.com Twitter: @allthingshadoop */ On Jul 18, 2011, at 9:51 PM, Michael Segel wrote: > > > >> Date: Mon, 18 Jul 2011 18:19:38 -0700 >> Su

Using df instead of du to calculate datanode space

2011-05-20 Thread Joe Stein
works well for others too. /* Joe Stein http://www.twitter.com/allthingshadoop */

Looking for guests again for the All Things Hadoop Podcast

2011-03-09 Thread Joe Stein
by the Hadoop and related items.. The general purpose is to share in the community what is going on for both new and existing folks and hope that this has been helpful for folks in general. Let me know, would like to get some guests lined up again before starting things back up. /* Joe Stein http

Re: Too many fetch-failures

2010-09-27 Thread Joe Stein
contact the datanodes it is trying to get data from. /* Joe Stein, 973-944-0094 http://www.medialets.com Twitter: @allthingshadoop */ On Sep 27, 2010, at 2:50 PM, Pramy Bhats wrote: > Hello, > > I am trying to run a biagram count on a 12-node cluster setup. For an input > file of 135 sp

Re: Best practices - Large Hadoop Cluster

2010-08-11 Thread Joe Stein
Not sure this was mentioned already but Adobe open sourced their puppet impl http://github.com/hstack/puppet as well as a nice post in regards to it http://hstack.org/hstack-automated-deployment-using-puppet/ /* Joe Stein http://www.linkedin.com/in/charmalloc Twitter: @allthingshadoop */ On

Re: what affects number of reducers launched by hadoop?

2010-07-29 Thread Joe Stein
e on this (and everything else) too. here are a couple tips & tricks you might find helpful in your first cluster http://allthingshadoop.com/2010/04/28/map-reduce-tips-tricks-your-first-real-cluster/ On Thu, Jul 29, 2010 at 6:31 AM, Abhinay Mehta wrote: > Which configuration key controls

Re: what affects number of reducers launched by hadoop?

2010-07-28 Thread Joe Stein
gramatically to 10 > job.setNumReduceTasks(10); > the number of "reduce > reduce" reducers increases to 10 and the > performance of application increases as well (the number of reducers > never exceeds). > > Can someone explain such behavior? > > Thanks in Advance

Re: Text files vs. SequenceFiles

2010-07-02 Thread Joe Stein
We have a custom SequenceFileLoader so we can still use Pig also against our SequenceFiles. It is worth the little bit of engineering effort to save space. /* Joe Stein http://www.linkedin.com/in/charmalloc Twitter: @allthingshadoop */ On Fri, Jul 2, 2010 at 6:14 PM, Alex Loddengaard wrote: