On Fri, Jun 19, 2009 at 10:37 PM, Allen Wittenauer <a...@yahoo-inc.com> wrote:

> On 6/19/09 3:49 AM, "Harish Mallipeddi" <harish.mallipe...@gmail.com>
> wrote:
> > Why do you want to do this in the first place? It seems like you want
> > cluster1 to be a plain HDFS cluster and cluster2 to be a mapred cluster.
> > Doing something like that will be disastrous - Hadoop is all about
> sending
> > computation closer to your data. If you don't want that, you need not
> even
> > use hadoop.
>
>     Given some of the limitations with HDFS (quota operability, security),
> I
> can easily why it would be desirable to have static data coming from one
> grid while doing computation/intermediate outputs/real output to another.
>
>    Using performance as your sole metric of viability is a bigger disaster
> waiting to happen.  "Sure, we crashed the file system, but look how fast it
> went down in flames!"
>
>
Well apart from doing a distcp between the 2 clusters periodically, I don't
see how this can be done in a way that would yield acceptable performance.

-- 
Harish Mallipeddi
http://blog.poundbang.in

Reply via email to