On Mon, Oct 21, 2013 at 2:00 PM, Jonathan Hodges <[email protected]> wrote:

> I am a Blur newbie and just heard about it and the recent 0.2 release from
> the latest Hadoop Weekly email.  I checked out the site and there is a
> great documentation for getting started, but didn't mention a couple
> questions I have.
>
> We are running our Hadoop clusters as a mix of persistent and transient EMR
> clusters in Amazon.  It is running Hadoop 1.0.3.  We are also using S3
> instead of HDFS to store our data.
>
> So does anyone have experience running Blur in an Amazon environment?  Does
> S3 vs HDFS present any problems to the Blur architecture?
>

I have run Blur in EC2 with a installed HDFS not against S3.  That being
said there the only feature of HDFS that is utilized that S3 cannot support
(that I know of) is the sync feature I use for the write ahead log.
However with https://issues.apache.org/jira/browse/BLUR-42 we could have a
pluggable write ahead log.  Also, if you do not utilize near real time
updates then the WAL is not even used, you would just use Mapreduce for
indexing.

Aaron


>
> Thanks in advance!
> -Jonathan
>

Reply via email to