On Mon, Oct 21, 2013 at 2:00 PM, Jonathan Hodges <[email protected]> wrote:
> I am a Blur newbie and just heard about it and the recent 0.2 release from > the latest Hadoop Weekly email. I checked out the site and there is a > great documentation for getting started, but didn't mention a couple > questions I have. > > We are running our Hadoop clusters as a mix of persistent and transient EMR > clusters in Amazon. It is running Hadoop 1.0.3. We are also using S3 > instead of HDFS to store our data. > > So does anyone have experience running Blur in an Amazon environment? Does > S3 vs HDFS present any problems to the Blur architecture? > I have run Blur in EC2 with a installed HDFS not against S3. That being said there the only feature of HDFS that is utilized that S3 cannot support (that I know of) is the sync feature I use for the write ahead log. However with https://issues.apache.org/jira/browse/BLUR-42 we could have a pluggable write ahead log. Also, if you do not utilize near real time updates then the WAL is not even used, you would just use Mapreduce for indexing. Aaron > > Thanks in advance! > -Jonathan >
