With respect to EMR, you can run HBase fairly easily. You can't run MapR w HBase on EMR stick w Amazon's release.
And you can run it but you will want to know your tuning parameters up front when you instantiate it. Sent from a remote device. Please excuse any typos... Mike Segel On May 8, 2013, at 9:04 PM, Andrew Purtell <apurt...@apache.org> wrote: > M7 is not Apache HBase, or any HBase. It is a proprietary NoSQL datastore > with (I gather) an Apache HBase compatible Java API. > > As for running HBase on EC2, we recently discussed some particulars, see > the latter part of this thread: http://search-hadoop.com/m/rI1HpK90gu where > I hijack it. I wouldn't recommend launching HBase as part of an EMR flow > unless you want to use it only for temporary random access storage, and in > which case use m2.2xlarge/m2.4xlarge instance types. Otherwise, set up a > dedicated HBase backed storage service on high I/O instance types. The > fundamental issue is IO performance on the EC2 platform is fair to poor. > > I have also noticed a large difference in baseline block device latency if > using an old Amazon Linux AMI (< 2013) or the latest AMIs from this year. > Use the new ones, they cut the latency long tail in half. There were some > significant kernel level improvements I gather. > > > On Wed, May 8, 2013 at 10:42 AM, Marcos Luis Ortiz Valmaseda < > marcosluis2...@gmail.com> wrote: > >> I think that you when you are talking about RMap, you are referring to >> MapR´s distribution. >> I think that MapR´s team released a very good version of its Hadoop >> distribution focused on HBase called M7. You can see its overview here: >> http://www.mapr.com/products/mapr-editions/m7-edition >> >> But this release was under beta testing, and I see that it´s not included >> in the Amazon Marketplace yet: >> >> https://aws.amazon.com/marketplace/seller-profile?id=802b0a25-877e-4b57-9007-a3fd284815a5 >> >> >> >> >> 2013/5/7 Pal Konyves <paul.kony...@gmail.com> >> >>> Hi, >>> >>> Has anyone got some recommendations about running HBase on EC2? I am >>> testing it, and so far I am very disappointed with it. I did not change >>> anything about the default 'Amazon distribution' installation. It has one >>> MasterNode and two slave nodes, and write performance is around 2500 >> small >>> rows per sec at most, but I expected it to be way better. Oh, and this >> is >>> with batch put operations with autocommit turned off, where each batch >>> containes about 500-1000 rows... When I do it with autocommit, it does >> not >>> even reach the 1000 rows per sec. >>> >>> Every nodes were m1.Large ones. >>> >>> Any experiences, suggestions? Is it worth to try the RMap distribution >>> instead of the amazon one? >>> >>> Thanks, >>> Pal >> >> >> >> -- >> Marcos Ortiz Valmaseda >> Product Manager at PDVSA >> http://about.me/marcosortiz > > > > -- > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White)