Hi, Yes the performance hit is normal. Looks like you're seeing network latency on disk I/O. Could also be a tuning issue. (differences in configurations...)
Not sure how much. CPU difference will impact performance, while disk I/O will really kill you. Sent from a remote device. Please excuse any typos... Mike Segel On Dec 30, 2011, at 11:33 AM, Mark Kerzner <mark.kerz...@shmsoft.com> wrote: > Thank you, Bryan, > > that is very important and clear some cloudiness in my mind. > > Sincerely, > Mark > > On Fri, Dec 30, 2011 at 10:54 AM, Bryan Beaudreault < > bbeaudrea...@hubspot.com> wrote: > >> We have also seen this in our testing, though we focused mainly on MR more >> than HBase. >> >> Keep in mind that EC2 Compute Units are defined as follows: >> >> The amount of CPU that is allocated to a particular instance is expressed >>> in terms of these EC2 Compute Units. We use several benchmarks and tests >> to >>> manage the consistency and predictability of the performance of an EC2 >>> Compute Unit. One EC2 Compute Unit provides the equivalent CPU capacity >> of >>> a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor. >> >> >> This does not even account for CPU contention that Amandeep mentioned, >> which we have noticed at times as well. Also, c1.mediums have a I/O >> Performance rating of "Moderate." I think this mainly refers to ethernet >> speed, but it could refer to disk speed as well. >> >> If your local workstation is a reasonably modern system, it is very >> possible for you to see much better performance locally. The difference >> between 2.5 1.0 GHz 2007 processors (2.5 compute units) and a modern i5, >> i7, or equivalent is huge not just in speed and number of cores, but >> architecture, cache, etc. In terms of HBase write speed, if you are >> running on an SSD this could cause a substantial gap as well. >> >> On Fri, Dec 30, 2011 at 12:38 AM, Amandeep Khurana <ama...@gmail.com> >> wrote: >> >>> Is your client program running on the same node? Given that c1.mediums >> are >>> on shared hosts, your neighbor might be overloading his VM, causing yours >>> to starve. >>> >>> On Fri, Dec 30, 2011 at 9:50 AM, Mark Kerzner <markkerz...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> I am running a small program to load about 1 million rows into HBase. >> It >>>> takes 200 seconds on my dev machine, and 800 seconds on a c1.medium EC2 >>>> machine. Both are running the same version of Ubuntu and the same >> version >>>> of HBase. Everything is local on one machine in both cases. >>>> >>>> What could the difference between the two environments be? I did notice >>>> that my local machine has higher CPU loads: >>>> >>>> hbase 64% >>>> java (my app) 38% >>>> hdfs 20% >>>> >>>> whereas the EC2 machine >>>> hbase 47% >>>> java (my app) 23% >>>> hdfs 14% >>>> >>>> >>>> Sincerely, >>>> Mark >>>> >>> >>