[GitHub] [pulsar] ashwallace commented on issue #7058: Pulsar on EBS having poor performance

GitBox Wed, 27 May 2020 20:02:23 -0700


ashwallace commented on issue #7058:
URL: https://github.com/apache/pulsar/issues/7058#issuecomment-635066612



   > 
   > 
   > @ashwallace But that doesn't explain why when on the pod itself using `dd` 
I'm able to access the EBS at 200 mbyte/s.
   > 
   > I will change the cluster to one that has throughput at a higher baseline 
and check results.
   
   @ckdarby All EC2's and EBS have a burst period (approx 30 mins a day - 
timing and availability of bursting depends on usage pattern). DD only has a 
single job to do on disk, so you can see that it is exceeding the baseline 
(bursting) without any tweaking. Pulsar is a bit more sophisticated so it will 
never look like DD performance.
   
   In your graphs, bookie is also exceeding the baseline (bursting) often too 
(150MBs peak observed). Your workload storage utilization appears pretty 
consistent though, so it is also likely you are consuming burst as it becomes 
available quite quickly, therefore not sustaining those higher throughputs. 
These are promising indicators that you may benefit from an EC2 instance with a 
higher baseline.
   
   Side notes: Generally, no application will attain the performance that a 
benchmarking tool would, simply because apps have more things to do, and unit 
of data that apps operate with may not align to the peak testing scenarios. 
Database servers get close but i wouldn't expect pub sub messaging to. For this 
reason you should not over-focus on attaining these peaks, and throughput 
shouldn't be your only performance goal. Generally speaking, smaller IO sizes 
typically have lower(better) response time and but lower (worse) throughput, 
but inversely larger IO has higher (better) throughput but higher(worse) 
response time. Which is best for your actual production needs? Don't forget the 
application metrics: i.e. how many messages/sec do you actually need on your 
worse day and work backwards from there. 
   
   Interestingly enough, the pulsar documentation itself recommends i3 instance 
types which have their own NVMe disks - which will go well beyond EBS 
performance. 
https://pulsar.apache.org/docs/v2.0.1-incubating/deployment/aws-cluster/
   
   I hope this helps you find the best instance size/instance count/cost ratio 
for your needs. Keen to hear back your next findings.
   
   
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [pulsar] ashwallace commented on issue #7058: Pulsar on EBS having poor performance

Reply via email to