Re: [rfc] small bioq patch

Maksim Yevmenkin Fri, 11 Oct 2013 15:40:26 -0700


> On Oct 11, 2013, at 2:52 PM, John-Mark Gurney <j...@funkthat.com> wrote:
> 
> Maksim Yevmenkin wrote this message on Fri, Oct 11, 2013 at 11:17 -0700:
>> i would like to submit the attached bioq patch for review and
>> comments. this is proof of concept. it helps with smoothing disk read
>> service times and arrear to eliminates outliers. please see attached
>> pictures (about a week worth of data)
>> 
>> - c034 "control" unmodified system
>> - c044 patched system
> 
> Can you describe how you got this data?  Were you using the gstat
> code or some other code?


Yes, it's basically gstat data. 

> Also, was your control system w/ the patch, but w/ the sysctl set to
> zero to possibly eliminate any code alignment issues?

Both systems use the same code base and build. Patched system has patch 
included, "control" system does not have the patch. I can rerun my tests with 
sysctl set to zero and use it as "control". So, the answer to your question is 
"no". 

>> graphs show max/avg disk read service times for both systems across 36
>> spinning drives. both systems are relatively busy serving production
>> traffic (about 10 Gbps at peak). grey shaded areas on the graphs
>> represent time when systems are refreshing their content, i.e. disks
>> are both reading and writing at the same time.
> 
> Can you describe why you think this change makes an improvement?  Unless
> you're running 10k or 15k RPM drives, 128 seems like a large number.. as
> that's about halve number of IOPs that a normal HD handles in a second..

Our (Netflix) load is basically random disk io. We have tweaked the system to 
ensure that our io path is "wide" enough, I.e. We read 1mb per disk io for 
majority of the requests. However offsets we read from are all over the place. 
It appears that we are getting into situation where larger offsets are getting 
delayed because smaller offsets are "jumping" ahead of them. Forcing bioq 
insert tail operation and effectively moving insertion point seems to help 
avoiding getting into this situation. And, no. We don't use 10k or 15k drives. 
Just regular enterprise 7200 sata drives. 

> I assume you must be regularly seeing queue depths of 128+ for this
> code to make a difference, do you see that w/ gstat?

No, we don't see large (128+) queue sizes in gstat data. The way I see it, we 
don't have to have deep queue here. We could just have a steady stream of io 
requests where new, smaller, offsets consistently "jumping" ahead of older, 
larger offset. In fact gstat data show shallow queue of 5 or less items.

> Also, do you see a similar throughput of the system?

Yes. We do see almost identical throughput from both systems.  I have not 
pushed the system to its limit yet, but having much smoother disk read service 
time is important for us because we use it as one of the components of system 
health metrics. We also need to ensure that disk io request is actually 
dispatched to the disk in a timely manner. 

Thanks
Max

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: [rfc] small bioq patch

Reply via email to