Re: Prioritize my own app's disk access

Jonathan Taylor Wed, 06 Jul 2016 03:09:07 -0700

Thanks for your reply Alastair. Definitely interested in thinking about your 
suggestions - some responses below that will hopefully help clarify:


> The first thing to state is that you *can’t* write code of this type with the 
> attitude that “dropping frames is not an option”.  Fundamentally, the problem 
> you have is that if you generate video data faster than it can be saved to 
> disk, there is only so much video data you can buffer up before you start 
> swapping, and if you swap you will be dead in the water --- it will kill 
> performance to the extent that you will not be able to save data as quickly 
> as you could before and the result will be catastrophic, with far more frames 
> dropped than if you simply accepted that there was the possibility the 
> machine you were on was not fast enough and would have to occasionally drop a 
> frame.

I should clarify exactly what I mean here. Under normal circumstances I know 
from measurements that the i/o can keep up with the maximum rate at which 
frames can be coming in. I very rarely see any backlog at all reported, but 
might occasionally see a transient glitch (if CPU load momentarily spikes) of 
the order of 10MB backlog that is soon cleared. With that as the status quo, 
and 8GB of RAM available, something has gone badly, badly wrong if we enter vm 
swap chaos. 

When I say "dropping frames is not an option", what I mean is that a single 
lost frame will be fairly catastrophic for the scientific experiment, and so my 
priorities in order are: (1) ensure the machine specs leave plenty of headroom 
above my actual requirements, (2) try and do anything relatively simple I can 
do to ensure my code is efficient and marks threads/operations/etc as high or 
low priority where possible, (3) identify stuff that the user should avoid 
doing (which looks like it includes transferring data off the machine while a 
recording session is in progress - hence this email thread!), (4) not worry too 
much about what to do when we *have* already ended up with a catastrophic 
backlog (i.e. whether to drop frames or do something else), because at that 
point we have failed in the sense that the scientific experiment will basically 
need to be re-run.

I should also clarify that (in spite of my other email thread running in 
parallel to this) I am not doing any complex encoding of the data being 
streamed to disk - these are just basic TIFF images and metadata. The encoding 
I referred to in my other thread is optional offline processing of 
previously-recorded data.

> The right way to approach this type of real time encoding problem is as 
> follows:
> 
> 1. Use statically allocated buffers (or dynamically allocated once at encoder 
> or program startup).  DO NOT dynamically allocate buffers as you generate 
> data.
> 
> 2. Knowing the rate at which you generate video data, decide on the maximum 
> write latency you need to be able to tolerate.  This (plus a bit as you need 
> some free to encode into) will tell you the total size of buffer(s) you need.

OK.

> 3. *Either*
> 
>   (i)  Allocate a ring buffer of the size required, then interleave encoding 
> and issuing I/O requests.  You should keep track of where the 
> as-yet-unwritten data starts in your buffer, so you know when your encoder is 
> about to hit that point.  Or
> 
>   (ii) Allocate a ring *of* fixed size buffers totalling the size required; 
> start encoding into the first one, then when finished, issue an I/O request 
> for that buffer and continue encoding into the next one.  You should keep 
> track of which buffers are in use, so you can detect when you run out.
> 
> 4. When issuing I/O requests, DO NOT use blocking I/O from the encoder 
> thread.  You want to be able to continue to fetch video from your camera and 
> generate data *while* I/O takes place.  GCD is a good option here, or you 
> could use a separate I/O thread with a semaphore, or any other asynchronous 
> I/O mechanism (e.g. POSIX air, libuv and so on).
> 
> 5. If you find yourself running out of buffers, drop frames until buffer 
> space is available, and display the number of frame drops to the user.  This 
> is *much* better than attempting to use dynamic buffers and then ending up 
> swapping, which is I think what’s happening to you (having read your later 
> e-mails).

I am making good use of GCD here (and like it very much!). There are quite a 
few queues involved, and one is a dedicated disk-writing queue. The main 
CPU-intensive work going on in parallel with this is some realtime image 
analysis, but this is running on a concurrent queue.

Hopefully my detail above explains why I really do not want to drop frames 
and/or use a ring buffer. Effectively I have a buffer pool, but if I exhaust 
the buffer pool then (a) something is going badly wrong, and (b) I prefer to 
expand the buffer pool as a last-ditch attempt to cope with the backlog rather 
than terminating the experiment right then and there.

> Without knowing exactly how much video data you’re generating and what 
> encoder you’re using (if any), it’s difficult to be any more specific, but 
> hopefully this gives you some useful pointers.

As I say, there is no encoding going on in this particular workflow. Absolute 
maximum data rates are of the order of 50MB/s, but [and this is a non-optimal 
point, but one that I would prefer to stick with] this is split out into a 
sequence of separate files, some of which are as small as ~100kB in size.

Thanks very much for all your comments
Jonny
_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Prioritize my own app's disk access

Reply via email to