Thank you very much for the valuable information.

I will try the 1st and the 3rd options, because the 2nd option will not be easily applicable withouth refactoring the current code a lot.

Thank you again,
Regards,
JongAm Park

On Aug 2, 2008, at 7:51 AM, James Bucanek wrote:

JongAm Park <mailto:[EMAIL PROTECTED]> wrote (Friday, August 1, 2008 7:27 PM -0700):
The function will write about 19200 bytes per every call. I'm going to write XDCAM 35 or 50 video data. What is curious was that saving files
as QuickTime movie format from the Final Cut Pro takes about 20 secs
for 1 minutes of XDCAM 35 video source, but if it does so using a FCP
plug-in of which source codes I work with takes about 35 seconds.
There is a call back function which is called by th FCP, and I even
tried converting it to multithreaded version, but it is still much
slower than the FCP's own scheme.

I measured the performance and found out that the most of the time
were spent with the FSWriteFork() function. Probably other parts
should be streamlined also, but it would impact significantly if the
file write can be faster.

Writing 19,200 bytes per call is going to add a significant amount of overhead. FSWriteFork() can be blindingly fast, but degrades when given small and/or misaligned chunks of data to deal with.

In my experience, the keys to making FSWriteFork() fly are

- Align the data to be written in memory to page boundaries (man valloc).
- Write data in multiples of page size blocks (man getpagesize)
- Turn off caching (see Cache Constants of FSWriteFork)

This allows FSWriteFork to avoid maintaining and copying your data into a buffer, which then gets written to disk in pieced. If all of the above prerequisites are met, the file system can perform a DMA transfer directly from your address space to the disk controller.

In my application, I was primarily interested in increasing read performance. Aligning the data buffers and reading large blocks (2MB) at a time more than doubled the read performance while simultaneously dropping CPU overhead.

Ultimately, I ended up creating a separate thread to read the data from the file into a set of circular buffers, while a second thread processed the data that had already been read. Between the faster I/ O and the reduced CPU overhead (which could then be utilized by the thread processing the data) I was able to improve the performance of my application by 350%.
--
James Bucanek


_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

Reply via email to