One way to speed it up is to do as much work as possible in parallel. One way 
—and this is just off the top of my head— is:

1. Create a NSOperationQueue, and add a single operation on that queue to 
manage the entire process. (This is because some parts of the process are 
synchronous and might take a while and you don’t want to block the UI thread.)

2. The operation would create another worker NSOperationQueue where operations 
are added that each process a single image file (the contents of your `for` 
loop).

3. The manager operation adds operations to the worker queue to process a 
reasonable chunk of the files (10? 50?) and then waits for those operations to 
complete. (NSOperationQueue has something like a “wait until done” method.) It 
then repeats until all the image files have been processed.

4. As each chunk completes, it can report status to the UI thread via a 
notification or some other means.

Unlike your synchronous implementation, below, the order of updates to that 
array is indeterminate. A way to fix it is to pre-populate it with as many 
placeholder items (NSDate.distantPast?) as are in imagefiles and then store 
iso_date at the same index as its corresponding filename. Another benefit is 
that there is a single memory allocation at the beginning rather than periodic 
resizes of the array (and copying the existing contents) as items are added.

And since all these items are running on different threads then you need to 
protect access to your dates_and_times array because modifying it is not 
thread-safe. One quick way is to create a NSLock and lock it around the array 
update:

[theLock lock];
dates_and_times[index] = iso_date;
[theLock unlock];

Anyway, another way to look at the process.

Steve


> On Aug 14, 2022, at 2:22 PM, Gabriel Zachmann via Cocoa-dev 
> <cocoa-dev@lists.apple.com> wrote:
> 
> I would like to collect the date/time stored in an EXIF tag in a bunch of 
> images.
> 
> I thought I could do so with the following procedure
> (some details and error checking omitted for sake of clarity):
> 
> 
>    NSMutableArray * dates_and_times = [NSMutableArray arrayWithCapacity: 
> [imagefiles count]];
>    CFDictionaryRef exif_dict;
>    CFStringRef dateref = NULL;
>    for ( NSString* filename in imagefiles )
>    {
>        NSURL * imgurl = [NSURL fileURLWithPath: filename isDirectory: NO];    
> // escapes any chars that are not allowed in URLs (space, &, etc.)
>        CGImageSourceRef image = CGImageSourceCreateWithURL( (__bridge 
> CFURLRef) imgurl, NULL );
>        CFDictionaryRef fileProps = CGImageSourceCopyPropertiesAtIndex( image, 
> 0, NULL );
>        bool success = CFDictionaryGetValueIfPresent( fileProps, 
> kCGImagePropertyExifDictionary, (const void **) & exif_dict );
>        success = CFDictionaryGetValueIfPresent( exif_dict, 
> kCGImagePropertyExifDateTimeDigitized, (const void **) & dateref );
>        NSString * date_str = [[NSString alloc] initWithString: (__bridge 
> NSString * _Nonnull)( dateref ) ];
>        NSDate * iso_date = [isoDateFormatter_ dateFromString: date_str];
>        if ( iso_date )
>             [dates_and_times addObject: iso_date ];
>        CFRelease( fileProps );
>    }
> 
> 
> But, I get the impression, this code actually loads each and every image.
> On my Macbook, it takes 3m30s for 250k images (130GB).
> 
> So, the big question is: can it be done faster?
> 
> I know the EXIF tags are part of the image file, but I was hoping it might be 
> possible to load only those EXIF dictionaries.
> Or are the CGImage functions above already clever enough to implement this 
> idea?
> 
> 
> Best regards, Gab.

_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to