Re: Retrieving the EXIF date/time from 250k images

2022-08-14 Thread Allyn Bauer via Cocoa-dev
Sorry if there's dupe emails.

I think an important question here is, what exactly are you trying to do?
Do you really expect to process these images repeatedly? 3.5m for a quarter
million pics isn't terrible when one already possesses a working solution.
If you have a specific task in mind, it would be helpful to know what level
of performance would be acceptable.

On Sun, Aug 14, 2022 at 8:51 PM Gary L. Wade via Cocoa-dev <
cocoa-dev@lists.apple.com> wrote:

> I noticed you release the fileProps but didn’t release the image, but I
> don’t know if that’s one of those details you left out for clarity.  Also,
> depending on some factors like mutability, while the initWithString call
> with a CFStringRef might essentially be a no-op, you can just do the
> typecast on the dateref and pass it directly into dateFromString.
>
> One thing I’d suggest is to do the work for each image asynchronously on a
> background queue and have that block (essentially all of your for-loop
> code) report its completion by some asynchronous way like posting a
> notification on the original queue along with the result you care about,
> the parsed date associated with the particular file.  Let the original
> queue handle how to store each parsed date; it would probably be best to
> use a dictionary where the key was the filename and value is the date.  To
> prevent memory pressure, allocate your background queue so that it’s
> concurrent and autorelease frequency is set to be workItem.  If you want to
> be sure to know when everything’s done, you could use a DispatchGroup to
> track those and you could choose to pass back NSNull or nil for the parsed
> result if the date could not be parsed.
>
> Of course, this will depend on if your file system is non-network-based
> and whether it’s SSD vs HD as well as other physical system factors.
> --
> Gary
>
> > On Aug 14, 2022, at 2:22 PM, Gabriel Zachmann via Cocoa-dev <
> cocoa-dev@lists.apple.com> wrote:
> >
> > I would like to collect the date/time stored in an EXIF tag in a bunch
> of images.
> >
> > I thought I could do so with the following procedure
> > (some details and error checking omitted for sake of clarity):
> >
> >
> >NSMutableArray * dates_and_times = [NSMutableArray arrayWithCapacity:
> [imagefiles count]];
> >CFDictionaryRef exif_dict;
> >CFStringRef dateref = NULL;
> >for ( NSString* filename in imagefiles )
> >{
> >NSURL * imgurl = [NSURL fileURLWithPath: filename isDirectory:
> NO];// escapes any chars that are not allowed in URLs (space, &, etc.)
> >CGImageSourceRef image = CGImageSourceCreateWithURL( (__bridge
> CFURLRef) imgurl, NULL );
> >CFDictionaryRef fileProps = CGImageSourceCopyPropertiesAtIndex(
> image, 0, NULL );
> >bool success = CFDictionaryGetValueIfPresent( fileProps,
> kCGImagePropertyExifDictionary, (const void **) & exif_dict );
> >success = CFDictionaryGetValueIfPresent( exif_dict,
> kCGImagePropertyExifDateTimeDigitized, (const void **) & dateref );
> >NSString * date_str = [[NSString alloc] initWithString: (__bridge
> NSString * _Nonnull)( dateref ) ];
> >NSDate * iso_date = [isoDateFormatter_ dateFromString: date_str];
> >if ( iso_date )
> > [dates_and_times addObject: iso_date ];
> >CFRelease( fileProps );
> >}
> >
> >
> > But, I get the impression, this code actually loads each and every image.
> > On my Macbook, it takes 3m30s for 250k images (130GB).
> >
> > So, the big question is: can it be done faster?
> >
> > I know the EXIF tags are part of the image file, but I was hoping it
> might be possible to load only those EXIF dictionaries.
> > Or are the CGImage functions above already clever enough to implement
> this idea?
> >
> >
> > Best regards, Gab.
> >
>
> ___
>
> Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)
>
> Please do not post admin requests or moderator comments to the list.
> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
>
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/cocoa-dev/allyn.bauer%40gmail.com
>
> This email sent to allyn.ba...@gmail.com
>
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Re: Retrieving the EXIF date/time from 250k images

2022-08-14 Thread Gary L. Wade via Cocoa-dev
I noticed you release the fileProps but didn’t release the image, but I don’t 
know if that’s one of those details you left out for clarity.  Also, depending 
on some factors like mutability, while the initWithString call with a 
CFStringRef might essentially be a no-op, you can just do the typecast on the 
dateref and pass it directly into dateFromString.

One thing I’d suggest is to do the work for each image asynchronously on a 
background queue and have that block (essentially all of your for-loop code) 
report its completion by some asynchronous way like posting a notification on 
the original queue along with the result you care about, the parsed date 
associated with the particular file.  Let the original queue handle how to 
store each parsed date; it would probably be best to use a dictionary where the 
key was the filename and value is the date.  To prevent memory pressure, 
allocate your background queue so that it’s concurrent and autorelease 
frequency is set to be workItem.  If you want to be sure to know when 
everything’s done, you could use a DispatchGroup to track those and you could 
choose to pass back NSNull or nil for the parsed result if the date could not 
be parsed.

Of course, this will depend on if your file system is non-network-based and 
whether it’s SSD vs HD as well as other physical system factors.
--
Gary

> On Aug 14, 2022, at 2:22 PM, Gabriel Zachmann via Cocoa-dev 
>  wrote:
> 
> I would like to collect the date/time stored in an EXIF tag in a bunch of 
> images.
> 
> I thought I could do so with the following procedure
> (some details and error checking omitted for sake of clarity):
> 
> 
>NSMutableArray * dates_and_times = [NSMutableArray arrayWithCapacity: 
> [imagefiles count]];
>CFDictionaryRef exif_dict;
>CFStringRef dateref = NULL;
>for ( NSString* filename in imagefiles )
>{
>NSURL * imgurl = [NSURL fileURLWithPath: filename isDirectory: NO];
> // escapes any chars that are not allowed in URLs (space, &, etc.)
>CGImageSourceRef image = CGImageSourceCreateWithURL( (__bridge 
> CFURLRef) imgurl, NULL );
>CFDictionaryRef fileProps = CGImageSourceCopyPropertiesAtIndex( image, 
> 0, NULL );
>bool success = CFDictionaryGetValueIfPresent( fileProps, 
> kCGImagePropertyExifDictionary, (const void **) & exif_dict );
>success = CFDictionaryGetValueIfPresent( exif_dict, 
> kCGImagePropertyExifDateTimeDigitized, (const void **) & dateref );
>NSString * date_str = [[NSString alloc] initWithString: (__bridge 
> NSString * _Nonnull)( dateref ) ];
>NSDate * iso_date = [isoDateFormatter_ dateFromString: date_str];
>if ( iso_date )
> [dates_and_times addObject: iso_date ];
>CFRelease( fileProps );
>}
> 
> 
> But, I get the impression, this code actually loads each and every image.
> On my Macbook, it takes 3m30s for 250k images (130GB).
> 
> So, the big question is: can it be done faster?
> 
> I know the EXIF tags are part of the image file, but I was hoping it might be 
> possible to load only those EXIF dictionaries.
> Or are the CGImage functions above already clever enough to implement this 
> idea?
> 
> 
> Best regards, Gab.
> 

___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com


Retrieving the EXIF date/time from 250k images

2022-08-14 Thread Gabriel Zachmann via Cocoa-dev
I would like to collect the date/time stored in an EXIF tag in a bunch of 
images.

I thought I could do so with the following procedure
(some details and error checking omitted for sake of clarity):


NSMutableArray * dates_and_times = [NSMutableArray arrayWithCapacity: 
[imagefiles count]];
CFDictionaryRef exif_dict;
CFStringRef dateref = NULL;
for ( NSString* filename in imagefiles )
{
NSURL * imgurl = [NSURL fileURLWithPath: filename isDirectory: NO];
// escapes any chars that are not allowed in URLs (space, &, etc.)
CGImageSourceRef image = CGImageSourceCreateWithURL( (__bridge 
CFURLRef) imgurl, NULL );
CFDictionaryRef fileProps = CGImageSourceCopyPropertiesAtIndex( image, 
0, NULL );
bool success = CFDictionaryGetValueIfPresent( fileProps, 
kCGImagePropertyExifDictionary, (const void **) & exif_dict );
success = CFDictionaryGetValueIfPresent( exif_dict, 
kCGImagePropertyExifDateTimeDigitized, (const void **) & dateref );
NSString * date_str = [[NSString alloc] initWithString: (__bridge 
NSString * _Nonnull)( dateref ) ];
NSDate * iso_date = [isoDateFormatter_ dateFromString: date_str];
if ( iso_date )
 [dates_and_times addObject: iso_date ];
CFRelease( fileProps );
}


But, I get the impression, this code actually loads each and every image.
On my Macbook, it takes 3m30s for 250k images (130GB).

So, the big question is: can it be done faster?

I know the EXIF tags are part of the image file, but I was hoping it might be 
possible to load only those EXIF dictionaries.
Or are the CGImage functions above already clever enough to implement this idea?


Best regards, Gab.




smime.p7s
Description: S/MIME cryptographic signature
___

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com