Kyle,
I think we need to break down your question...
It seems that the thematic folders and the file names may be ok
descriptive tag sources to start with. Perhaps you could try to
identify patterns to extract information for tags (i.e., "hall",
"committee", "holiday", etc.) You could traverse the file system, and
use the Google Data API for Picasa, to do an initial upload with tags
generated from those folders and file name patterns. You will need
*some* kind of user input to get to more detail.
So, are you asking whether it's possible for software to generate
metadata for what is pictured in a photograph without user input?
Are you intending for facial recognition software to identify who is in
a picture or just that there are people in it?
I think you're on the right track by using Picasa. I think it may be
easier for people in those departmental units to help tag people, and
add descriptions through that interface. Then you can write some kind
of script that slurps them into your repository.
Are you using the Google Data API for Picasa?
https://developers.google.com/picasa-web/
For batching jobs, you might look at GoogleCL:
http://code.google.com/p/googlecl/wiki/ExampleScripts#Picasa
If you don't want to or are not allowed to upload all your photos to the
web first, you will probably have to look at Pyfaces
[http://code.google.com/p/pyfaces/] or Open Source Computer Vision
[http://opencv.willowgarage.com/wiki/FaceDetection], but they smell of
pain to me.
-Shaun
On 11/20/12 2:54 PM, Kyle Banerjee wrote:
I am in the process of examining how photo collections maintained by campus
units can be incorporated into the library's repository. In all cases that
I've had to deal with so far, they're just using the file system -- i.e.
traversing folders that arrange images thematically to file names that
indicate the content.
Each of these collections contains many thousands of images. This means
that it's a hassle for them to find images, but also that there's no way
library staff alone will be able to handle all the metadata creation.
I'd like to use something slick like picasa to help them out (facial
recognition is an especially big deal for us). But I'm finding the metadata
to be both minimalist and clunky to work with so I wanted a reality check
to see if I'm not doing this the dumb way. Things I've noticed:
1. Picasa appears to store info in xmp rather than exif which is great
given the limitations of exif. However, I haven't yet found a way to use
more than a couple fields. The caption shows up in a description field, and
they tag show up in subjects. But aside from that, I'm at a loss of how to
populate other DC fields through the interface.
2. Facial recognition metadata doesn't show up in xmp at all. However, I
can get that by parsing .picasa.ini and contacts.xml (clunky, but doable).
I'm kind of tempted to tell people to go into albums and batch tag the
people albums since it's going to be fun explaining how to locate these
hidden files.
My real question is whether anyone has come up with a really good way to
assign metadata to thousands of photos, preferably in batch fashion? Thanks,
kyle
--
Shaun D. Ellis
Digital Library Interface Developer
Firestone Library, Princeton University
voice: 609.258.1698 | sha...@princeton.edu