Re: [CODE4LIB] TIFF Metadata to XML?
On Jul 19, 2011, at 11:03 AM, Joe Hourcle wrote: On Jul 19, 2011, at 10:34 AM, Stern, Randall wrote: Also, see FITS (http://code.google.com/p/fits/) FITS is an open source java toolset we wrote that wraps JHOVE, ExifTool, and several other format analysis tools and produces a single XML output stream. It also includes a crosswalk to MIX XML as an optional output. Really? You named a tool that deals with image data 'FITS' ? You do realize there's actually a 30+ year old image standard called FITS: http://fits.gsfc.nasa.gov/ (which has its own metadata standard, just to make things even more interesting) This appears to the a known issue in their tracker: http://code.google.com/p/fits/issues/detail?id=10. Medium priority. Also it's worth comparing the FITS' output to the output of exiftool -X. IIRC FITS uses a low level of verbosity with their FITS integration, though this may not be noticeable with some formats. Dave Rice avpreserve.com
Re: [CODE4LIB] TIFF Metadata to XML?
Try exiftool with the -X flag to get RDF XML output. Dave Rice avpreserve.com On Jul 18, 2011, at 9:18 AM, Edward M. Corrado wrote: Hello All, Before I re-invent the wheel or try many different programs, does anyone have a suggestion on a good way to extract embedded Metadata added by cameras and (more importantly) photo-editing programs such as Photoshop from TIFF files and save it as as XML? I have 60k photos that have metadata including keywords, descriptions, creator, and other fields embedded in them and I need to extract the metadata so I can load them into our digital archive. Right now, after looking at a few tools and having done a number of Google searches and haven't found anything that seems to do what I want. As of now I am leaning towards extracting the metadata using exiv2 and creating a script (shell, perl, whatever) to put the fields I need into a pseudo-Dublin Core XML format. I say pseudo because I have a few fields that are not Dublin Core. I am assuming there is a better way. (Although part of me thinks it might be easier to do that then exporting to XML and using XSLT to transform the file since I might need to do a lot of cleanup of the data regardless.) Anyway, before I go any further, does anyone have any thoughts/ideas/suggestions? Edward
Re: [CODE4LIB] Jpeg2000 and XMP metadata
Hi Joel, On Mar 23, 2011, at 9:45 AM, Richard, Joel M wrote: Morning, all! I thought I'd crowdsource this question. 8+ hours of beating up on this and I haven't found a good solution. We have some software that processes the scanned pages of a book. They come to me as TIFF and I am converting to JP2 in order to upload to the Internet Archive. The trouble is that I can't find a reliable piece of code or a process to add XMP metadata to the JP2. (FWIW, we're using the Jasper library) - exiftool doesn't seem to be working either. exiftool works for me. Can you send the command you're testing? If I run: exiftool -tagsfromfile source.tiff output.jp2 then I do get the XMP copied from the tiff to the jp2. Best Regards Dave Rice avpreserve.com
Re: [CODE4LIB] Jpeg2000 and XMP metadata
On Mar 23, 2011, at 10:26 AM, Dave Rice wrote: Hi Joel, On Mar 23, 2011, at 9:45 AM, Richard, Joel M wrote: Morning, all! I thought I'd crowdsource this question. 8+ hours of beating up on this and I haven't found a good solution. We have some software that processes the scanned pages of a book. They come to me as TIFF and I am converting to JP2 in order to upload to the Internet Archive. The trouble is that I can't find a reliable piece of code or a process to add XMP metadata to the JP2. (FWIW, we're using the Jasper library) - exiftool doesn't seem to be working either. exiftool works for me. Can you send the command you're testing? If I run: exiftool -tagsfromfile source.tiff output.jp2 then I do get the XMP copied from the tiff to the jp2. Although a problem with this approach is that now an XMP that describes a tiff is embedded in a jp2. Perhaps you could parse the source XMP for selects and then use exiftool to write relevant tags to the output file. Best Regards Dave Rice avpreserve.com
Re: [CODE4LIB] Jpeg2000 and XMP metadata
Hi Joel, On Mar 23, 2011, at 10:57 AM, Richard, Joel M wrote: Dave, Thanks for the response... I tried this and it sort of works with a warning about IPTC, but that's an effect of the data in the TIFF. Here's some results of my experimentation and an example of what I've tried with exiftool. exiftool -xmp test.tif -b xmp.xml exiftool '-xmp=xmp.xml' test.jp2 No namespace for XMP Warning: Can't write XMP:XMP (namespace unknown) - test.jp2 0 image files updated 1 image files unchanged I must be doing something wrong, but I can't see anything obviously wrong in the XMP file. Or perhaps my technique is simply invalid. I'll admit I'm a newbie when it comes to exiftool. Try: exiftool -tagsfromfile xmp.xml test.jp2 instead of exiftool '-xmp=xmp.xml' test.jp2 Dave Rice avpreserve.com --Joel On Mar 23, 2011, at 10:31 AM, Dave Rice wrote: On Mar 23, 2011, at 10:26 AM, Dave Rice wrote: Hi Joel, On Mar 23, 2011, at 9:45 AM, Richard, Joel M wrote: Morning, all! I thought I'd crowdsource this question. 8+ hours of beating up on this and I haven't found a good solution. We have some software that processes the scanned pages of a book. They come to me as TIFF and I am converting to JP2 in order to upload to the Internet Archive. The trouble is that I can't find a reliable piece of code or a process to add XMP metadata to the JP2. (FWIW, we're using the Jasper library) - exiftool doesn't seem to be working either. exiftool works for me. Can you send the command you're testing? If I run: exiftool -tagsfromfile source.tiff output.jp2 then I do get the XMP copied from the tiff to the jp2. Although a problem with this approach is that now an XMP that describes a tiff is embedded in a jp2. Perhaps you could parse the source XMP for selects and then use exiftool to write relevant tags to the output file. Best Regards Dave Rice avpreserve.com
Re: [CODE4LIB] Looking for a Word to EAD converter
Hi Daniel, Word and EAD aren't really equivalent concepts. The sample Word document that you post looks like it follows some type of definition of rules for the structure. Potentially if you had a lot of these documents, one could write a script to convert to the Word document to a raw text files (assuming the formatting doesn't provide any semantic meaning), then use text parser to isolate various discrete expression from the Word document and map it into an EAD structure. This may be tricky is the Word documents don't all follow the same rules and also if the Word document does not provide enough data to meet the minimal requirements of an EAD expression. David Rice AudioVisual Preservation Solutions 350 7th Avenue, Suite 1603 New York, NY 10001 ph: 212-564-2140 cell: 347-213-3517 www.avpreserve.com On 10/7/10 1:43 PM, Ethan Gruber wrote: Hi Daniel, I don't see how this will be possible. A program can't make semantically appropriate decisions for mapping prose to EAD tags. You'll just have to go with the copy-paste method in something like oXygen. Ethan Gruber On Thu, Oct 7, 2010 at 1:36 PM, Cornwall, Daniel D (EED) daniel.cornw...@alaska.gov wrote: Hi All, While I think what I'm looking for doesn't exist, I wanted to ask some experts before making confident assertions. Our institution has a lot of finding aids for photo and manuscript collections in MS Word Format. They have pretty standard subheadings. An example can be found at www.library.state.ak.us/hist/hist_docs/finding_aids/MS220.doc http://www.library.state.ak.us/hist/hist_docs/finding_aids/MS220.doc . I've had inquiries about getting these Word finding aids converted to EAD (Encoded Archival Description) through some sort of converter. I haven't been able to locate any such program, but maybe that's a reflection on my searching skills. There are a number of programs to create EAD finding aids from scratch and I've recommended acquiring one of these programs and getting staff to rekey/copy paste from Word into the EAD finding aid program. Staff are not willing to do this at least until I can demonstrate that there is no automated way to convert our finding aids. Of course, if there is a converter, so much the better. Thanks in advance for any enlightenment you can give me. - Daniel === Daniel Cornwall Head of Technical and Imaging Services Division of Libraries, Archives and Museums PO Box 110571 Juneau, AK 99811-0571 Phone (907) 465-6332 Fax (907) 465-2665 E-Mail: dan.cornw...@alaska.gov See Division resources at http://lam.alaska.govhttp://lam.alaska.gov/ . Any opinions expressed in this e-mail are mine alone and not those of my employer unless explicitly stated.