ID: 27238 User updated by: philip at nancarrow dot net Reported By: philip at nancarrow dot net Status: Open Bug Type: Feature/Change Request Operating System: Windows and Linux PHP Version: 4.3.4 New Comment:
Pierre, OK sure, I've put two JPEGs that include IIM record 1 at: http://www.nancarrow.net/download/testpic1_latin1.jpg [Latin1 encoded English] and http://www.nancarrow.net/download/testpic2_utf8.jpg [UTF8 encoded Chinese] The IPTC/NAA (aka "IIM") spec is freely downloadable from http://www.iptc.org/download/download.php?fn=IIMV4.1.pdf and this details all records include record 1. Appendix C lists the currently defined character sets, which is specified in dataset 1:90. Note the strange IPTC terminology - an "octet" is a byte, so "octet 2/5" means 0x25. The character set sequence starts with ESC, so where it says ISO-8859-1 is "intermediate character 2/12 to 2/15" followed by "octet 4/1" this would be something like: ESC,0x2F,0x41 or "ESC/A". Similarly UTF8 is ESC,2/5,4/7 or "ESC%G". Where the spec says "intermediate character 2/12 to 2/15" most creators writing the file use the end character, ie. 2/15 in this case. I'm not sure that PHP really needs to know about the encoding, does it ? Since strings are just byte sequences in PHP I guess it's down to the application to do the appropriate encoding/decoding... as long as they have access to the character set of course ! Thanks Philip Previous Comments: ------------------------------------------------------------------------ [2004-02-13 09:29:23] [EMAIL PROTECTED] > I can provide you with JPEG files containing IIM record 1 > if required; they're quite common in the news industry. Please do :) If you can provide an URL with some images with the required fields and a txt file for the expected result. Note that I never read the charset part in any docs about IPTC standart. Have you a link that describes it? pierre ------------------------------------------------------------------------ [2004-02-13 06:27:10] philip at nancarrow dot net Description: ------------ The iptcparse() function (GD extension) only returns IPTC/NAA records 2 and upward, skipping past record 1. This appears to be by design, but means that the returned data is incomplete, for example the "destination" dataset 1:05 is missing. Worse that this is the fact that "coded character set" (1:90) is missing, and without this value the encoding of the data is unknown (for example if 1:90 specifies ESC,%,G the data is UTF8 encoded). I assume that the current implementation is defaulting to ASCII or Latin1 encoding. I can provide you with JPEG files containing IIM record 1 if required; they're quite common in the news industry. Thank you ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=27238&edit=1