Sure! That would also give even more scrutiny to the code. I'm not 100%
sure this is totally correct, but I got wonderful help from Phil Harvey
(ExifTool) to get the charset/encoding correct.
So I'm pretty confident. How do I contribute?

Btw, you wouldn't happen to know anything about IPTC and XMP, would you? It
seems the EXIF tags I'm writing (UserComment and ImageDescription) are not
enough for the comment to appear as a caption in image viewer software
(like Picasa etc). I was wondering (hoping) Sanselan could write the
following tags:

IPTC:Caption-Abstract
and
XMP:Description


Joakim

On 1 June 2016 at 14:55, Benedikt Ritter <[email protected]> wrote:

> Hello Joakim,
>
> glad you found out what to do. This would make for a good addition to the
> user guide. Would you like to contribute your findings?
>
> Benedikt
>
> Joakim Knudsen <[email protected]> schrieb am Di., 31. Mai 2016 um
> 19:21 Uhr:
>
> > Btw, ENCODING_UTF16 is just a String = "UTF-16LE" (Little Endian)
> >
> > On 31 May 2016 at 19:20, Joakim Knudsen <[email protected]> wrote:
> >
> > > Following a post on the User-Commons-Apache log (from 2012), I ended up
> > > with the following code which seems to work.
> > > It writes proper Unicode, which I can read back successfully using
> > > ExifTool. I also see the comment nicely in Windows Explorer, and under
> > File
> > > > Properties.
> > > Note I changed the field type from ASCII to FIELD_TYPE_UNDEFINED,
> > > otherwise (with ASCII) it did not work. At least Windows couldn't make
> > > sense of the EXIF data.
> > >
> > > // http://osdir.com/ml/user-commons-apache/2012-03/msg00046.html
> > > byte[] unicodeMarker = new byte[]{ 0x55, 0x4E, 0x49, 0x43, 0x4F, 0x44,
> > >         0x45, 0x00 };
> > > byte[] comment = textToSet.getBytes(ENCODING_UTF16); // OR UTF-16BE if
> > the file is big-endian!
> > > byte[] bytesComment = new byte[unicodeMarker.length + comment.length];
> > > System.arraycopy(unicodeMarker, 0, bytesComment, 0,
> > unicodeMarker.length);
> > > System.arraycopy(comment, 0, bytesComment, unicodeMarker.length,
> > comment.length);
> > >
> > > TiffOutputField exif_comment = new
> > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT,
> > >         TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED,
> > bytesComment.length, bytesComment);
> > >
> > >
> > > I can now write UserComment: "æøå" without problems :)
> > >
> > >
> > >
> > > - Joakim
> > >
> > >
> > > On 31 May 2016 at 17:39, Benedikt Ritter <[email protected]> wrote:
> > >
> > >> Hello Joachim,
> > >>
> > >> Joakim Knudsen <[email protected]> schrieb am Sa., 28. Mai 2016
> um
> > >> 21:10 Uhr:
> > >>
> > >> > Hi Benedikt, and thanks for replying!
> > >> >
> > >> > So, if FieldType is unused, maybe the alternative, simpler
> constructor
> > >> is
> > >> > more appropriate/correct to use?
> > >> >
> > >> > // try using the approach given in the example (modified from the
> GPS
> > >> tag):
> > >> > TiffOutputField exif_comment = TiffOutputField.create(
> > >> >         TiffConstants.EXIF_TAG_USER_COMMENT,
> > >> >         outputSet.byteOrder, textToSet);
> > >> >
> > >> > However, now Sanselan throws an ImageWriteException:
> > >> > org.apache.sanselan.ImageWriteException: Tag has unexpected data
> type.
> > >> >
> > >> > So are you 100% sure field type should not be set (to ASCII)?
> > >> >
> > >>
> > >> No, I'm just saying that it uses a hard coded encoding anyway :-)
> > >>
> > >>
> > >> >
> > >> > Next, you're saying the string to set (textToSet) is converted
> > >> internally
> > >> > to byte array, using US-ASCII encoding.
> > >> > If I try writing "æøåæøå" to a file, I get "쎦쎸쎥쎦쎸쎥" when I copy the
> > JPEG
> > >> > out and check Properties in Windows Explorer.
> > >> > If I write only ASCII characters, e.g. "Test", then that comes
> through
> > >> just
> > >> > fine.
> > >> >
> > >> > In summary, here is the code that works for me (except non-ASCII
> > >> > characters):
> > >> >
> > >> >
> > >> > *//
> > >> >
> > >> >
> > >>
> >
> http://mail-archives.apache.org/mod_mbox/commons-user/201203.mbox/%3CCAJm2B-mYCXYKuyu=Hs8UAZCpw-B=kwuz4gszfuobvzwun0l...@mail.gmail.com%3E
> > >> > <
> > >> >
> > >>
> >
> http://mail-archives.apache.org/mod_mbox/commons-user/201203.mbox/%3CCAJm2B-mYCXYKuyu=Hs8UAZCpw-B=kwuz4gszfuobvzwun0l...@mail.gmail.com%3E
> > >> > >*byte
> > >> > b[] = ExifTagConstants.EXIF_TAG_USER_COMMENT.encodeValue(
> > >> >         TiffFieldTypeConstants.FIELD_TYPE_ASCII,
> > >> >         textToSet, outputSet.
> > >> > *byteOrder*);
> > >> >
> > >> > // constructor arguments: taginfo tag fieldtype count bytes
> > >> > TiffOutputField exif_comment = new
> > >> > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT.tag,
> > >> >         TiffConstants.EXIF_TAG_USER_COMMENT,
> > >> > TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED,
> > >> >         b.length, b);
> > >> >
> > >>
> > >> The provided links indicate to me, that it is possible to write non
> > ASCII
> > >> characters. Are you sure your code looks like what Damjan suggested?
> > >>
> > >> Benedikt
> > >>
> > >>
> > >> >
> > >> >
> > >> >
> > >> > Joakim
> > >> >
> > >> >
> > >> >
> > >> > On 22 May 2016 at 15:29, Benedikt Ritter <[email protected]>
> wrote:
> > >> >
> > >> > > Hello Joakim
> > >> > >
> > >> > > Joakim Knudsen <[email protected]> schrieb am Sa., 21. Mai
> > 2016
> > >> um
> > >> > > 19:29 Uhr:
> > >> > >
> > >> > > > Hi List!
> > >> > > >
> > >> > > > I'm working on an Android app, where I want to read and write
> > "EXIF
> > >> > tags"
> > >> > > > to JPEG files on the device. Sanselan 0.97 seems to work
> > perfectly,
> > >> > > > although it's a bit complicated to work with EXIF
> > tags/directories.
> > >> > > >
> > >> > > > The specific tags I'm interested in, is EXIF_TAG_USER_COMMENT
> and
> > >> > > > EXIF_TAG_IMAGE_DESCRIPTION.
> > >> > > > According to the documentation I could find, UserComment is of
> > field
> > >> > type
> > >> > > > "undefined", whereas ImageDescription is of field type ASCII.
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> http://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/usercomment.html
> > >> > > >
> > >>
> http://www.awaresystems.be/imaging/tiff/tifftags/imagedescription.html
> > >> > > >
> > >> > > > What's the proper way of creating those tags, wrt. charset etc?
> I
> > >> want
> > >> > as
> > >> > > > wide as possible character support (æøå etc).
> > >> > > >
> > >> > > > I find different discussions online, with different advice.
> Seems
> > >> two
> > >> > > > constructors are going around, where the simpler one does not
> deal
> > >> with
> > >> > > > charset/encoding at all. This one uses the .create method:
> > >> > > >
> > >> > > > String textToSet = "Some Text æøå";
> > >> > > >
> > >> > > > TiffOutputField exif_comment = TiffOutputField.create(
> > >> > > >                 TiffConstants.EXIF_TAG_USER_COMMENT,
> > >> > > >                 outputSet.byteOrder, textToSet);
> > >> > > >
> > >> > > >
> > >> > > > while this one uses the standard constructor:
> > >> > > >
> > >> > > > byte b[] = ExifTagConstants.EXIF_TAG_USER_COMMENT.encodeValue(
> > >> > > >         TiffFieldTypeConstants.FIELD_TYPE_ASCII,
> > >> > > >         textToSet, outputSet.byteOrder
> > >> > > > );
> > >> > > >
> > >> > > > // constructor arguments: taginfo tag fieldtype count bytes
> > >> > > > TiffOutputField exif_comment2 = new
> > >> > > > TiffOutputField(TiffConstants.EXIF_TAG_USER_COMMENT.tag,
> > >> > > >         TiffConstants.EXIF_TAG_USER_COMMENT,
> > >> > > > TiffFieldTypeConstants.FIELD_TYPE_UNDEFINED,
> > >> > > >         b.length, b);
> > >> > > >
> > >> > > > In this last one, the string to set has been converted to a byte
> > >> array
> > >> > > > first. But can/should I set the encoding anywhere?
> > >> > > >
> > >> > > > Is the field type even ASCII? This information seems to indicate
> > >> it's
> > >> > > > not ASCII...
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> http://www.awaresystems.be/imaging/tiff/tifftags/privateifd/exif/usercomment.html
> > >> > > >
> > >> > > >
> > >> > > > Need some help here, as you can see, to get this right. The
> second
> > >> > > > approach above does seem to work in my app, but I'd like to be
> > sure
> > >> > > > I'm not somehow messing up the JPEGs on the deviced.
> > >> > > >
> > >> > >
> > >> > > I've looked at the code of
> > >> > > org.apache.commons.imaging.formats.tiff.taginfos.TagInfoGpsText
> > >> > > (ExifTagConstants.EXIF_TAG_USER_COMMENT is an instance of
> > >> > TagInfoGpsText).
> > >> > > Here are my observations:
> > >> > >
> > >> > > - The FieldType parameter, which you have set to
> > >> > > TiffFieldTypeConstants.FIELD_TYPE_ASCII is never used in the
> > >> > implemenation
> > >> > > of encodeValue(FieldType, Object, ByteOrder)
> > >> > > - When converting the input String to byte array,
> > >> String.getBytes(String
> > >> > > charsetName) is used
> > >> > > - For charsetName "US-ASCII" is always used (it can not be
> > configured
> > >> by
> > >> > > the user)
> > >> > >
> > >> > > So my guess is, that the code will not handle characters not in
> the
> > >> > > US-ASCII charset correctly.
> > >> > >
> > >> > > Benedikt
> > >> > >
> > >> > >
> > >> > > >
> > >> > > >
> > >> > > >
> > >> > > > Joakim
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Reply via email to