The unicode specs say that a file 'may' start with a BOM(U+FEFF). The reader of the bytes can then look to see how the BOM is encoded, and pick the correct encoding(UTF-8, UTF-16(le/be), UTF-32(le/be). If the file does start with a BOM, it must be removed.
A BOM anywhere else in the datastream is left alone. However, lovely java doesn't do this correctly. UTF-8 encodings do *not* remove the BOM. Only the others do. The bug about this is at http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4508058 I'm sending this to the list, because UTF-8 is the only sensible encoding to use nowadays, and this might crop up here. I don't really have a fix yet. I'm going to have to deal with this in webslinger, so I'll develop a change there, and then alter the ofbiz code with the same kind of logic.