I emailed about a week ago about a problem I was having with XCC and large files. I got some very good advice which said I needed to use session.insertContent to get the file in. I'm done with that conversion but dealing with the resulting problems due to the change.
What I'm looking at right now is a file that is UTF-16 and begins with two BOM characters - which I have learned are actually relevant in telling any string parser/consumer what order the bytes in each pair will be... I wrote some code that strips out the BOMs but it seems to screw the encoding up altogether. I also put in code to set the encoding to UTF16 in the ContentCreateOptions. Without stripping BOMs, I get this: Invalid root text "ÿþ" at [uri] line 1 To deal with UTF-16 don't you *need those BOMs? What am I missing here? FYI the first line of the files looks like: <?xml version="1.0" encoding="UTF-16" standalone="yes"?> So it's clearly utf-16. There is some leeway in terms of how I create the Content object to feed to insertContent - currently I'm treating it as a byte[] - but I could do string conversion etc if that's what I need to do. Any help is appreciated. -- Josh Warner-Burke 42SIX Solutions (m): 410-493-4362 (e): jwbu...@42six.com http://www.42six.com
_______________________________________________ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general