Hey guys,

I've been beating my head against this absolutely infuriating bug for
the last 48 hours, so I thought I'd finally throw in the towel and try
asking here before I throw my laptop out the window.

I'm trying to parse the response XML from a call I made to AWS
SimpleDB. The response is coming back on the wire just fine; for
example, it may look like:

<?xml version="1.0" encoding="utf-8"?>
<ListDomainsResponse xmlns="http://sdb.amazonaws.com/doc/
2009-04-15/"><ListDomainsResult><DomainName>Audio</
DomainName><DomainName>Course</
DomainName><DomainName>DocumentContents</
DomainName><DomainName>LectureSet</DomainName><DomainName>MetaData</
DomainName><DomainName>Professors</DomainName><DomainName>Tag</
DomainName></ListDomainsResult><ResponseMetadata><RequestId>42330b4a-
e134-6aec-e62a-5869ac2b4575</RequestId><BoxUsage>0.0000071759</
BoxUsage></ResponseMetadata></ListDomainsResponse>

I pass in this XML to a parser with

XMLEventReader eventReader =
xmlInputFactory.createXMLEventReader(response.getContent());

and call eventReader.nextEvent(); a bunch of times to get the data I
want.

Here's the bizarre part -- it works great inside the local server. The
response comes in, I parse it, everyone's happy. The problem is that
when I deploy the code to Google App Engine, the outgoing request
still works, and the response XML seems 100% identical and correct to
me, but the response fails to parse with the following exception:

com.amazonaws.http.HttpClient handleResponse: Unable to unmarshall
response (ParseError at [row,col]:[1,1]
Message: Content is not allowed in prolog.): <?xml version="1.0"
encoding="utf-8"?>
<ListDomainsResponse xmlns="http://sdb.amazonaws.com/doc/
2009-04-15/"><ListDomainsResult><DomainName>Audio</
DomainName><DomainName>Course</
DomainName><DomainName>DocumentContents</
DomainName><DomainName>LectureSet</DomainName><DomainName>MetaData</
DomainName><DomainName>Professors</DomainName><DomainName>Tag</
DomainName></ListDomainsResult><ResponseMetadata><RequestId>42330b4a-
e134-6aec-e62a-5869ac2b4575</RequestId><BoxUsage>0.0000071759</
BoxUsage></ResponseMetadata></ListDomainsResponse>
javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1]
Message: Content is not allowed in prolog.
        at
com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(Unknown
Source)
        at com.sun.xml.internal.stream.XMLEventReaderImpl.nextEvent(Unknown
Source)
        at
com.amazonaws.transform.StaxUnmarshallerContext.nextEvent(StaxUnmarshallerContext.java:
153)
        ... (rest of lines omitted)

I have double, triple, quadruple checked this XML for 'invisible
characters' or non-UTF8 encoded characters, etc. I looked at it byte-
by-byte in an array for byte-order-marks or something of that nature.
Nothing; it passes every validation test I could throw at it. Even
stranger, it happens if I use a Saxon-based parser as well -- but ONLY
on GAE, it always works fine in my local environment.

It makes it very hard to trace the code for problems when I can only
run the debugger on an environment that works perfectly (I haven't
found any good way to remotely debug on GAE). Nevertheless, using the
primitive means I have, I've tried a million approaches including:

* XML with and without the prolog
* With and without newlines
* With and without the "encoding=" attribute in the prolog
* Both newline styles
* With and without the chunking information present in the HTTP stream

And I've tried most of these in multiple combinations where it made
sense they would interact -- nothing! I'm at my wit's end. Has anyone
seen an issue like this before that can hopefully shed some light on
it?

Thanks!
-Adrian

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine for Java" group.
To post to this group, send email to google-appengine-j...@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.

Reply via email to