Hi Dennis,
The way I have done it is as you suggested, via the XMLStreamReader.
I find the XML element that I am after, and then the next CDATA
element is assumed to be the mimeContent, and pipe it to a tempFile,
then I base64decode it, and return the inputStream for access to the
email to the application that requires it.
I've just tested it with a 200Meg email from a remote Exchange box,
and it worked no probs... ahh finally.
Here is some sample code (to give back to community ;)
XMLStreamReader xsr = firstElement.getXMLStreamReaderWithoutCaching();
...
events: for (int eventType = xsr.getEventType();; eventType =
xsr.next()) {
switch (eventType) {
case XMLStreamReader.START_ELEMENT: {
logger.debug("START_ELEMENT: " +
xsr.getName());
QName elementName = xsr.getName();
if (mimeContentName.equals(elementName)) {
logger.info
<http://logger.info>("Setting mimeContent to TRUE. Next CDATA will be
mimecontent, saved to [" + tempFilePath + "]");
mimecontent = true;
} else {
logger.debug("Setting mimeContent to
false.");
mimecontent = false;
}
depth++;
break;
}
case XMLStreamReader.CDATA: {
if (mimecontent) {
int sourceStart = 0;
int length = xsr.getTextLength();
char[] target = new char[length];
int copiedLength = 0;
int available = xsr.getTextLength() -
sourceStart;
if (available < 0) {
throw new
IndexOutOfBoundsException("sourceStart is greater than" + "number of
characters associated with this event");
}
if (available < length) {
copiedLength = available;
} else {
copiedLength = length;
}
char[] textChars =
xsr.getTextCharacters();
// Only copy the chars that we need.
System.arraycopy(textChars,
xsr.getTextStart() + 0, target, 0, copiedLength);
theWriter.write(target);
} else {
logger.debug("Ignoring CDATA...");
// We don't care about other data
}
break;
}
}
P.S. If I had more time, I would love to look at ADB or JiBX, but for
now there is no chance, a lesson learned for now.
On Tue, May 19, 2009 at 3:29 PM, Dennis Sosnoski <[email protected]
<mailto:[email protected]>> wrote:
Ah, I hadn't seen noticed/remembered this dealt with a single huge
base64 string. That's just plain bad design for the web service...
but no surprise if Exchange is involved.
No, you can't handle this cleanly with JiBX or any other data
binding tool I'm aware of. The only way you *could* handle it
would be by getting an XMLStreamReader and reading the data
directly, with . The XMLStreamReader interface provides next() and
getTextCharacters() methods which in theory could be used to get
this text a chunk at a time. Assuming the parser involved actually
implements these as intended, you can get a block of text at a
time and run the base64 decoding on that block, then move on to
the next block. A lot of work to implement, though.
Andrew, you might want to try switching to ADB or JiBX anyway, at
least to try for an easier solution than working directly with the
XMLStreamReader. Either of these should use less memory than
XMLBeans for the same data, so as long as your text strings are
only very large and not unbounded these might let you squeak by.
- Dennis
Andreas Veithen wrote:
Dennis,
I think that the Web service is actually Microsoft Exchange, so
changing it is not an option. Does JiBX support caching a
base64 value
(which is not represented using MTOM) on disk instead of memory?
Andrew,
You might still want to give MTOM a try. Many servers switch
to MTOM
if the request is sent as MTOM. I can't believe the Microsoft
Exchange
can only return inline base64.
Regards,
Andreas
On Mon, May 18, 2009 at 13:22, Dennis Sosnoski
<[email protected] <mailto:[email protected]>> wrote:
Hi Andrew,
Your best starting point is to get rid of XMLBeans.
XMLBeans always stores
raw XML in memory, so there's no way to avoid the memory
issues with large
messages.
ADB would avoid some of the overhead, in that it would
convert the XML
message to an object graph, which would typically be a
factor of 2-5x
smaller than the raw XML data (depending on the type of
data in your
message).
JiBX would do at least as well as ADB in terms of the
reduced-size object
graph. If you have some way of breaking up the response
data into more
easily-digestible chunks, JiBX would also allow you to do
piece-meal
processing of the message (by creating a fake collection,
for instance,
where your data objects expose an addXXX() method which
just writes the
object being added to some backing store rather than
adding it to an
in-memory collection). Piece-meal processing is the only
way you can
dramatically decrease your memory usage and handle
messages of effectively
unlimited size without a problem.
- Dennis
--
Dennis M. Sosnoski
SOA and Web Services in Java
Axis2 Training and Consulting
http://www.sosnoski.com - http://www.sosnoski.co.nz
Seattle, WA +1-425-939-0576 - Wellington, NZ +64-4-298-6117
Andrew Bruno wrote:
Mr J,
But its just one data element that I need, so there is
no concept many
rows, etc.
As a matter of fact, I am breaking up the call in 2
calls, one for all the
metadata, and one for the actual mime content.
i.e.
<m:ResponseMessages>
<m:GetItemResponseMessage ResponseClass="Success">
<m:ResponseCode>NoError</m:ResponseCode>
<m:Items>
<t:Message>
<t:MimeContent
CharacterSet="UTF-8">RnJvbTogTGVlIF..... large
mime base64 encoded email......... 120Meg++ of encoded
data</t:MimeContent>
....
I only ever request one message at a time. Its just
that when the
MimeContent is greater then 10Meg, OutOfmemory occurs.
It appears to be the case because the parse loads all
the content into
memory.
In theory, this element can have data of any size
coming back.
On Mon, May 18, 2009 at 6:54 PM, J. Hondius
<[email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>>> wrote:
I'd think about a different design of the webservice
calls.
You should try to avoid real big results.
Split into more calls.
Something like:
One to get a overview list
Another to get details
Or:
One call to get the size like the SQL count() does
combined with
Add parameters to your to limit the result: like
start_at,
number_of_results
my 2c
Andrew Bruno schreef:
Hello all
I was wondering how some of you may be dealing
with web
service calls
that result in extremely large data responses?
I have been struggling in trying to change the
way the parsing
of the
XML response works, as I am getting out of
memory errors
java.lang.OutOfMemoryError: Java heap space
at
org.apache.xmlbeans.impl.store.CharUtil.allocate(CharUtil.java:397)
at
org.apache.xmlbeans.impl.store.CharUtil.saveChars(CharUtil.java:506)
at
org.apache.xmlbeans.impl.store.CharUtil.saveChars(CharUtil.java:419)
at
org.apache.xmlbeans.impl.store.CharUtil.saveChars(CharUtil.java:489)
at
org.apache.xmlbeans.impl.store.Cur$CurLoadContext.text(Cur.java:2911)
at
org.apache.xmlbeans.impl.store.Cur$CurLoadContext.stripText(Cur.java:3113)
at
org.apache.xmlbeans.impl.store.Cur$CurLoadContext.text(Cur.java:3126)
at
org.apache.xmlbeans.impl.store.Locale.loadXMLStreamReader(Locale.java:1154)
at
org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:843)
at
org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:826)
at
org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:231)
.....
Is there a way to change the parser to use a
temp file rather then
trying to buffer the response in memory?
Should I be directing this question to the
developers list?
Or should I be thinking differently on solving
this problem?
Please any ideas :(
Thank you
Andrew