Under "Factors affecting memory usage" there is this text:
When processing an "add" command for a document, the standard XML update
handler has two limitations:
• All of the document's fields must simultaneously fit into memory.
(Technically, it's actually the sum of min(<the actual field value's length>,
maxFieldLength). As such, adjusting maxFieldLength may be of some help.)
• (I'm assuming that fields are truncated to maxFieldLength
before being added to the relevant document object. If that's not true, then
maxFieldLength won't help here. --ChrisHarris)
• Each individual <field>...</field> tag in the input XML must fit into
memory, regardless of maxFieldLength.
Bullet 1 contradicts bullet 2, at least, the way I read it.
Looking at the tokenizer that applies the maxFieldLength cutoff, it is working
with a stream... That implies that the first bullet is correct, and that the
entire XML document doesn't need to fit into memory. Unless what we are trying
to say is that to parse the incoming XML document, the entire document must fit
into memory? After that, the tokenizer kicks in and only the min(<the actual
field value's length>, maxFieldLength) applies to each field...?
Eric
-----------------------------------------------------
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 |
http://www.opensourceconnections.com
Co-Author: Solr 1.4 Enterprise Search Server available from
http://www.packtpub.com/solr-1-4-enterprise-search-server
Free/Busy: http://tinyurl.com/eric-cal
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]