Hi Fayyaz,
I recommend to use SAX or, maybe, a custom parser for large xml files .It
should be faster than using Digester. The main difference between those xml
parsers is that Digester needs to load the entire xml document in memory when
it creates those objects, meanwhile you can parse the doc
Hi,
I have the same problem.
This is useful when you try to extract the contexts (terms before and after) of
a certain term (for example).
I found a solution but it performs badly: when you try to retrieve those
contexts you have to re-tokenize the documents containing the given term (i.e.
"socc
ontent of your document. If you
really want to index the whole xml file just read the file using java
io anyway I would not suggest doing that at all.
best regards simon
>
> Thanks...
> Catalin Mititelu wrote:
> Yes. The default max limit for indexing tokens is 10,000.
> Look
Yes. The default max limit for indexing tokens is 10,000.
Look here
http://lucene.apache.org/java/docs/api/org/apache/lucene/index/IndexWriter.html#DEFAULT_MAX_FIELD_LENGTH
aslam bari <[EMAIL PROTECTED]> wrote: Dear all,
I am trying to index a Xml file which has 6MB size. Does lucene support t