Re: Problem Indexing Large Document Field
Yeap, that was the problem... I just needed to increase the maxFieldLength number. Thanks... On May 26, 2004, at 5:56 PM, [EMAIL PROTECTED] wrote: http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/ IndexWrite r.html#DEFAULT_MAX_FIELD_LENGTH maxFieldLength public int maxFieldLengthThe maximum number of terms that will be indexed for a single field in a document. This limits the amount of memory required for indexing, so that collections with very large files will not crash the indexing process by running out of memory. Note that this effectively truncates large documents, excluding from the index terms that occur further in the document. If you know your source documents are large, be sure to set this value high enough to accomodate the expected size. If you set it to Integer.MAX_VALUE, then the only limit is your memory, but you should anticipate an OutOfMemoryError. By default, no more than 10,000 terms will be indexed for a field. -Original Message- From: Gilberto Rodriguez [mailto:[EMAIL PROTECTED] Sent: Wednesday, May 26, 2004 4:04 PM To: [EMAIL PROTECTED] Subject: Problem Indexing Large Document Field I am trying to index a field in a Lucene document with about 90,000 characters. The problem is that it only indexes part of the document. It seems to only index about 65,00 characters. So, if I search on terms that are at the beginning of the text, the search works, but it fails for terms that are at the end of the document. Is there a limitation on how many characters can be stored in a document field? Any help would be appreciated, thanks Gilberto Rodriguez Software Engineer 370 CenterPointe Circle, Suite 1178 Altamonte Springs, FL 32701-3451 407.339.1177 (Ext.112) • phone 407.339.6704 • fax [EMAIL PROTECTED] • email www.conviveon.com • web This e-mail contains legally privileged and confidential information intended only for the individual or entity named within the message. If the reader of this message is not the intended recipient, or the agent responsible to deliver it to the intended recipient, the recipient is hereby notified that any review, dissemination, distribution or copying of this communication is prohibited. If this communication was received in error, please notify me by reply e-mail and delete the original message. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Gilberto Rodriguez Software Engineer 370 CenterPointe Circle, Suite 1178 Altamonte Springs, FL 32701-3451 407.339.1177 (Ext.112) • phone 407.339.6704 • fax [EMAIL PROTECTED] • email www.conviveon.com • web This e-mail contains legally privileged and confidential information intended only for the individual or entity named within the message. If the reader of this message is not the intended recipient, or the agent responsible to deliver it to the intended recipient, the recipient is hereby notified that any review, dissemination, distribution or copying of this communication is prohibited. If this communication was received in error, please notify me by reply e-mail and delete the original message. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Problem Indexing Large Document Field
Thanks, James... That solved the problem. On May 26, 2004, at 4:15 PM, James Dunn wrote: Gilberto, Look at the IndexWriter class. It has a property, maxFieldLength, which you can set to determine the max number of characters to be stored in the index. http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/ IndexWriter.html Jim --- Gilberto Rodriguez <[EMAIL PROTECTED]> wrote: I am trying to index a field in a Lucene document with about 90,000 characters. The problem is that it only indexes part of the document. It seems to only index about 65,00 characters. So, if I search on terms that are at the beginning of the text, the search works, but it fails for terms that are at the end of the document. Is there a limitation on how many characters can be stored in a document field? Any help would be appreciated, thanks Gilberto Rodriguez Software Engineer  370 CenterPointe Circle, Suite 1178 Altamonte Springs, FL 32701-3451  407.339.1177 (Ext.112)  phone 407.339.6704  fax [EMAIL PROTECTED]  email www.conviveon.com  web  This e-mail contains legally privileged and confidential information intended only for the individual or entity named within the message. If the reader of this message is not the intended recipient, or the agent responsible to deliver it to the intended recipient, the recipient is hereby notified that any review, dissemination, distribution or copying of this communication is prohibited. If this communication was received in error, please notify me by reply e-mail and delete the original message. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] __ Do you Yahoo!? Friends. Fun. Try the all-new Yahoo! Messenger. http://messenger.yahoo.com/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Gilberto Rodriguez Software Engineer  370 CenterPointe Circle, Suite 1178 Altamonte Springs, FL 32701-3451  407.339.1177 (Ext.112) â phone 407.339.6704 â fax [EMAIL PROTECTED] â email www.conviveon.com â web  This e-mail contains legally privileged and confidential information intended only for the individual or entity named within the message. If the reader of this message is not the intended recipient, or the agent responsible to deliver it to the intended recipient, the recipient is hereby notified that any review, dissemination, distribution or copying of this communication is prohibited. If this communication was received in error, please notify me by reply e-mail and delete the original message. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Problem Indexing Large Document Field
I am trying to index a field in a Lucene document with about 90,000 characters. The problem is that it only indexes part of the document. It seems to only index about 65,00 characters. So, if I search on terms that are at the beginning of the text, the search works, but it fails for terms that are at the end of the document. Is there a limitation on how many characters can be stored in a document field? Any help would be appreciated, thanks Gilberto Rodriguez Software Engineer 370 CenterPointe Circle, Suite 1178 Altamonte Springs, FL 32701-3451 407.339.1177 (Ext.112) • phone 407.339.6704 • fax [EMAIL PROTECTED] • email www.conviveon.com • web This e-mail contains legally privileged and confidential information intended only for the individual or entity named within the message. If the reader of this message is not the intended recipient, or the agent responsible to deliver it to the intended recipient, the recipient is hereby notified that any review, dissemination, distribution or copying of this communication is prohibited. If this communication was received in error, please notify me by reply e-mail and delete the original message. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]