Pasha Bizhan wrote:
Also, 230Kb is not equal 20.000. Try to set writer.maxFieldLength to 250
000.
maxFieldLength's unit is a token, not a character.
--
Best regards,
Andrzej Bialecki
___. ___ ___ ___ _ _ __
[__ || __|__/|__||\/| Information Retrieval, Semantic W
Thanks Andrzej and Pasha for your prompt replies and suggestions.
I will try everything you have suggested and report back on the findings!
regards
-pedja
Pasha Bizhan said the following on 2/25/2005 6:32 PM:
Hi,
whole document was indexed or not.
Luke can help you to give an answer the question
Hi,
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> Or perhaps someone can enlighten me on how to use Luke to find out if the
whole document was indexed or not.
Luke can help you to give an answer the question: does my index contain a
correct data?
Let do the following steps:
- run Luke
[EMAIL PROTECTED] wrote:
Anyone else has any ideas why wouldn't the whole documents be indexed as
described below?
Or perhaps someone can enlighten me on how to use Luke to find out if
the whole document was indexed or not.
I have not used Luke in such capacity before so not sure what to do or
Anyone else has any ideas why wouldn't the whole documents be indexed as
described below?
Or perhaps someone can enlighten me on how to use Luke to find out if
the whole document was indexed or not.
I have not used Luke in such capacity before so not sure what to do or
look for?
thanks
-pedja
Hi Otis
Thanks for the reply, what exactly should I be looking for with Luke?
What would setting the max value to maxInteger do? Is this some
arbitrary value or...?
-pedja
Otis Gospodnetic said the following on 2/24/2005 2:24 PM:
Use Luke to peek in your index and find out what really got indexed
Use Luke to peek in your index and find out what really got indexed.
You could also try the extreme case and set that max value to the max
Integer.
Otis
--- "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
> Hi everyone
>
> I'm having a bizzare problem with a few of the documents here that do
>
Hi everyone
I'm having a bizzare problem with a few of the documents here that do
not seem to get indexed entirely.
I use textmining WordExtractor to convert M$ Word to plain text and then
index that text.
For example one document which is about 230KB in size when converted to
plain text, when