Re: Untokenized URL

Shai Erera Fri, 04 Jul 2008 23:58:36 -0700

Hi

Regarding the contentLength, when you add it to the document, do you use
*store* it as well (i.e., passing Store.YES or Store.COMPRESS)?


Regarding the URL, how do you add it to the document? For example, if you do
doc.add(new Field("url", "http://www.cnn.com";, Store.NO,
Index.UN_TOKENIZED), it would create a token like "url:http://www.cnn.com";
without breaking it to its parts. Is that what you're looking for?

Shai

On Fri, Jul 4, 2008 at 11:19 AM, blazingwolf7 <[EMAIL PROTECTED]>
wrote:

>
> Hi,
>
> I am currently working on retrieving url and contentLength of each document
> found during the search. I want to retrieve it during the calculation of
> score so that I can influence the score in some other way.
>
> I used the methods from TermDocs and TermEnum to get the information.
> However, the url I retrieve as is know by most, is tokenized. It is broken
> down into several parts and I will have to rejoin them. Can anyone help me
> with this? I am stuck here wondering how to get back the whole url without
> using a Reader.
>
> Also, I try to retrieve the contentLength, but the results return are null.
> Why is that? I opened the index using Luke and the contentLength is there
> but when I try to get it using this way, the results is null.
>
> Can anyone help me with both of these problems? Any help will be
> appreciated. Thanks
> --
> View this message in context:
> http://www.nabble.com/Untokenized-URL-tp18275048p18275048.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>


-- 
Regards,

Shai Erera

Re: Untokenized URL

Reply via email to