Re: Normalization of Documents

2002-04-16 Thread Joshua O'Madadhain
from Bernhard Messer: > > Let me know if you find that idea interessting, i would like to work on > > that topic. Yup, me too. This is germane to my research as well. Joshua [EMAIL PROTECTED] Per Obscurius...www.ics.uci.edu/~jmadden Joshua Madden: Information Scientist, Musician, Philoso

Re: Normalization of Documents

2002-04-15 Thread Melissa Mifsud
> Let me know if you find that idea interessting, i would like to work on > that topic. Seeing as I bought the topic up... I'm interested!! I've been doing alot of research for my University thesis on IR and the type of information that can be gathered from individual documents themselves and th

RE: Normalization of Documents

2002-04-15 Thread Halácsy Péter
> -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] > Sent: Thursday, April 11, 2002 9:19 PM > To: [EMAIL PROTECTED] > Subject: RE: Normalization of Documents > > > > From: Halácsy Péter > > > > option 2 (I think bet

RE: Normalization of Documents

2002-04-14 Thread apache
> From: Halácsy Péter > > What I would like: > > score_d = sum_t( tf_q * idf_t / norm_q * tf_d * idf_t / > norm_d_t) * p_value_d > > where: > p_value_d : predefined value of document calculated at > indexing time (0 < p_value_d <= 1) > > in the API: > option 1: > writer = new IndexWriter(..

RE: Normalization of Documents

2002-04-13 Thread Otis Gospodnetic
> > Let me know if you find that idea interessting, i would like=20 > > to work on=20 > > that topic. > I find it very interesting. Ich auch. Otis __ Do You Yahoo!? Yahoo! Tax Center - online filing with TurboTax http://taxes.yahoo.com/ -- To un

RE: Normalization of Documents

2002-04-13 Thread Halácsy Péter
> Therefore we would need an interface where we could change the lucene=20 > document boost factor during runtime. For example, a=20 > document's ranking=20 > could be based on: > links pointing to that document (like Google) > last modification date, > size of the document, > ter

Re: Normalization of Documents

2002-04-13 Thread Peter Carlson
Hi Bernhard, I think this is a very interesting issue. I think that changing the scoring algorithm is one part of it, the other is to get the information from the Document to use in the ranking. Since this is an expensive operation, there will have to be an alternative approach. Do you have any

RE: Normalization of Documents

2002-04-11 Thread Halácsy Péter
> -Original Message- > From: Peter Carlson [mailto:[EMAIL PROTECTED]] > Sent: Thursday, April 11, 2002 4:35 PM > To: Lucene Users List > Subject: Re: Normalization of Documents > > > Hi, > > These types of questions/discussions should be on the users

Re: Normalization of Documents

2002-04-11 Thread Peter Carlson
y). In my project I should score documents on > their length and their age (more recent document is more valuable and very old > documents are as valuable as very new in my archive). > > peter > >> -Original Message- >> From: Peter Carlson [mailto:[EMAIL P