Re: Normalization of Documents

2002-04-15 Thread Melissa Mifsud
> Let me know if you find that idea interessting, i would like to work on > that topic. Seeing as I bought the topic up... I'm interested!! I've been doing alot of research for my University thesis on IR and the type of information that can be gathered from individual documents themselves and th

"Match All Words" Query

2002-04-06 Thread Melissa Mifsud
Hi! I've been going round in circles trying to come up with a query that will return documents which contian ALL the query terms. This should be easy, however I would like the words to span ANY of the fields of the documents. If the BooleanQuery(ies) do actually follow boolean logic, then I sh

Boolean Queries

2002-04-05 Thread Melissa Mifsud
I've been experimenting with boolean queries to find out their real meaning. If you want to submit a query and would like the returned hits to be documents in which ALL the query terms appear, is it necessary to construct a boolean query adding a clause for each term in the query: booleanQuery.

Document Scoring

2002-04-04 Thread Melissa Mifsud
Hi, I've been going throught the source code, attempting to find the exact point in time where the score for each document is calculated and the methods that do this. I've ended up very confused! Methods such as IndexReader.docFreq(Tem t) which are then used by Query.scorer(...) are declared a

Re: Lucene-created files

2002-03-07 Thread Melissa Mifsud
#x27;t help you with that but I remember seeing previous messages regarding their contents. You might want to take a look at the archives (http://www.mail-archive.com/lucene-dev@jakarta.apache.org/) or, better yet, the source code ;-) Regards, --Daniel > -Original Message- > From: Mel

What type of indexer is Lucene? Question reworded.

2002-03-07 Thread Melissa Mifsud
Hi again! I should really reword my question as follows: On which criteria are relevant documents chosen given a particular query and once retrieved, how are these documents ranked? The techniques by which this is done will then determine what type of IR model Lucene implements. Thanks agai

Lucene-created files

2002-03-06 Thread Melissa Mifsud
Hi, Does anyone know the significance of the files that are generated by Lucene? I know they are essentially the term index, however I need to have a full understanding of them. Also, they look encrypted... can anyone confirm this? Melissa

Indexing HTML with Lucene

2002-03-05 Thread Melissa Mifsud
Hi, Is it necessary to strip the HTML tags from HTML documents BEFORE telling Lucene to index them? Does Lucene do this or will it index the tags too?! Melissa

What type of indexer is Lucene?

2002-03-05 Thread Melissa Mifsud
Hi! Can anyone tell me what kind of indexer Lucene is? Statistical, Probabilistic, Boolean, Extended Boolean? I can't seem to find the answer in any documentation or article and it's really important that I know the type before I use Lucene in for application! Thanks! Melissa