Thanks for taking time to answer me. The problem is that I'm not allowed to 
post code due to a confidentiality contract that I was required to sign. I'll 
try to see if I can get a special permission to post code since I'm wasting so 
much time trying to find the answer to this.

I tried looking for each time the query is touched and numbers are still 
present in the query. I don't know if it's the analyzer, but if it was, 
woundl't the numbers be cut out of the index completely ? As I said in my 1st 
post, they are "findable" with Lukeall. If I read right, the FrenchAnalyzer 
included in lucene is supposed to be based on StandardAnalyzer so I really fail 
to see what is going wrong. Might it be the fact that the tokenizer used is 
Stringtokenizer and not Tokenstream ? The numbers are tokenized, and in the 
returned query they are present....

I really don't know where they get zapped out of existence... 

Thanks again for helping.

Xavier Tô
Bacc. en Informatique et Génie Logiciel
[EMAIL PROTECTED]
(450)434-8905

--------------------------------------------------------------------------------------

Hard to tell without seeing any code.  Perhaps numbers are being removed from 
the query string
during search.
Make sure the same or at least "compatible" Analyzer is used during both 
indexing and querying.
Grab the code from Lucene in Action .... hm, lucenebook.com may be down at the 
moment, but
that's where you can get the code normally.  The code includes some classes 
that let you run
a query string through a set of Analyzers and see how each of them behaves and 
what it does
to a query.

Otis

----- Original Message ----
From: "To, Xavier" <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Wednesday, January 31, 2007 12:21:27 AM
Subject: Problem with a search engine


Hi, I recently started an internship and I've been asked to fix their
search engine so numbers are searched. At first, I thought it was the
Analyzer that wasn't working right, but we're using StandardAnalyzer and
the numbers are indexed (I checked with Lukeall). Then I thought they
are not tokenized during the search, but they are. They just seem to be
ignored for some reason. Did anyone experienced something similar ? If
so, how can I fix this ? It's probably something that would jump in my
face if it was alive, but I just can't see it. Can anyone help me ? It
would be very much appreciated.


Xavier T�
Stagiaire
D�veloppement - Maintenance & �volution 
AXA Canada Tech
2020, rue University, bureau 700
Montr�al(Qu�bec)H3A 2A5
T�l. :  (514) 282-6817, poste 2224
T�l�c. :  (514) 282-6017
Courriel : [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
  _____  

"Ce message est confidentiel, � l'usage exclusif du destinataire
ci-dessus et son contenu ne repr�sente en aucun cas un engagement de la
part de AXA, sauf en cas de stipulation expresse et par �crit de la part
de AXA. Toute publication, utilisation ou diffusion, m�me partielle,
doit �tre autoris�e pr�alablement. Si vous n'�tes pas destinataire de ce
message, merci d'en avertir imm�diatement l'exp�diteur."

"This e-mail message is confidential, for the exclusive use of the
addressee and its contents shall not constitute a commitment by AXA,
except as otherwise specifically provided in writing by AXA. Any
unauthorized disclosure, use or dissemination, either whole or partial,
is prohibited. If you are not the intended recipient of the message,
please notify the sender immediately."

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to