Thanks for taking time to answer me. The problem is that I'm not allowed to post code due to a confidentiality contract that I was required to sign. I'll try to see if I can get a special permission to post code since I'm wasting so much time trying to find the answer to this.
I tried looking for each time the query is touched and numbers are still present in the query. I don't know if it's the analyzer, but if it was, woundl't the numbers be cut out of the index completely ? As I said in my 1st post, they are "findable" with Lukeall. If I read right, the FrenchAnalyzer included in lucene is supposed to be based on StandardAnalyzer so I really fail to see what is going wrong. Might it be the fact that the tokenizer used is Stringtokenizer and not Tokenstream ? The numbers are tokenized, and in the returned query they are present.... I really don't know where they get zapped out of existence... Thanks again for helping. Xavier Tô Bacc. en Informatique et Génie Logiciel [EMAIL PROTECTED] (450)434-8905 -------------------------------------------------------------------------------------- Hard to tell without seeing any code. Perhaps numbers are being removed from the query string during search. Make sure the same or at least "compatible" Analyzer is used during both indexing and querying. Grab the code from Lucene in Action .... hm, lucenebook.com may be down at the moment, but that's where you can get the code normally. The code includes some classes that let you run a query string through a set of Analyzers and see how each of them behaves and what it does to a query. Otis ----- Original Message ---- From: "To, Xavier" <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Wednesday, January 31, 2007 12:21:27 AM Subject: Problem with a search engine Hi, I recently started an internship and I've been asked to fix their search engine so numbers are searched. At first, I thought it was the Analyzer that wasn't working right, but we're using StandardAnalyzer and the numbers are indexed (I checked with Lukeall). Then I thought they are not tokenized during the search, but they are. They just seem to be ignored for some reason. Did anyone experienced something similar ? If so, how can I fix this ? It's probably something that would jump in my face if it was alive, but I just can't see it. Can anyone help me ? It would be very much appreciated. Xavier T� Stagiaire D�veloppement - Maintenance & �volution AXA Canada Tech 2020, rue University, bureau 700 Montr�al(Qu�bec)H3A 2A5 T�l. : (514) 282-6817, poste 2224 T�l�c. : (514) 282-6017 Courriel : [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> _____ "Ce message est confidentiel, � l'usage exclusif du destinataire ci-dessus et son contenu ne repr�sente en aucun cas un engagement de la part de AXA, sauf en cas de stipulation expresse et par �crit de la part de AXA. Toute publication, utilisation ou diffusion, m�me partielle, doit �tre autoris�e pr�alablement. Si vous n'�tes pas destinataire de ce message, merci d'en avertir imm�diatement l'exp�diteur." "This e-mail message is confidential, for the exclusive use of the addressee and its contents shall not constitute a commitment by AXA, except as otherwise specifically provided in writing by AXA. Any unauthorized disclosure, use or dissemination, either whole or partial, is prohibited. If you are not the intended recipient of the message, please notify the sender immediately." --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]