Re: [Dspace-tech] Fuzzy search in DSpace
Dear Jayan, Thank you very much for this interesting reference on Lucene Search in DSpace! I forwarded it to my "power users". Just one point: you can configure DSpace for either an implicit OR between search terms or an implicit AND. OR is the default. When no patch is applied to add sorting to DSpace, the search result is implicitly sorted from the most relevant (most search terms with most relative frequency) to the least relevant (for instance, one occurrence of one of the search terms). This is just nice with implicit OR and useful for many applications. When AND is choosen as the implicit operator, the relevancy sorting is less relevant (!) and sorting by date is often prefered. Have a nice day! Christophe Jayan Chirayath Kurian a écrit : Hello, It looks quite nice to experiment with different options. http://drtc.isibang.ac.in:8080/jspui/handle/1849/244 The link refers to an interesting write-up. Cheers! Jayan -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Vlastimil Krejcir Sent: Thursday, May 15, 2008 10:49 PM To: [EMAIL PROTECTED]; dspace-tech@lists.sourceforge.net; [EMAIL PROTECTED] Subject: [Dspace-tech] Fuzzy search in DSpace Hi all, maybe I've just discovered something that is well known in the whole DSpace community. I'm not sure if everybody knows that Lucene (and so the DSpace) has fuzzy search. In my opinion this feature is not promoted enough (or not promoted at all). You can use the fuzzy search by adding "~" to query. For example we have an item about the movie Spiderman. So the query "spiterman" doesn't give us any results whereas "spiterman~" give us the right item about the movie (and maybe more items depends on the fuzzy search setting). This can be use also for the thing I personally call "cutted of diacritics search". Because it also works for words with diacritics (so "krejcir~" gives all items where I'm the author even if there is only my surname with diacritics ("Krejčíř") stored. It's not exact because this gives also results which have nothing common with me. On the other hand why not to use it. For details you can consult the Lucene documentation. hope this post might help Vlastik Vlastimil Krejčíř Library and Information Centre, Institute of Computer Science Masaryk University in Brno, Czech Republic Email: krejcir (at) ics (dot) muni (dot) cz Phone: +420 549 49 3872 ICQ: 163963217 Jabber: [EMAIL PROTECTED] - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech begin:vcard fn:Christophe Dupriez n:Dupriez;Christophe org:DESTIN inc. SSEB adr;quoted-printable:;;rue des Palais 44, bo=C3=AEte 1;Bruxelles;;B-1030;Belgique email;internet:[EMAIL PROTECTED] title:Informaticien tel;work:+32/2/216.66.15 tel;fax:+32/2/242.97.25 tel;cell:+32/475.77.62.11 note;quoted-printable:D=C3=A9veloppement de Syst=C3=A8mes de Traitement de l'Information x-mozilla-html:TRUE url:http://www.destin.be version:2.1 end:vcard - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Fuzzy search in DSpace
Hello, It looks quite nice to experiment with different options. http://drtc.isibang.ac.in:8080/jspui/handle/1849/244 The link refers to an interesting write-up. Cheers! Jayan -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Vlastimil Krejcir Sent: Thursday, May 15, 2008 10:49 PM To: [EMAIL PROTECTED]; dspace-tech@lists.sourceforge.net; [EMAIL PROTECTED] Subject: [Dspace-tech] Fuzzy search in DSpace Hi all, maybe I've just discovered something that is well known in the whole DSpace community. I'm not sure if everybody knows that Lucene (and so the DSpace) has fuzzy search. In my opinion this feature is not promoted enough (or not promoted at all). You can use the fuzzy search by adding "~" to query. For example we have an item about the movie Spiderman. So the query "spiterman" doesn't give us any results whereas "spiterman~" give us the right item about the movie (and maybe more items depends on the fuzzy search setting). This can be use also for the thing I personally call "cutted of diacritics search". Because it also works for words with diacritics (so "krejcir~" gives all items where I'm the author even if there is only my surname with diacritics ("Krejčíř") stored. It's not exact because this gives also results which have nothing common with me. On the other hand why not to use it. For details you can consult the Lucene documentation. hope this post might help Vlastik Vlastimil Krejčíř Library and Information Centre, Institute of Computer Science Masaryk University in Brno, Czech Republic Email: krejcir (at) ics (dot) muni (dot) cz Phone: +420 549 49 3872 ICQ: 163963217 Jabber: [EMAIL PROTECTED] - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
[Dspace-tech] Fuzzy search in DSpace
Hi all, maybe I've just discovered something that is well known in the whole DSpace community. I'm not sure if everybody knows that Lucene (and so the DSpace) has fuzzy search. In my opinion this feature is not promoted enough (or not promoted at all). You can use the fuzzy search by adding "~" to query. For example we have an item about the movie Spiderman. So the query "spiterman" doesn't give us any results whereas "spiterman~" give us the right item about the movie (and maybe more items depends on the fuzzy search setting). This can be use also for the thing I personally call "cutted of diacritics search". Because it also works for words with diacritics (so "krejcir~" gives all items where I'm the author even if there is only my surname with diacritics ("Krejčíř") stored. It's not exact because this gives also results which have nothing common with me. On the other hand why not to use it. For details you can consult the Lucene documentation. hope this post might help Vlastik Vlastimil Krejčíř Library and Information Centre, Institute of Computer Science Masaryk University in Brno, Czech Republic Email: krejcir (at) ics (dot) muni (dot) cz Phone: +420 549 49 3872 ICQ: 163963217 Jabber: [EMAIL PROTECTED] - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech