Re: [Dspace-tech] Apostrophes in searches
Just replying to my own email to record the answer. DSpace 1.8 included an upgrade to Lucene 3.3.0. With 3.3 Lucene seems to have changed the default Analyzer to no longer be an english language analyser. From the javadocs... "* ClassicAnalyzer was named StandardAnalyzer in Lucene versions prior to 3.1. * As of 3.1, {@link StandardAnalyzer} implements Unicode text segmentation, * as specified by UAX#29." I think the old behaviour can be reinstated by changing dspace.cfg to have... search.analyzer = org.apache.lucene.analysis.standard.ClassicAnalyzer Cheers. On 01/11/12 16:11, TAYLOR Robin wrote: > Hi all, > > I've just been comparing a search at DSpace version 1.6 with a search at > 1.8 and notice that at 1.8 an apostrophe is treated as a token > delimiter, so a search term of "O'Connor" is split into "O" and > "Connor", whereas at 1.6 it was treated as one token. I presume it was a > conscious change made at some point and I was just wondering when and > where (in terms of the source code). Its not a problem for me I just > need to be able to provide an explanation to the repository administrator. > > Thanks, Robin. > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -- LogMeIn Central: Instant, anywhere, Remote PC access and management. Stay in control, update software, and manage PCs from one command center Diagnose problems and improve visibility into emerging IT issues Automate, monitor and manage. Do more in less time with Central http://p.sf.net/sfu/logmein12331_d2d ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Apostrophes in searches
On Thu, Nov 1, 2012 at 6:01 PM, helix84 wrote: > One quickly forgets changes for the better, but remembers changes for > the best. Hmm, I should start writing fortune cookies. s/best/worse/ Regards, ~~helix84 -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Apostrophes in searches
On Thu, Nov 1, 2012 at 5:37 PM, Robin Taylor wrote: > Actually its the old Lucene search I'm looking at, but I suspect you could > be right in that it may well be the underlying Lucene code that changed. I'm > just assuming that someone has already noticed this difference and can point > me in the right direction. One quickly forgets changes for the better, but remembers changes for the best. Hmm, I should start writing fortune cookies. So in DSpace 1.6, there was Lucene 2.3.0, in 1.8, there was 3.3.0. 1) look at the changes in Lucene: http://lucene.apache.org/core/old_versioned_docs/versions/3_3_0/changes/Changes.html 2) look at Lucene changes in DSpace: git diff dspace-1_6_x dspace-1_8_x dspace-api/src/main/java/org/dspace/search/ Good luck :) Regards, ~~helix84 -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Apostrophes in searches
Hi helix84, Actually its the old Lucene search I'm looking at, but I suspect you could be right in that it may well be the underlying Lucene code that changed. I'm just assuming that someone has already noticed this difference and can point me in the right direction. Cheers, Robin. On 01/11/12 16:20, helix84 wrote: > I think it could be a change in Solr's behavior, not necessarily in > our configuration, see: > > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.StandardTokenizerFactory > > What Solr version did we use in 1.6? > > Regards, > ~~helix84 -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Apostrophes in searches
I think it could be a change in Solr's behavior, not necessarily in our configuration, see: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.StandardTokenizerFactory What Solr version did we use in 1.6? Regards, ~~helix84 -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech