Re: [Dspace-tech] Apostrophes in searches

2012-11-05 Thread Robin Taylor
Just replying to my own email to record the answer.

DSpace 1.8 included an upgrade to Lucene 3.3.0. With 3.3 Lucene seems to 
have changed the default Analyzer to no longer be an english language 
analyser. From the javadocs...

* ClassicAnalyzer was named StandardAnalyzer in Lucene versions prior 
to 3.1.
  * As of 3.1, {@link StandardAnalyzer} implements Unicode text 
segmentation,
  * as specified by UAX#29.

I think the old behaviour can be reinstated by changing dspace.cfg to 
have...

search.analyzer = org.apache.lucene.analysis.standard.ClassicAnalyzer

Cheers.


On 01/11/12 16:11, TAYLOR Robin wrote:
 Hi all,

 I've just been comparing a search at DSpace version 1.6 with a search at
 1.8 and notice that at 1.8 an apostrophe is treated as a token
 delimiter, so a search term of O'Connor is split into O and
 Connor, whereas at 1.6 it was treated as one token. I presume it was a
 conscious change made at some point and I was just wondering when and
 where (in terms of the source code). Its not a problem for me I just
 need to be able to provide an explanation to the repository administrator.

 Thanks, Robin.



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


--
LogMeIn Central: Instant, anywhere, Remote PC access and management.
Stay in control, update software, and manage PCs from one command center
Diagnose problems and improve visibility into emerging IT issues
Automate, monitor and manage. Do more in less time with Central
http://p.sf.net/sfu/logmein12331_d2d
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Apostrophes in searches

2012-11-01 Thread helix84
I think it could be a change in Solr's behavior, not necessarily in
our configuration, see:

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.StandardTokenizerFactory

What Solr version did we use in 1.6?

Regards,
~~helix84

--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Apostrophes in searches

2012-11-01 Thread helix84
On Thu, Nov 1, 2012 at 5:37 PM, Robin Taylor robin.tay...@ed.ac.uk wrote:
 Actually its the old Lucene search I'm looking at, but I suspect you could
 be right in that it may well be the underlying Lucene code that changed. I'm
 just assuming that someone has already noticed this difference and can point
 me in the right direction.

One quickly forgets changes for the better, but remembers changes for
the best. Hmm, I should start writing fortune cookies.

So in DSpace 1.6, there was Lucene 2.3.0, in 1.8, there was 3.3.0.

1) look at the changes in Lucene:
http://lucene.apache.org/core/old_versioned_docs/versions/3_3_0/changes/Changes.html

2) look at Lucene changes in DSpace:
git diff dspace-1_6_x dspace-1_8_x dspace-api/src/main/java/org/dspace/search/

Good luck :)

Regards,
~~helix84

--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Apostrophes in searches

2012-11-01 Thread helix84
On Thu, Nov 1, 2012 at 6:01 PM, helix84 heli...@centrum.sk wrote:
 One quickly forgets changes for the better, but remembers changes for
 the best. Hmm, I should start writing fortune cookies.

s/best/worse/

Regards,
~~helix84

--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech