Re: [Dspace-tech] Apostrophes in searches

2012-11-05 Thread Robin Taylor
Just replying to my own email to record the answer.

DSpace 1.8 included an upgrade to Lucene 3.3.0. With 3.3 Lucene seems to 
have changed the default Analyzer to no longer be an english language 
analyser. From the javadocs...

"* ClassicAnalyzer was named StandardAnalyzer in Lucene versions prior 
to 3.1.
  * As of 3.1, {@link StandardAnalyzer} implements Unicode text 
segmentation,
  * as specified by UAX#29."

I think the old behaviour can be reinstated by changing dspace.cfg to 
have...

search.analyzer = org.apache.lucene.analysis.standard.ClassicAnalyzer

Cheers.


On 01/11/12 16:11, TAYLOR Robin wrote:
> Hi all,
>
> I've just been comparing a search at DSpace version 1.6 with a search at
> 1.8 and notice that at 1.8 an apostrophe is treated as a token
> delimiter, so a search term of "O'Connor" is split into "O" and
> "Connor", whereas at 1.6 it was treated as one token. I presume it was a
> conscious change made at some point and I was just wondering when and
> where (in terms of the source code). Its not a problem for me I just
> need to be able to provide an explanation to the repository administrator.
>
> Thanks, Robin.
>


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


--
LogMeIn Central: Instant, anywhere, Remote PC access and management.
Stay in control, update software, and manage PCs from one command center
Diagnose problems and improve visibility into emerging IT issues
Automate, monitor and manage. Do more in less time with Central
http://p.sf.net/sfu/logmein12331_d2d
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Apostrophes in searches

2012-11-01 Thread helix84
On Thu, Nov 1, 2012 at 6:01 PM, helix84  wrote:
> One quickly forgets changes for the better, but remembers changes for
> the best. Hmm, I should start writing fortune cookies.

s/best/worse/

Regards,
~~helix84

--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Apostrophes in searches

2012-11-01 Thread helix84
On Thu, Nov 1, 2012 at 5:37 PM, Robin Taylor  wrote:
> Actually its the old Lucene search I'm looking at, but I suspect you could
> be right in that it may well be the underlying Lucene code that changed. I'm
> just assuming that someone has already noticed this difference and can point
> me in the right direction.

One quickly forgets changes for the better, but remembers changes for
the best. Hmm, I should start writing fortune cookies.

So in DSpace 1.6, there was Lucene 2.3.0, in 1.8, there was 3.3.0.

1) look at the changes in Lucene:
http://lucene.apache.org/core/old_versioned_docs/versions/3_3_0/changes/Changes.html

2) look at Lucene changes in DSpace:
git diff dspace-1_6_x dspace-1_8_x dspace-api/src/main/java/org/dspace/search/

Good luck :)

Regards,
~~helix84

--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Apostrophes in searches

2012-11-01 Thread Robin Taylor
Hi helix84,

Actually its the old Lucene search I'm looking at, but I suspect you 
could be right in that it may well be the underlying Lucene code that 
changed. I'm just assuming that someone has already noticed this 
difference and can point me in the right direction.

Cheers, Robin.


On 01/11/12 16:20, helix84 wrote:
> I think it could be a change in Solr's behavior, not necessarily in
> our configuration, see:
>
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.StandardTokenizerFactory
>
> What Solr version did we use in 1.6?
>
> Regards,
> ~~helix84


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Apostrophes in searches

2012-11-01 Thread helix84
I think it could be a change in Solr's behavior, not necessarily in
our configuration, see:

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.StandardTokenizerFactory

What Solr version did we use in 1.6?

Regards,
~~helix84

--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech