Hi all,

Jabok Library currently uses Evergreen 2.8.2 and we have successfully changed charsets both for <client> and <yazgfs> (in the configuration files mentioned at http://docs.evergreen-ils.org/2.1/html/Z3950serversupport.html) to utf-8 and so now Z39.50 clients can receive data (records) with the correct diacritics.

However, one related problem still persists - the Z39.50 queries only work when no diacritics are used. Eg. search results are returned when we submit a query "matousek" (author's surname) but no results are reported when the correct version "matoušek" is used.

We have tried the following but to no avail:

1) add element client_query_charset to gfs (according to http://www.indexdata.com/yaz/doc/server.vhosts.html) but it was an unknown element;

2) delete the second mention of "encoding="utf-8"" from /xsl/MARC21slim2SRWDC.xsl and restart the open-ils.supercat service, hoping that this procedure would have similar results like when MODS stylesheets were treated in the same way to resolve our Zotero encoding problems (see https://bugs.launchpad.net/evergreen/+bug/1442276).

We have also tried further query testing in yaz-client. In this case, some interesting things happened:

When yaz-client was used for a generic query "find matoušek" (i.e., with diacritics), the answer was 34 hits:

Z> find matoušek
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 34, setno 1
records returned: 0
Elapsed: 0.681894

However, when searching specifically for author (with diacritics again), the answer was zero hits:

Z> find @attr 1=1003 @attr 2=3 "matoušek"
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 0, setno 12
records returned: 0
Elapsed: 0.117265

When diacritics were omitted, we got 34 hits again:

Z> find @attr 1=1003 @attr 2=3 "matousek"
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 34, setno 13
records returned: 0
Elapsed: 0.637897

Our Z39.50 server runs at mojzis.jabok.cuni.cz (port 9999, database Jabok) and it now uses the utf-8 encoding.

When we have tried Laurentian (laurentian.concat.ca, port 210, database OSUL), we have used a word "francais" and "français" (searching for a person in Tellico), in case of "francais" we got the results but when asking for "français", no results were found. So probably it is not just our case...

Do you have any ideas what we could do to make the queries with diacritics work correctly?

Thank you in advance for any hints!

Linda

Reply via email to