Hi Tasha, in yaml using a - like that makes the entry part of a list / array of 

This is easier to see with a yaml -> json example


       tokenizer: keyword


         - icu_folding


         - punctuation

Becomes something like this in json:

{ “analyzer_phrases":  { “tokenizer": “keyword” }, “filter”: [“icu_folding”], 
“char_filter”: [“punctuation”] }

So adding additional ‘-‘-prefixed entries below filter or char_filter would add 
additional entries to those arrays. Just having the bare word punctuation below 
char_filter would be a syntax error.

This means that there should be a definition for the punctuation char_filter 
*somewhere*, but possibly not in that file.


Jason Boyer
Senior System Administrator
Equinox Open Library Initiative
+1 (877) Open-ILS (673-6457)

> On Jun 24, 2021, at 5:55 PM, Bales (US), Tasha R <tasha.r.ba...@boeing.com> 
> wrote:
> Follow-up to my problem searching the OPAC for phrases containing punctuation 
> (i.e., Electroactive polymer (EAP) actuators).  This Bywater 
> Solutions<https://bywatersolutions.com/education/elastic-searching> article 
> suggests that the problem is a feature of Elasticsearch.
> FYI, we are using Elasticsearch 6.1.1 on its own dedicated server, and I 
> don’t believe we’ve installed the ICU Analysis plug-in (looks like it’s 
> required for Zebra, but I can’t tell if it’s required for Elasticsearch), 
> which could be a factor.  I couldn’t replicate all aspects of my experience 
> in a sandbox, although searches for phrases containing punctuation still 
> failed in an OPAC “Library Catalog” sandbox search.  I concluded that I 
> needed to review our configuration.
> I’ve been reviewing the Koha Wiki and elastic.co documentation, and comparing 
> to our index_config.yaml.
> By chance does anyone know how to interpret the syntax below?  The 
> documentation describes the parameters below, but I don’t see any usage of 
> “-“ before options.  Does the option “- punctuation“ mean “yes, remove 
> punctuation”, or “no, don’t remove punctuation”, or does the phrase refer to 
> some additional configuration file, or perhaps it’s commented out?
>      analyzer_phrase:
>        tokenizer: keyword
>        filter:
>          - icu_folding
>        char_filter:
>          - punctuation
> Thanks for your time and consideration,
> Tasha Bales
> Business Support Team | Information Services
> Enterprise Services | Enterprise Operations, Finance and Sustainability
> -----Original Message-----
> From: Bales (US), Tasha R
> Sent: Tuesday, June 22, 2021 7:54 AM
> To: 'Jonathan Druart' <jonathan.dru...@bugs.koha-community.org>
> Cc: Discussion Group Koha <koha@lists.katipo.co.nz>
> Subject: RE: [EXTERNAL] Re: [Koha] Title search works, but Library catalog 
> search fails in OPAC
> Jonathan, thank you!
> It does work without the parentheses.
> I would suspect an encoding problem, but for that the problem only manifests 
> in the OPAC, and not the intranet.
> I came across this issue while testing after migrating from MariaDB to 
> Percona MySQL.  Your reply prompted me to check the encoding of the new 
> database, and it's unfortunately Latin-1.  Since these are parentheses and 
> not diacritics, I’m not sure what my expectations should be, but changing to 
> UTF-8 is a place to start.  Httpd.conf does have UTF-8 set as the default.
> FWIW, my source records were encoded in MARC-8.  I used MarcEdit to convert 
> them to UTF-8, and it appears that Koha automatically converts anyway on 
> import.  When I loaded these records into Koha, I used bulkmarcimport.pl on 
> the command line.
> I'll ask that the default character set of the database be changed, and see 
> if that helps.  Thanks again.  I'm embarrassed that I didn't think to omit 
> the parentheses, or rather was belligerently insisting to myself that they 
> should not have been a problem,
> Tasha Bales
> Business Support Team | Information Services Enterprise Services | Enterprise 
> Operations, Finance and Sustainability
> (480) 509-5415
> https://is.web.boeing.com
> -----Original Message-----
> From: Jonathan Druart [mailto:jonathan.dru...@bugs.koha-community.org]
> Sent: Tuesday, June 22, 2021 12:01 AM
> To: Bales (US), Tasha R <tasha.r.ba...@boeing.com>
> Cc: Discussion Group Koha <koha@lists.katipo.co.nz>
> Subject: [EXTERNAL] Re: [Koha] Title search works, but Library catalog search 
> fails in OPAC
> Importance: High
> EXT email: be mindful of links/attachments.
> Hello Tasha,
> I've created 2 records with
>  245$a Electroactive polymer (EAP) actuators as artificial muscles and the 
> following query returns the 2 results.
> /opac-search.pl?idx=&q=Electroactive%20polymer%20%28EAP%29%20actuators&weight_search=1
> Tried on master and 20.11.06.
> Maybe a silly idea: does it work without the parenthesis?
> Could you try and recreate it on a sandbox
> (https://wiki.koha-community.org/wiki/Sandboxes) and provide us a step by 
> step plan to reproduce the problem?
> Regards,
> Jonathan
> Le mar. 22 juin 2021 à 00:46, Bales (US), Tasha R <tasha.r.ba...@boeing.com> 
> a écrit :
>> Good afternoon,
>> I’m having trouble with Title vs. Library catalog keyword searching with 
>> several example titles.  Searching the same phrase with either method yields 
>> different results.  This problem occurs only in the OPAC.  I hope to confirm 
>> whether the behavior I’m seeing is intended (i.e., the problem is me) or 
>> not.  Thanks in advance.
>> For example, given the ebook title, Electroactive polymer (EAP)
>> actuators as artificial muscles, a Title keyword search in the OPAC is 
>> successful, but a plain, Library catalog (i.e., no index specified), keyword 
>> search fails.
>> For reference, the title is recorded in the MARC record as:
>>  a Title Electroactive polymer (EAP) actuators as artificial muscles :
>> Below I’ve copied in my search history as well as the tail of the search URL 
>> that shows the search parameters.
>> ·         Library catalog keyword search with 0 results
>> o   2021-06-21 02:34 PM   Electroactive polymer (EAP) actuators, 
>> suppress:false  0
>> o   
>> …opac-search.pl?idx=&q=Electroactive%20polymer%20%28EAP%29%20actuators&weight_search=1
>> ·         Title keyword search with 2 results
>> o   2021-06-21 02:34 PM   Electroactive polymer (EAP) actuators, 
>> suppress:false  2
>> o   
>> …opac-search.pl?idx=ti&q=Electroactive+polymer+%28EAP%29+actuators&weight_search=1
>> As a test, I decided to enclose my Library catalog search terms in quotes, 
>> which yielded the desired results.  However, I did not at all anticipate 
>> that quotes would be required to get hits:
>> ·         Library catalog quoted keyword search with 2 results
>> o   2021-06-21 02:46 PM   "Electroactive polymer (EAP) actuators", 
>> suppress:false  2
>> o   … 
>> opac-search.pl?idx=&q=%22Electroactive+polymer+%28EAP%29+actuators%22&weight_search=1
>> On comparing the above URL query strings, it appears that the unquoted terms 
>> in the Library catalog keyword search aren’t “anded” together with a “+” the 
>> way other searches are, but I’m not sure what the implications are, if any.  
>> Also, the Koha manual indicates the following, which suggests to me that I 
>> ought to get hits on the unquoted string:
>> When you have more than one word in the search box, Koha will still do a 
>> keyword search, but a bit differently. Each word will be searched on its 
>> own, then the Boolean connector ‘and’ will narrow your search to those items 
>> with all words contained in matching records.
>> I understand and can predict pretty well the way our old ILS (Millennium, if 
>> context helps) will perform a keyword search, but I’m a little confused 
>> here.  My expectation for this particular case is that all of the above 
>> methods would yield results.  If there are any pointers to be had, I thank 
>> you if might point me to them so that I may be better poised to help users.
>> I’m using Elasticsearch with Koha 20.11.06.  I reindexed both authorities 
>> and biblios today, but that didn’t impact my experience.  The records are 
>> not newly added.
>> Thanks!
>> Tasha Bales
>> Business Support Team | Information Services Enterprise Services |
>> Enterprise Operations, Finance and Sustainability
>> _______________________________________________
>> Koha mailing list  http://koha-community.org Koha@lists.katipo.co.nz
>> Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha
> _______________________________________________
> Koha mailing list  http://koha-community.org
> Koha@lists.katipo.co.nz
> Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha


Koha mailing list  http://koha-community.org
Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha

Reply via email to