https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=42485

--- Comment #3 from Martin Renvoize (ashimema) 
<[email protected]> ---
Created attachment 199926
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=199926&action=edit
Bug 42485: Disable date and numeric detection on ES mappings

When ElasticsearchMARCFormat is set to ARRAY, the marc_data_array
field uses dynamic:true mapping. Elasticsearch auto-detects field
types based on the first document indexed. If a MARC subfield value
looks like a date (e.g. 2024-01-15 in 952$e), ES locks that
subfield to type date. Subsequent records with non-date values in
the same subfield fail to index with a document_parsing_exception.

This patch sets date_detection and numeric_detection to false at
the mapping root level, so ES treats all dynamically mapped fields
as text instead of guessing types.

Test plan:
1. $ ktd --search-engine es8 --name test-es --proxy up -d
   $ ktd --name test-es --wait-ready 180
   $ ktd --name test-es --shell
2. k$ koha-mysql kohadev -e "UPDATE systempreferences
     SET value='ARRAY'
     WHERE variable='ElasticsearchMARCFormat';"
   k$ echo 'flush_all' | nc -q1 memcached 11211
   k$ koha-mysql kohadev -e "DELETE FROM biblio_metadata;
     DELETE FROM items; DELETE FROM biblioitems;
     DELETE FROM biblio;"
   k$ rebuild_elasticsearch.pl -d -r --biblios
3. Add a record with date-like 952$e and index it:
   k$ koha-shell kohadev -p -c 'perl -MC4::Biblio
     -MMARC::Record -MMARC::Field -MKoha::Item -e "
     my \$r = MARC::Record->new();
     \$r->append_fields(
       MARC::Field->new(q{245},q{1},q{0},a=>q{Record one}),
       MARC::Field->new(q{942},q{ },q{ },c=>q{BK}));
     my (\$bn) = C4::Biblio::AddBiblio(\$r, q{});
     Koha::Item->new({biblionumber=>\$bn,
       biblioitemnumber=>\$bn, homebranch=>q{CPL},
       holdingbranch=>q{CPL}, itype=>q{BK},
       booksellerid=>q{2024-01-15}})->store;
     print qq{Added biblio \$bn\n};"'
   k$ rebuild_elasticsearch.pl --biblios -bn <biblionumber>
4. Check the mapping for 952$e:
   k$ curl -s es:9200/koha_kohadev_biblios/_mapping/field/\
     marc_data_array.fields.952.subfields.e
=> FAIL: type is "date" (should be "text")
5. Add a record with non-date 952$e and index it:
   k$ koha-shell kohadev -p -c 'perl -MC4::Biblio
     -MMARC::Record -MMARC::Field -MKoha::Item -e "
     my \$r = MARC::Record->new();
     \$r->append_fields(
       MARC::Field->new(q{245},q{1},q{0},a=>q{Record two}),
       MARC::Field->new(q{942},q{ },q{ },c=>q{BK}));
     my (\$bn) = C4::Biblio::AddBiblio(\$r, q{});
     Koha::Item->new({biblionumber=>\$bn,
       biblioitemnumber=>\$bn, homebranch=>q{CPL},
       holdingbranch=>q{CPL}, itype=>q{BK},
       booksellerid=>q{My Vendor Name}})->store;
     print qq{Added biblio \$bn\n};"'
   k$ rebuild_elasticsearch.pl --biblios -bn <biblionumber> -v -v -v
=> FAIL: document_parsing_exception - failed to parse date field
   [My Vendor Name]
6. Apply this patch, repeat from step 2
7. Repeat step 4:
=> SUCCESS: type is "text"
8. Repeat step 5:
=> SUCCESS: No indexing errors
9. Sign off :-D

Signed-off-by: Andrew Fuerste Henry <[email protected]>
Signed-off-by: Martin Renvoize <[email protected]>

-- 
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[email protected]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/

Reply via email to