https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38913
Bug ID: 38913 Summary: Elasticsearch indexing explodes with oversized records Change sponsored?: --- Product: Koha Version: unspecified Hardware: All OS: All Status: NEW Severity: major Priority: P5 - low Component: Searching - Elasticsearch Assignee: koha-bugs@lists.koha-community.org Reporter: janus...@gmail.com QA Contact: testo...@bugs.koha-community.org Depends on: 38416 After Bug 38416 Elasticsearch indexing explodes with oversized records, especially with UTF encoded data. In Koha::SearchEngine::Elasticsearch::marc_records_to_documents a following snippet has been introduced: my $usmarc_record = $record->as_usmarc(); #NOTE: Try to round-trip the record to prove it will work for retrieval after searching my $decoded_usmarc_record = MARC::Record->new_from_usmarc($usmarc_record); But if $record is oversized (> 99999 bytes), it is OK for MARC::Record object, but not for $record->as_usmarc. The produced ISO 2709 string is not correct and hence cannot be properly converted back to MARC::Record object by new_from_usmarc. The result in this case can be like: UTF-8 "\x85" does not map to Unicode at /usr/share/perl5/MARC/File/Encode.pm line 35. Since it is done without any eval / try, the whole reindex procedure (for instance rebuild_elasticsearch.pl) is being randomly interrupted with no explanation. Referenced Bugs: https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing -- You are receiving this mail because: You are watching all bug changes. You are the assignee for the bug. _______________________________________________ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/