[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 David Cook changed: What|Removed |Added See Also||https://bugs.koha-community ||.org/bugzilla3/show_bug.cgi ||?id=25600 -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #30 from David Cook --- (In reply to David Cook from comment #29) > (In reply to Tomás Cohen Arazi (tcohen) from comment #27) > > Was it too bad to just do MARCXML? This ended up a bit hacky for me. > > I agree. Check out bug 38270. > > I've added MARCXML and MARCXML_COMPRESSED options to ElasticsearchMARCFormat > there. > > From memory MARCXML_COMPRESSED actually ends up being even more compact than > USMARC and still has good performance. As I note in Comment 1 on bug 38270: base64 marcxml: 4.9K base64 isomarc: 1.3K base64 zlib compressed marcxml: 1.1K (or 1012 bytes if you don't use newlines in the base64 encoding) 5KB x 1,000,000 records = 4.7GB 1KB x 1,000,000 records = 0.95 GB -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #29 from David Cook --- (In reply to Tomás Cohen Arazi (tcohen) from comment #27) > Was it too bad to just do MARCXML? This ended up a bit hacky for me. I agree. Check out bug 38270. I've added MARCXML and MARCXML_COMPRESSED options to ElasticsearchMARCFormat there. From memory MARCXML_COMPRESSED actually ends up being even more compact than USMARC and still has good performance. -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #28 from Tomás Cohen Arazi (tcohen) --- (In reply to Tomás Cohen Arazi (tcohen) from comment #27) > Was it too bad to just do MARCXML? This ended up a bit hacky for me. FTR: This are the sizes for the first record in KTD in different formats [1]: -rw-r--r-- 1 kohadev-koha kohadev-koha 1.8K Feb 4 18:40 record.b64 -rw-r--r-- 1 kohadev-koha kohadev-koha 2.3K Feb 4 18:41 record.json -rw-r--r-- 1 kohadev-koha kohadev-koha 4.0K Feb 4 18:40 record.xml [1] Extracted with the following commands respectively: ```shell perl -MMIME::Base64 -MKoha::Biblios -MEncode -e 'my $b = Koha::Biblios->new->next(); print encode_base64( encode( "UTF-8", $b->metadata_record->as_usmarc))' > record.b64 perl -MMARC::Record::MiJ -MKoha::Biblios -e 'my $b = Koha::Biblios->new->next(); print $b->metadata_record->to_mij' > record.json perl -MKoha::Biblios -e 'my $b = Koha::Biblios->new->next(); print $b->metadata_record->as_xml' > record.xml ``` -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #27 from Tomás Cohen Arazi (tcohen) --- Was it too bad to just do MARCXML? This ended up a bit hacky for me. -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #26 from David Cook --- (In reply to Janusz Kaczmarek from comment #23) > Please have a look at the patch @ Bug 38913. Thanks again, Janusz. I've Passed QA your patch and added an updated unit test. It was hard to reproduce, but once reproduced it's so obvious. Thanks too Andrii for first raising the issue. -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #25 from David Cook --- (In reply to Janusz Kaczmarek from comment #24) > (In reply to Janusz Kaczmarek from comment #23) > > Please have a look at the patch @ Bug 38913. > > There is also a test data added that provokes this issue on KTD. Awesome. Thanks for providing that. I'll take a look shortly. -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #24 from Janusz Kaczmarek --- (In reply to Janusz Kaczmarek from comment #23) > Please have a look at the patch @ Bug 38913. There is also a test data added that provokes this issue on KTD. -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #23 from Janusz Kaczmarek --- (In reply to Janusz Kaczmarek from comment #22) > Wouldn't this be enough and OK? Please have a look at the patch @ Bug 38913. -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #22 from Janusz Kaczmarek --- Wouldn't this be enough and OK? @@ -842,8 +842,9 @@ sub marc_records_to_documents { my $usmarc_record = $record->as_usmarc(); #NOTE: Try to round-trip the record to prove it will work for retrieval after searching -my $decoded_usmarc_record = MARC::Record->new_from_usmarc($usmarc_record); -if ( $decoded_usmarc_record->warnings() ) { +my $decoded_usmarc_record; +eval { $decoded_usmarc_record = MARC::Record->new_from_usmarc($usmarc_record); } ; +if ( $@ || $decoded_usmarc_record->warnings() ) { #NOTE: We override the warnings since they're many and misleading It seems to work for me... -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 Janusz Kaczmarek changed: What|Removed |Added CC||janus...@gmail.com --- Comment #21 from Janusz Kaczmarek --- There is a serious issue introduced by this patch (cf. Bug 38913). It is not so rare to happen when you have UTF-8 encoded records rich in non basic Latin characters. Then often the ISO 2709 string produced by as_usmarc will end not between Unicode characters (which normally will happen with English-only letters) but in the middle of a composed character. Then you will always get this error (UTF-8 "\x85" does not map to Unicode at /usr/share/perl5/MARC/File/Encode.pm). There should be a way to save the initial idea behind the patch without making reindexing virtually impossible... -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 Janusz Kaczmarek changed: What|Removed |Added Blocks||38913 Referenced Bugs: https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38913 [Bug 38913] Elasticsearch indexing explodes with oversized records -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #20 from David Cook --- (In reply to Andrii Nugged from comment #14) > - so, on rebuild_elasticsearch.pl it dies with such message: > > UTF-8 "\xC3" does not map to Unicode at /usr/share/perl5/MARC/File/Encode.pm > line 35. > > - it happens in the line: > > $decoded_usmarc_record = MARC::Record->new_from_usmarc($usmarc_record); > I am researching why, but I am still in the process. After reviewing the main branch and v24.11.00, this seems very unlikely. If you had bad UTF8 data, the MARC::Record object would fail to get created from the MARCXML within an eval{}. To fail at '$decoded_usmarc_record = MARC::Record->new_from_usmarc($usmarc_record);' with a UTF8 encoding error... it just doesn't make sense. I wrote a little script to inject a "\xC3" byte into the UTF-8 record to try to update a record using Koha APIs with mixed encodings, but something along the way converted it into the EFBFBD UTF-8 replacement character... I was more brutal and I tried injecting C3 bytes into the text, but either DBI or MySQL itself seems to automatically try to do damage control and turns a C3 byte into a C383 UTF-8 byte. There might be some sort of configuration of bad bytes out there that can trigger this error you're having, but I can't find it. -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #19 from David Cook --- (In reply to David Cook from comment #18) > (In reply to Fridolin Somers from comment #17) > > I see there may be side effects, not backported to 23.11.x > > So far I'm not able to reproduce but in theory there might be. Yeah no... koha-testing-docker comes with 1 bad record and I've added a 2nd bad record, and 434/436 records are indexed when using 'perl misc/search_tools/rebuild_elasticsearch.pl -d -v -b -c 10' Koha::BiblioUtils runs the Koha::Biblio->metadata_record() function within an eval, so if you can't get a MARC::Record from the XML, then the exception for that record is caught. So we're talking about a record that is valid MARCXML and a valid MARC::Record but an invalid USMARC that is bad enough to trigger a fatal error during decoding (which is interesting since creating a MARC::Record from bad USMARC typically works even when it shouldn't). I've got 1 more idea to try... -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #18 from David Cook --- (In reply to Fridolin Somers from comment #17) > I see there may be side effects, not backported to 23.11.x So far I'm not able to reproduce but in theory there might be. -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 Fridolin Somers changed: What|Removed |Added CC||fridolin.som...@biblibre.co ||m --- Comment #17 from Fridolin Somers --- I see there may be side effects, not backported to 23.11.x -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #16 from David Cook --- (In reply to David Cook from comment #15) > I'll create a new bug report and add a patch with an eval or try/catch so > that a bad record doesn't cause a larger crash, but I am curious about the > underlying cause too. Actually, before doing that, I think it would be good to be able to reproduce it in koha-testing-docker... I've tried adding in bad data and MySQL is actually fighting me. Andrii, perhaps you can open a new bug report, link it to this one, and provide some data and steps for reproducing your problem? -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #15 from David Cook --- (In reply to Andrii Nugged from comment #14) > After this patch, it DIES inside this sub for something like 1/5 of my > records, Thanks for reporting this. Looking again at my code, I can see how that could be a risk. > UTF-8 "\xC3" does not map to Unicode at /usr/share/perl5/MARC/File/Encode.pm > line 35. > > - it happens in the line: > > $decoded_usmarc_record = MARC::Record->new_from_usmarc($usmarc_record); > > > Note 1: I used the 24.11.xx branches, just with Elasticsearch.pm code > reverted (removed), and it works on old code properly: it does reindex and > has all records. But with this patch, I have ~2500 records lost from the > index. Are you able to see these records in your Koha search results? If they're failing in the indexing code, surely they should be failing in the search code too? > Note 2: we have a lot of non-ASCII symbols in Finnish language texts and > Cyrillic texts. I haven't seen any problems in my non-English libraries, but perhaps they haven't triggered much indexing recently. Looking at the above, it seems like you might have some data problems? What do you have for position 09 in the leader? I do have a vague memory that there might be some Koha code somewhere for forcing UTF-8 on records even when the MARC records themselves aren't marked as UTF-8... > I am researching why, but I am still in the process. I'll create a new bug report and add a patch with an eval or try/catch so that a bad record doesn't cause a larger crash, but I am curious about the underlying cause too. -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 Andrii Nugged changed: What|Removed |Added CC||nug...@gmail.com --- Comment #14 from Andrii Nugged --- After this patch, it DIES inside this sub for something like 1/5 of my records, i.e., it's not recording "warnings" into an array, but crashes, and because it dies, it crushes the WHOLE block of IDs processing, so if I am processing this in rebuild_elasticsearch.pl with -c 1000 (for example, I have 198136 biblios in DB, 198136 records in index with old code and this patch removed, and 195464 records in index with this above new patch present) for block submission - it loses all from the block from index, - so, on rebuild_elasticsearch.pl it dies with such message: UTF-8 "\xC3" does not map to Unicode at /usr/share/perl5/MARC/File/Encode.pm line 35. - it happens in the line: $decoded_usmarc_record = MARC::Record->new_from_usmarc($usmarc_record); Note 1: I used the 24.11.xx branches, just with Elasticsearch.pm code reverted (removed), and it works on old code properly: it does reindex and has all records. But with this patch, I have ~2500 records lost from the index. Note 2: we have a lot of non-ASCII symbols in Finnish language texts and Cyrillic texts. I am researching why, but I am still in the process. -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 Lucas Gass (lukeg) changed: What|Removed |Added Status|Pushed to stable|Pushed to oldstable -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 Lucas Gass (lukeg) changed: What|Removed |Added Status|Pushed to main |Pushed to stable CC||lu...@bywatersolutions.com Version(s)|24.11.00|24.11.00,24.05.06 released in|| --- Comment #13 from Lucas Gass (lukeg) --- Backported to 24.05.x for upcoming 24.05.06 -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #12 from Katrin Fischer --- Pushed for 24.11! Well done everyone, thank you! -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 Katrin Fischer changed: What|Removed |Added Status|Passed QA |Pushed to main Version(s)||24.11.00 released in|| -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 Marcel de Rooy changed: What|Removed |Added Attachment #174460|0 |1 is obsolete|| --- Comment #10 from Marcel de Rooy --- Created attachment 174543 --> https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=174543&action=edit Bug 38416: Add unit tests Signed-off-by: Martin Renvoize Signed-off-by: Marcel de Rooy -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 Marcel de Rooy changed: What|Removed |Added Status|Signed Off |Passed QA Patch complexity|--- |Small patch -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 Marcel de Rooy changed: What|Removed |Added Attachment #174459|0 |1 is obsolete|| --- Comment #9 from Marcel de Rooy --- Created attachment 174542 --> https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=174542&action=edit Bug 38416: Failover to MARCXML if cannot roundtrip USMARC during indexing This change failsover to MARCXML from USMARC if there are any warnings generated by MARC::File::USMARC::decode when trying to roundtrip the record. Test plan: 0. Apply the patch 1. Setup your koha-testing-docker to use Elasticsearch 2. Create a new record with 15,000 characters in the 500$a field 3. Index that record (e.g. perl misc/search_tools/rebuild_elasticsearch.pl --biblios -v -v) 4. Note that a warning saying the following appears: "Warnings encountered while roundtripping a MARC record to/from USMARC. Failing over to MARCXML" 5. View the "Elasticsearch record" on the detail page and note that the marc_format is MARCXML 6. Perform a search for the record (the keyword should be something that brings up other results too) 7. Note that the record appears correctly in the search results Signed-off-by: Martin Renvoize Signed-off-by: Marcel de Rooy -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 Marcel de Rooy changed: What|Removed |Added CC||m.de.r...@rijksmuseum.nl QA Contact|testo...@bugs.koha-communit |m.de.r...@rijksmuseum.nl |y.org | -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 Marcel de Rooy changed: What|Removed |Added Attachment #174461|0 |1 is obsolete|| --- Comment #11 from Marcel de Rooy --- Created attachment 174544 --> https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=174544&action=edit Bug 38416: Tidy Signed-off-by: Martin Renvoize Signed-off-by: Marcel de Rooy -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 Martin Renvoize (ashimema) changed: What|Removed |Added Status|Needs Signoff |Signed Off CC||martin.renvoize@ptfs-europe ||.com -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 Martin Renvoize (ashimema) changed: What|Removed |Added Attachment #174389|0 |1 is obsolete|| --- Comment #8 from Martin Renvoize (ashimema) --- Created attachment 174461 --> https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=174461&action=edit Bug 38416: Tidy Signed-off-by: Martin Renvoize -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 Martin Renvoize (ashimema) changed: What|Removed |Added Attachment #174319|0 |1 is obsolete|| --- Comment #7 from Martin Renvoize (ashimema) --- Created attachment 174460 --> https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=174460&action=edit Bug 38416: Add unit tests Signed-off-by: Martin Renvoize -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 Martin Renvoize (ashimema) changed: What|Removed |Added Attachment #174318|0 |1 is obsolete|| --- Comment #6 from Martin Renvoize (ashimema) --- Created attachment 174459 --> https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=174459&action=edit Bug 38416: Failover to MARCXML if cannot roundtrip USMARC during indexing This change failsover to MARCXML from USMARC if there are any warnings generated by MARC::File::USMARC::decode when trying to roundtrip the record. Test plan: 0. Apply the patch 1. Setup your koha-testing-docker to use Elasticsearch 2. Create a new record with 15,000 characters in the 500$a field 3. Index that record (e.g. perl misc/search_tools/rebuild_elasticsearch.pl --biblios -v -v) 4. Note that a warning saying the following appears: "Warnings encountered while roundtripping a MARC record to/from USMARC. Failing over to MARCXML" 5. View the "Elasticsearch record" on the detail page and note that the marc_format is MARCXML 6. Perform a search for the record (the keyword should be something that brings up other results too) 7. Note that the record appears correctly in the search results Signed-off-by: Martin Renvoize -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 David Cook changed: What|Removed |Added CC||jonathan.dru...@gmail.com, ||n...@bywatersolutions.com, ||tomasco...@gmail.com -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #5 from David Cook --- Created attachment 174389 --> https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=174389&action=edit Bug 38416: Tidy -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #4 from David Cook --- Note: When rebuilding the koha-testing-docker Elasticsearch indexes, I noticed zero impact on indexing time by adding the roundtripping step. -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #3 from David Cook --- If we did push bug 38270, it would be tempting to failover to MARCXML_COMPRESSED actually... -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #2 from David Cook --- Created attachment 174319 --> https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=174319&action=edit Bug 38416: Add unit tests -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 --- Comment #1 from David Cook --- Created attachment 174318 --> https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=174318&action=edit Bug 38416: Failover to MARCXML if cannot roundtrip USMARC during indexing This change failsover to MARCXML from USMARC if there are any warnings generated by MARC::File::USMARC::decode when trying to roundtrip the record. Test plan: 0. Apply the patch 1. Setup your koha-testing-docker to use Elasticsearch 2. Create a new record with 15,000 characters in the 500$a field 3. Index that record (e.g. perl misc/search_tools/rebuild_elasticsearch.pl --biblios -v -v) 4. Note that a warning saying the following appears: "Warnings encountered while roundtripping a MARC record to/from USMARC. Failing over to MARCXML" 5. View the "Elasticsearch record" on the detail page and note that the marc_format is MARCXML 6. Perform a search for the record (the keyword should be something that brings up other results too) 7. Note that the record appears correctly in the search results -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 David Cook changed: What|Removed |Added Status|NEW |Needs Signoff -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 David Cook changed: What|Removed |Added Assignee|koha-b...@lists.koha-commun |dc...@prosentient.com.au |ity.org | -- You are receiving this mail because: You are watching all bug changes. You are the assignee for the bug. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 David Cook changed: What|Removed |Added Blocks||27365 Referenced Bugs: https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27365 [Bug 27365] Koha doesn't check marcxml field size is < 1 and fails in various places -- You are receiving this mail because: You are watching all bug changes. You are the assignee for the bug. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416 David Cook changed: What|Removed |Added See Also||https://bugs.koha-community ||.org/bugzilla3/show_bug.cgi ||?id=38270 -- You are receiving this mail because: You are the assignee for the bug. You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/