[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2025-02-06 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

David Cook  changed:

   What|Removed |Added

   See Also||https://bugs.koha-community
   ||.org/bugzilla3/show_bug.cgi
   ||?id=25600

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2025-02-04 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #30 from David Cook  ---
(In reply to David Cook from comment #29)
> (In reply to Tomás Cohen Arazi (tcohen) from comment #27)
> > Was it too bad to just do MARCXML? This ended up a bit hacky for me.
> 
> I agree. Check out bug 38270. 
> 
> I've added MARCXML and MARCXML_COMPRESSED options to ElasticsearchMARCFormat
> there.
> 
> From memory MARCXML_COMPRESSED actually ends up being even more compact than
> USMARC and still has good performance.

As I note in Comment 1 on bug 38270:

base64 marcxml: 4.9K
base64 isomarc: 1.3K
base64 zlib compressed marcxml: 1.1K
(or 1012 bytes if you don't use newlines in the base64 encoding)

5KB x 1,000,000 records = 4.7GB
1KB x 1,000,000 records = 0.95 GB

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2025-02-04 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #29 from David Cook  ---
(In reply to Tomás Cohen Arazi (tcohen) from comment #27)
> Was it too bad to just do MARCXML? This ended up a bit hacky for me.

I agree. Check out bug 38270. 

I've added MARCXML and MARCXML_COMPRESSED options to ElasticsearchMARCFormat
there.

From memory MARCXML_COMPRESSED actually ends up being even more compact than
USMARC and still has good performance.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2025-02-04 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #28 from Tomás Cohen Arazi (tcohen)  ---
(In reply to Tomás Cohen Arazi (tcohen) from comment #27)
> Was it too bad to just do MARCXML? This ended up a bit hacky for me.

FTR: This are the sizes for the first record in KTD in different formats [1]:

-rw-r--r-- 1 kohadev-koha kohadev-koha 1.8K Feb  4 18:40 record.b64
-rw-r--r-- 1 kohadev-koha kohadev-koha 2.3K Feb  4 18:41 record.json
-rw-r--r-- 1 kohadev-koha kohadev-koha 4.0K Feb  4 18:40 record.xml


[1] Extracted with the following commands respectively:

```shell
perl -MMIME::Base64 -MKoha::Biblios -MEncode -e 'my $b =
Koha::Biblios->new->next(); print encode_base64( encode( "UTF-8",
$b->metadata_record->as_usmarc))' > record.b64
perl -MMARC::Record::MiJ -MKoha::Biblios -e 'my $b =
Koha::Biblios->new->next(); print $b->metadata_record->to_mij' > record.json
perl -MKoha::Biblios -e 'my $b = Koha::Biblios->new->next(); print
$b->metadata_record->as_xml' > record.xml
```

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2025-02-04 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #27 from Tomás Cohen Arazi (tcohen)  ---
Was it too bad to just do MARCXML? This ended up a bit hacky for me.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2025-01-19 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #26 from David Cook  ---
(In reply to Janusz Kaczmarek from comment #23)
> Please have a look at the patch @ Bug 38913.

Thanks again, Janusz. I've Passed QA your patch and added an updated unit test. 

It was hard to reproduce, but once reproduced it's so obvious. 

Thanks too Andrii for first raising the issue.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2025-01-19 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #25 from David Cook  ---
(In reply to Janusz Kaczmarek from comment #24)
> (In reply to Janusz Kaczmarek from comment #23)
> > Please have a look at the patch @ Bug 38913.
> 
> There is also a test data added that provokes this issue on KTD.

Awesome. Thanks for providing that. I'll take a look shortly.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2025-01-16 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #24 from Janusz Kaczmarek  ---
(In reply to Janusz Kaczmarek from comment #23)
> Please have a look at the patch @ Bug 38913.

There is also a test data added that provokes this issue on KTD.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2025-01-16 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #23 from Janusz Kaczmarek  ---
(In reply to Janusz Kaczmarek from comment #22)
> Wouldn't this be enough and OK?

Please have a look at the patch @ Bug 38913.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2025-01-16 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #22 from Janusz Kaczmarek  ---
Wouldn't this be enough and OK?

@@ -842,8 +842,9 @@ sub marc_records_to_documents {
 my $usmarc_record = $record->as_usmarc();

 #NOTE: Try to round-trip the record to prove it will work for
retrieval after searching
-my $decoded_usmarc_record =
MARC::Record->new_from_usmarc($usmarc_record);
-if ( $decoded_usmarc_record->warnings() ) {
+my $decoded_usmarc_record;
+eval { $decoded_usmarc_record =
MARC::Record->new_from_usmarc($usmarc_record); } ;
+if ( $@ || $decoded_usmarc_record->warnings() ) {

 #NOTE: We override the warnings since they're many and
misleading


It seems to work for me...

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2025-01-16 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

Janusz Kaczmarek  changed:

   What|Removed |Added

 CC||janus...@gmail.com

--- Comment #21 from Janusz Kaczmarek  ---
There is a serious issue introduced by this patch (cf. Bug 38913).  It is not
so rare to happen when you have UTF-8 encoded records rich in non basic Latin
characters. Then often the ISO 2709 string produced by as_usmarc will end not
between Unicode characters (which normally will happen with English-only
letters) but in the middle of a composed character.  Then you will always get
this error (UTF-8 "\x85" does not map to Unicode at
/usr/share/perl5/MARC/File/Encode.pm). 

There should be a way to save the initial idea behind the patch without making
reindexing virtually impossible...

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2025-01-16 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

Janusz Kaczmarek  changed:

   What|Removed |Added

 Blocks||38913


Referenced Bugs:

https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38913
[Bug 38913] Elasticsearch indexing explodes with oversized records
-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-12-18 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #20 from David Cook  ---
(In reply to Andrii Nugged from comment #14)
> - so, on rebuild_elasticsearch.pl it dies with such message:
> 
> UTF-8 "\xC3" does not map to Unicode at /usr/share/perl5/MARC/File/Encode.pm
> line 35.
> 
> - it happens in the line:
> 
> $decoded_usmarc_record = MARC::Record->new_from_usmarc($usmarc_record);

> I am researching why, but I am still in the process.

After reviewing the main branch and v24.11.00, this seems very unlikely.

If you had bad UTF8 data, the MARC::Record object would fail to get created
from the MARCXML within an eval{}. 

To fail at '$decoded_usmarc_record =
MARC::Record->new_from_usmarc($usmarc_record);' with a UTF8 encoding error...
it just doesn't make sense.

I wrote a little script to inject a "\xC3" byte into the UTF-8 record to try to
update a record using Koha APIs with mixed encodings, but something along the
way converted it into the EFBFBD UTF-8 replacement character...

I was more brutal and I tried injecting C3 bytes into the text, but either DBI
or MySQL itself seems to automatically try to do damage control and turns a C3
byte into a C383 UTF-8 byte. 

There might be some sort of configuration of bad bytes out there that can
trigger this error you're having, but I can't find it.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-12-18 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #19 from David Cook  ---
(In reply to David Cook from comment #18)
> (In reply to Fridolin Somers from comment #17)
> > I see there may be side effects, not backported to 23.11.x
> 
> So far I'm not able to reproduce but in theory there might be.

Yeah no... koha-testing-docker comes with 1 bad record and I've added a 2nd bad
record, and 434/436 records are indexed when using 'perl
misc/search_tools/rebuild_elasticsearch.pl -d -v -b -c 10'

Koha::BiblioUtils runs the Koha::Biblio->metadata_record() function within an
eval, so if you can't get a MARC::Record from the XML, then the exception for
that record is caught.

So we're talking about a record that is valid MARCXML and a valid MARC::Record
but an invalid USMARC that is bad enough to trigger a fatal error during
decoding (which is interesting since creating a MARC::Record from bad USMARC
typically works even when it shouldn't).

I've got 1 more idea to try...

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-12-18 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #18 from David Cook  ---
(In reply to Fridolin Somers from comment #17)
> I see there may be side effects, not backported to 23.11.x

So far I'm not able to reproduce but in theory there might be.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-12-18 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

Fridolin Somers  changed:

   What|Removed |Added

 CC||fridolin.som...@biblibre.co
   ||m

--- Comment #17 from Fridolin Somers  ---
I see there may be side effects, not backported to 23.11.x

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-12-15 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #16 from David Cook  ---
(In reply to David Cook from comment #15)
> I'll create a new bug report and add a patch with an eval or try/catch so
> that a bad record doesn't cause a larger crash, but I am curious about the
> underlying cause too.

Actually, before doing that, I think it would be good to be able to reproduce
it in koha-testing-docker...

I've tried adding in bad data and MySQL is actually fighting me. 

Andrii, perhaps you can open a new bug report, link it to this one, and provide
some data and steps for reproducing your problem?

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-12-15 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #15 from David Cook  ---
(In reply to Andrii Nugged from comment #14)
> After this patch, it DIES inside this sub for something like 1/5 of my
> records,

Thanks for reporting this. Looking again at my code, I can see how that could
be a risk.

> UTF-8 "\xC3" does not map to Unicode at /usr/share/perl5/MARC/File/Encode.pm
> line 35.
> 
> - it happens in the line:
> 
> $decoded_usmarc_record = MARC::Record->new_from_usmarc($usmarc_record);
> 
> 
> Note 1: I used the 24.11.xx branches, just with Elasticsearch.pm code
> reverted (removed), and it works on old code properly: it does reindex and
> has all records. But with this patch, I have ~2500 records lost from the
> index.

Are you able to see these records in your Koha search results? If they're
failing in the indexing code, surely they should be failing in the search code
too?

> Note 2: we have a lot of non-ASCII symbols in Finnish language texts and
> Cyrillic texts.

I haven't seen any problems in my non-English libraries, but perhaps they
haven't triggered much indexing recently. Looking at the above, it seems like
you might have some data problems? What do you have for position 09 in the
leader? I do have a vague memory that there might be some Koha code somewhere
for forcing UTF-8 on records even when the MARC records themselves aren't
marked as UTF-8...

> I am researching why, but I am still in the process.

I'll create a new bug report and add a patch with an eval or try/catch so that
a bad record doesn't cause a larger crash, but I am curious about the
underlying cause too.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-12-15 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

Andrii Nugged  changed:

   What|Removed |Added

 CC||nug...@gmail.com

--- Comment #14 from Andrii Nugged  ---
After this patch, it DIES inside this sub for something like 1/5 of my records,

i.e., it's not recording "warnings" into an array, but crashes, and because it
dies, it crushes the WHOLE block of IDs processing, so if I am processing this
in rebuild_elasticsearch.pl with -c 1000 (for example, I have 198136 biblios in
DB, 198136 records in index with old code and this patch removed, and 195464
records in index with this above new patch present) for block submission - it
loses all from the block from index,

- so, on rebuild_elasticsearch.pl it dies with such message:

UTF-8 "\xC3" does not map to Unicode at /usr/share/perl5/MARC/File/Encode.pm
line 35.

- it happens in the line:

$decoded_usmarc_record = MARC::Record->new_from_usmarc($usmarc_record);


Note 1: I used the 24.11.xx branches, just with Elasticsearch.pm code reverted
(removed), and it works on old code properly: it does reindex and has all
records. But with this patch, I have ~2500 records lost from the index.

Note 2: we have a lot of non-ASCII symbols in Finnish language texts and
Cyrillic texts.


I am researching why, but I am still in the process.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-12-05 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

Lucas Gass (lukeg)  changed:

   What|Removed |Added

 Status|Pushed to stable|Pushed to oldstable

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-12-05 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

Lucas Gass (lukeg)  changed:

   What|Removed |Added

 Status|Pushed to main  |Pushed to stable
 CC||lu...@bywatersolutions.com
 Version(s)|24.11.00|24.11.00,24.05.06
released in||

--- Comment #13 from Lucas Gass (lukeg)  ---
Backported to 24.05.x for upcoming 24.05.06

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-18 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #12 from Katrin Fischer  ---
Pushed for 24.11!

Well done everyone, thank you!

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-18 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

Katrin Fischer  changed:

   What|Removed |Added

 Status|Passed QA   |Pushed to main
 Version(s)||24.11.00
released in||

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-15 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

Marcel de Rooy  changed:

   What|Removed |Added

 Attachment #174460|0   |1
is obsolete||

--- Comment #10 from Marcel de Rooy  ---
Created attachment 174543
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=174543&action=edit
Bug 38416: Add unit tests

Signed-off-by: Martin Renvoize 

Signed-off-by: Marcel de Rooy 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-15 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

Marcel de Rooy  changed:

   What|Removed |Added

 Status|Signed Off  |Passed QA
   Patch complexity|--- |Small patch

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-15 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

Marcel de Rooy  changed:

   What|Removed |Added

 Attachment #174459|0   |1
is obsolete||

--- Comment #9 from Marcel de Rooy  ---
Created attachment 174542
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=174542&action=edit
Bug 38416: Failover to MARCXML if cannot roundtrip USMARC during indexing

This change failsover to MARCXML from USMARC if there are any
warnings generated by MARC::File::USMARC::decode when trying to
roundtrip the record.

Test plan:
0. Apply the patch
1. Setup your koha-testing-docker to use Elasticsearch
2. Create a new record with 15,000 characters in the 500$a field
3. Index that record
(e.g. perl misc/search_tools/rebuild_elasticsearch.pl --biblios -v -v)
4. Note that a warning saying the following appears:
"Warnings encountered while roundtripping a MARC record to/from USMARC.
Failing over to MARCXML"
5. View the "Elasticsearch record" on the detail page and note that the
marc_format is MARCXML
6. Perform a search for the record (the keyword should be something that
brings up other results too)
7. Note that the record appears correctly in the search results

Signed-off-by: Martin Renvoize 

Signed-off-by: Marcel de Rooy 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-15 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

Marcel de Rooy  changed:

   What|Removed |Added

 CC||m.de.r...@rijksmuseum.nl
 QA Contact|testo...@bugs.koha-communit |m.de.r...@rijksmuseum.nl
   |y.org   |

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-15 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

Marcel de Rooy  changed:

   What|Removed |Added

 Attachment #174461|0   |1
is obsolete||

--- Comment #11 from Marcel de Rooy  ---
Created attachment 174544
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=174544&action=edit
Bug 38416: Tidy

Signed-off-by: Martin Renvoize 

Signed-off-by: Marcel de Rooy 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-13 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

Martin Renvoize (ashimema)  changed:

   What|Removed |Added

 Status|Needs Signoff   |Signed Off
 CC||martin.renvoize@ptfs-europe
   ||.com

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-13 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

Martin Renvoize (ashimema)  changed:

   What|Removed |Added

 Attachment #174389|0   |1
is obsolete||

--- Comment #8 from Martin Renvoize (ashimema) 
 ---
Created attachment 174461
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=174461&action=edit
Bug 38416: Tidy

Signed-off-by: Martin Renvoize 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-13 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

Martin Renvoize (ashimema)  changed:

   What|Removed |Added

 Attachment #174319|0   |1
is obsolete||

--- Comment #7 from Martin Renvoize (ashimema) 
 ---
Created attachment 174460
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=174460&action=edit
Bug 38416: Add unit tests

Signed-off-by: Martin Renvoize 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-13 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

Martin Renvoize (ashimema)  changed:

   What|Removed |Added

 Attachment #174318|0   |1
is obsolete||

--- Comment #6 from Martin Renvoize (ashimema) 
 ---
Created attachment 174459
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=174459&action=edit
Bug 38416: Failover to MARCXML if cannot roundtrip USMARC during indexing

This change failsover to MARCXML from USMARC if there are any
warnings generated by MARC::File::USMARC::decode when trying to
roundtrip the record.

Test plan:
0. Apply the patch
1. Setup your koha-testing-docker to use Elasticsearch
2. Create a new record with 15,000 characters in the 500$a field
3. Index that record
(e.g. perl misc/search_tools/rebuild_elasticsearch.pl --biblios -v -v)
4. Note that a warning saying the following appears:
"Warnings encountered while roundtripping a MARC record to/from USMARC.
Failing over to MARCXML"
5. View the "Elasticsearch record" on the detail page and note that the
marc_format is MARCXML
6. Perform a search for the record (the keyword should be something that
brings up other results too)
7. Note that the record appears correctly in the search results

Signed-off-by: Martin Renvoize 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-11 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

David Cook  changed:

   What|Removed |Added

 CC||jonathan.dru...@gmail.com,
   ||n...@bywatersolutions.com,
   ||tomasco...@gmail.com

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-11 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #5 from David Cook  ---
Created attachment 174389
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=174389&action=edit
Bug 38416: Tidy

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-10 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #4 from David Cook  ---
Note: When rebuilding the koha-testing-docker Elasticsearch indexes, I noticed
zero impact on indexing time by adding the roundtripping step.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-10 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #3 from David Cook  ---
If we did push bug 38270, it would be tempting to failover to
MARCXML_COMPRESSED actually...

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-10 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #2 from David Cook  ---
Created attachment 174319
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=174319&action=edit
Bug 38416: Add unit tests

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-10 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

--- Comment #1 from David Cook  ---
Created attachment 174318
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=174318&action=edit
Bug 38416: Failover to MARCXML if cannot roundtrip USMARC during indexing

This change failsover to MARCXML from USMARC if there are any
warnings generated by MARC::File::USMARC::decode when trying to
roundtrip the record.

Test plan:
0. Apply the patch
1. Setup your koha-testing-docker to use Elasticsearch
2. Create a new record with 15,000 characters in the 500$a field
3. Index that record
(e.g. perl misc/search_tools/rebuild_elasticsearch.pl --biblios -v -v)
4. Note that a warning saying the following appears:
"Warnings encountered while roundtripping a MARC record to/from USMARC.
Failing over to MARCXML"
5. View the "Elasticsearch record" on the detail page and note that the
marc_format is MARCXML
6. Perform a search for the record (the keyword should be something that
brings up other results too)
7. Note that the record appears correctly in the search results

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-10 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

David Cook  changed:

   What|Removed |Added

 Status|NEW |Needs Signoff

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-10 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

David Cook  changed:

   What|Removed |Added

   Assignee|koha-b...@lists.koha-commun |dc...@prosentient.com.au
   |ity.org |

-- 
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-10 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

David Cook  changed:

   What|Removed |Added

 Blocks||27365


Referenced Bugs:

https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27365
[Bug 27365] Koha doesn't check marcxml field size is < 1 and  fails in
various places
-- 
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 38416] Failover to MARCXML if cannot roundtrip USMARC when indexing

2024-11-10 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38416

David Cook  changed:

   What|Removed |Added

   See Also||https://bugs.koha-community
   ||.org/bugzilla3/show_bug.cgi
   ||?id=38270

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/