[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-14 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #43 from Thomas Klausner  ---
Created attachment 180906
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=180906&action=edit
MARCXML file with 75 authorities from GND

Here'a another MARCXML file with 75 auths from GND (2 of which cannot be
properly imported into ktd, which is ok for testing)

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-14 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #42 from Thomas Klausner  ---
Created attachment 180905
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=180905&action=edit
Bug 37020: (Followup) Fix txn handling, loop label and record_number counter

So, I reviewed the code, which has a bunch of issues:

* there's a call to `next RECORD`, but no `RECORD` loop label.
* $record_number is incremented twice
* and the calls to txn_begin is moved from the outside the loop into each
record, starting one transaction per record. Not sure how mysql handles this,
but postgres would not like it.

This patch fixes those issues:

After I fixed the first two issues, I ran the patched script on a file
containing 75 authorities (from GND), and not one was added to the DB,
presuambly because nested transactions don't work in mysql?

So I also moved the BEGIN transaction outside of the main loop and started a
new one after each commit, and added a final commit after to loop.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-14 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

Thomas Klausner  changed:

   What|Removed |Added

 Status|In Discussion   |Needs Signoff

--- Comment #44 from Thomas Klausner  ---
Here'a test plan:

1: Start KTD with ES, eg ktd --es7 start
2: Connect to DB: ktd --dbshell
3: Get the count of current auth: `SELECT count(*) FROM auth_header;`
   I get 1706. Remember that number
4: Apply the patch, copy the file "MARCXML file with 75 authorities from GND" /
75_authorities_gnd.xml into your koha dir
5: enter a koha shell: ktd --shell
6: import the 75_authorities_gnd.xml file:

kohadev-koha@kohadevbox:koha$  misc/migration_tools/bulkmarcimport.pl -m
MARCXML -c utf8 -a --insert --file 75_authorities_gnd.xml -l testlog

You will see two DBIx errors, but at the end:

75 MARC records done in 0.464853048324585 seconds

7: Again in the DB-Shell, run the count: SELECT count(*) FROM auth_header;

You should now get 75 more auths (eg 1781 in my ktd)

you can verify that in the DB with: select 1781 - 1706;

8: Take a look at the testlog file, which will have 73 OKs and two errors :
eg:
1713;insert;ok
000152145;insert;ERROR
1715;insert;ok

Notice that 1714 is skipped (because it had an error).

But it still exists in the DB (in a useless form):

select * from auth_header where authid = 1714;

I'm not exactly sure why the two not importable auths show up in the db. But as
this is the same behavior as on main (without this patch) I consider it out of
scope for this bug.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-14 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #41 from Thomas Klausner  ---
Yes, I agree: Let's fix this bug, and (later / in another issue) think about
what benefits XML::LibXML::Reader might have.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-11 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #40 from Leo Stoyanov  ---
Sounds like a good plan to me as well: If Dave's patch solves the immediate
issue, then that could be the official patch for this bug; then, another
Bugzilla issue could be made for implementing XML::LibXML::Reader as an
enhancement.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-10 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #39 from David Gustafsson  ---
Sounds good, though I don't understand the need to use XML::LibXML::Reader
instead of MARC::File::XML. MARC::File::XML does perhaps have some odd
behaviour when it comes to encoding, but there already is a worked around for
this and it's just much nicer and more readable to have the MARC::Batch
interface for all formats, and no special handling for MARC XML. There is also 
a slight risk of introducing new bugs by rewriting code that already has been
tried and tested, to me it's better to leave it as it is.

Anyway, I think what we should do is:

* review your patch
* merge it if it works
* start another bugzilla issue to refactor bulkmarcimport to use
XML::LibXML::Reader for streaming XML input files

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-09 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #38 from Thomas Klausner  ---
> I have had a look at the XML::Twig patch and now that the records are again 
> streamed and not loaded all at once I don't see why we should add a new 
> dependency and increase code complexity with very little to gain in terms of 
> memory consumption.

I agree. Usimg XML::Twig makes little sense, esp as XML::LibXML (and thus
XML::LibXML::Reader are already available)

> I may be wrong, but memory consumbed by XML::LibXML should be released a 
> record has been read, so there should not really be any significant 
> difference.

The old code (prior to pushing all the records onto an array) had no memory
problem. But I would still like to move away from using MARC::Batch when
reading XML files, because the code there is crazy. But maybe this could be a
second, later patch / bug.

> With regard to the patch by Kyle M there is a bug introduced where the main 
> loop can be terminated without indexing the last batch of records, since the 
> conditions for indexing the last batch will not be met. There needs to be an 
> additional check for if we have run out of records, and after that a check 
> whether to terminate the loop.

Yeah, reading that patch was also quite hard...

Anyway, I think what we should do is:

* review your patch
* merge it if it works
* start another bugzilla issue to refactor bulkmarcimport to use
XML::LibXML::Reader for streaming XML input files

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-08 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #37 from David Gustafsson  ---
(In reply to Thomas Klausner from comment #35)
> @David Gustafsson: I also mentioned all of this in my previous comments. And
> I still think we should move away from MARC::Batch and use XML::LibXML. And
> I also liked the cleanups Leo Stoyanov did in his XML::Twig version.
> 
> I'm in a train with very flaky internet at the moment, so it's a bit hard to
> review your patch (because of all the indention changes..), but at the first
> glance it seems very similar to Kyle M Halls patch?
> 
> So I don't think we should just go with the minimal fix, but properly
> improve bulkmarcimport.
> 
> What do you think of my analysis in the comments prior to your commits?
> 
> I would have loved a bit of discussion / feedback...

I have had a look at the XML::Twig patch and now that the records are again
streamed and not loaded all at once I don't see why we should add a new
dependency and increase code complexity with very little to gain in terms of
memory consumption. I may be wrong, but memory consumbed by XML::LibXML should
be released a record has been read, so there should not really be any
significant difference.

With regard to the patch by Kyle M there is a bug introduced where the main
loop can be terminated without indexing the last batch of records, since the
conditions for indexing the last batch will not be met. There needs to be an
additional check for if we have run out of records, and after that a check
whether to terminate the loop.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-08 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

Jonathan Druart  changed:

   What|Removed |Added

 CC||jonathan.dru...@gmail.com
 Status|Needs Signoff   |In Discussion

--- Comment #36 from Jonathan Druart  ---
David, can you reply to Thomas please? See previous comments.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-05 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #24 from Thomas Klausner  ---
In our script, we used XML::LibXML. I can provide an example patch, but before
spend that time, here's a quick preview of how the usage would look like:

open( my $fh, '<', $input_marc_file );

my $reader = XML::LibXML::Reader->new( IO => $fh );

while ( $reader->read ) {
next unless $reader->nodeType == XML_READER_TYPE_ELEMENT;
next unless $reader->name eq 'record';

my $xml= $reader->readOuterXml;
my $record = MARC::Record->new_from_xml( $xml, 'UTF8', 'MARC21' );
}


I find that a bit easier to read than the HTML::Twig way of installing a
handler subref for 'record'.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-05 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #26 from Thomas Klausner  ---
One potential killer argument against XML::Twig and in favor of XML::LibXML:
Koha currently already comes with XML::LibXML, but not with XML::Twig. So we
would need to introduce a new depenceny.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-05 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

Thomas Klausner  changed:

   What|Removed |Added

 CC||janus...@gmail.com

--- Comment #20 from Thomas Klausner  ---
*** Bug 39537 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-05 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #35 from Thomas Klausner  ---
@David Gustafsson: I also mentioned all of this in my previous comments. And I
still think we should move away from MARC::Batch and use XML::LibXML. And I
also liked the cleanups Leo Stoyanov did in his XML::Twig version.

I'm in a train with very flaky internet at the moment, so it's a bit hard to
review your patch (because of all the indention changes..), but at the first
glance it seems very similar to Kyle M Halls patch?

So I don't think we should just go with the minimal fix, but properly improve
bulkmarcimport.

What do you think of my analysis in the comments prior to your commits?

I would have loved a bit of discussion / feedback...

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-04 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #30 from David Gustafsson  ---
I think this also resolved a bug where if a limit is set for number of imported
records, the loop will terminate prematurely without the last batch of records
being committed to Elasticsearch.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-04 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #33 from David Gustafsson  ---
Created attachment 180644
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=180644&action=edit
Bug 37020: Don't update Elasticsearch index if--test option is set

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-04 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

David Gustafsson  changed:

   What|Removed |Added

 Attachment #176425|0   |1
is obsolete||
 Attachment #176921|0   |1
is obsolete||
 Attachment #180639|0   |1
is obsolete||

--- Comment #28 from David Gustafsson  ---
Created attachment 180640
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=180640&action=edit
Bug 37020: bulkmarcimport gets killed after update to 24.05. when inserting
large files

Include checking for invalid marc data in main loop to avoid
loading all marc data into memory at once.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-04 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #27 from David Gustafsson  ---
Created attachment 180639
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=180639&action=edit
Bug 37020: bulkmarcimport gets killed after update to 24.05. when inserting
large files

Include checking for invalid marc data in main loop to avoid
loading all marc data into memory at once.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-04 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #32 from David Gustafsson  ---
Created attachment 180641
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=180641&action=edit
Bug 37020: Don't update Elasticsearch index --test option is set

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-04 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

David Gustafsson  changed:

   What|Removed |Added

 Attachment #180640|0   |1
is obsolete||
 Attachment #180641|0   |1
is obsolete||
 Attachment #180644|0   |1
is obsolete||

--- Comment #34 from David Gustafsson  ---
Created attachment 180645
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=180645&action=edit
Bug 37020: bulkmarcimport gets killed after update to 24.05. when inserting
large files

Include checking for invalid marc data in main loop to avoid
loading all marc data into memory at once.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-04 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #31 from David Gustafsson  ---
And in addition to the database transaction was previously also done in a
manner in which made sure that both the records where inserted/updated and
elastic index was updated, which also now should have been fixed.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-04 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

David Gustafsson  changed:

   What|Removed |Added

 Status|Failed QA   |Needs Signoff
 CC||glask...@gmail.com

--- Comment #29 from David Gustafsson  ---
Had a look at this and the reason the script now consumes a lot more memory is
that all records are loaded into memory when checking for bad xml records and
encoding errors. Not sure why this was moved into a separate step, but now
moved it in to the main loop so that the maximum records loaded is the commit
size, which should resolve the issue.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-03 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #25 from Thomas Klausner  ---
BTW, here is a link to Bug 37478 which introduced the duplicate loop that
caused the memory issues (and the --skip_bad_records options does not protect
us from the problem, because the loop pushing the records onto a Perl array
happens if it is set or not)

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-03 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #23 from Thomas Klausner  ---
Review of the patch "Bug 37020: [alternate] Changed script to use XML:Twig to
reduce memory usage.":

This patch changes a lot, mainly using XML::Twig to pull-parse the (huge) XML
file instead of using the regex-"parser" in MARC::Batch.

It is a bit hard to read because it adds some comments (which would have to be
removed) and adds a bunch of functions before the main code (which IMO is bad
style, but maybe OK for Koha?), also moving the helper funtion up in the code.

It proposes two different parsers, one removing the namespaces ('marc:record'),
one not. Not sure why we need this.

I like that processing a record is moved into a function.

So I guess if this approach is taken, the code layout should be changed to have
all the functions AFTER the main code, and we need to decide which of the two
parser-function should be used (or we have both and a flag to choose?)

I like this approach better then the other one because it removes MARC::Batch
(for reading XML)

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-03 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #22 from Thomas Klausner  ---
The patch "Bug 37020: Stream records from file instead of loading all into
memory" basically "just" removes the first loop that pushed all (valid) records
onto an array. Yes, this solves the memory problem, but still uses the IMO not
very sane regex-based pull "parser" in MARC::Batch.

But it's a rather small change that works. (even though the patch looks huge
because it contains a lot of whitespace-only changes. use `git diff -w ...` to
see the relevant changes.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-04-03 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

Thomas Klausner  changed:

   What|Removed |Added

 CC||d...@plix.at

--- Comment #21 from Thomas Klausner  ---
I have a similar but slightly different solution (developed for importing 10mil
authority records). Instead of using MARC::Batch, I use XML::LibXML::Reader
(which also implements a memory efficient pull parser)

I guess XML::Twig works just as well, so I will (hopefully today) review this
path.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-01-22 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #19 from Leo Stoyanov  ---
Created attachment 176921
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=176921&action=edit
Bug 37020: [alternate] Changed script to use XML:Twig to reduce memory usage.

In the original script, MARC::Batch->new('XML', ...) was holding onto large
chunks of parsed data for MARCXML files, resulting in excessive memory usage
even after the script tried to free references. A hacky approach was to
manually clear the batch’s internal structures, but Data::Dumper showed that
MARC::Batch wasn’t storing data in those specific fields. Hence, the hack
offered no solution to the underlying caching. By contrast, XML::Twig can
stream XML elements one by one, calling a handler and then discarding the
parsed chunk.

Note, there is no guarantee this implementation works for non-XML files as it
stands (although, in theory it should). The focus is on the XML:Twig
implementation for reference as a solution. Overall, batching seems to be
eating up memory.

To test:
1. Run perl misc/migration_tools/bulkmarcimport.pl -m=MARCXML -b -d -v
--commit=1000 --file=file_path_here on a large MARCXML/XML file (for example, 2
GB or greater).
2. On whatever machine or container it is ran, the script will likely cause an
out-of-memory error and crash the environment.
3. Apply the patch, run "restart_all", and redo step 1. The script should
utilize much less memory to import records from MARCXML/XML files.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-01-16 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #18 from Jan Kissig  ---
(In reply to Kyle M Hall (khall) from comment #13)
> (In reply to Jan Kissig from comment #9)
> > Just tried to import 120k records via 
> 
> Jan, is there any chance you can attach your records file to this bug?

Hi Kyle, here are the records I tried. Modified to fit ktd.

https://nextcloud.th-wildau.de/nextcloud/index.php/s/fB3or6bXNX9fo9E

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-01-16 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

Leo Stoyanov  changed:

   What|Removed |Added

 CC||leo.stoyanov@bywatersolutio
   ||ns.com

--- Comment #17 from Leo Stoyanov  ---
(In reply to Jan Kissig from comment #15)
> (In reply to Kyle M Hall (khall) from comment #13)
> > (In reply to Jan Kissig from comment #9)
> > > Just tried to import 120k records via 
> > 
> > Jan, is there any chance you can attach your records file to this bug?
> 
> I will find a place to upload my 120k-record file. But for the beginning I
> attached a single record. Can you import and verify that result is something
> like: 
> 
> 3 MARC records done in ...

I imported your attached file, and the result was indeed "3 MARC records done
in 0.290374994277954 seconds".

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-01-16 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #16 from Michał  ---
This really highlights the difference of the default patch files as done in
Bugzilla. There should be a way to generate a separate git diff that ignores
the whitespace changes completely, IDEs can do it, but there doesn't seem to be
any popular way to convert an existing diff to skip displaying them... Even
displaying the diff with some better viewers like Kompare, it's still not much
easier to read. So we can only ask for complimentary git diff -w
(--ignore-all-space) attachment to make review easier I guess (I would generate
it myself, but I can't clone Koha here on the internet I have today).

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-01-16 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #15 from Jan Kissig  ---
(In reply to Kyle M Hall (khall) from comment #13)
> (In reply to Jan Kissig from comment #9)
> > Just tried to import 120k records via 
> 
> Jan, is there any chance you can attach your records file to this bug?

I will find a place to upload my 120k-record file. But for the beginning I
attached a single record. Can you import and verify that result is something
like: 

3 MARC records done in ...

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-01-16 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #14 from Jan Kissig  ---
Created attachment 176653
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=176653&action=edit
MARCXML with 1 record

perl bulkmarcimport.pl -m=MARCXML -b --commit=1000
--file=path/to/bib-262.marcxml

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-01-16 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #13 from Kyle M Hall (khall)  ---
(In reply to Jan Kissig from comment #9)
> Just tried to import 120k records via 

Jan, is there any chance you can attach your records file to this bug?

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-01-16 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #12 from Kyle M Hall (khall)  ---
(In reply to Katrin Fischer from comment #11)
> @Kyle: I think better keep the tidy separate here, hard to see any changes?

There isn't any tidying in this patch ( that I can recall :). It really just
involved moving move of the code from the lower loop into the upper loop (
which does change the indentation fwiw ) so there is only one loop instead of
two.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-01-15 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #11 from Katrin Fischer  ---
@Kyle: I think better keep the tidy separate here, hard to see any changes?

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-01-15 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

Katrin Fischer  changed:

   What|Removed |Added

   Assignee|koha-b...@lists.koha-commun |kyle.m.h...@gmail.com
   |ity.org |
 Status|Signed Off  |Failed QA

--- Comment #10 from Katrin Fischer  ---
Hi Jan, please set to Failed QA (and don't feel bad)

-- 
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-01-13 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #9 from Jan Kissig  ---
Just tried to import 120k records via 

kohadev-koha@kohadevbox:migration_tools(bug_37020)$ perl bulkmarcimport.pl -b
-d --m=MARCXML --file=records.xml --commit=1000

At first it went as expected but after 120k it went on and on and then finally
I had to kill the process in order to get control over my machine again.

So I tried with a small file with 1 record: 

perl bulkmarcimport.pl -b -d --m=MARCXML --file=../../../thw/journal.xml
--commit=1
Deleting biblios
.Use of uninitialized value in concatenation (.) or string at
/usr/share/perl5/MARC/File/XML.pm line 399,  chunk 2.

3 MARC records done in 0.0895359516143799 seconds

So there seems an error in the loop but I did not dig into it..

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-01-13 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

Magnus Enger  changed:

   What|Removed |Added

 Attachment #176388|0   |1
is obsolete||

--- Comment #8 from Magnus Enger  ---
Created attachment 176425
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=176425&action=edit
Bug 37020: Stream records from file instead of loading all into memory

One of the issues with bulkmarcimport.pl is that the script loads the entire
record set into memory
before processing those records. If we load and process those records one at a
time the memory
needed will be miniscule.

Test Plan:
1) Import a very large record set, note the ram consumption be the script
2) Apply this patch
1) Import the same record set, the ram consumption should be reduced!

Signed-off-by: Magnus Enger 
The patch reduces the RAM used when importing a large file.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-01-13 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

Magnus Enger  changed:

   What|Removed |Added

   Patch complexity|--- |Small patch
 Status|Needs Signoff   |Signed Off

-- 
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-01-10 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #7 from Kyle M Hall (khall)  ---
Created attachment 176388
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=176388&action=edit
Bug 37020: Stream records from file instead of loading all into memory

One of the issues with bulkmarcimport.pl is that the script loads the entire
record set into memory
before processing those records. If we load and process those records one at a
time the memory
needed will be miniscule.

Test Plan:
1) Import a very large record set, note the ram consumption be the script
2) Apply this patch
1) Import the same record set, the ram consumption should be reduced!

-- 
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2025-01-10 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

Kyle M Hall (khall)  changed:

   What|Removed |Added

 Status|NEW |Needs Signoff

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2024-12-27 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

Enica Davis  changed:

   What|Removed |Added

 CC||en...@bywatersolutions.com

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2024-12-27 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

Andrew Fuerste-Henry  changed:

   What|Removed |Added

 CC||martin.renvoize@ptfs-europe
   ||.com, tomasco...@gmail.com

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2024-12-27 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

Andrew Fuerste-Henry  changed:

   What|Removed |Added

 CC||and...@bywatersolutions.com
   ||,
   ||k...@bywatersolutions.com,
   ||n...@bywatersolutions.com

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2024-08-12 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #6 from Katrin Fischer  ---
(In reply to Aditya from comment #5)
> I am having even bigger problems due to the mysql warnings. I have used
> koha-dump to backup koha 23.11 in server 1. I have installed latest Koha
> 24.05 in server 2 and ran koha-restore. As it's a version upgrade, it's
> expected to have web installer. 
> 
> When I am going through database upgrade step, I am getting errors and can't
> proceed further: "WARNING: MYSQL_OPT_RECONNECT is deprecated and will be
> removed in a future version.". 
> 
> I have used "apt install --reinstall koha-common", it seemed like the issue
> got fixed with the same warning in terminal, but the database doesn't seem
> to be upgraded as I am now getting "Error 500" in Circulation and Fine Rules
> page, and empty fields in "Identity Provider" edit buttons, which was
> working perfectly fine in version 23.11. 
> 
> I have checked through installer codebase, it seems like whenever DBD::mysql
> generates any warning logs, the Koha's error handling mechanism mistakes it
> as error instead of warnings (line 441 of
> "/usr/share/koha/intranet/cgi-bin/installer/install.pl"). And I believe, the
> same error mechanism is reflected either throughout the Koha codebase or in
> DBD::mysql package. 
> 
> Possible fix: 
> 1. Change logic for error handling
> 2. Add "SET sql_notes=0" everywhere before having any query of SQL
> 3. Bundle error-free DBD::mysql perl package with Koha that has no such issue

Hi, this bug report is about the import script bulkmarcimport.pl. I think you
might be experiencing different bugs, like bug 37533 (error 500 on circulation
rules). Please check on the mailing list or Mattermost chat first if you are
experiencing different issues like this. We will often be able to point you to
the right bugs or solution.

-- 
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2024-08-11 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

Aditya  changed:

   What|Removed |Added

 CC||adifb...@gmail.com

--- Comment #5 from Aditya  ---
I am having even bigger problems due to the mysql warnings. I have used
koha-dump to backup koha 23.11 in server 1. I have installed latest Koha 24.05
in server 2 and ran koha-restore. As it's a version upgrade, it's expected to
have web installer. 

When I am going through database upgrade step, I am getting errors and can't
proceed further: "WARNING: MYSQL_OPT_RECONNECT is deprecated and will be
removed in a future version.". 

I have used "apt install --reinstall koha-common", it seemed like the issue got
fixed with the same warning in terminal, but the database doesn't seem to be
upgraded as I am now getting "Error 500" in Circulation and Fine Rules page,
and empty fields in "Identity Provider" edit buttons, which was working
perfectly fine in version 23.11. 

I have checked through installer codebase, it seems like whenever DBD::mysql
generates any warning logs, the Koha's error handling mechanism mistakes it as
error instead of warnings (line 441 of
"/usr/share/koha/intranet/cgi-bin/installer/install.pl"). And I believe, the
same error mechanism is reflected either throughout the Koha codebase or in
DBD::mysql package. 

Possible fix: 
1. Change logic for error handling
2. Add "SET sql_notes=0" everywhere before having any query of SQL
3. Bundle error-free DBD::mysql perl package with Koha that has no such issue

-- 
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2024-06-18 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #4 from Jan Kissig  ---
I can confirm on ktd with 24.06.:

- 8GB RAM, 4GB swap, 120k Records -> Killed
- 12GB RAM, 4GB swap, 120k Records -> Worked

No entries in ktd koha logs. 

Like mentioned before. The script seems to be hanging in the first loop, there
is no commit of records in my output found. 

Shall i link to my marcxml for someone else to test?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2024-06-18 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #3 from Jan Kissig  ---
Hi Michał

(In reply to Michał from comment #2)
> Are you sure it's a regression and wouldn't happen with the older version of
> the script? 

I just checked it, I ran bulkmarcimport around 20 times before on the same
machine, with the same amount of records (~120k) on versions < 24.05. Everytime
it worked.

> How much swap space does your installation have?

4GB

> Generally back when I ran Koha VM with very low RAM (like 2 or 4 GB), some
> older version like 23.05, I'd have issues with various cron tasks crashing
> due to OOM (out-of-memory), up until I gave it some significant swap space
> (like 16 or 24 GB), which I think was even also a recommendation somewhere.
> 
> Just thinking if it could perchance be simply a matter of that rather than a
> regression per se (though it might be both!).

I tried also on ktd with 8GB RAM and it seems that docker is crashing because
of bulkmarcimport. I am trying to give it more ressources.

I also gave the script a little extra output in both loops and it always
stopped in the first while-loop
(https://git.koha-community.org/Koha-community/Koha/src/branch/main/misc/migration_tools/bulkmarcimport.pl#L317)
Maybe that @marc_records array is causing my error.

-- 
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2024-06-17 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

Michał  changed:

   What|Removed |Added

 CC||schodkowy.omegi-0r@icloud.c
   ||om

--- Comment #2 from Michał  ---
Are you sure it's a regression and wouldn't happen with the older version of
the script? How much swap space does your installation have?

Generally back when I ran Koha VM with very low RAM (like 2 or 4 GB), some
older version like 23.05, I'd have issues with various cron tasks crashing due
to OOM (out-of-memory), up until I gave it some significant swap space (like 16
or 24 GB), which I think was even also a recommendation somewhere.

Just thinking if it could perchance be simply a matter of that rather than a
regression per se (though it might be both!).

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2024-06-17 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

Katrin Fischer  changed:

   What|Removed |Added

   Keywords||RM_priority

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2024-06-17 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

--- Comment #1 from Katrin Fischer  ---
Is there anything in the other logs maybe that could give us some more
information on the issue?

-- 
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2024-06-04 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

Magnus Enger  changed:

   What|Removed |Added

 CC||mag...@libriotech.no

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 37020] bulkmarcimport gets killed after update to 24.05. when inserting large files

2024-06-04 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020

Jan Kissig  changed:

   What|Removed |Added

 Depends on||29440


Referenced Bugs:

https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=29440
[Bug 29440] Refactor/clean up bulkmarcimport.pl
-- 
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/