https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38913

--- Comment #14 from Janusz Kaczmarek <janus...@gmail.com> ---
(In reply to Jonathan Druart from comment #13)
> With this patch:
> "1 MARC records done in 81.9053399562836 seconds"
> 
> However, I have delete all biblio and background_jobs before the import and
> now I have:
> 
> MariaDB [koha_kohadev]> select count(*) from biblio\G
> count(*): 1
> 
> 
> MariaDB [koha_kohadev]> select count(*) from background_jobs\G
> count(*): 2508
> 
> Interesting!...

Well, every Koha::Item->store triggers $indexer->index_records, so no wonder --
we have 2508 952 fields in the test record :)

> > 2. Stage the file and import using the UI
> 
> Same with this patch.

This is also more or less clear.  In C4::ImportBatch::BatchCommitRecords called
by the worker we call: 

my $marc_record = MARC::Record->new_from_usmarc($rowref->{'marc'});

despite of having also the marcxml representation of the record in the
import_records table (import_records.marcxml). 

This is exactly the same issue that made David's patch die with this kind of
records.  Worker dies because if the uncaught die generated by new_from_usmarc.
This has nothing to do with the patch (and with the previous David's patch) --
just another case of a call to a function that potentially dies without any
eval / try. 


Now, if we create and save to import_records table both versions (iso2709 and
marcxml) in C4::ImportBatch::_create_import_record, why not to use marcxml
version in C4::ImportBatch::BatchCommitRecords instead of iso2709 which creates
trouble in case of oversized records?

After this little change it seems to work - I was able to import the huge test
record with UI:

diff --git a/C4/ImportBatch.pm b/C4/ImportBatch.pm
index 5aebaafacf..799b69f0ca 100644
--- a/C4/ImportBatch.pm
+++ b/C4/ImportBatch.pm
@@ -531,7 +531,7 @@ sub BatchCommitRecords {
     my $item_tag;
     my $item_subfield;
     my $dbh = C4::Context->dbh;
-    my $sth = $dbh->prepare("SELECT import_records.import_record_id,
record_type, status, overlay_status, marc, encoding
+    my $sth = $dbh->prepare("SELECT import_records.import_record_id,
record_type, status, overlay_status, marc, marcxml, encoding
                              FROM import_records
                              LEFT JOIN import_auths ON
(import_records.import_record_id=import_auths.import_record_id)
                              LEFT JOIN import_biblios ON
(import_records.import_record_id=import_biblios.import_record_id)
@@ -568,7 +568,7 @@ sub BatchCommitRecords {
         } else {
             $marc_type = 'USMARC';
         }
-        my $marc_record = MARC::Record->new_from_usmarc($rowref->{'marc'});
+        my $marc_record = MARC::Record->new_from_xml($rowref->{'marcxml'},
$rowref->{'encoding'});

         if ($record_type eq 'biblio') {
             # remove any item tags - rely on _batchCommitItems


> 
> > 3. Now the record is in the DB, start a full reindex:
> > 
> > % koha-elasticsearch --rebuild -b  kohadev
> > UTF-8 "\xC4" does not map to Unicode at /usr/share/perl5/MARC/File/Encode.pm
> > line 35.
> > Something went wrong rebuilding indexes for kohadev
> > 
> > No info on the problematic record! We should tell which record failed.
> 
> We don't have anything in the output, which is problematic IMO.

Yes, this is problematic, because new_from_usmarc died and we didn't catch it. 
But now since we call it in eval we should be save with this.

-- 
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/

Reply via email to