[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-02-02 Thread bugzilla-daemon--- via Koha-bugs
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

Lucas Gass  changed:

   What|Removed |Added

 Status|Pushed to stable|Pushed to oldstable
 CC||lu...@bywatersolutions.com
 Version(s)|24.05.00,23.11.02   |24.05.00,23.11.02,23.05.09
released in||

--- Comment #25 from Lucas Gass  ---
Backported to 23.05.x for upcoming 23.05.09

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-17 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

Fridolin Somers  changed:

   What|Removed |Added

 Status|Pushed to master|Pushed to stable
 Version(s)|24.05.00|24.05.00,23.11.02
released in||

--- Comment #24 from Fridolin Somers  ---
Pushed to 23.11.x for 23.11.02

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-16 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

David Nind  changed:

   What|Removed |Added

  Text to go in the|This enables breaking large |This enables breaking large
  release notes|Elasticsearch or Open   |Elasticsearch or Open
   |Search indexing requests|Search indexing requests
   |into smaller chunks (for|into smaller chunks (for
   |example, from batch |example, when updating many
   |modifications). It adds a   |records using batch
   |chunk_size configuration to |modifications).
   |the elasticsearch section   |
   |in koha-conf.xml (the   |This means
   |default is 5,000:   |that instead of sending a
   |5000). So instead of sending  |for indexing, which could
   |a single background request |exceed the limits of the
   |for indexing, which could   |search server or take up
   |exceed the limits of the|too many resources, it
   |search server or take up|limits index update
   |too many resources, this|requests to a more
   |limits index update |manageable size.
   |requests to a more  |
   |manageable size.
   |The
   |
   |default chunk size is
   |NOTE:   |5,000. To configure a
   |This doesn't change the |different chunk size, add a
   |command line indexing   | directive to
   |script, as this already |the elasticsearch section
   |allows passing a commit |of the instance's
   |size defining how many  |koha-conf.xml (for example:
   |records to send.|2000).
   ||
   ||NOTE: This doesn't
   ||change the command line
   ||indexing script, as this
   ||already allows passing a
   ||commit size defining how
   ||many records to send.

--- Comment #23 from David Nind  ---
Tweaked the release notes text - feel free to improve!

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-16 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

--- Comment #22 from Katrin Fischer  ---
Pushed for 24.05!

Well done everyone, thank you!

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-16 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

Katrin Fischer  changed:

   What|Removed |Added

 Version(s)||24.05.00
released in||
 Status|Passed QA   |Pushed to master

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-11 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

David Nind  changed:

   What|Removed |Added

  Text to go in the|This enables breaking large |This enables breaking large
  release notes|Elasticsearch or Open   |Elasticsearch or Open
   |Search indexing requests|Search indexing requests
   |into smaller chunks (for|into smaller chunks (for
   |example, from batch |example, from batch
   |modifications). It adds a   |modifications). It adds a
   |chunk_size configuration to |chunk_size configuration to
   |the elasticsearch section   |the elasticsearch section
   |in koha-conf.xml (for   |in koha-conf.xml (the
   |example:|default is 5,000:
   |2505000). So instead of sending a |e>). So instead of sending
   |single background request   |a single background request
   |for indexing, which could   |for indexing, which could
   |exceed the limits of the|exceed the limits of the
   |search server or take up|search server or take up
   |too many resources, this|too many resources, this
   |limits index update |limits index update
   |requests to a more  |requests to a more
   |manageable size.
   |manageable size.
   |
   |
   |NOTE:   |NOTE:
   |This doesn't change the |This doesn't change the
   |command line indexing   |command line indexing
   |script, as this already |script, as this already
   |allows passing a commit |allows passing a commit
   |size defining how many  |size defining how many
   |records to send.|records to send.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-11 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

--- Comment #21 from Jonathan Druart  ---
Created attachment 160856
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=160856=edit
Bug 35086: (follow-up) Use 5000 as example in conf file

Signed-off-by: Jonathan Druart 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-11 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

--- Comment #20 from Jonathan Druart  ---
Created attachment 160855
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=160855=edit
Bug 35086: Tidy tests

Signed-off-by: David Nind 

Signed-off-by: Jonathan Druart 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-11 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

--- Comment #19 from Jonathan Druart  ---
Created attachment 160854
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=160854=edit
Bug 35086: Also split chunks when indexing from background job

The es background indexer is designed to combine background jobs when started
based on the 'batch_size' option.

While this is helpful for combining individual updates, it can be problematic
when there are several large batch modifications, or when worker has stopped
and is restarted.

This patch uses the same logic as in the indexer to split the chunks that are
sent directly for indexing.

To test:
1 - Follow test plan on previous patch
2 - Confirm items are correctly indexed and jobs marked

Signed-off-by: David Nind 

Signed-off-by: Jonathan Druart 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-11 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

Jonathan Druart  changed:

   What|Removed |Added

 Attachment #160560|0   |1
is obsolete||
 Attachment #160561|0   |1
is obsolete||
 Attachment #160562|0   |1
is obsolete||
 Attachment #160572|0   |1
is obsolete||

--- Comment #18 from Jonathan Druart  ---
Created attachment 160853
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=160853=edit
Bug 35086: Add chunk_size option to elasticsearch configuration

Whne performing batch operations we can send a large numebr of records for
reindexing at once.
Currently this can create requetss that are too large for Elasticsearch to
process. We need
to break these requests into chunks/

This patch adds a chunk_size configuration to the elasticsearch stanza in
koha-conf.xml

If blank we default to 5000.

To test:
0 - Have Koha using Elasticsearch
1 - Create and download a report of all barcodes:
SELECT barcode FROM items
2 - Batch modify these items
3 - Note a single ESindexing job is created
4 - Create and download a report of all authority ids:
SELECT auth_header.authid FROM auth_header
5 - Setup a marc modification template, and batch modify all the authorities
6 - Again note a single ES backgorund job is created
7 - Apply patch
8 - Repeat the modifications above - you still get a single job
9 - Edit koha-conf.xml and add 250 to elasticsearch
stanza
10 - Repeat modifications - you now get several background ES jobs
11 - prove -v t/db_dependent/Koha/SearchEngine/Elasticsearch/Indexer.t

Signed-off-by: David Nind 

Signed-off-by: Jonathan Druart 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-11 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

Jonathan Druart  changed:

   What|Removed |Added

 Status|Signed Off  |Passed QA

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-11 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

--- Comment #17 from Nick Clemens  ---
(In reply to Jonathan Druart from comment #16)
> Should not we surround the while with a try instead?... Not sure what's best
> here!

I'd rather try to index what we can, and only fail the bits that didn't work -
i.e. if we have a big job and encounter an error early - a try on the whole
thing would fail on the first chunk and stop. This way it tries each chunk - so
one might fail, but the rest succeed - if there are errors lets minimize what
needs to be reindexed

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-11 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

--- Comment #16 from Jonathan Druart  ---
I am wondering about the changes made to the worker.

+while ( ( my @auth_chunk = $auth_chunks->() ) ) {
+try {
+$auth_indexer->update_index( \@auth_chunk );
+} catch {
+$logger->warn( sprintf "Update of elastic index failed with:
%s", $_ );
+};
+}

Should not we surround the while with a try instead?... Not sure what's best
here!

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-05 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

--- Comment #15 from Nick Clemens  ---
(In reply to Jonathan Druart from comment #13)
> you have 500 in conf and 5000 in pm, is that expected?

500 seemed a more reasonable size in my head, but 5000 is more consistent with
our default indexing so I updated it.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-05 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

--- Comment #14 from Nick Clemens  ---
Created attachment 160572
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=160572=edit
Bug 35086: (follow-up) Use 5000 as example in conf file

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-04 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

--- Comment #13 from Jonathan Druart  ---
you have 500 in conf and 5000 in pm, is that expected?

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-04 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

--- Comment #12 from David Nind  ---
Here are the list of jobs from testing using a freshly started KTD:

-1316 - Before patch, for both item modification and authority 
record changes
1317-1318 - After patch, no chunking, item modifications
1319-2632 - After patch, no chunking, authority record changes
==> no change (as expected) to number of jobs after patch
applied and no chunking set
2633-2635 - After patch, chunking (250), item modifications
. 1 job for batch item modifications, 2 jobs for elastic search
  updates (1 batch of 250 and 1 batch of 161)
2636-3955 - After patch, chunking (250), authority record changes
. 2636 - Batch authority record modification
. 2637-3948 - Elasticsearch index updates for 1,312 individual
  bibliographic record updates
. 3949-3955 - 6 batches of 250, 1 batch of 206

I'm assuming this is what is expected, feel free to change the bug status if it
isn't.


Testing notes:

1. I tested using ES8 (ktd --es8 up).

2. For the modification of bibliographic records, I updated the 'z - Public
note' with some text.

3. For the modification of authority records, I had a rule to add some text to
680$i subfield.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-04 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

David Nind  changed:

   What|Removed |Added

 Attachment #160532|0   |1
is obsolete||

--- Comment #11 from David Nind  ---
Created attachment 160562
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=160562=edit
Bug 35086: Tidy tests

Signed-off-by: David Nind 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-04 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

David Nind  changed:

   What|Removed |Added

 Attachment #160531|0   |1
is obsolete||

--- Comment #10 from David Nind  ---
Created attachment 160561
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=160561=edit
Bug 35086: Also split chunks when indexing from background job

The es background indexer is designed to combine background jobs when started
based on the 'batch_size' option.

While this is helpful for combining individual updates, it can be problematic
when there are several large batch modifications, or when worker has stopped
and is restarted.

This patch uses the same logic as in the indexer to split the chunks that are
sent directly for indexing.

To test:
1 - Follow test plan on previous patch
2 - Confirm items are correctly indexed and jobs marked

Signed-off-by: David Nind 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-04 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

David Nind  changed:

   What|Removed |Added

 Attachment #160530|0   |1
is obsolete||

--- Comment #9 from David Nind  ---
Created attachment 160560
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=160560=edit
Bug 35086: Add chunk_size option to elasticsearch configuration

Whne performing batch operations we can send a large numebr of records for
reindexing at once.
Currently this can create requetss that are too large for Elasticsearch to
process. We need
to break these requests into chunks/

This patch adds a chunk_size configuration to the elasticsearch stanza in
koha-conf.xml

If blank we default to 5000.

To test:
0 - Have Koha using Elasticsearch
1 - Create and download a report of all barcodes:
SELECT barcode FROM items
2 - Batch modify these items
3 - Note a single ESindexing job is created
4 - Create and download a report of all authority ids:
SELECT auth_header.authid FROM auth_header
5 - Setup a marc modification template, and batch modify all the authorities
6 - Again note a single ES backgorund job is created
7 - Apply patch
8 - Repeat the modifications above - you still get a single job
9 - Edit koha-conf.xml and add 250 to elasticsearch
stanza
10 - Repeat modifications - you now get several background ES jobs
11 - prove -v t/db_dependent/Koha/SearchEngine/Elasticsearch/Indexer.t

Signed-off-by: David Nind 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-04 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

David Nind  changed:

   What|Removed |Added

 Status|Needs Signoff   |Signed Off

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-04 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

--- Comment #8 from Nick Clemens  ---
Created attachment 160532
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=160532=edit
Bug 35086: Tidy tests

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-04 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

--- Comment #7 from Nick Clemens  ---
Created attachment 160531
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=160531=edit
Bug 35086: Also split chunks when indexing from background job

The es background indexer is designed to combine background jobs when started
based on the 'batch_size' option.

While this is helpful for combining individual updates, it can be problematic
when there are several large batch modifications, or when worker has stopped
and is restarted.

This patch uses the same logic as in the indexer to split the chunks that are
sent directly for indexing.

To test:
1 - Follow test plan on previous patch
2 - Confirm items are correctly indexed and jobs marked

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-04 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

Nick Clemens  changed:

   What|Removed |Added

 Attachment #160301|0   |1
is obsolete||

--- Comment #6 from Nick Clemens  ---
Created attachment 160530
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=160530=edit
Bug 35086: Add chunk_size option to elasticsearch configuration

Whne performing batch operations we can send a large numebr of records for
reindexing at once.
Currently this can create requetss that are too large for Elasticsearch to
process. We need
to break these requests into chunks/

This patch adds a chunk_size configuration to the elasticsearch stanza in
koha-conf.xml

If blank we default to 5000.

To test:
0 - Have Koha using Elasticsearch
1 - Create and download a report of all barcodes:
SELECT barcode FROM items
2 - Batch modify these items
3 - Note a single ESindexing job is created
4 - Create and download a report of all authority ids:
SELECT auth_header.authid FROM auth_header
5 - Setup a marc modification template, and batch modify all the authorities
6 - Again note a single ES backgorund job is created
7 - Apply patch
8 - Repeat the modifications above - you still get a single job
9 - Edit koha-conf.xml and add 250 to elasticsearch
stanza
10 - Repeat modifications - you now get several background ES jobs
11 - prove -v t/db_dependent/Koha/SearchEngine/Elasticsearch/Indexer.t

Signed-off-by: David Nind 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2024-01-04 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

Nick Clemens  changed:

   What|Removed |Added

 Status|Signed Off  |Needs Signoff

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2023-12-26 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

David Nind  changed:

   What|Removed |Added

  Text to go in the||This enables breaking large
  release notes||Elasticsearch or Open
   ||Search indexing requests
   ||into smaller chunks (for
   ||example, from batch
   ||modifications). It adds a
   ||chunk_size configuration to
   ||the elasticsearch section
   ||in koha-conf.xml (for
   ||example:
   ||250). So instead of sending a
   ||single background request
   ||for indexing, which could
   ||exceed the limits of the
   ||search server or take up
   ||too many resources, this
   ||limits index update
   ||requests to a more
   ||manageable size.
   ||
   ||NOTE:
   ||This doesn't change the
   ||command line indexing
   ||script, as this already
   ||allows passing a commit
   ||size defining how many
   ||records to send.
 CC||da...@davidnind.com

--- Comment #5 from David Nind  ---
I've signed off, as everything in the test plan worked.

However, I did note one thing when updating authority records:

1. Starting fresh (patch applied, shutting down KTD, then starting up again so
that there are no previous jobs, adding the 250,
restarting everything).

2. After updating all the authority records by adding text to 680$1, there are
1320 entries for jobs:
   - the last 7 job entries (1314-1320) are for the elastic search index
updates, split into 250 chunks (1320 is for 206 record updates)
   - the first job entry (1) is for the batch authority record modification
(1706 modifications)
   - the rest of the job entries (2-1313) are for Elasticsearch index updates
to individual bibliographic records (from a sample I checked)

3. I'm assuming that the individual bibliographic record updates are because
the authority terms updated are linked to them.

4. I don't know whether this is what is expected, or whether there are plans to
chunk the subsequent individual bibliographic records updated because of
authority term changes. Or whether that is even possible.

5. Irrespective of that, this is still a great improvement!

Testing notes (using KTD):

1. I tested using ES8 (ktd --es8 up).

2. For the modification of bibliographic records, I updated the 'z - Public
note' with some text.

3. For the modification of authority records, I had a rule to add some text to
680$i subfield.

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2023-12-26 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

David Nind  changed:

   What|Removed |Added

 Attachment #160278|0   |1
is obsolete||

--- Comment #4 from David Nind  ---
Created attachment 160301
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=160301=edit
Bug 35086: Add chunk_size option to elasticsearch configuration

Whne performing batch operations we can send a large numebr of records for
reindexing at once.
Currently this can create requetss that are too large for Elasticsearch to
process. We need
to break these requests into chunks/

This patch adds a chunk_size configuration to the elasticsearch stanza in
koha-conf.xml

If blank we default to 5000.

To test:
0 - Have Koha using Elasticsearch
1 - Create and download a report of all barcodes:
SELECT barcode FROM items
2 - Batch modify these items
3 - Note a single ESindexing job is created
4 - Create and download a report of all authority ids:
SELECT auth_header.authid FROM auth_header
5 - Setup a marc modification template, and batch modify all the authorities
6 - Again note a single ES backgorund job is created
7 - Apply patch
8 - Repeat the modifications above - you still get a single job
9 - Edit koha-conf.xml and add 250 to elasticsearch
stanza
10 - Repeat modifications - you now get several background ES jobs
11 - prove -v t/db_dependent/Koha/SearchEngine/Elasticsearch/Indexer.t

Signed-off-by: David Nind 

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2023-12-26 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

David Nind  changed:

   What|Removed |Added

 Status|Needs Signoff   |Signed Off

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2023-12-22 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

Dani Elder  changed:

   What|Removed |Added

 CC||danielle.elder@law.utexas.e
   ||du

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2023-12-22 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

Nick Clemens  changed:

   What|Removed |Added

   Assignee|koha-b...@lists.koha-commun |n...@bywatersolutions.com
   |ity.org |

--- Comment #3 from Nick Clemens  ---
Note: this patch doesn't affect the command line indexing script which allows
you to pass a commit size defining how many records to send

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2023-12-22 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

--- Comment #2 from Nick Clemens  ---
Created attachment 160278
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=160278=edit
Bug 35086: Add chunk_size option to elasticsearch configuration

Whne performing batch operations we can send a large numebr of records for
reindexing at once.
Currently this can create requetss that are too large for Elasticsearch to
process. We need
to break these requests into chunks/

This patch adds a chunk_size configuration to the elasticsearch stanza in
koha-conf.xml

If blank we default to 5000.

To test:
0 - Have Koha using Elasticsearch
1 - Create and download a report of all barcodes:
SELECT barcode FROM items
2 - Batch modify these items
3 - Note a single ESindexing job is created
4 - Create and download a report of all authority ids:
SELECT auth_header.authid FROM auth_header
5 - Setup a marc modification template, and batch modify all the authorities
6 - Again note a single ES backgorund job is created
7 - Apply patch
8 - Repeat the modifications above - you still get a single job
9 - Edit koha-conf.xml and add 250 to elasticsearch
stanza
10 - Repeat modifications - you now get several background ES jobs
11 - prove -v t/db_dependent/Koha/SearchEngine/Elasticsearch/Indexer.t

-- 
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2023-12-22 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

Nick Clemens  changed:

   What|Removed |Added

 Status|NEW |Needs Signoff

-- 
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2023-10-28 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

Fridolin Somers  changed:

   What|Removed |Added

 CC||fridolin.som...@biblibre.co
   ||m

--- Comment #1 from Fridolin Somers  ---
I would prefer kconf so that test env with same database can be different.

We usually have prod using real cluster and test env using a single node
side-container.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2023-10-18 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

Nick Clemens  changed:

   What|Removed |Added

 Depends on||32594


Referenced Bugs:

https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=32594
[Bug 32594] Add a dedicated ES indexing background worker
-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/


[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

2023-10-17 Thread bugzilla-daemon
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

Nick Clemens  changed:

   What|Removed |Added

 CC||jonathan.druart+koha@gmail.
   ||com,
   ||katrin.fisc...@bsz-bw.de,
   ||martin.renvoize@ptfs-europe
   ||.com, tomasco...@gmail.com

-- 
You are receiving this mail because:
You are watching all bug changes.
___
Koha-bugs mailing list
Koha-bugs@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/