Hi, Michele!

My replies to specific points of yours appear under the relevant text below.

On 8/6/21 6:03 PM, Morgan, Michele via Evergreen-dev wrote:
We've been looking at record loading speed, specifically records with located uri's. We do a big business in electronic resources, and expect it will only increase.

We currently load the records using Vandelay via the client. I know there are scripts available that others have kindly shared and we're looking at scripting the loads.

Most of our loads are done by staff using Vandelay. We do have a script to load large batches of records with located URIs that we use when staff request it. I have shared a link to it in IRC several times. I will include it here for the sake of anyone who missed it and is interested in having a look:

https://pastebin.com/g4RGDJLr


We've been monitoring the vandelay session tracker in the database and have found that records with 1 uri take on average 2 seconds to load, 3 uris take on average 4 seconds, 10 uris take on average 12 seconds. We are running postgres 9.6.

I have found that, depending on your database hardware and configuration, it can take two seconds to update a bib record regardless of whether or not it has located URIs. This seemed to improve in 2018 when we bought super fast hardware.

Authority updates, for authorities with many linked bibs, can take even longer.

In general, upgrading PostgreSQL to Pg 11 or later improves the performance of authority updates markedly. Upgrading PostgreSQL seems to have less impact on bibliographic record updates and appears to have a detrimental effect on updating or inserting bib records with located URIs.

PostgreSQL 10 seems to perform at a level comparable to PostgreSQL version 9.6 for most things. In some areas, it seems to make a slight improvement.


I know Jason Stephenson is testing newer versions of postgres and finding the loading even slower.

There are so many database triggers that run when a marc is updated, and we are wondering if efficiency could be improved in some of the functions. Does anyone have ideas about avenues to make the processing more efficient?

At CW MARS, we are using a modified version of the script at the link that I shared above to test the speed of updating and inserting records with located URIs on the various releases of PostgreSQL. This is being done in conjunction with a PostgreSQL database extension called plprofiler (https://github.com/bigsql/plprofiler). The extension will gather statistics on how long the process spends in each database function, and a report can then be generated to pinpoint those areas that require attention.

The actual profiling of a load began yesterday morning with PostgreSQL version 10. As it takes days to load 12,567 records on the hardware that I have for testing, I anticipate that I will be able to do 1 or 2 test loads per week, at most.

We tried PostgreSQL 12 on our training server briefly toward the beginning of the year. We found additional places where Evergreen performance seemed to be negatively impacted by the newer database version, with at least one of these being noticeable in the web staff client. (I am sorry to say that I do not remember the specific interface that was affected at this time, and I've had a difficult time finding it in my emails, tickets, or notes.) Suffice it to say that more areas of Evergreen functionality should be tested with more recent releases of PostgreSQL before we can recommend anyone use a version more recent than Pg 10. As I have pointed out elsewhere, there is some urgency for this work since PostgreSQL community support for 9.6 ends in November 2021 and support for 10 end in November 2022.

See: http://list.evergreen-ils.org/pipermail/evergreen-dev/2021-July/000172.html

I endeavor to document and coordinate work with others in a shared Google folder:

https://drive.google.com/drive/folders/1sRZ8P1RHCOcZx42DxUehvOpoNkJJnfqn

I highly encourage anyone who wants to see Evergreen's database performance improve over the next year to get involved.

Cheers,
Jason

_______________________________________________
Evergreen-dev mailing list
[email protected]
http://list.evergreen-ils.org/cgi-bin/mailman/listinfo/evergreen-dev

Reply via email to