Brian,

If marc_export causes a crash, then your database server is under powered (most likely). There could be something buggy or misconfigured somewhere.

marc_export does not use Apache or the Evergreen back end in any serious way. It looks up some settings via OpenSRF::Utils::SettingsClient and that's it. The rest of the time it runs select statements in the database.

If you want to batch export records, short of writing your own tool, marc_export is it. I also doubt you would gain much performance from a custom tool because the database is the main bottleneck. It's possible that some of the queries used by marc_export could be improved, particularly for more recent PostgreSQL versions.

I implement many custom exports, and they are typically wrappers around marc_export. I will implement queries to find the set of records that I want and then pipe the record IDs into marc_export, or the script might determine what options to use when running marc_export based on a configuration file.

If you have any specific ideas to improve marc_export or the export process from the staff client, feel free to file bugs on Launchpad: https://bugs.launchpad.net/evergreen.

Sorry that I can't be more helpful at this time.
Jason Stephenson

On 10/9/23 08:56, Brian Holda via Evergreen-dev wrote:
Or maybe a better way to ask this. Have people found a good way to export a large number of marc records within Evergreen? We found the staff client way to do it. And it processes files of 5-10,000 records at a time. But if we want to do 1 million records, let's say, it's a bit tedious. So then I found the marc_export script <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.evergreen-2Dils.org_3.2_-5Fmarc-5Fexport-5Fexporting-5Fbibliographic-5Frecords-5Finto-5Fmarc-5Ffiles.html&d=DwMGaQ&c=4rZ6NPIETe-LE5i2KBR4rw&r=rB3XDC6iCWGkjZtiGXbRHlEfGQP12yvXoVpChsQG6IY&m=PZmft0gaDkWQJ-PsQAmuzLbFyoxYjph24cGK4vqaTEXFnORZ9vEDMUnFRbuzb4np&s=28sCuznEp6eX3W0zi51ryW-NBwTZ0P8RjE4l6oEL_rE&e=>. But that crashed our server doing it with 3,000 records at a time. We have ideas on how to modify the process, and it's not terrible using the staff client way, but I figure this must be a somewhat common task that others have good solutions for? Anyone willing to share 🙂?

Thanks,
Brian

Brian Holda
Library Technology Manager
Hekman Library
Calvin University
(616) 526-8673

<https://library.calvin.edu/>

------------------------------------------------------------------------
*From:* Evergreen-dev <[email protected]> on behalf of Brian Holda via Evergreen-dev <[email protected]>
*Sent:* Thursday, October 5, 2023 4:25 PM
*To:* Evergreen Development Discussion List <[email protected]>
*Subject:* [Evergreen-dev] marc_export - apache crashes
Hi all,

Not sure if it's user error or something else going on, so wanted to see if any of you all have experience using marc_export script <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.evergreen-2Dils.org_3.2_-5Fmarc-5Fexport-5Fexporting-5Fbibliographic-5Frecords-5Finto-5Fmarc-5Ffiles.html&d=DwMGaQ&c=4rZ6NPIETe-LE5i2KBR4rw&r=rB3XDC6iCWGkjZtiGXbRHlEfGQP12yvXoVpChsQG6IY&m=PZmft0gaDkWQJ-PsQAmuzLbFyoxYjph24cGK4vqaTEXFnORZ9vEDMUnFRbuzb4np&s=28sCuznEp6eX3W0zi51ryW-NBwTZ0P8RjE4l6oEL_rE&e=> and had similar problems.

In brief:

  * Tue, 5pm - I ran the following test (this is for a file of 3,100
    records). This took about 30 sec. and successfully created the
    export file without any noticeable effects on our apache2 server:
    |cat /home/opensrf/marc-test.txt | marc_export --reporter -i -c
    /openils/conf/opensrf_core.xml     -x /openils/conf/fm_IDL.xml -f
    XML --timeout 5 > exported_files.xml|
  *
    Wed, 11:40am- I ran what I thought was essentially the same test
    (for the same file of 3,100 records). This also took about 30 sec.
    and successfully created the export file. However, 8 min. later
    apache crashed and had to be restarted. In the error log, it said
    "couldn't grab the accept mutex" immediately before crashing. Here's
    the code I ran:
    cat /tmp/marc-output/marc1.txt | marc_export --reporter -i -c
    /openils/conf/opensrf_core.xml     -x /openils/conf/fm_IDL.xml -f
    XML --timeout 5 > /tmp/marc-output/exported-marc1.xml
  *
    Wed, 4pm- I ran essentially the same command (for the same file of
    3,100 records), but without using the |tmp|​ folder. This time it
    stalled and after waiting a few minutes we pressed |ctrl|​ + |c|​
    which I assumed stopped everything cleanly, as it returned me to the
    command prompt. However, at 4:50pm apache quit again, with the same
    "couldn't grab the accept mutex" messages beforehand. Here's the
    code I ran this time:
    |cat /home/opensrf/marc2.txt | marc_export --reporter -i -c
    /openils/conf/opensrf_core.xml \ -x /openils/conf/fm_IDL.xml -f XML
    --timeout 5 > /home/opensrf/exported-marc2.xml|

Anyone know what might be happening here?

Brian Holda
Library Technology Manager
Hekman Library
Calvin University
(616) 526-8673

<https://library.calvin.edu/>


_______________________________________________
Evergreen-dev mailing list
[email protected]
http://list.evergreen-ils.org/cgi-bin/mailman/listinfo/evergreen-dev
_______________________________________________
Evergreen-dev mailing list
[email protected]
http://list.evergreen-ils.org/cgi-bin/mailman/listinfo/evergreen-dev

Reply via email to