I think I've got it going by adding a little loop that checks the indexer state 
via the response body of the indexer (ie does it contain the string "paused") 
and then sending a "put" with the duration parameter set to either pause (you 
can specify an exact duration or just let it default to 900 seconds) or resume 
(duration = 0). Things to note:

1) For some reason, the indexer actually listens at 
AppConfig[:indexer_url]/aspace-indexer/ not just AppConfig[:indexer_url]
2) There's no nice ASHTTP wrapper for put, so you have to construct the 
Net::HTTP for the put yourself

I don't see why similar logic couldn't be incorporated into any import job so 
that the import has a chance to finish up before the indexer runs again, 
preventing the sync issues in ANW-902.

I've got about 280k objects to check and update, so I'll see if I run into any 
indexer issues once the job is completed. The only thing I've seen that may be 
related to that is a snapshot failure when doing a large index run (full or 
otherwise), but I don't think I've ever seen it completely fail due to a commit 
timeout. That almost sounds more like disk access or network (if you're running 
a separate SOLR instance).

Joshua

________________________________
From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
<archivesspace_users_group-boun...@lyralists.lyrasis.org> on behalf of Andrew 
Morrison <andrew.morri...@bodleian.ox.ac.uk>
Sent: Wednesday, February 19, 2020 8:32 AM
To: Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org>
Subject: Re: [Archivesspace_Users_Group] Method to Pause Indexer during Job Run?

I'd be interested in hearing if you get this to work, because it could be 
useful in fixing this issue:

https://archivesspace.atlassian.net/browse/ANW-902<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Farchivesspace.atlassian.net%2Fbrowse%2FANW-902&data=02%7C01%7Cjoshua.d.shaw%40dartmouth.edu%7Cd12063b50e4448bf978608d7b5403726%7C995b093648d640e5a31ebf689ec9446f%7C0%7C0%7C637177159721299253&sdata=%2BorLCZa2QQJV24OC954RIbJg5ZNb1luGcyMtm0npsXI%3D&reserved=0>

Also, if you're making a truly mammoth update, which will be followed by a 
re-index of nearly everything, you might want to consider increasing the 
AppConfig[:indexer_solr_timeout_seconds] config setting. It may be our 
infrastructure, but I've found that Solr commit's phase can take so long that 
ArchivesSpace times out before it finishes, causing it to start the whole 
re-index again from scratch. We've set it to 1800 to avoid this, but YMMV.

Andrew.

________________________________
From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
<archivesspace_users_group-boun...@lyralists.lyrasis.org> on behalf of Joshua 
D. Shaw <joshua.d.s...@dartmouth.edu>
Sent: 19 February 2020 13:05
To: Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org>
Subject: Re: [Archivesspace_Users_Group] Method to Pause Indexer during Job Run?

Thanks, James. I glanced at that, but somehow didn't realize those were 
endpoints I could hit. I'll give it a go!

Joshua

________________________________
From: archivesspace_users_group-boun...@lyralists.lyrasis.org 
<archivesspace_users_group-boun...@lyralists.lyrasis.org> on behalf of James 
Bullen <ja...@hudmol.com>
Sent: Tuesday, February 18, 2020 7:16 PM
To: Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org>
Subject: Re: [Archivesspace_Users_Group] Method to Pause Indexer during Job Run?


Hi Joshua,

I haven’t used it, but I see these endpoints in indexer/app/main.rb

  get "/" do
    if IndexerCommon.paused?
      "Indexers paused until 
#{IndexerCommon.class_variable_get(:@@paused_until)}"
    else
      "Running every #{AppConfig[:solr_indexing_frequency_seconds].to_i} 
seconds. "
    end
  end

  # this pauses the indexer so that bulk update and migrations can happen
  # without bogging down the server
  put "/" do
    duration = params[:duration].nil? ? 900 : params[:duration].to_i
    IndexerCommon.pause duration
    "#{IndexerCommon.class_variable_get(:@@paused_until)}"
  end


Seems to do what you want.


Cheers,
James


On Feb 19, 2020, at 6:29 AM, Joshua D. Shaw 
<joshua.d.s...@dartmouth.edu<mailto:joshua.d.s...@dartmouth.edu>> wrote:

Hey all-

I writing a job that may take a *long* time (hours) to complete which will be 
updating a *lot* of AO records. I'm wondering if there's a way to pause the 
Indexer during a job so that I can let the Indexer do its thing*after* the job 
completes. I know I can toggle the AppConfig value for the indexer and do a 
stop/start for the app, but ideally I'd like to do the pause/resume of the 
Indexer while the job runs.

I could also set this up as a migration, but the updates include a bunch of 
tables (I'm adding an instance to AOs which meet certain criteria) and I'd 
prefer to use the API to do things to be safe.

Any thoughts on pausing the Indexer during a job, or do I bite the bullet and 
do this as a migration?

Thanks!
Joshua

___________________
Joshua Shaw (he, him)
Technology Coordinator
Rauner Special Collections Library & Digital Library Technologies Group
Dartmouth College
603.646.0405
!DSPAM:5e4c3b1e193891489818497! _______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org<mailto:Archivesspace_Users_Group@lyralists.lyrasis.org>
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group<https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Flyralists.lyrasis.org%2Fmailman%2Flistinfo%2Farchivesspace_users_group&data=02%7C01%7Cjoshua.d.shaw%40dartmouth.edu%7Cd12063b50e4448bf978608d7b5403726%7C995b093648d640e5a31ebf689ec9446f%7C0%7C0%7C637177159721299253&sdata=7GnwRz5a1RviSlz84dZ%2FZP%2FYVzZoaflXBDZNTACygfo%3D&reserved=0>


!DSPAM:5e4c3b1e193891489818497!

_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group@lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group

Reply via email to