Hi Mark, Indeed; to your point of having the data available, we publish our HTML finding aids from EAD, so we have a cache of all EADs that is updated nightly with any deltas. So it's an easy and up-to-date data source for me to turn to.
We turned up over 6,000 unittitles with the string 'Mrs', although some of them will need no remediation (e.g., Mrs Doubtfire -- a unittitle from a movie reviewer's collection.) Kevin On Tue, May 19, 2020 at 4:08 PM Custer, Mark <mark.cus...@yale.edu> wrote: > Kevin, > > > > I love that approach! Much cleaner to write those queries against the > EAD/EAC exports than having to do all of the table joins directly on the > database. Of course, it does require having all of that data exported from > ArchivesSpace, but that’s already a good goal to have to share your > description with services like ArchiveGrid, etc. 😊 > > > > To partially answer your question Lara, the database query that I shared > previously will export bioghist notes that are attached to agent records, > but it keeps those notes in the format of the ArchivesSpace JSON records. > Good to have in the reports to eyeball, at least, but not something that > you’d want to edit directly. Here’s an example, which gets stuffed into a > single cell in our report: > > > > { > "jsonmodel_type": "note_bioghist", > "subnotes": [ > { > "jsonmodel_type": "note_text", > "content": "English stage actor.", > "publish": true, > "subnote_guid": "7e7523151fa57827315ceb1772346741" > } > ], > "persistent_id": "497d3dd4284731aa1331eac06928d10a" > } > > > > Of course, you could further process that output, or grab the contents of > the note directly with an API query, etc. But as Blake mentioned, I don’t > think that any of the built-in reports include those notes right now since > it does require extra processing to make the notes human readable. But > ASpace already includes a lot of code to do just that… I just don’t know > if any of the current reports tap into that. The previous reporting system > in ArchivesSpace, which used Jasper Reports, had an option to easily get > those notes out and add them to the reports (e.g. that big blog above would > become “English stage actor”), but even that wouldn’t take any EAD tags > that might be present and convert those to something else. > > > > Kevin, your sample reports also include searching across unit titles in a > finding aid, which is another great use case. I’m afraid to see how many > we have of those, but we should definitely look. I know we’ve got quite a > few, like > https://archives.yale.edu/repositories/11/archival_objects/526696, which > I’ve been meaning to update for quite some time. In that case, though, my > thought was just to add a new agent heading for > http://id.loc.gov/authorities/names/no93005770, which I hope would make > it clear what’s going on / not also require updating folder labels and the > like… that said, I’d be curious if folks are doing that, too. > > > > Mark > > > > > > > > *From:* archivesspace_users_group-boun...@lyralists.lyrasis.org [mailto: > archivesspace_users_group-boun...@lyralists.lyrasis.org] *On Behalf Of *Kevin > W. Schlottmann > *Sent:* Tuesday, 19 May, 2020 3:10 PM > *To:* Archivesspace Users Group < > archivesspace_users_group@lyralists.lyrasis.org> > *Subject:* Re: [Archivesspace_Users_Group] Mrs. Husband's Name / updates > in our ArchivesSpace database > > > > Hi all, > > > > As an alternative to direct database access to run these sorts of queries, > one could download the data as EAD, load it into a local XML database such > as BaseX, and run queries using xquery there. (I wrote a couple of quick > queries and posted them here: > https://gist.github.com/kschlottmann/4c7a3125780c18cd175a29c9ba237928) I > put the results into a Google sheet for distributed review by our team. If > we have suggested updates to the free-text description (scope notes and > unit titles), we'll note that in the Google sheet and then run those back > in with the API. > > > > This approach obviously technical preconditions as well, including an API > or OAI script that will download the EADs en masse and a script that will > write the updates back in, but it can be done in a hosted environment where > one doesn't have database access. > > > > Kevin > > > > On Tue, May 19, 2020 at 10:27 AM Karen Miller <k-mill...@northwestern.edu> > wrote: > > Good morning. > > > > At Northwestern University, our ArchivesSpace is hosted by Atlas, who has > given us ODBC read-only access to the MySQL database. Just yesterday I ran > a report that harvests BiogHist notes from Agents for a cleanup project. My > SQL is pretty hacky too (although Mark’s is a good bit more neat looking > than mine!), but I could post it on GitHub if it’s of interest. > > > > Karen > > > > *Karen D. Miller* > > Monographic Cataloger/Metadata Specialist > > Northwestern University Libraries > > Northwestern University > > 1970 Campus Drive > > Evanston, IL 60208 > > www.library.northwestern.edu > > k-mill...@northwestern.edu > > 874.467.3462 > > > > *From:* archivesspace_users_group-boun...@lyralists.lyrasis.org < > archivesspace_users_group-boun...@lyralists.lyrasis.org> *On Behalf Of *Lara > Friedman-Shedlov > *Sent:* Monday, May 18, 2020 4:07 PM > *To:* Archivesspace Users Group < > archivesspace_users_group@lyralists.lyrasis.org> > *Subject:* Re: [Archivesspace_Users_Group] Mrs. Husband's Name / updates > in our ArchivesSpace database > > > > Thanks for sharing this Mark. At the University of Minnesota, we are in > the early stages of doing some agent record clean-up as well, so this > information is very useful. > > > > For one project we were looking at doing with our agents, we wanted to > create a report of all of our agent records that would also include the > biographical /historical notes from any linked resource records (or at > least the URLs to any associated resource records, so we could easily find > the relevant bio/hist notes). Our instance of ArchivesSpace is hosted by > Lyrasis so we contacted them to ask about it and were told this is "not > possible." I found that surprising and wonder if anyone else has done > something like this and if so, how. > > > > / Lara Friedman-Shedlov > > > > > > > > On Mon, May 18, 2020 at 4:00 PM Custer, Mark <mark.cus...@yale.edu> wrote: > > All, > > > > Is anyone else working on Agent-related cleanup projects in ArchivesSpace > right now? We’ve got a couple of those going on at Yale, and I wanted to > mention one of them on this listserv since I said that I would last night > on Twitter 😊. My reasoning was that it would be better to share widely, > even early on, in the likely event that others were working on similar > projects, and in hopes that it might save time for anyone else looking to > get started with such a project. > > > > Anyhow, just for the sake of sharing, here’s a really hacky SQL database > query that you can use in ArchivesSpace to get a list of agents that have > any name forms that include “Mrs.” or “Miss”: > https://gist.github.com/fordmadox/d78656fceb04b62000b662a3f2464488 > > > > A few caveats: > > - I do **not** know SQL very well, so I know that this could be > improved dramatically, but it gets data out of ArchivesSpace. > - The query casts a rather wide net, since it looks for Mrs. or Miss > in any of the name forms, but it could be altered to just look for those > two terms in the “sort_name” only if desired. > - The result of the query has at least one Yale-specific field in > there, since we store our local ILS bibliographic IDs in ASpace’s “user > defined string 2” field. You can ignore that, or add something else, etc., > but the gist is that this query should work in any ASpace instance. It > should return one row per agent, with multiple name forms, and a bit more > information like which Resources, Archival Objects, and/or Accessions the > agent is linked to. > - And last, it just searches for variations of “Mrs.” and “Miss”, > which works for our dataset, but you could modify the HAVING clause at the > end of the query to search for other honorific terms, if needed. > > > > Most importantly, though, getting a dataset to review and act on is just > the first step. The hard work comes next! Jessica Tai, Alison Clemens, > and Karen Spicher are spearheading this project at Yale. If anyone has > specific questions about the project, I’d encourage you to reach out > directly to them. > > > > For now, though, here’s one interesting example from the project: we’ve > got an agent record now in ArchivesSpace for “Brady, John G., Mrs.”, > https://archives.yale.edu/agents/people/77599. Originally this would > just have been text in the finding aid, later a link in a catalog record, > and now it’s full-blown, standalone record for a person in ArchivesSpace. > But nowhere in that agent record is her given name, Elizabeth, although her > name is thankfully listed multiple times in the finding aid (check out the > finding aid author! 😊), in the Wikipedia entry for John Green Brady, and > elsewhere. So, that agent record will eventually be one of a few hundred > local records that we update in ArchivesSpace during the course of this > project. And, one of the things that I like about ArchivesSpace --which > helps to make sure that this project is possible without too many > workarounds-- is that even if an agent record has a corresponding > authority record in, say, the Library of Congress name authority file (e.g. > http://id.loc.gov/authorities/names/n2008076910), you can still choose to > use a different name variant for that agent’s display name, rather than an > authorized heading. > > > > Anyhow, I hope all is doing well! > > > > Mark > > > > > > _______________________________________________ > Archivesspace_Users_Group mailing list > Archivesspace_Users_Group@lyralists.lyrasis.org > http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group > > > > > -- > > Lara D. Friedman-Shedlov (she, her, hers) > > Description and Access Archivist | Kautz Family YMCA Archives | > www.lib.umn.edu/ymca > > Digital Records Archivist | Archives & Special Collections | > www.lib.umn.edu/special > > University of Minnesota Libraries | lib.umn.edu | 612.626.7972 > > > > > > > > _______________________________________________ > Archivesspace_Users_Group mailing list > Archivesspace_Users_Group@lyralists.lyrasis.org > http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group > > > > -- > > Kevin Schlottmann > Head of Archives Processing > Rare Book & Manuscript Library > Butler Library, Room 801 > Columbia University > 535 W. 114th St., New York, NY 10027 > (212) 854-8483 > _______________________________________________ > Archivesspace_Users_Group mailing list > Archivesspace_Users_Group@lyralists.lyrasis.org > http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group > -- Kevin Schlottmann Head of Archives Processing Rare Book & Manuscript Library Butler Library, Room 801 Columbia University 535 W. 114th St., New York, NY 10027 (212) 854-8483
_______________________________________________ Archivesspace_Users_Group mailing list Archivesspace_Users_Group@lyralists.lyrasis.org http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group