[Arches] Re: Adapting the Lincoln HER reference data and resource models

Adam Cox Mon, 22 Jan 2018 10:06:44 -0800

Hi David, I apologize for the late response to this. First, I just wanted 
to say that I'm glad you have found those scripts to be helpful, but they 
do have a limitation that may affect your workflow: At present, there is no 
way to create a hierarchical concept structure. As you can tell from 
looking at the CSVs, there is no way to indicate that a given concept 
should be created as a child of another concept; they are all created equal 
and added to the concept collection that is named after the CSV.


I would really like to add this capability to that script, because it's the 
kind of thing that would be a very helpful utility to have within arches, 
for example, it could be implemented as something like:

    python manage.py make_thesaurus -s path/to/csv/directory

etc.

I have added an issue on the fpan-data 
repo https://github.com/legiongis/fpan-data/issues/4 to at least record my 
thoughts on this. Feel free to chime in on there if you like.

That said, your best bet with your current endeavor would be, as Ryan 
suggests, to load DISCO, delete the resources, and then load your CSV using 
Lincoln terminology. As Ryan mentioned, you should be able to use the 
prefLabels for the concepts, instead of the UUIDs. For example, just put 
"Primary" in for name type, not the UUID for the Value that "Primary" 
represents (which is what is listed in the Assets_concepts.json file).

In a broader sense, you've hit on one of the challenges of importing 
business data and new concepts into existing package: 1) The resource model 
graphs store UUIDs of concept collections 2) the contents of a CSV must 
have the prefLabels of a valid concept within the concept collection. 
Because of 1, resource models and thesauri are inextricably linked; if you 
remove and create a new concept collection, you must also update the node 
on the graph. Because of 2, if you are adding new concepts to an existing 
set of concept collections (which are linked to resource model graphs), you 
must anticipate the concept collection UUID. Ultimately, what your use case 
illustrates is the usefulness of being able to "swap out" existing concepts 
in a existing concept collection (whose UUID is linked to the graphs) 
without actually affecting the UUID of the concept collection. I don't have 
a great workflow or any helper scripts for that, but it a direction I'm 
hoping to be able to go before too long.

Adam


On Wednesday, January 17, 2018 at 4:28:17 PM UTC-6, David Osborne wrote:
>
> Hi Ryan
>
> Thanks for such a detailed reply, which fills in some of the gaps in my 
> knowledge. I'll try your suggestions and report back!
>
> cheers
> David
>
> On Tuesday, 16 January 2018 01:51:56 UTC, Ryan Anderson wrote:
>>
>> Hi David,
>>
>> It sounds like you are on the right track, and that's awesome. Here's 
>> some information that might help push you along. I'm going to work my way 
>> backwards from the error you encountered.
>>
>> The error you're seeing in your concepts file appears when a concept node 
>> in a resource model or branch is not associated with a concept collection 
>> from the RDM.
>>
>> A little background on the concepts file . . . the concepts file 
>> (Assets_concepts.json) was/is meant to assist Arches users in looking up 
>> valueid's of concepts from the RDM. Until recently it was necessary to use 
>> these valueids in your business data CSV files. With few exceptions, you 
>> can now use the value itself in your CSV file and Arches will figure out 
>> the valueid 'behind the scenes' on import. The concepts file lists all the 
>> concept (and concept-value-list, and domain, and domain-value-list) nodes 
>> in a resource model and all the valid values that can be used in your 
>> business data csv to populate that node.
>>
>> Back to your error . . . nodes in a resource model or branch of data type 
>> concept (or concept-value-list) are 'fed' by a collection of concepts. This 
>> collection is stored in the Collections tab of the RDM. But the Arches user 
>> must tell the node which collection will be the source of that nodes 
>> values. You can assign a collection to a node in the graph manager by 
>> clicking on the node then going to the node config panel (between the 
>> resource model diagram and the node list in the graph manager) and 
>> selecting the collection in the Concept Collection dropdown and clicking 
>> save. The node will now know that values from the assigned collection are 
>> valid input values. The concept file will now also know which collection is 
>> associated with that node and in turn, the list of valid values for that 
>> node as well. 
>>
>> In your case you can create a Style and Identifier Type collection in the 
>> RDM from your imported concepts by selected the Top Node from the Thesauri 
>> tab of the RDM then clicking Manage -> Make Collection. Once your 
>> collection is created follow the steps above to associate the collection to 
>> the correct node in the graph manager.
>>
>>
>> If this all seems to onerous, there's no harm in loading the DISCO 
>> package, deleting out the business data and replacing it with your own. 
>> Deleting the business data with the remove_resources command should not 
>> modify either the RDM or the resource models.
>>
>>
>> Let us know how it goes.
>>
>>
>> Cheers,
>> Ryan
>>
>>
>>
>>
>> On Sunday, January 14, 2018 at 3:52:53 AM UTC-8, David Osborne wrote:
>>>
>>> I am working on setting up a new HER installation and would like to 
>>> model it, more or less, on the structures used by Lincoln in the demo data. 
>>> Some of the reference data, such as building periods and archaeological 
>>> periods, need to be supplemented, extended or replaced, so I have worked 
>>> out how to construct a thesaurus from a directory of CSV files, using the 
>>> add_uuid_to_csvs and thesaurus_from_csvs utilities from Adam's Github repo 
>>> *fpan-data* which he mentioned on the forum in September. However, when 
>>> trying to judiciously add to that only some of the Lincoln reference data 
>>> CSVs (and ignoring those from the DISCO project), I must have missed some 
>>> crucial steps: if I try download the mapping for the Assets resource model, 
>>> my Assets_concepts.json file says that each concept such as *Style* or 
>>> *External 
>>> Identifier Type* "does not appear to be configured with a valid concept 
>>> collectionid". My problem is that I realise I don't yet fully understand 
>>> the inter-relationships between all the parts of the reference data.
>>>
>>> What would be the simplest course of action to load our relatively small 
>>> initial set of HER data (only 205 sites/buildings), to have it ready for an 
>>> important demo in just over two weeks' time? At present, I'm thinking: load 
>>> the DISCO_data package; delete all the resources from the database via the 
>>> command-line; delete all the DISCO reference data/models via the web 
>>> interface; then load our data from CSV, with appropriate fields edited to 
>>> use the uuids of the relevant concepts.
>>>
>>> Can I safely delete anything relating to the DISCO project without 
>>> affecting the HER structures?
>>>
>>> It would be nice to use our thesaurus, if it's just a simple matter of 
>>> referring to its uuids for items such as building type instead of the 
>>> nearest Lincoln equivalent, or is more work needed to make the thesaurus 
>>> usable than generating the XML file using Adam's *fpan-data* utilities 
>>> mentioned above? At a pinch, we could just use the Lincoln terms for now 
>>> and move to using our thesaurus later: it's essential to have something 
>>> working by the end of the month which resembles the Lincoln demo but 
>>> displays our data instead!
>>>
>>> Any suggestions welcome. I can upload some files I'm using, if that 
>>> would help, just let me know what would be useful to see.
>>>
>>> Thanks in advance,
>>> David
>>>
>>

-- 
-- To post, send email to archesproject@googlegroups.com. To unsubscribe, send 
email to archesproject+unsubscr...@googlegroups.com. For more information, 
visit https://groups.google.com/d/forum/archesproject?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Arches Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to archesproject+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[Arches] Re: Adapting the Lincoln HER reference data and resource models

Reply via email to