In my long-running data load process that appears to fail, I’ve found the issue 
but I don’t see an obvious way to correct it.

My process creates a temporary content database that contains the latest 
content version of content previously loaded. This temp database is then the 
source for a process that creates a set of where-used index records in another 
database that point to the nodes in the temp content database by node ID.

The node-recording elements look like this:

<noderef node-id="43617"
database="pce-test-data"
tagname="mapref"
baseuri="/pce-test-data/encryption-support.ditamap" 
href="cloud-encryption.ditamap"
/>

Note the “database” attribute: it’s the name of the database the node ID is 
from.

After the process as completed constructing all the where-used records and is 
ready to swap these new databases into production, I have an XSLT transform 
that updates the values of the @database attributes to replace the temporary 
database name with the production name (i.e., remove leading “_temp_” from the 
database name.

I then swap the temp databases in place of the old databases, putting the new 
data into production.

This works fine at small scales, but when I attempt it with my 200K-link 
database, the XSLT transform either simply never completes or fails in the 
backgroujnd or would take so long to complete that it would be impractical. In 
any case, this approach does not work for my full-scale case ☹

So my question is: How can avoid this need to update my node reference elements 
to reflect the new database name?

One solution that comes to mind is simply not recording the database name on 
the <noderef> element but somewhere else, say in the root element of the 
document that contains the <nodere>, but that requires that all the <noderef> 
elements in that context target the same database, which will be true in this 
case but might not be true in the future (I had designed <noderef> to enable 
mixing references to nodes in different databases).

I could also have the code that’s creating these where-used records manage the 
prod-to-temp database name dynamically (and that may be my best solution the 
more I think about it) but starts to look like magic and I try to avoid magic 
code.

So a solution that is less fragile would be ideal.

Changing the value requires an update of some sort, whether it’s via XSLT or 
XQuery update, it’s going to be problematic at this scale.

Is there any solution I’ve overlooked?

Thanks,

Eliot
_____________________________________________
Eliot Kimber
Sr. Staff Content Engineer
O: 512 554 9368

servicenow

servicenow.com<https://www.servicenow.com>
LinkedIn<https://www.linkedin.com/company/servicenow> | 
X<https://twitter.com/servicenow> | 
YouTube<https://www.youtube.com/user/servicenowinc> | 
Instagram<https://www.instagram.com/servicenow>

Reply via email to