#689: BibAuthorID: improve indexing gateway
-------------------------+--------------------
Reporter: simko | Owner: scarli
Type: defect | Status: new
Priority: blocker | Milestone: v1.0
Component: BibAuthorID | Version:
Keywords: |
-------------------------+--------------------
The author index is indexing, besides name tokens, also canonical author
IDs (e.g. `G.Aad.1`). This works by BibIndex calling BibAuthorID gateway
that returns back which canonical author IDs are to be indexed for given
record ID. However, the whole process works by starting from the indexing
side, i.e. starting from a list of records that have had their metadata
modified since the last run.
This creates a problem when, for example, cataloguers modified canonical
author IDs from `G.Aad.1` to, say, `G.X.Aad.3`, because the indexer cannot
know about this change and that the corresponding records are to be re-
indexed.
We therefore need a new BibAuthorID gateway function that BibIndex would
call with a timestamp parameter (corresponding to the timestamp of the
last indexing run) that would return back the list of record IDs that must
be re-indexed as a result of paper claiming activity that happened from
the given timestamp up to current time. BibAuthorID tables have all the
necessary information, so can reply to such a request.
Technically, in addition to `get_persons_from_recids()`, we would need a
new function `get_recids_affected_since(yyyymmddhhmmss)` or at least
`get_persons_affected_since(yyyymmddhhmmss)` and BibIndex would then do
the adding/cleaning job. The function should return not only records with
modified canonical IDs, but also deleted IDs etc.
(An alternative would be that BibAuthorID pages would force re-indexing of
certain records live, as the claiming happens, by calling bibindex with
`-w author -a -i 123456` arguments; but the former approach, changing the
meaning of what `last-modified-since` means in teh case of author indexes,
would be more robust.)
--
Ticket URL: <http://invenio-software.org/ticket/689>
Invenio <http://invenio-software.org>