Re: weak documents

Thomas Scheffler Wed, 27 Nov 2013 01:45:06 -0800

Am 27.11.2013 09:58, schrieb Paul Libbrecht:

Thomas,


our experience with Curriki.org is that evaluating what I call the
"related documents" is a procedure that needs access to the complete
content and thus is run at the DB level and no thte sold-level.

For example, if a user changes a part of its name, we need to reindex
all of his resources. Sure we could try to run a solr query for this,
and maybe add index fields for it, but we felt it better to run this
on the index-trigger side, the thing in our (XWiki) wiki which
listens to changes and requests the reindexing of a few documents
(including deletions).

For the maintenance operation, the same issue has appeared. So, if
the indexer or listener or solr has been down for a few minutes or
hours, we'd need to reindex not only all changed documents but all
changed documents and their related documents.

If you are able to work through your solution that would be
solr-only,  to write down all depends-on at index time, it means you
would index-update all "inverse related" documents every time that
changes. For the relation above (documents of a user), it means the
user documents needs reindexing every time a new document is added. I
wonder if this makes a scale difference.

I think both use-cases differ a bit. On index-time of my master documentI have all information of dependent documents ready. So instead ofcommitting one document I commit - lets say - four.


In your case you have to query to get all documents of a user first.

Here is a more detailed use-case. I have metadata in 1 to n languages todescribe a document (e.g. journal article).

I commit a master document in a specified default language to SOLR andone document for every language I have metadata for. If a user adds orremoves metadata (e.g. abstract in French) there is one document more orone document less in SOLR. So their number changes and I want stalleddata to be kept in the index.

A similar use case: I have article documents with authors. I create"author" documents for every article. If someone adds or removes anauthor I need to track that change. These "dump" author documents areused for an alphabetical person index and hold a unique field that isused to group them but these documents exists only as long as theirmaster documents do.

My two use-cases are quite similar so I would like these "weak"documents functionality somehow.

SOLR knows if a document is added with id=foo it have to replace adocument that matches id:"foo". If I can change this behavior todependsOn:"foo" I am done. :-D


regards

Thomas

Re: weak documents

Reply via email to