Hi, For me, there are two kinds of indexes: the property/value indexes, and the fulltext index.
The property/value indexes are for property values, node names, paths, node references, and so on. Such indexes (or "indices") are relatively small and fast. In relational databases, those are the secondary indexes (non-primary-key indexes). Those index updates should be done synchronously as part of the transaction (maybe even in the transient space). Currently, we use Apache Lucene for this, but I wouldn't. I would keep those indexes within the repository. The fulltext index is (potentially) slow, specially fulltext extraction. Therefore, fulltext index should be done asynchronously if it takes too long. Also, in a clustered environment, at least text extraction should only be done in one cluster node. I would still use Apache Tika and Apache Lucene for this. Regards, Thomas
