2011/6/1 Florent André <[email protected]>: > Hi Rupert, > > Thanks for your valuables answers ! > > In fact, if get it now, the meaning of indexing in entity hub is not just > about index, but about create a new (offline) entity hub. > > You said : >> The Solr Yard provides better performance especially for big Datasets. > ... >> The Clerezza is fine for smaller data sets. > > Do you have a "magic number" (a vague will be fine :) ) that define the > limit for a big dataset ?
The SolrYard implementation should be pretty scalable (tens or hundreds millions of entities). The ClerezzaYard will suffer from a limitation though. It won't be scalable to more than a couple of thousands of entities as long as the following is not fixed: https://issues.apache.org/jira/browse/CLEREZZA-466 -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel
