Re: Inserting many documents and update relations

2012-11-20 Thread Mikhail Khludnev
Hello,
I propose to join docs externally eg in tiny rdbms, just put ids there and
keep content in files. Then DIH, I believe and only believe, should be able
to build full document representation with joined entities.

As an alternative you can index document as is with id-references between
them in separate solr core, then index joined docs into another core by
DIH's SolrEntityProcessor querying the first core in with
http://wiki.apache.org/solr/Join .

19.11.2012 23:55 пользователь uwe72
uwe.clem...@exxcellent.deuwe.clem...@exxcellent.de
написал:

 Hi there,

 i have a principal question.

 We have arround 5 million lucene documents.

 At the beginning we have arround 4000 XML-files which we transform to
 SolrInputDocuemnts by using solrj and adding them to the index.

 A document is also related to other documents, so while adding a document
 we
 have to do some queries (at least one) to identiy if there are related
 documents already in the cache in order to do the association to the
 related
 document. The related document also has a backlink, so we have to update
 also the related document (means load, update, delete and re-add).

 We are using solr 3.6.1.

 The performance is quite slow because of this queries and modfifications of
 already existing documents in the cache.

 Are there some configuration issues what we can do, or anything else?

 Thanks a lot in advance.





 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Inserting-many-documents-and-update-relations-tp4021151.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Inserting many documents and update relations

2012-11-19 Thread uwe72
Hi there,

i have a principal question.

We have arround 5 million lucene documents. 

At the beginning we have arround 4000 XML-files which we transform to
SolrInputDocuemnts by using solrj and adding them to the index.

A document is also related to other documents, so while adding a document we
have to do some queries (at least one) to identiy if there are related
documents already in the cache in order to do the association to the related
document. The related document also has a backlink, so we have to update
also the related document (means load, update, delete and re-add).

We are using solr 3.6.1.

The performance is quite slow because of this queries and modfifications of
already existing documents in the cache.

Are there some configuration issues what we can do, or anything else?

Thanks a lot in advance.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Inserting-many-documents-and-update-relations-tp4021151.html
Sent from the Solr - User mailing list archive at Nabble.com.