Hello again,

I still trying to index a with solr cloud and dih. I can index but it seems
that indexation is done on only 1 shard. (my goal was to parallelze that to
go fast)
This my conf:
I have 2 tomcat instances,
One with zookeeper embedded in solr 4.4.0 started and 1 shard (port 8080)
The other with the second shard. (port 9180)
In my admin interface, I see 2 shards, each one is leader


When I launch the dih, documents are indexed. But only the shard1 is
working.
http://localhost:8080/solr-0.4.0-pfd/noticesBIBcollection/dataimportMNb?command=full-import&entity=noticebib&optimize=true&indent=true&clean=true&commit=true&verbose=false&debug=false&wt=json&rows=1000


In my first shard, I see messages coming from my indexation process:
DEBUG 2013-09-03 11:48:57,801 Thread-12
org.apache.solr.handler.dataimport.URLDataSource  (92) - Accessing URL:
file:/X:/3/7/002/37002118.xml
DEBUG 2013-09-03 11:48:57,832 Thread-12
org.apache.solr.handler.dataimport.URLDataSource  (92) - Accessing URL:
file:/X:/3/7/002/37002120.xml
DEBUG 2013-09-03 11:48:57,966 Thread-12
org.apache.solr.handler.dataimport.LogTransformer  (58) - Notice fichier:
3/7/002/37002120.xml
DEBUG 2013-09-03 11:48:57,966 Thread-12 fr.bnf.solr.BnfDateTransformer
(696) - NN=37002120

In the second instance, I just have this kind of logs, at it was receiving
notifications from zookeeper of new updates
INFO 2013-09-03 11:48:57,323 http-9180-7
org.apache.solr.update.processor.LogUpdateProcessor  (198) - [noticesBIB]
webapp=/solr-0.4.0-pfd path=/update params=
{distrib.from=http://172.20.48.237:8080/solr-0.4.0-pfd/noticesBIB/&update.distrib=TOLEADER&wt=javabin&version=2}
 {add=[37001748 (1445149264874307584), 37001757 (1445149264879550464),
37001764 (1445149264883744768), 37001786 (1445149264887939072), 37001817
(1445149264891084800), 37001819 (1445149264896327680), 37001837
(1445149264900521984), 37001861 (1445149264903667712), 37001869
(1445149264907862016), 37001963 (1445149264912056320)]} 0 41

I supposed there was a confusion between cores names and collection name,
and I tried to change the name of the collection, but it solved nothing.
When I come to dih interfaces, in shard1, I see indexation processing, and
on shard 2 "no information available"

Is there something specia to do to distributre indexation process?
Should I run zookeeper on both instances (even if it's not mandatory?
...
Regards
Jerome



Fermeture annuelle des sites François-Mitterrand et Richelieu du 2 au 15 
septembre 2013 Avant d'imprimer, pensez à l'environnement. 

Reply via email to