RE: [solr-solrcloud] How does DIH work when there are multiple nodes?
DIH is also not designed to multi-thread very well. One way I've handled this is to have a DIH XML that breaks-up a database query into multiple processes by taking the modulo of a row, as follows: This allows me to do sub-queries within the entity, but it is often better to just write a small program to get this data from the database, and ETL processors such as Pentaho DI (Kettle) and Talend DI do this quite well. If you can express what you want in a database view, even a complicated one, then your best way to get it into Solr IMO is to use logstash with the jdbc input plugin. It can do some transformation, but you'll need your database view to process the data. > -Original Message- > From: Shawn Heisey > Sent: Friday, January 4, 2019 12:25 PM > To: solr-user@lucene.apache.org > Subject: Re: [solr-solrcloud] How does DIH work when there are multiple > nodes? > > On 1/4/2019 1:04 AM, 유정인 wrote: > > The reader was looking for a way to do 'DIH' automatically. > > > > The reason was for HA configuration. > > If you send a DIH request to the collection (as opposed to a specific > core), that request will be load balanced across the cloud. You won't > know which replica/core actually handles it. This means that an import > command may be handled by a different host than a status command. In > that situation, the status command will not know about the import, > because it will be running on a different Solr core. > > When doing DIH on SolrCloud, you should send your requests directly to a > specific core on a specific node. It's the only way to be sure what's > happening. High availability would have to be handled in your application. > > Thanks, > Shawn
Re: [solr-solrcloud] How does DIH work when there are multiple nodes?
On 1/4/2019 1:04 AM, 유정인 wrote: The reader was looking for a way to do 'DIH' automatically. The reason was for HA configuration. If you send a DIH request to the collection (as opposed to a specific core), that request will be load balanced across the cloud. You won't know which replica/core actually handles it. This means that an import command may be handled by a different host than a status command. In that situation, the status command will not know about the import, because it will be running on a different Solr core. When doing DIH on SolrCloud, you should send your requests directly to a specific core on a specific node. It's the only way to be sure what's happening. High availability would have to be handled in your application. Thanks, Shawn
RE: [solr-solrcloud] How does DIH work when there are multiple nodes?
Hi The reader was looking for a way to do 'DIH' automatically. The reason was for HA configuration. Thank you for answer. If you know how, please reply. -Original Message- From: Doss Sent: Friday, January 04, 2019 3:59 PM To: solr-user@lucene.apache.org Subject: RE: [solr-solrcloud] How does DIH work when there are multiple nodes? Hi, The data import process will not happen automatically, we have to do it manually through the admin interface or by calling the URL https://lucene.apache.org/solr/guide/7_5/uploading-structured-data-store- data-with-the-data-import-handler.html Full Import: http://node1ip:8983/solr/yourindexname/dataimport?command=full- import&commit=true Delta Import: http://node1ip:8983/solr/yourindexname/dataimport?command=delta- import&commit=true If you want to do the delta import automatically you can setup a cron (linux) which can call the URL periodically. Best, Doss. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
RE: [solr-solrcloud] How does DIH work when there are multiple nodes?
Hi, The data import process will not happen automatically, we have to do it manually through the admin interface or by calling the URL https://lucene.apache.org/solr/guide/7_5/uploading-structured-data-store-data-with-the-data-import-handler.html Full Import: http://node1ip:8983/solr/yourindexname/dataimport?command=full-import&commit=true Delta Import: http://node1ip:8983/solr/yourindexname/dataimport?command=delta-import&commit=true If you want to do the delta import automatically you can setup a cron (linux) which can call the URL periodically. Best, Doss. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
RE: [solr-solrcloud] How does DIH work when there are multiple nodes?
Hi Did you tell me how to call one node directly? Are you saying that one of the three nodes is automatically run? I would like to know how one of the three nodes is automatically performed. -Original Message- From: Doss Sent: Friday, January 04, 2019 3:38 PM To: solr-user@lucene.apache.org Subject: Re: [solr-solrcloud] How does DIH work when there are multiple nodes? Hi, I am assuming you are having the same index replicated in all 3 nodes, then doing a full index/ delta index using DIH in one node will replicate the data to other nodes, so no need to do it in all 3 nodes. Hope this helps! Best, Doss. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: [solr-solrcloud] How does DIH work when there are multiple nodes?
Hi, I am assuming you are having the same index replicated in all 3 nodes, then doing a full index/ delta index using DIH in one node will replicate the data to other nodes, so no need to do it in all 3 nodes. Hope this helps! Best, Doss. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
[solr-solrcloud] How does DIH work when there are multiple nodes?
Hi solrcloud Configured on 3 nodes. DIH is used for collecting / indexing, and each node has the same DIH. The DIH is executed at a fixed interval each time. Then there is the question here. Are you running on 3 nodes simultaneously? Or is it only a leader? And how do you know the leader? I am wondering how DIH works in solrcloud configuration.