RE: [solr-solrcloud] How does DIH work when there are multiple nodes?

2019-01-04 Thread Davis, Daniel (NIH/NLM) [C]
DIH is also not designed to multi-thread very well.   One way I've handled this 
is to have a DIH XML that breaks-up a database query into multiple processes by 
taking the modulo of a row, as follows:



This allows me to do sub-queries within the entity, but it is often better to 
just write a small program to get this data from the database, and ETL 
processors such as Pentaho DI (Kettle) and Talend DI do this quite well.

If you can express what you want in a database view, even a complicated one, 
then your best way to get it into Solr IMO is to use logstash with the jdbc 
input plugin.   It can do some transformation, but you'll need your database 
view to process the data.

> -Original Message-
> From: Shawn Heisey 
> Sent: Friday, January 4, 2019 12:25 PM
> To: solr-user@lucene.apache.org
> Subject: Re: [solr-solrcloud] How does DIH work when there are multiple
> nodes?
> 
> On 1/4/2019 1:04 AM, 유정인 wrote:
> > The reader was looking for a way to do 'DIH' automatically.
> >
> > The reason was for HA configuration.
> 
> If you send a DIH request to the collection (as opposed to a specific
> core), that request will be load balanced across the cloud.  You won't
> know which replica/core actually handles it. This means that an import
> command may be handled by a different host than a status command.  In
> that situation, the status command will not know about the import,
> because it will be running on a different Solr core.
> 
> When doing DIH on SolrCloud, you should send your requests directly to a
> specific core on a specific node.  It's the only way to be sure what's
> happening.  High availability would have to be handled in your application.
> 
> Thanks,
> Shawn



Re: [solr-solrcloud] How does DIH work when there are multiple nodes?

2019-01-04 Thread Shawn Heisey

On 1/4/2019 1:04 AM, 유정인 wrote:

The reader was looking for a way to do 'DIH' automatically.

The reason was for HA configuration.


If you send a DIH request to the collection (as opposed to a specific 
core), that request will be load balanced across the cloud.  You won't 
know which replica/core actually handles it. This means that an import 
command may be handled by a different host than a status command.  In 
that situation, the status command will not know about the import, 
because it will be running on a different Solr core.


When doing DIH on SolrCloud, you should send your requests directly to a 
specific core on a specific node.  It's the only way to be sure what's 
happening.  High availability would have to be handled in your application.


Thanks,
Shawn



RE: [solr-solrcloud] How does DIH work when there are multiple nodes?

2019-01-04 Thread 유정인
Hi

The reader was looking for a way to do 'DIH' automatically.

The reason was for HA configuration.

Thank you for answer.

If you know how, please reply.
-Original Message-
From: Doss  
Sent: Friday, January 04, 2019 3:59 PM
To: solr-user@lucene.apache.org
Subject: RE: [solr-solrcloud] How does DIH work when there are multiple
nodes?

Hi,

The data import process will not happen automatically, we have to do it
manually through the admin interface or by calling the URL

https://lucene.apache.org/solr/guide/7_5/uploading-structured-data-store-
data-with-the-data-import-handler.html

Full Import:

http://node1ip:8983/solr/yourindexname/dataimport?command=full-
import=true

Delta Import:

http://node1ip:8983/solr/yourindexname/dataimport?command=delta-
import=true


If you want to do the delta import automatically you can setup a cron
(linux) which can call the URL periodically.

Best,
Doss.




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html



RE: [solr-solrcloud] How does DIH work when there are multiple nodes?

2019-01-03 Thread Doss
Hi,

The data import process will not happen automatically, we have to do it
manually through the admin interface or by calling the URL

https://lucene.apache.org/solr/guide/7_5/uploading-structured-data-store-data-with-the-data-import-handler.html

Full Import:

http://node1ip:8983/solr/yourindexname/dataimport?command=full-import=true

Delta Import:

http://node1ip:8983/solr/yourindexname/dataimport?command=delta-import=true


If you want to do the delta import automatically you can setup a cron
(linux) which can call the URL periodically.

Best,
Doss.




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


RE: [solr-solrcloud] How does DIH work when there are multiple nodes?

2019-01-03 Thread 유정인
Hi

Did you tell me how to call one node directly?

Are you saying that one of the three nodes is automatically run?

I would like to know how one of the three nodes is automatically performed.

-Original Message-
From: Doss  
Sent: Friday, January 04, 2019 3:38 PM
To: solr-user@lucene.apache.org
Subject: Re: [solr-solrcloud] How does DIH work when there are multiple
nodes?

Hi,

I am assuming you are having the same index replicated in all 3 nodes, then
doing a full index/ delta index using DIH in one node will replicate the
data to other nodes, so no need to do it in all 3 nodes. Hope this helps!

Best,
Doss.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Re: [solr-solrcloud] How does DIH work when there are multiple nodes?

2019-01-03 Thread Doss
Hi,

I am assuming you are having the same index replicated in all 3 nodes, then
doing a full index/ delta index using DIH in one node will replicate the
data to other nodes, so no need to do it in all 3 nodes. Hope this helps!

Best,
Doss.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


[solr-solrcloud] How does DIH work when there are multiple nodes?

2019-01-03 Thread 유정인
Hi

solrcloud Configured on 3 nodes.

DIH is used for collecting / indexing, and each node has the same DIH. The
DIH is executed at a fixed interval each time.

 

Then there is the question here.

Are you running on 3 nodes simultaneously?

Or is it only a leader?

 

And how do you know the leader?

 

I am wondering how DIH works in solrcloud configuration.