[ 
https://issues.apache.org/jira/browse/SOLR-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15464638#comment-15464638
 ] 

MuḼammad T Jack commented on SOLR-9478:
---------------------------------------

Yeah, we got the  same problem. We have two solr instance running on the same 
shard. ( actually we just have only one shard).

We are not using dataimport.properties which stored in zookeeper. Since it is 
difficult to figure out which collection we are using when use collection alias.

We are using cronjob to run delta import every single minute, but when a delta 
import running on solr instance. another delta import must wait for it until 
server free. Just for example , Sometimes it may takes about 3 minutes to 
finish delta import. So the data changed during this period are lost.

Is there any better solutions to do delta import when solr running on cloud 
mode ?

Thanks
MuḼammad T Jack

> Do delta import will loss some documents, when the documents added in the 
> duration of delta import.
> ---------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-9478
>                 URL: https://issues.apache.org/jira/browse/SOLR-9478
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: 5.5, 5.5.2, 6.0
>            Reporter: ted zhu
>              Labels: delta-import, solrcloud
>             Fix For: 5.5.2
>
>
> hello guys,
> I met a problem when i using the solrcloud mode. When the solr instance run 
> delta-import, it may take 
> some time to be finished( my data source is mysql database). So during this 
> time, the new added documents
> will loss, the deltaQuery, i use SUBDATE($\{dih.last_index_time\}, INTERVAL 2 
> MINUTE), 
> let it run the delta-import 2 mins earlier than the last_index_time, if the 
> delta-import's duration is 5 mins, it will loss the records at the first 3 
> mins.
>     Our servers doesn't use solr cloud mode before, we deal with this issue 
> is tring to rewrite dataimport.properties file, 
> query the max(sys_time_stamp), which will help to record the max time stamp, 
> and let the solr can run delta import standing 
> by the time found in the file, of course, it will never miss docuements. 
>    But now, we use solrcloud, the dataimport.properties is on the zookeeper, 
> and we may have multiple collections for the 
> same core.how can i update the dataimport.properties file now in colleciton 
> now? Do you have any solution to help record
>  the max(sys_time_stamp) in dataimport.properties, rather than using the time 
> of delta-import start to run?
> Cheers



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to