Database optimization is not like program optimization- it is wildly unpredictable.

What bugs me about the delta approach is using the last time DIH ran, rather than a timestamp from the DB. Oh well. Also, with SOLR-1499 you can query Solr directly to see what it has.

Lukas Kahwe Smith wrote:
Hi,

I think i have mentioned this approach before on this list, but I really think that the 
deltaQuery approach which is currently explained as the "way to do updates" is 
far from ideal. It seems to add a lot of redundant queries.

I therefore propose to merge the initial import and delta queries using the 
below approach:

         <entity name="person" query="SELECT * FROM foo
         WHERE '${dataimporter.request.clean}' != 'false' OR last_updated>  
'${dataimporter.last_index_time}'">

Using this approach when clean = true the "last_updated>  
'${dataimporter.last_index_time}" should be optimized out by any sane RDBMS. And if 
clean = false it basically triggers the delta query part to be evaluated.

Is there any downside to this approach? Should this be added to the wiki?

regards.
Lukas Kahwe Smith
m...@pooteeweet.org



Reply via email to