[
https://issues.apache.org/jira/browse/SOLR-1920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13496531#comment-13496531
]
Shawn Heisey commented on SOLR-1920:
------------------------------------
In the MySQL database where my data originates, the field that I use for
tracking what's new is an autoincrement field, mapped to a tlong in Solr. New
documents added to the database just get assigned the next autoincrement
number. If Solr could be informed that field X is the tracking field, the
highest value encountered during an import (according to that field's sort
mechanism) could be stored in dataimport.properties and re-used during the next
delta-import.
If DIH is sufficiently disconnected from Solr schema internals (which actually
seems likely), you'd have to base your sort on the SQL data type, because it
would have no way to know what kind of field Solr has.
I currently do all delta tracking outside of Solr, so I'm already covered. The
generic idea seemed worthy of opening an issue two years ago, because other
people may run into situations where they cannot use a timestamp for delta
tracking.
I have no idea what kind of tracking problems you'd encounter when dealing with
soft commits. Without a transaction log, that could get ugly. For performance
reasons, I am initially deploying 4.x with no transaction log (see SOLR-3954).
> Need generic placemarker for DIH delta-import
> ---------------------------------------------
>
> Key: SOLR-1920
> URL: https://issues.apache.org/jira/browse/SOLR-1920
> Project: Solr
> Issue Type: Improvement
> Components: contrib - DataImportHandler
> Reporter: Shawn Heisey
> Priority: Minor
> Fix For: 4.1
>
>
> The dataimporthandler currently is only capable of saving the index timestamp
> for later use in delta-import commands. It should be extended to allow any
> arbitrary data to be used as a placemarker for the next import.
> It is possible to use externally supplied variables in data-config.xml and
> send values in via the URL that starts the import, but if the config can
> support it natively, that is better.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]