That's what I am trying to do. Thanks for the advice. Once I have it done I
will rise the issue and upload the patch.
 

Noble Paul നോബിള്‍ नोब्ळ् wrote:
> 
> OK . I guess I see it.  I am thinking of exposing the writes to the
> properties file via an API.
> 
> say Context#persist(key,value);
> 
> 
> This can write the data to the dataimport.properties.
> 
> You must be able to retrieve that value by ${dataimport.persist.<key>}
> 
> or through an API, Context.getPersistValue(key)
> 
> You can raise an issue and give a patch and we can get it committed
> 
> I guess this is what you wish to achieve
> 
> --Noble
> 
> 
> 
> On Wed, Dec 3, 2008 at 3:28 AM, Marc Sturlese <[EMAIL PROTECTED]>
> wrote:
>>
>> Do you mean the file used by dataimporthandler called
>> dataimport.properties?
>> If you mean this one it's writen at the end of the indexing proccess. The
>> writen date will be used in the next indexation by delta-query to
>> identify
>> the new or modified rows from the database.
>>
>> What I am trying to do is instead of saving a timestamp save the last
>> indexed id. Doing that, in the next execution I will start indexing from
>> the
>> last doc that was indexed in the previous indexation. But I am still a
>> bit
>> confused about how to do that...
>>
>> Noble Paul നോബിള്‍ नोब्ळ् wrote:
>>>
>>> delta-import file?
>>>
>>>
>>> On Wed, Dec 3, 2008 at 12:08 AM, Lance Norskog <[EMAIL PROTECTED]>
>>> wrote:
>>>> Does the DIH delta feature rewrite the delta-import file for each set
>>>> of
>>>> rows? If it does not, that sounds like a bug/enhancement.
>>>> Lance
>>>>
>>>> -----Original Message-----
>>>> From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:[EMAIL PROTECTED]
>>>> Sent: Tuesday, December 02, 2008 8:51 AM
>>>> To: solr-user@lucene.apache.org
>>>> Subject: Re: DataImportHandler: Deleteing from index and db;
>>>> lastIndexed
>>>> id feature
>>>>
>>>> You can write the details to a file using a Transformer itself.
>>>>
>>>> It is wise to stick to the public API as far as possible. We will
>>>> maintain back compat and your code will be usable w/ newer versions.
>>>>
>>>>
>>>> On Tue, Dec 2, 2008 at 5:12 PM, Marc Sturlese <[EMAIL PROTECTED]>
>>>> wrote:
>>>>>
>>>>> Thanks I really apreciate your help.
>>>>>
>>>>> I didn't explain myself so well in here:
>>>>>
>>>>>> 2.-This is probably my most difficult goal.
>>>>>> Deltaimport reads a timestamp from the dataimport.properties and
>>>>>> modify/add all documents from db wich were inserted after that date.
>>>>>> What I want is to be able to save in the field the id of the last
>>>>>> idexed doc. So in the next time I ejecute the indexer make it start
>>>>>> indexing from that last indexed id doc.
>>>>> You can use a Transformer to write something to the DB.
>>>>> Context#getDataSource(String) for each row
>>>>>
>>>>> When I said:
>>>>>
>>>>>> be able to save in the field the id of the last idexed doc
>>>>> I made a mistake, wanted to mean :
>>>>>
>>>>> be able to save in the file (dataimport.properties) the id of the last
>>>>> indexed doc.
>>>>> The point would be to do my own deltaquery indexing from the last doc
>>>>> indexed id instead of the timestamp.
>>>>> So I think this would not work in that case (it's my mistake because
>>>>> of the bad explanation):
>>>>>
>>>>>>You can use a Transformer to write something to the DB.
>>>>>>Context#getDataSource(String) for each row
>>>>>
>>>>> It is because I was saying:
>>>>>> I think I should begin modifying the SolrWriter.java and
>>>>>> DocBuilder.java.
>>>>>> Creating functions like getStartTime, persistStartTime... for ID
>>>>>> control
>>>>>
>>>>> I am in the correct direction?
>>>>>  Sorry for my englis and thanks in advance
>>>>>
>>>>>
>>>>> Noble Paul നോബിള്‍ नोब्ळ् wrote:
>>>>>>
>>>>>> On Tue, Dec 2, 2008 at 3:01 PM, Marc Sturlese
>>>>>> <[EMAIL PROTECTED]>
>>>>>> wrote:
>>>>>>>
>>>>>>> Hey there,
>>>>>>>
>>>>>>> I have my dataimporthanlder almost completely configured. I am
>>>>>>> missing three goals. I don't think I can reach them just via xml
>>>>>>> conf or transformer and sqlEntitProcessor plugin. But need to be
>>>>>>> sure of that.
>>>>>>> If there's no other way I will hack some solr source classes, would
>>>>>>> like to know the best way to do that. Once I have it solved, I can
>>>>>>> upload or post the source in the forum in case someone think it can
>>>>>>> be helpful.
>>>>>>>
>>>>>>> 1.- Every time I execute dataimporthandler (to index data from a
>>>>>>> db), at the start time or end time I need to delete some expired
>>>>>>> documents. I have to delete them from the database and from the
>>>>>>> index. I know wich documents must be deleted because of a field in
>>>>>>> the db that says it. Would not like to delete first all from DB or
>>>>>>> first all from index but one from index and one from doc every time.
>>>>>>
>>>>>> You can override the init() destroy() of the SqlEntityProcessor and
>>>>>> use it as the processor for the root entity. At this point you can
>>>>>> run the necessary db queries and solr delete queries . look at
>>>>>> Context#getSolrCore() and Context#getdataSource(String)
>>>>>>
>>>>>>
>>>>>>> The "delete mark" is setted as an update in the db row so I think I
>>>>>>> could use deltaImport. Don't know If deletedPkQuery is the way to do
>>>>>>> that. Can not find so much information about how to make it work. As
>>>>>>> deltaQuery modifies docs (delete old and insert new) I supose it
>>>>>>> must be a easy way to do this just doing the delete and not the new
>>>>>>> insert.
>>>>>> deletedPkQuery does everything first. it runs the query and uses that
>>>>>> to identify the deleted rows.
>>>>>>>
>>>>>>> 2.-This is probably my most difficult goal.
>>>>>>> Deltaimport reads a timestamp from the dataimport.properties and
>>>>>>> modify/add all documents from db wich were inserted after that date.
>>>>>>> What I want is to be able to save in the field the id of the last
>>>>>>> idexed doc. So in the next time I ejecute the indexer make it start
>>>>>>> indexing from that last indexed id doc.
>>>>>> You can use a Transformer to write something to the DB.
>>>>>> Context#getDataSource(String) for each row
>>>>>>
>>>>>>> The point of doing this is that if I do a full import from a db with
>>>>>>> lots of rows the app could encounter a problem in the middle of the
>>>>>>> execution and abort the process. As deltaquey works I would have to
>>>>>>> restart the execution from the begining. Having this new
>>>>>>> functionality I could optimize the index and start from the last
>>>>>>> indexed doc.
>>>>>>> I think I should begin modifying the SolrWriter.java and
>>>>>>> DocBuilder.java.
>>>>>>> Creating functions like getStartTime, persistStartTime... for ID
>>>>>>> control
>>>>>>>
>>>>>>> 3.-I commented before about this last point. I want to give boost to
>>>>>>> doc fields at indexing time.
>>>>>>>>>Adding fieldboost is a planned item.
>>>>>>>
>>>>>>>>>It must work as follows .
>>>>>>>>>Add a special value $fieldBoost.<fieldname> to the row map
>>>>>>>
>>>>>>>>>And DocBuilder should respect that. You can raise a bug and we can
>>>>>>>>>commit it soon.
>>>>>>> How can I do to rise a bug?
>>>>>> https://issues.apache.org/jira/secure/CreateIssue!default.jspa
>>>>>>>
>>>>>>> Thanks in advance
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> View this message in context:
>>>>>>> http://www.nabble.com/DataImportHandler%3A-Deleteing-from-index-and-
>>>>>>> db--lastIndexed-id-feature-tp20788755p20788755.html
>>>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> --Noble Paul
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>> http://www.nabble.com/DataImportHandler%3A-Deleteing-from-index-and-db
>>>>> --lastIndexed-id-feature-tp20788755p20790542.html
>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> --Noble Paul
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> --Noble Paul
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/DataImportHandler%3A-Deleteing-from-index-and-db--lastIndexed-id-feature-tp20788755p20801932.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> --Noble Paul
> 
> 

-- 
View this message in context: 
http://www.nabble.com/DataImportHandler%3A-Deleteing-from-index-and-db--lastIndexed-id-feature-tp20788755p20808620.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to