That's what I am trying to do. Thanks for the advice. Once I have it done I will rise the issue and upload the patch.
Noble Paul നോബിള് नोब्ळ् wrote: > > OK . I guess I see it. I am thinking of exposing the writes to the > properties file via an API. > > say Context#persist(key,value); > > > This can write the data to the dataimport.properties. > > You must be able to retrieve that value by ${dataimport.persist.<key>} > > or through an API, Context.getPersistValue(key) > > You can raise an issue and give a patch and we can get it committed > > I guess this is what you wish to achieve > > --Noble > > > > On Wed, Dec 3, 2008 at 3:28 AM, Marc Sturlese <[EMAIL PROTECTED]> > wrote: >> >> Do you mean the file used by dataimporthandler called >> dataimport.properties? >> If you mean this one it's writen at the end of the indexing proccess. The >> writen date will be used in the next indexation by delta-query to >> identify >> the new or modified rows from the database. >> >> What I am trying to do is instead of saving a timestamp save the last >> indexed id. Doing that, in the next execution I will start indexing from >> the >> last doc that was indexed in the previous indexation. But I am still a >> bit >> confused about how to do that... >> >> Noble Paul നോബിള് नोब्ळ् wrote: >>> >>> delta-import file? >>> >>> >>> On Wed, Dec 3, 2008 at 12:08 AM, Lance Norskog <[EMAIL PROTECTED]> >>> wrote: >>>> Does the DIH delta feature rewrite the delta-import file for each set >>>> of >>>> rows? If it does not, that sounds like a bug/enhancement. >>>> Lance >>>> >>>> -----Original Message----- >>>> From: Noble Paul നോബിള് नोब्ळ् [mailto:[EMAIL PROTECTED] >>>> Sent: Tuesday, December 02, 2008 8:51 AM >>>> To: solr-user@lucene.apache.org >>>> Subject: Re: DataImportHandler: Deleteing from index and db; >>>> lastIndexed >>>> id feature >>>> >>>> You can write the details to a file using a Transformer itself. >>>> >>>> It is wise to stick to the public API as far as possible. We will >>>> maintain back compat and your code will be usable w/ newer versions. >>>> >>>> >>>> On Tue, Dec 2, 2008 at 5:12 PM, Marc Sturlese <[EMAIL PROTECTED]> >>>> wrote: >>>>> >>>>> Thanks I really apreciate your help. >>>>> >>>>> I didn't explain myself so well in here: >>>>> >>>>>> 2.-This is probably my most difficult goal. >>>>>> Deltaimport reads a timestamp from the dataimport.properties and >>>>>> modify/add all documents from db wich were inserted after that date. >>>>>> What I want is to be able to save in the field the id of the last >>>>>> idexed doc. So in the next time I ejecute the indexer make it start >>>>>> indexing from that last indexed id doc. >>>>> You can use a Transformer to write something to the DB. >>>>> Context#getDataSource(String) for each row >>>>> >>>>> When I said: >>>>> >>>>>> be able to save in the field the id of the last idexed doc >>>>> I made a mistake, wanted to mean : >>>>> >>>>> be able to save in the file (dataimport.properties) the id of the last >>>>> indexed doc. >>>>> The point would be to do my own deltaquery indexing from the last doc >>>>> indexed id instead of the timestamp. >>>>> So I think this would not work in that case (it's my mistake because >>>>> of the bad explanation): >>>>> >>>>>>You can use a Transformer to write something to the DB. >>>>>>Context#getDataSource(String) for each row >>>>> >>>>> It is because I was saying: >>>>>> I think I should begin modifying the SolrWriter.java and >>>>>> DocBuilder.java. >>>>>> Creating functions like getStartTime, persistStartTime... for ID >>>>>> control >>>>> >>>>> I am in the correct direction? >>>>> Sorry for my englis and thanks in advance >>>>> >>>>> >>>>> Noble Paul നോബിള് नोब्ळ् wrote: >>>>>> >>>>>> On Tue, Dec 2, 2008 at 3:01 PM, Marc Sturlese >>>>>> <[EMAIL PROTECTED]> >>>>>> wrote: >>>>>>> >>>>>>> Hey there, >>>>>>> >>>>>>> I have my dataimporthanlder almost completely configured. I am >>>>>>> missing three goals. I don't think I can reach them just via xml >>>>>>> conf or transformer and sqlEntitProcessor plugin. But need to be >>>>>>> sure of that. >>>>>>> If there's no other way I will hack some solr source classes, would >>>>>>> like to know the best way to do that. Once I have it solved, I can >>>>>>> upload or post the source in the forum in case someone think it can >>>>>>> be helpful. >>>>>>> >>>>>>> 1.- Every time I execute dataimporthandler (to index data from a >>>>>>> db), at the start time or end time I need to delete some expired >>>>>>> documents. I have to delete them from the database and from the >>>>>>> index. I know wich documents must be deleted because of a field in >>>>>>> the db that says it. Would not like to delete first all from DB or >>>>>>> first all from index but one from index and one from doc every time. >>>>>> >>>>>> You can override the init() destroy() of the SqlEntityProcessor and >>>>>> use it as the processor for the root entity. At this point you can >>>>>> run the necessary db queries and solr delete queries . look at >>>>>> Context#getSolrCore() and Context#getdataSource(String) >>>>>> >>>>>> >>>>>>> The "delete mark" is setted as an update in the db row so I think I >>>>>>> could use deltaImport. Don't know If deletedPkQuery is the way to do >>>>>>> that. Can not find so much information about how to make it work. As >>>>>>> deltaQuery modifies docs (delete old and insert new) I supose it >>>>>>> must be a easy way to do this just doing the delete and not the new >>>>>>> insert. >>>>>> deletedPkQuery does everything first. it runs the query and uses that >>>>>> to identify the deleted rows. >>>>>>> >>>>>>> 2.-This is probably my most difficult goal. >>>>>>> Deltaimport reads a timestamp from the dataimport.properties and >>>>>>> modify/add all documents from db wich were inserted after that date. >>>>>>> What I want is to be able to save in the field the id of the last >>>>>>> idexed doc. So in the next time I ejecute the indexer make it start >>>>>>> indexing from that last indexed id doc. >>>>>> You can use a Transformer to write something to the DB. >>>>>> Context#getDataSource(String) for each row >>>>>> >>>>>>> The point of doing this is that if I do a full import from a db with >>>>>>> lots of rows the app could encounter a problem in the middle of the >>>>>>> execution and abort the process. As deltaquey works I would have to >>>>>>> restart the execution from the begining. Having this new >>>>>>> functionality I could optimize the index and start from the last >>>>>>> indexed doc. >>>>>>> I think I should begin modifying the SolrWriter.java and >>>>>>> DocBuilder.java. >>>>>>> Creating functions like getStartTime, persistStartTime... for ID >>>>>>> control >>>>>>> >>>>>>> 3.-I commented before about this last point. I want to give boost to >>>>>>> doc fields at indexing time. >>>>>>>>>Adding fieldboost is a planned item. >>>>>>> >>>>>>>>>It must work as follows . >>>>>>>>>Add a special value $fieldBoost.<fieldname> to the row map >>>>>>> >>>>>>>>>And DocBuilder should respect that. You can raise a bug and we can >>>>>>>>>commit it soon. >>>>>>> How can I do to rise a bug? >>>>>> https://issues.apache.org/jira/secure/CreateIssue!default.jspa >>>>>>> >>>>>>> Thanks in advance >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> View this message in context: >>>>>>> http://www.nabble.com/DataImportHandler%3A-Deleteing-from-index-and- >>>>>>> db--lastIndexed-id-feature-tp20788755p20788755.html >>>>>>> Sent from the Solr - User mailing list archive at Nabble.com. >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> --Noble Paul >>>>>> >>>>>> >>>>> >>>>> -- >>>>> View this message in context: >>>>> http://www.nabble.com/DataImportHandler%3A-Deleteing-from-index-and-db >>>>> --lastIndexed-id-feature-tp20788755p20790542.html >>>>> Sent from the Solr - User mailing list archive at Nabble.com. >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> --Noble Paul >>>> >>>> >>> >>> >>> >>> -- >>> --Noble Paul >>> >>> >> >> -- >> View this message in context: >> http://www.nabble.com/DataImportHandler%3A-Deleteing-from-index-and-db--lastIndexed-id-feature-tp20788755p20801932.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > > > -- > --Noble Paul > > -- View this message in context: http://www.nabble.com/DataImportHandler%3A-Deleteing-from-index-and-db--lastIndexed-id-feature-tp20788755p20808620.html Sent from the Solr - User mailing list archive at Nabble.com.