You can write the details to a file using a Transformer itself. It is wise to stick to the public API as far as possible. We will maintain back compat and your code will be usable w/ newer versions.
On Tue, Dec 2, 2008 at 5:12 PM, Marc Sturlese <[EMAIL PROTECTED]> wrote: > > Thanks I really apreciate your help. > > I didn't explain myself so well in here: > >> 2.-This is probably my most difficult goal. >> Deltaimport reads a timestamp from the dataimport.properties and >> modify/add >> all documents from db wich were inserted after that date. What I want is >> to >> be able to save in the field the id of the last idexed doc. So in the next >> time I ejecute the indexer make it start indexing from that last indexed >> id >> doc. > You can use a Transformer to write something to the DB. > Context#getDataSource(String) for each row > > When I said: > >> be able to save in the field the id of the last idexed doc > I made a mistake, wanted to mean : > > be able to save in the file (dataimport.properties) the id of the last > indexed doc. > The point would be to do my own deltaquery indexing from the last doc > indexed id instead of the timestamp. > So I think this would not work in that case (it's my mistake because of the > bad explanation): > >>You can use a Transformer to write something to the DB. >>Context#getDataSource(String) for each row > > It is because I was saying: >> I think I should begin modifying the SolrWriter.java and DocBuilder.java. >> Creating functions like getStartTime, persistStartTime... for ID control > > I am in the correct direction? > Sorry for my englis and thanks in advance > > > Noble Paul നോബിള് नोब्ळ् wrote: >> >> On Tue, Dec 2, 2008 at 3:01 PM, Marc Sturlese <[EMAIL PROTECTED]> >> wrote: >>> >>> Hey there, >>> >>> I have my dataimporthanlder almost completely configured. I am missing >>> three >>> goals. I don't think I can reach them just via xml conf or transformer >>> and >>> sqlEntitProcessor plugin. But need to be sure of that. >>> If there's no other way I will hack some solr source classes, would like >>> to >>> know the best way to do that. Once I have it solved, I can upload or post >>> the source in the forum in case someone think it can be helpful. >>> >>> 1.- Every time I execute dataimporthandler (to index data from a db), at >>> the >>> start time or end time I need to delete some expired documents. I have to >>> delete them from the database and from the index. I know wich documents >>> must >>> be deleted because of a field in the db that says it. Would not like to >>> delete first all from DB or first all from index but one from index and >>> one >>> from doc every time. >> >> You can override the init() destroy() of the SqlEntityProcessor and >> use it as the processor for the root entity. At this point you can run >> the necessary db queries and solr delete queries . look at >> Context#getSolrCore() and Context#getdataSource(String) >> >> >>> The "delete mark" is setted as an update in the db row so I think I could >>> use deltaImport. Don't know If deletedPkQuery is the way to do that. Can >>> not >>> find so much information about how to make it work. As deltaQuery >>> modifies >>> docs (delete old and insert new) I supose it must be a easy way to do >>> this >>> just doing the delete and not the new insert. >> deletedPkQuery does everything first. it runs the query and uses that >> to identify the deleted rows. >>> >>> 2.-This is probably my most difficult goal. >>> Deltaimport reads a timestamp from the dataimport.properties and >>> modify/add >>> all documents from db wich were inserted after that date. What I want is >>> to >>> be able to save in the field the id of the last idexed doc. So in the >>> next >>> time I ejecute the indexer make it start indexing from that last indexed >>> id >>> doc. >> You can use a Transformer to write something to the DB. >> Context#getDataSource(String) for each row >> >>> The point of doing this is that if I do a full import from a db with lots >>> of >>> rows the app could encounter a problem in the middle of the execution and >>> abort the process. As deltaquey works I would have to restart the >>> execution >>> from the begining. Having this new functionality I could optimize the >>> index >>> and start from the last indexed doc. >>> I think I should begin modifying the SolrWriter.java and DocBuilder.java. >>> Creating functions like getStartTime, persistStartTime... for ID control >>> >>> 3.-I commented before about this last point. I want to give boost to doc >>> fields at indexing time. >>>>>Adding fieldboost is a planned item. >>> >>>>>It must work as follows . >>>>>Add a special value $fieldBoost.<fieldname> to the row map >>> >>>>>And DocBuilder should respect that. You can raise a bug and we can >>>>>commit it soon. >>> How can I do to rise a bug? >> https://issues.apache.org/jira/secure/CreateIssue!default.jspa >>> >>> Thanks in advance >>> >>> >>> >>> >>> -- >>> View this message in context: >>> http://www.nabble.com/DataImportHandler%3A-Deleteing-from-index-and-db--lastIndexed-id-feature-tp20788755p20788755.html >>> Sent from the Solr - User mailing list archive at Nabble.com. >>> >>> >> >> >> >> -- >> --Noble Paul >> >> > > -- > View this message in context: > http://www.nabble.com/DataImportHandler%3A-Deleteing-from-index-and-db--lastIndexed-id-feature-tp20788755p20790542.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- --Noble Paul