Dear Jorge, Hi, Could you please tell me more about this solr plugin? Do you have that? Regards.
On Wed, Jul 2, 2014 at 9:44 AM, Jorge Luis Betancourt Gonzalez < [email protected]> wrote: > Sometime ago for a very particular use case we abstracted this > responsability into a custom Solr plugin for a few stored fields. it would > handle this case, (don’t just updating a date field, but also keeping a > counter on how many times an url is indexed). Of course you need stored > fields for this and yet under the hood a document gets deleted and added. > > On Jul 1, 2014, at 9:54 AM, Markus Jelsma <[email protected]> > wrote: > > > Hi, > > > > NutchIndexAction is indeed prepared to handle updates but the methods > are not implemented. In case of Solr, it still does an internal add/delete > for updated documents, and to do so, you must have all fields > stored="true". So in almost all cases, it is more efficient not to store > all fields and send some additional data over the wire. You can implement > it though. > > > > Markus > > > > -----Original message----- > >> From:Ali Nazemian <[email protected]> > >> Sent: Tuesday 1st July 2014 15:31 > >> To: [email protected] > >> Subject: Changing nutch for update documents instead of add new ones > >> > >> Dears, > >> Hi, > >> I am going to do some changes in nutch default behavior. I want to > change > >> nutch solr index (indexWriter class) in a way that instead of adding new > >> document to solr, old documents are updated. I saw an "update" method > >> inside this class. Is that implemented for this purpose? If no what is > the > >> purpose of this method? Another question is doing such thing (changing > >> indexWriter to update document instead of adding them) would affect my > >> performance for whole web crawling? > >> Best regards. > >> > >> -- > >> A.Nazemian > >> > > VII Escuela Internacional de Verano en la UCI del 30 de junio al 11 de > julio de 2014. Ver www.uci.cu > -- A.Nazemian

