Resending to dev@nutch - had sent to markus only
> >> We still need to do >> something about the moreindexing filter. >> >> https://issues.apache.org/jira/browse/NUTCH-985 >> > > For now a quick fix for the moreindexingfilter would be OK, but we can > maybe create a new issue for 1.4 and rely on Date objects everywhere then > format it properly in the SOLRWriter. We could of course to the latter now, > but since I have no time to do it in the short time and don't want to twist > your arm I'll let you decide > > > >> >> On Thursday 05 May 2011 15:34:56 Julien Nioche wrote: >> > Hi Markus, >> > >> > Sorry for the late reply. Definitely +1 to change to Date in the schema, >> it >> > is the right thing to do and it's also the right time to do it >> > >> > Thanks >> > >> > Julien >> > >> > On 28 April 2011 12:43, Markus Jelsma <markus.jel...@openindex.io> >> wrote: >> > > Hi devs, >> > > >> > > The Solr schema must be updated as well to get dedup to work in 1.3. >> This >> > > is >> > > because in december last year index-basic seems to have been updated >> to >> > > write >> > > proper formatted dates to Solr but the schema field was still a long. >> > > >> > > Somehow Solr accepted (this is a bug) the input but cannot cope with >> the >> > > output, nor could Nutch convert the date to the internally used long >> > > (which it >> > > now can). The remaining issue is to update the field to use date >> instead >> > > of long. But this will break existing Solr set ups for sure because of >> > > field incompatibility. >> > > >> > > I propose to update the field, regardless of current Solr set ups >> because >> > > of >> > > the assumption that 1) an index can always be recreated from segments >> and >> > > 2) >> > > the current indexer assumes the Solr bug remains in 3.1 and higher as >> > > well. >> > > >> > > I haven't tested it with 3.1 but the bug is in 1.4.1 for sure. >> > > >> > > Thoughts? >> > > >> > > Cheers, >> > > -- >> > > Markus Jelsma - CTO - Openindex >> > > http://www.linkedin.com/in/markus17 >> > > 050-8536620 / 06-50258350 >> >> -- >> Markus Jelsma - CTO - Openindex >> http://www.linkedin.com/in/markus17 >> 050-8536620 / 06-50258350 >> > > > > -- > * > *Open Source Solutions for Text Engineering > > http://digitalpebble.blogspot.com/ > http://www.digitalpebble.com > -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com