Are you using the scoring-link plugin? On Wednesday 12 October 2011 15:18:12 Marek Bachmann wrote: > Hey Folks, > > sorry for this second request to this topic. I managed to figure out > that the problem is nutch related. > > Once again: I have a set of urls( ~182k ) fetched, parsed and ranked via > WebGraph. All went very well. > > After that I want to index them to solr. This works fine too, except > that the boost isn't set. > > I have debugged this issue for an example url: > > nutch@hrz-pc318:/nutch/dumps/dbdump$ cat part-00001 | grep -A 9 > http://www.mathematik.uni-kassel.de/~fgcaadm/fachgruppe-computeralgebra.de/ > JdM/beitrag-hohenwarter/bezier3cons.html > > http://www.mathematik.uni-kassel.de/~fgcaadm/fachgruppe-computeralgebra.de/ > JdM/beitrag-hohenwarter/bezier3cons.html Version: 7 > Status: 2 (db_fetched) > Fetch time: Fri Oct 14 14:03:18 CEST 2011 > Modified time: Thu Jan 01 01:00:00 CET 1970 > Retries since fetch: 0 > Retry interval: 603450 seconds (6 days) > Score: 0.16124992 > Signature: 02ab7d9e6655082ff139e8a9c9afb97f > Metadata: _pst_: success(1), lastModified=0 > > You see the score isn't 1.0 > I ran the solrindex command an logged the traffic via tcpmon, here is > the extract of the document which is send to solr: > > POST /solr/update?wt=javabin&version=2 HTTP/1.1 > User-Agent: > Solr[org.apache.solr.client.solrj.impl.CommonsHttpSolrServer] 1.0 > Host: localhost:8080 > Transfer-Encoding: chunked > Content-Type: application/xml; charset=UTF-8 > > 2000 > <add> > <doc boost="1.0"> > <field name="site"> > www.mathematik.uni-kassel.de > </field> > <field name="host"> > www.mathematik.uni-kassel.de > </field> > <field name="lastModified"> > 2008-03-03T13:22:14.000Z > </field> > <field name="segment"> > 20111007135815 > </field> > <field name="digest"> > 02ab7d9e6655082ff139e8a9c9afb97f > </field> > <field name="tstamp"> > 2011-10-07T12:25:48.230Z > </field> > <field name="date"> > 2008-03-03T13:22:14.000Z > </field> > <field name="type"> > text/html > </field> > <field name="id"> > > http://www.mathematik.uni-kassel.de/~fgcaadm/fachgruppe-computeralgebra.de/ > JdM/beitrag-hohenwarter/bezier3cons.html </field> > <field name="url"> > > http://www.mathematik.uni-kassel.de/~fgcaadm/fachgruppe-computeralgebra.de/ > JdM/beitrag-hohenwarter/bezier3cons.html </field> > <field name="anchor"> > bezier3cons.html > </field> > <field name="content"> > [...] > </field> > <field name="title"> > Kubische Bézierkurve - GeoGebra Dynamisches > Arbeitsblatt > </field> > <field name="boost"> > 1.0 > </field> > <field name="contentLength"> > 1570 > </field> > </doc> > [...] > </add> > > So the boost is set to 1.0. I can't help myself why this happens. Need > your help. :)
-- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

