Thanks Lewis. >>Is it possible for you to provide some >>conversation/input to the issue? Well i will tell what i was trying to do. I am running nutch on top of hadoop and integrated nutch(1.5.1-SNAPSHOT) with solr(5.0) for indexing.. The Nutch trunk has two files schema.xml, schema-solr4.xml. Schema.xml is for previous version of solr.. So i renamed my schema-solr4.xml to schema.xml and copied into solr/conf directory.
This file has two bugs: (1) stopwords are specified as stopwords_en.txt. You need to change it to stopwords.txt. (2) boost type is defined as string.. If boost type is not changed to float then it will give u CCE at run time while indexing the pages. Even though your hadoop job will fail, but stil you can go ahead and search in solr. And yes changing the type of boost to float solved my problem. Regards, Som On Fri, Jul 27, 2012 at 12:42 AM, Lewis John Mcgibbney < [email protected]> wrote: > Hi, > > On Thu, Jul 26, 2012 at 7:48 PM, shekhar sharma <[email protected]> > wrote: > > The class cast exception is due to the following reasons: > > In the schema.xml, the type of the boost is specified as string > > Good catch. We always wish to define this field as float. I'll open a > ticket, submit a fix and get this sorted in due course. > > > FYI there is some issue with SolrDeleteDuplicates, most of the users are > > getting Null pointer exception.. > > https://issues.apache.org/jira/browse/NUTCH-1100 > > Yes I am aware of this issue. Have you applied Markus' patch? Does it > solve your problem? Is it possible for you to provide some > conversation/input to the issue? > > > > Lewis, i would like to know, what the SolrDeleteDuplicates is doing...? > > > The algorithm being implemented for the dedup is documented quite > nicely either on the source [0] or on the new Javadoc [1] > > [0] > http://svn.apache.org/repos/asf/nutch/trunk/src/java/org/apache/nutch/indexer/solr/SolrDeleteDuplicates.java > [1] http://nutch.apache.org/apidocs-1.5/index.html >

