[ 
https://issues.apache.org/jira/browse/NUTCH-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated NUTCH-1486:
----------------------------------------
    Attachment: NUTCH-1486-trunkv4.patch

Patch for trunk. This patch touches a couple of places.
* corrects classes within log4j.properties to indexwriter for SolrWriter
* removes schema-solr4.xml and moves all required fields over to schema.xml
* removes the bastard additional dependencies from ivy/ivy.xml (cf. NUTCH-2056, 
NUTCH-2058) and adds them to the parsefilter-naivebayes. Also upgrades the 
Mahout and Lucene API's along with the accompanying dependencies to play nicely 
with Lucene and Solr 4.10.2. Finally implements the correct plugins.xml runtime 
dependencies for this plugin as well.
* Removes the transitive dependency for org.apache.httpcomponents httpcore and 
httpclient within index-geoip. These dependencies were leading to hellish 
classpath issues due to newer implementations being used elsewhere. Also 
upgrades index-geoip dependency to 2.3.1. Implements the correct plugin.xml 
runtime dependencies.
* Introduces some new properties within nutch-default.xml which enable us to 
choose between HttpSolrServer, CloudSolrServer, ConcurrentSolrServer or 
LBSolrServer. These have been documented within nutch-site.xml and also within 
the describe() function of SolrWriter.
* upgraded use of httpclient and httpcore across the board to >= 4.3.1 meaning 
that we avoid classpath issues when indexing and building custom plugins on top 
of Nutch which implement newer interfaces for these dependencies. 

[~asitang] can you please test out this patch along with the 
parsefilter-naivebayes? I want to confirm that it works similar/same to what 
you expect from your trained models.

@ everyone else, I've tested this indexing into Elasticsearch 1.5.0 and Apache 
Solr 4.10.2 and all is good. It would be very much appreciated if people could 
test before this patch diverges too much from trunk.
* removed 
* 

> Upgrade to Solr 4.10.2
> ----------------------
>
>                 Key: NUTCH-1486
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1486
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: 1.6, 2.1
>         Environment: Solr 4.0, Nutch trunk 1.6-SNAPSHOT & Probably 2.2-SNAPHOT
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>             Fix For: 1.11
>
>         Attachments: NUTCH-1486-1.8.patch, NUTCH-1486-1.9-trunk.patch, 
> NUTCH-1486-2.x-v3.patch, NUTCH-1486-2.x.patch, NUTCH-1486-2.x.v2.patch, 
> NUTCH-1486-nutchgora.patch, NUTCH-1486-trunk.patch, 
> NUTCH-1486-trunk.v2.patch, NUTCH-1486-trunk.v3.patch, NUTCH-1486-trunkv4.patch
>
>
> When attempting to configure a 4 multicore 4.0 instance with Nutch 
> schema-solr4.xml file, I get the following exceptions.
> This has been discussed previously. As I see it we have two options
> 1. Keep maintaining both schema options
> 2. Ditch the more complex schema-solr4.xml in favour of vanilla schema.xml
> Thoughts?
> {code}
> SEVERE: Unable to create core: collection4
> org.apache.solr.common.SolrException: Unable to use updateLog: _version_field 
> must exist in schema, using indexed="true" stored="true" and 
> multiValued="false" (_version_ does not exist)
>       at org.apache.solr.core.SolrCore.<init>(SolrCore.java:721)
>       at org.apache.solr.core.SolrCore.<init>(SolrCore.java:566)
>       at org.apache.solr.core.CoreContainer.create(CoreContainer.java:850)
>       at org.apache.solr.core.CoreContainer.load(CoreContainer.java:534)
>       at org.apache.solr.core.CoreContainer.load(CoreContainer.java:356)
>       at 
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:308)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:107)
>       at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:114)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:754)
>       at 
> org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:258)
>       at 
> org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1221)
>       at 
> org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:699)
>       at 
> org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:454)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:36)
>       at 
> org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:183)
>       at 
> org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:491)
>       at 
> org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:138)
>       at 
> org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:142)
>       at 
> org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:53)
>       at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:604)
>       at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:535)
>       at org.eclipse.jetty.util.Scanner.scan(Scanner.java:398)
>       at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:332)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:118)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:552)
>       at 
> org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:227)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:63)
>       at 
> org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:53)
>       at 
> org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:91)
>       at org.eclipse.jetty.server.Server.doStart(Server.java:263)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1215)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at 
> org.eclipse.jetty.xml.XmlConfiguration.main(XmlConfiguration.java:1138)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at org.eclipse.jetty.start.Main.invokeMain(Main.java:457)
>       at org.eclipse.jetty.start.Main.start(Main.java:602)
>       at org.eclipse.jetty.start.Main.main(Main.java:82)
> Caused by: org.apache.solr.common.SolrException: Unable to use updateLog: 
> _version_field must exist in schema, using indexed="true" stored="true" and 
> multiValued="false" (_version_ does not exist)
>       at org.apache.solr.update.UpdateLog.init(UpdateLog.java:236)
>       at org.apache.solr.update.UpdateHandler.initLog(UpdateHandler.java:94)
>       at org.apache.solr.update.UpdateHandler.<init>(UpdateHandler.java:123)
>       at 
> org.apache.solr.update.DirectUpdateHandler2.<init>(DirectUpdateHandler2.java:97)
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>       at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>       at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>       at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:476)
>       at org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:544)
>       at org.apache.solr.core.SolrCore.<init>(SolrCore.java:705)
>       ... 45 more
> Caused by: org.apache.solr.common.SolrException: _version_field must exist in 
> schema, using indexed="true" stored="true" and multiValued="false" (_version_ 
> does not exist)
>       at 
> org.apache.solr.update.VersionInfo.getAndCheckVersionField(VersionInfo.java:57)
>       at org.apache.solr.update.VersionInfo.<init>(VersionInfo.java:83)
>       at org.apache.solr.update.UpdateLog.init(UpdateLog.java:233)
>       ... 55 more
> 01-Nov-2012 16:26:15 org.apache.solr.common.SolrException log
> SEVERE: null:org.apache.solr.common.SolrException: Unable to use updateLog: 
> _version_field must exist in schema, using indexed="true" stored="true" and 
> multiValued="false" (_version_ does not exist)
>       at org.apache.solr.core.SolrCore.<init>(SolrCore.java:721)
>       at org.apache.solr.core.SolrCore.<init>(SolrCore.java:566)
>       at org.apache.solr.core.CoreContainer.create(CoreContainer.java:850)
>       at org.apache.solr.core.CoreContainer.load(CoreContainer.java:534)
>       at org.apache.solr.core.CoreContainer.load(CoreContainer.java:356)
>       at 
> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:308)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:107)
>       at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:114)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:754)
>       at 
> org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:258)
>       at 
> org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1221)
>       at 
> org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:699)
>       at 
> org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:454)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:36)
>       at 
> org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:183)
>       at 
> org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:491)
>       at 
> org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:138)
>       at 
> org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:142)
>       at 
> org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:53)
>       at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:604)
>       at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:535)
>       at org.eclipse.jetty.util.Scanner.scan(Scanner.java:398)
>       at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:332)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:118)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:552)
>       at 
> org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:227)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:63)
>       at 
> org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:53)
>       at 
> org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:91)
>       at org.eclipse.jetty.server.Server.doStart(Server.java:263)
>       at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:59)
>       at 
> org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1215)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at 
> org.eclipse.jetty.xml.XmlConfiguration.main(XmlConfiguration.java:1138)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at org.eclipse.jetty.start.Main.invokeMain(Main.java:457)
>       at org.eclipse.jetty.start.Main.start(Main.java:602)
>       at org.eclipse.jetty.start.Main.main(Main.java:82)
> Caused by: org.apache.solr.common.SolrException: Unable to use updateLog: 
> _version_field must exist in schema, using indexed="true" stored="true" and 
> multiValued="false" (_version_ does not exist)
>       at org.apache.solr.update.UpdateLog.init(UpdateLog.java:236)
>       at org.apache.solr.update.UpdateHandler.initLog(UpdateHandler.java:94)
>       at org.apache.solr.update.UpdateHandler.<init>(UpdateHandler.java:123)
>       at 
> org.apache.solr.update.DirectUpdateHandler2.<init>(DirectUpdateHandler2.java:97)
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>       at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>       at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>       at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:476)
>       at org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:544)
>       at org.apache.solr.core.SolrCore.<init>(SolrCore.java:705)
>       ... 45 more
> Caused by: org.apache.solr.common.SolrException: _version_field must exist in 
> schema, using indexed="true" stored="true" and multiValued="false" (_version_ 
> does not exist)
>       at 
> org.apache.solr.update.VersionInfo.getAndCheckVersionField(VersionInfo.java:57)
>       at org.apache.solr.update.VersionInfo.<init>(VersionInfo.java:83)
>       at org.apache.solr.update.UpdateLog.init(UpdateLog.java:233)
>       ... 55 more
> 01-Nov-2012 16:26:15 org.apache.solr.servlet.SolrDispatchFilter init
> INFO: user.dir=/home/lewis/ASF/solr/example
> 01-Nov-2012 16:26:15 org.apache.solr.servlet.SolrDispatchFilter init
> INFO: SolrDispatchFilter.init() done
> 2012-11-01 16:26:15.228:INFO:oejs.AbstractConnector:Started 
> SocketConnector@0.0.0.0:8983
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to