[jira] [Commented] (NUTCH-2162) Nutch Webapp Crawl fails as it tries to index
[ https://issues.apache.org/jira/browse/NUTCH-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826439#comment-15826439 ] Lewis John McGibbney commented on NUTCH-2162: - The workaround is to set up a Solr server locally such that it connects to it. By default, when you start the Webapp, it does want to index, but no Solr server is started. You need to manually ensure that a Solr Server is available. > Nutch Webapp Crawl fails as it tries to index > - > > Key: NUTCH-2162 > URL: https://issues.apache.org/jira/browse/NUTCH-2162 > Project: Nutch > Issue Type: Bug > Components: web gui >Affects Versions: 1.11 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney > Fix For: 1.13 > > Attachments: nutch_webapp.log > > > Right now a crawl task fails on the trunk version of the WebApp due to it > attempting to index. No indexer is defined by default so this is a major bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2162) Nutch Webapp Crawl fails as it tries to index
[ https://issues.apache.org/jira/browse/NUTCH-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823524#comment-15823524 ] suyash commented on NUTCH-2162: --- Lewis I am also trying to get nutch to work well with solr by setting the property solr.server.url in nutc-site.xml. but still it stalls at 83 % on GUI . what is the workaround here ? > Nutch Webapp Crawl fails as it tries to index > - > > Key: NUTCH-2162 > URL: https://issues.apache.org/jira/browse/NUTCH-2162 > Project: Nutch > Issue Type: Bug > Components: web gui >Affects Versions: 1.11 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney > Fix For: 1.13 > > Attachments: nutch_webapp.log > > > Right now a crawl task fails on the trunk version of the WebApp due to it > attempting to index. No indexer is defined by default so this is a major bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2162) Nutch Webapp Crawl fails as it tries to index
[ https://issues.apache.org/jira/browse/NUTCH-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994168#comment-14994168 ] Lewis John McGibbney commented on NUTCH-2162: - Ack. I also got it working well with Solr and ES bt an indexing engine is not a prerequisite as indicated in the crawl script. Making it mandatory from within the GUI is backwards IMHO. I think driving for crawl metrics via statistics panel would be a good goal for this web app. An indexing engine may not be required for that either if we can data accessible through RESt. On Friday, November 6, 2015, Chris A. Mattmann (JIRA) -- *Lewis* > Nutch Webapp Crawl fails as it tries to index > - > > Key: NUTCH-2162 > URL: https://issues.apache.org/jira/browse/NUTCH-2162 > Project: Nutch > Issue Type: Bug > Components: web gui >Affects Versions: 1.11 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney > Fix For: 1.11 > > Attachments: nutch_webapp.log > > > Right now a crawl task fails on the trunk version of the WebApp due to it > attempting to index. No indexer is defined by default so this is a major bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2162) Nutch Webapp Crawl fails as it tries to index
[ https://issues.apache.org/jira/browse/NUTCH-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994029#comment-14994029 ] Chris A. Mattmann commented on NUTCH-2162: -- so I tried this out. It actually works fine as long as you have everything the default, e.g., if you install solr on 8983, and you install the Nutch schema in that solr and by default you install it into collection 1. I have it fully working with that config. It's brittle but doesn't require a code update and it works. One other thing to note - you can't change properties (yet) from the Nutch config, so you *must* update http.agent.name to something in your runtime/*/conf/nutch-{site|default}.xml file before starting the web services REST layer and using the Wicket App. One other thing we should think about - Maven - and then Maven WAR overlays here once we get a version of Nutch working with Maven. > Nutch Webapp Crawl fails as it tries to index > - > > Key: NUTCH-2162 > URL: https://issues.apache.org/jira/browse/NUTCH-2162 > Project: Nutch > Issue Type: Bug > Components: web gui >Affects Versions: 1.11 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney > Fix For: 1.11 > > Attachments: nutch_webapp.log > > > Right now a crawl task fails on the trunk version of the WebApp due to it > attempting to index. No indexer is defined by default so this is a major bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2162) Nutch Webapp Crawl fails as it tries to index
[ https://issues.apache.org/jira/browse/NUTCH-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14993376#comment-14993376 ] Lewis John McGibbney commented on NUTCH-2162: - In all honesty a work around for this is merely to comment out the following line https://github.com/apache/nutch/blob/trunk/src/java/org/apache/nutch/webui/client/impl/RemoteCommandsBatchFactory.java#L59 However, the correct solution is to bake in optional logic which allows the user to determine whether indexing is required or not. I'll have a crack when i can. > Nutch Webapp Crawl fails as it tries to index > - > > Key: NUTCH-2162 > URL: https://issues.apache.org/jira/browse/NUTCH-2162 > Project: Nutch > Issue Type: Bug > Components: web gui >Affects Versions: 1.11 >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney > Fix For: 1.11 > > Attachments: nutch_webapp.log > > > Right now a crawl task fails on the trunk version of the WebApp due to it > attempting to index. No indexer is defined by default so this is a major bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)