subject:"\[jira\] \[Commented\] \(NUTCH\-2162\) Nutch Webapp Crawl fails as it tries to index"

[jira] [Commented] (NUTCH-2162) Nutch Webapp Crawl fails as it tries to index

2017-01-17 Thread Lewis John McGibbney (JIRA)


[ 
https://issues.apache.org/jira/browse/NUTCH-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826439#comment-15826439
 ] 

Lewis John McGibbney commented on NUTCH-2162:
-

The workaround is to set up a Solr server locally such that it connects to it. 
By default, when you start the Webapp, it does want to index, but no Solr 
server is started. You need to manually ensure that a Solr Server is available.

> Nutch Webapp Crawl fails as it tries to index
> -
>
> Key: NUTCH-2162
> URL: https://issues.apache.org/jira/browse/NUTCH-2162
> Project: Nutch
>  Issue Type: Bug
>  Components: web gui
>Affects Versions: 1.11
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 1.13
>
> Attachments: nutch_webapp.log
>
>
> Right now a crawl task fails on the trunk version of the WebApp due to it 
> attempting to index. No indexer is defined by default so this is a major bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (NUTCH-2162) Nutch Webapp Crawl fails as it tries to index

2017-01-15 Thread suyash (JIRA)


[ 
https://issues.apache.org/jira/browse/NUTCH-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823524#comment-15823524
 ] 

suyash commented on NUTCH-2162:
---

Lewis I am also trying to get nutch to work well with solr by setting the 
property solr.server.url in nutc-site.xml. but still it stalls at 83 % on GUI . 
what is the workaround here ?

> Nutch Webapp Crawl fails as it tries to index
> -
>
> Key: NUTCH-2162
> URL: https://issues.apache.org/jira/browse/NUTCH-2162
> Project: Nutch
>  Issue Type: Bug
>  Components: web gui
>Affects Versions: 1.11
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 1.13
>
> Attachments: nutch_webapp.log
>
>
> Right now a crawl task fails on the trunk version of the WebApp due to it 
> attempting to index. No indexer is defined by default so this is a major bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (NUTCH-2162) Nutch Webapp Crawl fails as it tries to index

2015-11-06 Thread Lewis John McGibbney (JIRA)


[ 
https://issues.apache.org/jira/browse/NUTCH-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994168#comment-14994168
 ] 

Lewis John McGibbney commented on NUTCH-2162:
-

Ack.
I also got it working well with Solr and ES bt an indexing engine is not a
prerequisite as indicated in the crawl script. Making it mandatory from
within the GUI is backwards IMHO.
I think driving for crawl metrics via statistics panel would be a good goal
for this web app. An indexing engine may not be required for that either if
we can data accessible through RESt.

On Friday, November 6, 2015, Chris A. Mattmann (JIRA) 



-- 
*Lewis*


> Nutch Webapp Crawl fails as it tries to index
> -
>
> Key: NUTCH-2162
> URL: https://issues.apache.org/jira/browse/NUTCH-2162
> Project: Nutch
>  Issue Type: Bug
>  Components: web gui
>Affects Versions: 1.11
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 1.11
>
> Attachments: nutch_webapp.log
>
>
> Right now a crawl task fails on the trunk version of the WebApp due to it 
> attempting to index. No indexer is defined by default so this is a major bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (NUTCH-2162) Nutch Webapp Crawl fails as it tries to index

2015-11-06 Thread Chris A. Mattmann (JIRA)


[ 
https://issues.apache.org/jira/browse/NUTCH-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994029#comment-14994029
 ] 

Chris A. Mattmann commented on NUTCH-2162:
--

so I tried this out. It actually works fine as long as you have everything the 
default, e.g., if you install solr on 8983, and you install the Nutch schema in 
that solr and by default you install it into collection 1. I have it fully 
working with that config. It's brittle but doesn't require a code update and it 
works. 

One other thing to note - you can't change properties (yet) from the Nutch 
config, so you *must* update http.agent.name to something in your 
runtime/*/conf/nutch-{site|default}.xml file before starting the web services 
REST layer and using the Wicket App.

One other thing we should think about - Maven - and then Maven WAR overlays 
here once we get a version of Nutch working with Maven.

> Nutch Webapp Crawl fails as it tries to index
> -
>
> Key: NUTCH-2162
> URL: https://issues.apache.org/jira/browse/NUTCH-2162
> Project: Nutch
>  Issue Type: Bug
>  Components: web gui
>Affects Versions: 1.11
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 1.11
>
> Attachments: nutch_webapp.log
>
>
> Right now a crawl task fails on the trunk version of the WebApp due to it 
> attempting to index. No indexer is defined by default so this is a major bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (NUTCH-2162) Nutch Webapp Crawl fails as it tries to index

2015-11-06 Thread Lewis John McGibbney (JIRA)


[ 
https://issues.apache.org/jira/browse/NUTCH-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14993376#comment-14993376
 ] 

Lewis John McGibbney commented on NUTCH-2162:
-

In all honesty a work around for this is merely to comment out the following 
line
https://github.com/apache/nutch/blob/trunk/src/java/org/apache/nutch/webui/client/impl/RemoteCommandsBatchFactory.java#L59
However, the correct solution is to bake in optional logic which allows the 
user to determine whether indexing is required or not. I'll have a crack when i 
can.

> Nutch Webapp Crawl fails as it tries to index
> -
>
> Key: NUTCH-2162
> URL: https://issues.apache.org/jira/browse/NUTCH-2162
> Project: Nutch
>  Issue Type: Bug
>  Components: web gui
>Affects Versions: 1.11
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
> Fix For: 1.11
>
> Attachments: nutch_webapp.log
>
>
> Right now a crawl task fails on the trunk version of the WebApp due to it 
> attempting to index. No indexer is defined by default so this is a major bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (NUTCH-2162) Nutch Webapp Crawl fails as it tries to index

[jira] [Commented] (NUTCH-2162) Nutch Webapp Crawl fails as it tries to index

[jira] [Commented] (NUTCH-2162) Nutch Webapp Crawl fails as it tries to index

[jira] [Commented] (NUTCH-2162) Nutch Webapp Crawl fails as it tries to index

[jira] [Commented] (NUTCH-2162) Nutch Webapp Crawl fails as it tries to index

5 matches

Site Navigation

Mail list logo

Footer information