[
https://issues.apache.org/jira/browse/NUTCH-1393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13612275#comment-13612275
]
Roland von Herget commented on NUTCH-1393:
------------------------------------------
It's not a problem for me, I found the way around it, but the "Usage" does not
show it.
I think it's inconsistent right now, without parameters it tells us:
{code}
# ./bin/nutch generate
Usage: GeneratorJob [-topN N] [-crawlId id] [-noFilter] [-noNorm] [-adddays
numDays]
-topN <N> - number of top URLs to be selected, default is
Long.MAX_VALUE
-crawlId <id> - the id to prefix the schemas to operate on,
(default: storage.crawl.id)");
-noFilter - do not activate the filter plugin to filter the url,
default is true
-noNorm - do not activate the normalizer plugin to normalize the
url, default is true
-adddays - Adds numDays to the current time to facilitate crawling
urls already
fetched sooner then db.default.fetch.interval. Default
value is 0.
----------------------
Please set the params.
{code}
As far as I know, parameters in "[]" are optional, so it should run without any
parameters, e.g.:
{code}
# java
Usage: java [-options] class [args...]
(to execute a class)
or java [-options] -jar jarfile [args...]
(to execute a jar file)
{code}
("-options" and "args" are optional, "class" must be set)
So, I think there may be two ways, to get it more consistent:
- create something like a '-run' switch, which is mandatory
- move the usage info to a '-help' or '-h' switch and run by default
> Display consistent usage of GeneratorJob with 1.X
> -------------------------------------------------
>
> Key: NUTCH-1393
> URL: https://issues.apache.org/jira/browse/NUTCH-1393
> Project: Nutch
> Issue Type: Bug
> Components: administration gui, generator
> Affects Versions: nutchgora
> Reporter: Lewis John McGibbney
> Fix For: 2.2
>
> Attachments: NUTCH-1393.patch, NUTCH-1393-v2.patch,
> NUTCH-1393-v3.patch
>
>
> If we pass the generate argument to the nutch script, the Generator
> auto-spings into action and begins generating fetchlists. This should not be
> the case, instead it should print traditional usage to stdout. An example is
> below
> {code}
> lewis@lewis:~/ASF/nutchgora/runtime/local$ ./bin/nutch generate
> GeneratorJob: Selecting best-scoring urls due for fetch.
> GeneratorJob: starting
> GeneratorJob: filtering: true
> GeneratorJob: done
> GeneratorJob: generated batch id: 1339628223-1694200031
> {code}
> All I wanted to do was get the usage params printed to stdout but instead it
> generated my batch willy nilly.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira