[ 
https://issues.apache.org/jira/browse/NUTCH-1393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13612275#comment-13612275
 ] 

Roland von Herget commented on NUTCH-1393:
------------------------------------------

It's not a problem for me, I found the way around it, but the "Usage" does not 
show it.
I think it's inconsistent right now, without parameters it tells us:
{code}
# ./bin/nutch generate
Usage: GeneratorJob [-topN N] [-crawlId id] [-noFilter] [-noNorm] [-adddays 
numDays]
    -topN <N>      - number of top URLs to be selected, default is 
Long.MAX_VALUE
    -crawlId <id>  - the id to prefix the schemas to operate on,
                    (default: storage.crawl.id)");
    -noFilter      - do not activate the filter plugin to filter the url, 
default is true
    -noNorm        - do not activate the normalizer plugin to normalize the 
url, default is true
    -adddays       - Adds numDays to the current time to facilitate crawling 
urls already
                     fetched sooner then db.default.fetch.interval. Default 
value is 0.
----------------------
Please set the params.
{code}

As far as I know, parameters in "[]" are optional, so it should run without any 
parameters, e.g.:
{code}
# java
Usage: java [-options] class [args...]
           (to execute a class)
   or  java [-options] -jar jarfile [args...]
           (to execute a jar file)
{code}
("-options" and "args" are optional, "class" must be set)

So, I think there may be two ways, to get it more consistent:
- create something like a '-run' switch, which is mandatory
- move the usage info to a '-help' or '-h' switch and run by default

                
> Display consistent usage of GeneratorJob with 1.X
> -------------------------------------------------
>
>                 Key: NUTCH-1393
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1393
>             Project: Nutch
>          Issue Type: Bug
>          Components: administration gui, generator
>    Affects Versions: nutchgora
>            Reporter: Lewis John McGibbney
>             Fix For: 2.2
>
>         Attachments: NUTCH-1393.patch, NUTCH-1393-v2.patch, 
> NUTCH-1393-v3.patch
>
>
> If we pass the generate argument to the nutch script, the Generator 
> auto-spings into action and begins generating fetchlists. This should not be 
> the case, instead it should print traditional usage to stdout. An example is 
> below
> {code}
> lewis@lewis:~/ASF/nutchgora/runtime/local$ ./bin/nutch generate
> GeneratorJob: Selecting best-scoring urls due for fetch.
> GeneratorJob: starting
> GeneratorJob: filtering: true
> GeneratorJob: done
> GeneratorJob: generated batch id: 1339628223-1694200031
> {code}
> All I wanted to do was get the usage params printed to stdout but instead it 
> generated my batch willy nilly.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to