[jira] [Updated] (NUTCH-3014) Standardize Job names

2023-10-22 Thread Lewis John McGibbney (Jira)


 [ 
https://issues.apache.org/jira/browse/NUTCH-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated NUTCH-3014:

Description: 
There is a large degree of variability when we set the job name}}{}}}

 

{{Job job = NutchJob.getInstance(getConf());}}

{{job.setJobName("read " + segment);}}

 

Some examples mention the job name, others don't. Some use upper case, others 
don't, etc.

I think we can standardize the NutchJob job names. This would help when 
filtering jobs in YARN ResourceManager UI as well.

I propose we implement the following convention
 * *Nutch* (mandatory) - static value which prepends the job name, assists with 
distinguishing the Job as a NutchJob and making it easily findable.
 * *${ClassName}* (mandatory) - literally the name of the Class the job is 
encoded in
 * *${additional info}* (optional) - value could further distinguish the type 
of job (LinkRank Counter, LinkRank Initializer, LinkRank Inverter, etc.)

_{*}Nutch ${ClassName}{*}: *${additional info}*_

_Examples:_
 * _Nutch LinkRank: Inverter_
 * _Nutch CrawlDb: + $crawldb_
 * _Nutch LinkDbReader: + $linkdb_

Thanks for any suggestions/comments.

  was:
There is a large degree of variability when we set the job name}}{}}}

 

{{Job job = NutchJob.getInstance(getConf());}}

{{job.setJobName("read " + segment);}}

 

Some examples mention the job name, others don't. Some use upper case, others 
don't, etc.

I think we can standardize the NutchJob job names. This would help when 
filtering jobs in YARN ResourceManager UI as well.

I propose we implement the following convention
 * *Nutch* (mandatory) - static value which prepends the job name, assists with 
distinguishing the Job as a NutchJob and making it easily findable.
 * *${ClassName}* (mandatory) - literally the name of the Class the job is 
encoded in
 * *${additional info}* (optional) - value could further distinguish the type 
of job (LinkRank Counter, LinkRank Initializer, LinkRank Inverter, etc.)

_{*}Nutch ${ClassName}{*}: *${additional info}*_

_Examples:_
 * _Nutch LinkRank Inverter_
 * _Nutch CrawlDb + $crawldb_
 * _Nutch LinkDbReader + $linkdb_

Thanks for any suggestions/comments.


> Standardize Job names
> -
>
> Key: NUTCH-3014
> URL: https://issues.apache.org/jira/browse/NUTCH-3014
> Project: Nutch
>  Issue Type: Improvement
>  Components: configuration, runtime
>Affects Versions: 1.19
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Minor
> Fix For: 1.20
>
>
> There is a large degree of variability when we set the job name}}{}}}
>  
> {{Job job = NutchJob.getInstance(getConf());}}
> {{job.setJobName("read " + segment);}}
>  
> Some examples mention the job name, others don't. Some use upper case, others 
> don't, etc.
> I think we can standardize the NutchJob job names. This would help when 
> filtering jobs in YARN ResourceManager UI as well.
> I propose we implement the following convention
>  * *Nutch* (mandatory) - static value which prepends the job name, assists 
> with distinguishing the Job as a NutchJob and making it easily findable.
>  * *${ClassName}* (mandatory) - literally the name of the Class the job is 
> encoded in
>  * *${additional info}* (optional) - value could further distinguish the type 
> of job (LinkRank Counter, LinkRank Initializer, LinkRank Inverter, etc.)
> _{*}Nutch ${ClassName}{*}: *${additional info}*_
> _Examples:_
>  * _Nutch LinkRank: Inverter_
>  * _Nutch CrawlDb: + $crawldb_
>  * _Nutch LinkDbReader: + $linkdb_
> Thanks for any suggestions/comments.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NUTCH-3014) Standardize Job names

2023-10-22 Thread Lewis John McGibbney (Jira)


 [ 
https://issues.apache.org/jira/browse/NUTCH-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated NUTCH-3014:

Description: 
There is a large degree of variability when we set the job name}}{}}}

 

{{Job job = NutchJob.getInstance(getConf());}}

{{job.setJobName("read " + segment);}}

 

Some examples mention the job name, others don't. Some use upper case, others 
don't, etc.

I think we can standardize the NutchJob job names. This would help when 
filtering jobs in YARN ResourceManager UI as well.

I propose we implement the following convention
 * *Nutch* (mandatory) - static value which prepends the job name, assists with 
distinguishing the Job as a NutchJob and making it easily findable.
 * *${ClassName}* (mandatory) - literally the name of the Class the job is 
encoded in
 * *${additional info}* (optional) - value could further distinguish the type 
of job (LinkRank Counter, LinkRank Initializer, LinkRank Inverter, etc.)

_{*}Nutch ${ClassName}{*}: *${additional info}*_

_Examples:_
 * _Nutch LinkRank Inverter_
 * _Nutch CrawlDb + $crawldb_
 * _Nutch LinkDbReader + $linkdb_

Thanks for any suggestions/comments.

  was:
There is a large degree of variability when we set the job name{{{}{}}}

 

{{Job job = NutchJob.getInstance(getConf());}}

{{job.setJobName("read " + segment);}}

 

Some examples mention the job name, others don't. Some use upper case, others 
don't, etc.

I think we can standardize the NutchJob job names. This would help when 
filtering jobs in YARN ResourceManager UI as well.

I propose we implement the following convention
 * *Nutch* (mandatory) - static value which prepends the job name, assists with 
distinguishing the Job as a NutchJob and making it easily findable.
 * *${ClassName}* (mandatory) - literally the name of the Class the job is 
encoded in
 * *${additional info}* (optional) - value could further distinguish the type 
of job (LinkRank Counter, LinkRank Initializer, LinkRank Inverter, etc.)

_*Nutch ${ClassName}* *${additional info}*_

_Examples:_
 * _Nutch LinkRank Inverter_
 * _Nutch CrawlDb + $crawldb_
 * _Nutch LinkDbReader + $linkdb_

Thanks for any suggestions/comments.


> Standardize Job names
> -
>
> Key: NUTCH-3014
> URL: https://issues.apache.org/jira/browse/NUTCH-3014
> Project: Nutch
>  Issue Type: Improvement
>  Components: configuration, runtime
>Affects Versions: 1.19
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Minor
> Fix For: 1.20
>
>
> There is a large degree of variability when we set the job name}}{}}}
>  
> {{Job job = NutchJob.getInstance(getConf());}}
> {{job.setJobName("read " + segment);}}
>  
> Some examples mention the job name, others don't. Some use upper case, others 
> don't, etc.
> I think we can standardize the NutchJob job names. This would help when 
> filtering jobs in YARN ResourceManager UI as well.
> I propose we implement the following convention
>  * *Nutch* (mandatory) - static value which prepends the job name, assists 
> with distinguishing the Job as a NutchJob and making it easily findable.
>  * *${ClassName}* (mandatory) - literally the name of the Class the job is 
> encoded in
>  * *${additional info}* (optional) - value could further distinguish the type 
> of job (LinkRank Counter, LinkRank Initializer, LinkRank Inverter, etc.)
> _{*}Nutch ${ClassName}{*}: *${additional info}*_
> _Examples:_
>  * _Nutch LinkRank Inverter_
>  * _Nutch CrawlDb + $crawldb_
>  * _Nutch LinkDbReader + $linkdb_
> Thanks for any suggestions/comments.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (NUTCH-3014) Standardize Job names

2023-10-22 Thread Lewis John McGibbney (Jira)


 [ 
https://issues.apache.org/jira/browse/NUTCH-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated NUTCH-3014:

Summary: Standardize Job names  (was: Standardize NutchJob job names)

> Standardize Job names
> -
>
> Key: NUTCH-3014
> URL: https://issues.apache.org/jira/browse/NUTCH-3014
> Project: Nutch
>  Issue Type: Improvement
>  Components: configuration, runtime
>Affects Versions: 1.19
>Reporter: Lewis John McGibbney
>Assignee: Lewis John McGibbney
>Priority: Minor
> Fix For: 1.20
>
>
> There is a large degree of variability when we set the job name{{{}{}}}
>  
> {{Job job = NutchJob.getInstance(getConf());}}
> {{job.setJobName("read " + segment);}}
>  
> Some examples mention the job name, others don't. Some use upper case, others 
> don't, etc.
> I think we can standardize the NutchJob job names. This would help when 
> filtering jobs in YARN ResourceManager UI as well.
> I propose we implement the following convention
>  * *Nutch* (mandatory) - static value which prepends the job name, assists 
> with distinguishing the Job as a NutchJob and making it easily findable.
>  * *${ClassName}* (mandatory) - literally the name of the Class the job is 
> encoded in
>  * *${additional info}* (optional) - value could further distinguish the type 
> of job (LinkRank Counter, LinkRank Initializer, LinkRank Inverter, etc.)
> _*Nutch ${ClassName}* *${additional info}*_
> _Examples:_
>  * _Nutch LinkRank Inverter_
>  * _Nutch CrawlDb + $crawldb_
>  * _Nutch LinkDbReader + $linkdb_
> Thanks for any suggestions/comments.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)