Re: [DISCUSS] contents of nutch release artifact

2009-03-19 Thread Eric J. Christeson


On Mar 19, 2009, at 8:48 AM, Sami Siren wrote:



Jukka Zitting was suggesting we should rethink the Nutch release  
packaging because of it's size. I don't see this as a blocker for  
1.0 but we could perhaps start the discussion about this anyway so  
throw in your opinions...


+1 for both binary and source releases.  As I see it, it's not much  
more work and it gives people options.  If we're looking to get more  
interest in Nutch, making things as easy as possible for people is a  
good thing.


Eric

--
Eric J. Christeson  
eric.christe...@ndsu.edu

Enterprise Computing and Infrastructure(701) 231-8693 (Voice)
North Dakota State University, Fargo, North Dakota, USA



[jira] Created: (NUTCH-713) Config options for webgraph Scoring not documented

2009-03-09 Thread Eric J. Christeson (JIRA)
Config options for webgraph Scoring not documented
--

 Key: NUTCH-713
 URL: https://issues.apache.org/jira/browse/NUTCH-713
 Project: Nutch
  Issue Type: Improvement
  Components: indexer
Affects Versions: 1.0.0
 Environment: All
Reporter: Eric J. Christeson
Priority: Minor


There are a number of properties for webgraph scoring that are only documented 
in code.  I have found these:

link.ignore.internal.host
link.ignore.internal.domain
link.ignore.limit.domain
link.ignore.limit.host
link.ignore.limit.page
link.loops.depth
link.analyze.initial.score
link.analyze.damping.factor
link.analyze.rank.one
link.analyze.iteration
link.analyze.num.iterations

I have a patch to add these to conf/nutch-default.xml with the best description 
I could find.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (NUTCH-713) Config options for webgraph Scoring not documented

2009-03-09 Thread Eric J. Christeson (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric J. Christeson updated NUTCH-713:
-

Attachment: webgraph-scoring.diff

Patch to add config options to conf/nutch-default.xml

 Config options for webgraph Scoring not documented
 --

 Key: NUTCH-713
 URL: https://issues.apache.org/jira/browse/NUTCH-713
 Project: Nutch
  Issue Type: Improvement
  Components: indexer
Affects Versions: 1.0.0
 Environment: All
Reporter: Eric J. Christeson
Priority: Minor
 Attachments: webgraph-scoring.diff

   Original Estimate: 1h
  Remaining Estimate: 1h

 There are a number of properties for webgraph scoring that are only 
 documented in code.  I have found these:
 link.ignore.internal.host
 link.ignore.internal.domain
 link.ignore.limit.domain
 link.ignore.limit.host
 link.ignore.limit.page
 link.loops.depth
 link.analyze.initial.score
 link.analyze.damping.factor
 link.analyze.rank.one
 link.analyze.iteration
 link.analyze.num.iterations
 I have a patch to add these to conf/nutch-default.xml with the best 
 description I could find.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [VOTE] Release Apache Nutch 1.0

2009-03-09 Thread Eric J. Christeson


non-binding +1

--
Eric J. Christeson  
eric.christe...@ndsu.edu

Enterprise Computing and Infrastructure(701) 231-8693 (Voice)
North Dakota State University



PGP.sig
Description: This is a digitally signed message part


NTCH-635 LinkAnalysis Tool for Nutch

2009-02-12 Thread Eric J. Christeson
I went through org.apache.nutch.scoring.webgraph.* found all the  
config settings I could, threw them into nutch-default.xml and tried  
to document them.  Who wants the patches?


Eric
--
Eric J. Christeson  
eric.christe...@ndsu.edu

Enterprise Computing and Infrastructure(701) 231-8693 (Voice)
North Dakota State University