Hi Dennis

The logging.conf file is in the /hdd2/jobstream/ folder along with the
python script. I haven't modified the logging.conf file at all -
should i?

Regards
Justin

On 1/29/07, Dennis Kubes <[EMAIL PROTECTED]> wrote:
> Justin,
>
> Thanks for the update.  I will update the script and the wiki to be able
> to run this from a clean, no previous fetches run.  Currently it did
> assume that there were at least some previous fetches, crawldb, and
> segments to go with it.
>
> As to your error, I think it is looking for the logging.conf file.  Is
> that file in the same directory as the JobStream.py script?  In the top
> of the logging file there is a section called formatters like this:
>
> [formatters]
> keys=simple
>
>
> Dennis Kubes
>
> Justin Hartman wrote:
> > Hi Dennis
> >
> > This is a great contribution and I personally thank you for making it
> > available to the community.
> >
> > I am having a little difficulty getting it to work and possibly you
> > can provide some assistance in what I'm doing wrong?
> >
> > A little background first:-
> > I'm running the python script in the following location:
> > /hdd2/jobstream/JobStream.py
> > My master directory is: /hdd2/nutch/master
> > My backup directory is: /hdd2/nutch/backup
> >
> > My config in JobStream.py is as follows:-
> >
> > Line 55 to 60 configured as:
> > class JobStream:
> >  nutchdir = "/home/nutch/nutch"
> >  masterdir = "/hdd2/nutch/master"
> >  backupdir = "/hdd2/nutch/backup"
> >  log = logging.getLogger("jobstream")
> >
> > Line 377 onwards configured as:
> > def main(argv):
> >  # set the default values
> >  resume = 0
> >  execute = 0
> >  checkfile = "jobstream.stop"
> >  logconf = "logging.conf"
> >  jobdir = "/hdd2/jobstream"
> >  nutchdir = "/home/nutch/nutch"
> >  masterdir = "/hdd2/nutch/master"
> >  backupdir = "/hdd2/nutch/backup"
> >  dfsdumpdir = "/hdd2/nutch/dump"
> >  tempdir = "/hdd2/nutch/temp"
> >  splitsize = 500000
> >  fetchmerge = 3
> >
> > All the above paths are correct and have been created and the master
> > and backup directories contain zero data and have been created for
> > usage of the python script.
> >
> > When executing JobStream.py -e for the first time I got an error
> > telling me it could not find various directories within the master
> > directory so I injected the URLs into the /hdd2/nutch/master
> > directory.
> >
> > This solved my initial error however now I have this error (below) and
> > not sure what to do about it:
> >
> > /usr/bin/python2.4 /hdd2/jobstream/JobStream.py -e
> > Traceback (most recent call last):
> >  File "/hdd2/jobstream/JobStream.py", line 465, in ?
> >    main(sys.argv[1:])
> >  File "/hdd2/jobstream/JobStream.py", line 444, in main
> >    logging.config.fileConfig(logconf)
> >  File "logging/config.py", line 76, in fileConfig
> >  File "/usr/lib/python2.4/ConfigParser.py", line 511, in get
> >    raise NoSectionError(section)
> > ConfigParser.NoSectionError: No section: 'formatters'
> >
> > Do you have any ideas?
> >
> > Regards
> > Justin
> >
> > On 1/29/07, Dennis Kubes <[EMAIL PROTECTED]> wrote:
> >> It is up on the wiki at the following location.
> >>
> >> http://wiki.apache.org/nutch/Automating_Fetches_with_Python
> >>
> >> It has also been added to the front page.
> >>
> >> Dennis Kubes
> >>
> >> Andrzej Bialecki wrote:
> >> > Dennis Kubes wrote:
> >> >> We have a python script with logging which fully automates the
> >> >> fetching and updating process, not the invert links or the indexing
> >> >> process.  If anybody wants a copy, send me an email and I will send
> >> >> you a copy.
> >> >>
> >> >> We are currently working on a more in-depth framework for automating
> >> >> these types of job streams in python but that is not complete yet.
> >> >>
> >> >> Andrzej, do you think this is something we should post to the wiki?
> >> >
> >> > Sure, if it's ok for you to release it I'm sure many people would find
> >> > it useful.
> >> >
> >>
> >
> >
>


-- 
Regards
Justin Hartman
PGP Key ID: 102CC123

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to