Hi Dennis
The logging.conf file is in the /hdd2/jobstream/ folder along with the
python script. I haven't modified the logging.conf file at all -
should i?
Regards
Justin
On 1/29/07, Dennis Kubes <[EMAIL PROTECTED]> wrote:
> Justin,
>
> Thanks for the update. I will update the script and the wiki to be able
> to run this from a clean, no previous fetches run. Currently it did
> assume that there were at least some previous fetches, crawldb, and
> segments to go with it.
>
> As to your error, I think it is looking for the logging.conf file. Is
> that file in the same directory as the JobStream.py script? In the top
> of the logging file there is a section called formatters like this:
>
> [formatters]
> keys=simple
>
>
> Dennis Kubes
>
> Justin Hartman wrote:
> > Hi Dennis
> >
> > This is a great contribution and I personally thank you for making it
> > available to the community.
> >
> > I am having a little difficulty getting it to work and possibly you
> > can provide some assistance in what I'm doing wrong?
> >
> > A little background first:-
> > I'm running the python script in the following location:
> > /hdd2/jobstream/JobStream.py
> > My master directory is: /hdd2/nutch/master
> > My backup directory is: /hdd2/nutch/backup
> >
> > My config in JobStream.py is as follows:-
> >
> > Line 55 to 60 configured as:
> > class JobStream:
> > nutchdir = "/home/nutch/nutch"
> > masterdir = "/hdd2/nutch/master"
> > backupdir = "/hdd2/nutch/backup"
> > log = logging.getLogger("jobstream")
> >
> > Line 377 onwards configured as:
> > def main(argv):
> > # set the default values
> > resume = 0
> > execute = 0
> > checkfile = "jobstream.stop"
> > logconf = "logging.conf"
> > jobdir = "/hdd2/jobstream"
> > nutchdir = "/home/nutch/nutch"
> > masterdir = "/hdd2/nutch/master"
> > backupdir = "/hdd2/nutch/backup"
> > dfsdumpdir = "/hdd2/nutch/dump"
> > tempdir = "/hdd2/nutch/temp"
> > splitsize = 500000
> > fetchmerge = 3
> >
> > All the above paths are correct and have been created and the master
> > and backup directories contain zero data and have been created for
> > usage of the python script.
> >
> > When executing JobStream.py -e for the first time I got an error
> > telling me it could not find various directories within the master
> > directory so I injected the URLs into the /hdd2/nutch/master
> > directory.
> >
> > This solved my initial error however now I have this error (below) and
> > not sure what to do about it:
> >
> > /usr/bin/python2.4 /hdd2/jobstream/JobStream.py -e
> > Traceback (most recent call last):
> > File "/hdd2/jobstream/JobStream.py", line 465, in ?
> > main(sys.argv[1:])
> > File "/hdd2/jobstream/JobStream.py", line 444, in main
> > logging.config.fileConfig(logconf)
> > File "logging/config.py", line 76, in fileConfig
> > File "/usr/lib/python2.4/ConfigParser.py", line 511, in get
> > raise NoSectionError(section)
> > ConfigParser.NoSectionError: No section: 'formatters'
> >
> > Do you have any ideas?
> >
> > Regards
> > Justin
> >
> > On 1/29/07, Dennis Kubes <[EMAIL PROTECTED]> wrote:
> >> It is up on the wiki at the following location.
> >>
> >> http://wiki.apache.org/nutch/Automating_Fetches_with_Python
> >>
> >> It has also been added to the front page.
> >>
> >> Dennis Kubes
> >>
> >> Andrzej Bialecki wrote:
> >> > Dennis Kubes wrote:
> >> >> We have a python script with logging which fully automates the
> >> >> fetching and updating process, not the invert links or the indexing
> >> >> process. If anybody wants a copy, send me an email and I will send
> >> >> you a copy.
> >> >>
> >> >> We are currently working on a more in-depth framework for automating
> >> >> these types of job streams in python but that is not complete yet.
> >> >>
> >> >> Andrzej, do you think this is something we should post to the wiki?
> >> >
> >> > Sure, if it's ok for you to release it I'm sure many people would find
> >> > it useful.
> >> >
> >>
> >
> >
>
--
Regards
Justin Hartman
PGP Key ID: 102CC123
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general