>
> I would state this as "document records," because I assume that's what
> you mean--I can't come up with other "permanent data structures." This
> certainly limits the scope of changes, mostly to inside DocumentDB.cc
> and below.
I think URL state description, robots.txt content, cookies are all
candidates to be stored on disk. One *very* interesting feature would
be to have a restartable crawler. htdig + ^C + htdig restart where it
stopped. Once you store the state of your crawler in a database, you
get that advantage.
--
Loic Dachary
24 av Secretan
75019 Paris
Tel: 33 1 42 45 09 16
e-mail: [EMAIL PROTECTED]
URL: http://www.senga.org/
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
- Re: [htdig3-dev] Htdig database backend Marc Britten
- Re: [htdig3-dev] Htdig database backend loic
- Re: [htdig3-dev] Htdig database backend Geoff Hutchison
- Re: [htdig3-dev] Htdig database backend Andrew Scherpbier
- Re: [htdig3-dev] Htdig database backend Marc Britten
- Re: [htdig3-dev] Htdig database ba... Geoff Hutchison
- Re: [htdig3-dev] Htdig database ba... Marc Britten
- Re: [htdig3-dev] Htdig database ba... Torsten Neuer
- Re: [htdig3-dev] Htdig database ba... loic
- Re: [htdig3-dev] Htdig database ba... Geoff Hutchison
- Re: [htdig3-dev] Htdig database ba... loic
- Re: [htdig3-dev] Htdig database ba... Geoff Hutchison
- Re: [htdig3-dev] Htdig database ba... Bill Carlson
- Re: [htdig3-dev] Htdig database ba... Geoff Hutchison
- Re: [htdig3-dev] Htdig database ba... Bill Carlson
- Re: [htdig3-dev] Htdig database ba... Aaron Turner
- Re: [htdig3-dev] Htdig database ba... Torsten Neuer
- Re: [htdig3-dev] Htdig database ba... Marc Britten
- Re: [htdig3-dev] Htdig database ba... loic
- Re: [htdig3-dev] Htdig database backend loic
- Re: [htdig3-dev] Htdig database backend William Rhee
