Geoff Hutchison writes:
> >Finally I will create a class called a Crawler (or maybe I'll use
> >Retriever) which coordinates the traversal of the doc tree. Its only
> >callback from the Parsable will be got_href, which obviously it needs
>
> I think you'd also want a got_redirect in some form too, to handle
> the META Http-Equiv refreshes and so on.
I was thinking that it should be the parser that knows how to derive
absolute URLs from relative ones. After all, it is pretty
html-specific, no?
> >I hope to have something to share within the next week or so--before
>
> Any continuing progress? As I stated a while ago, my next project is
> with htsearch/ after attending to some misc. cleanups. So I don't
> expect that there will continue to be many conflicts.
This project has been on hold again for a while, but I would still
like to get back to it soon. Right now my branch is broken because
I'm in the middle of a refactoring, but it shouldn't be too bad.
Please give me a little warning if you plan to dive into htsearch so
that I can try to get my branch in working order and merge it back in.
Yours,
Michael
--
Michael Haggerty
[EMAIL PROTECTED]
_______________________________________________
htdig-dev mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/htdig-dev