On Sun, 28 Jun 2009 19:13:53 +0200 Patrik Lembke <bla...@chebab.com>
wrote:
> B''H
> 
> On Tue, 16 Jun 2009 21:59:13 +0930 Karl Goetz <k...@kgoetz.id.au> wrote:
> > On Tue, 16 Jun 2009 11:51:10 +0200
> > Let us know how you progress.
> > kk
> 
> Well currently I have a sort of working spider, just that it is a bit
> too good at crawling. For example it's hard too know what parts of the
> page is really wiki and whats not, example:
> http://wiki.gnewsense.org/ForumMain/
> 
> So I will have to use some sort of black list of pages not to fetch
> (currently it just checks that it is a page on wiki.gnewsense.org).
> 
> img download and rewrite also "wiki-links" rewrite is also implemented
> currently.
> 
> So I mostly need a list of "bad"-places, like the forum and
> http://wiki.gnewsense.org/Site/FASTMembership
> 

Bumping this since I really need this information to continue. If this
project is no longer interesting please let me know.

-- 
Patrik Lembke
www: http://blambi.chebab.com/
jabber: bla...@lysator.liu.se
GnuPG-key: http://gpg.chebab.com/8FA11A15.asc

Attachment: signature.asc
Description: PGP signature

_______________________________________________
gNewSense-dev mailing list
gNewSense-dev@nongnu.org
http://lists.nongnu.org/mailman/listinfo/gnewsense-dev

Reply via email to