On Mon, Jul 10, 2006 at 06:40:58PM -0400, Colin Davis wrote: > wget works fine too. > > httrack is easier to configure to get sites from across domains, > which is useful from sites with outgoing links. Plus, I was already > using it on another script ;) If someone wants to post a version that > uses wget, I'd be happy to switch. > > > I don't agree that this is piracy- Google, Archive.org, and others > cache and store sites. As long as the robot respects robots.txt, I > don't feel that this is something that's unprecedented.
They take deletion requests, which isn't possible on Freenet. They also have fairly strict copyright policies. > > -Colin > > > On Jul 10, 2006, at 6:34 PM, Matthew Toseland wrote: > > >Why httrack and not wget? > > > >Incidentally the freenet project does not endorse piracy; that > >includes > >mirroring sites without permission. :) > > > >On Mon, Jul 10, 2006 at 06:29:31PM -0400, Colin Davis wrote: > >>While exclusive Freenet content is preferred, Until we have more of > >>that, it might make sense to cache some popular websites into > >>freenet. > >> > >>This allows people a way to browse these sites, and their links, > >>without having to go through the public internet. I've been inserting > >>a few of these, but it tends to overload my node. If others wanted > >>to start doing the same thing, it'd be advantageous. > >> > >>As I understand it, there isn't harm in inserting the same page > >>multiple times, since it will just collide on the CHK, and not be > >>stored twice. > >> > >>PyFCP is a great tool for uploading content via a cronscript. You can > >>set up a site once, and then just run freesitemgr update each day. > >>http://www.freenet.org.nz/pyfcp/ > >> > >> > >>The script I've been using follows- > >> > >>#!/bin/sh > >> > >>cd /usr/local/freenet > >> > >> > >>#Move to each directory, get the files fromt he website- recurse two > >>(or 3) levels. > >>cd /usr/local/freenet/mirror/XXXXXXX > >>httrack --mirror --update --mirrorlinks -r2 -%e2 -H3 -C2 --near > >>http://XXXXXX.org -F "Mozilla/5.0 (Macintosh; U; PPC Mac OS X; de- > >>de) AppleWebKit/412.6 (KHTML, like Gecko) Safari/412.2" +*.gif +*.jpg > >>+*.png +*.js +*.css > >>cd /usr/local/freenet/mirror/XXXXXXX > >>httrack --mirror --update --mirrorlinks -r2 -%e2 -H3 -C2 --near > >>http://XXXXXX.com -F "Mozilla/5.0 (Macintosh; U; PPC Mac OS X; de-de) > >>AppleWebKit/412.6 (KHTML, like Gecko) Safari/412.2" +*.gif +*.jpg > >>+*.png +*.js +*.css > >>cd /usr/local/freenet/mirror/XXXXXXX > >>httrack --mirror --update --mirrorlinks -r3 -%e2 -H3 -C2 --near > >>http://XXXXXX.com -F "Mozilla/5.0 (Macintosh; U; PPC Mac OS X; de-de) > >>AppleWebKit/412.6 (KHTML, like Gecko) Safari/412.2" +*.gif +*.jpg > >>+*.png +*.js +*.css > >>cd /usr/local/freenet/mirror/XXXXXXX > >>httrack --mirror --update http://XXXXXXX.org/ > >>cd /usr/local/freenet/mirror/XXXXXXX > >>httrack --mirror --update --mirrorlinks -r3 -%e2 -H3 -C2 --near > >>http://XXXXXX.net -F "Mozilla/5.0 (Macintosh; U; PPC Mac OS X; de-de) > >>AppleWebKit/412.6 (KHTML, like Gecko) Safari/412.2" +*.gif +*.jpg > >>+*.png +*.js +*.css > >> > >> > >>#Move the httrack cache files out of the directories before insert- > >>Httrack is a waste of space. It's big, and not very helpful to the > >>insert, but we want it saved, so we don't keep downloading the same > >>thing. > >>mv /usr/local/freenet/mirror/XXXXXXX/hts-cache/* /usr/local/freenet/ > >>mirror/XXXXXXX/hts-cache > >>mv /usr/local/freenet/mirror/XXXXXXX/hts-cache/* /usr/local/freenet/ > >>mirror/XXXXXXX/hts-cache/ > >>mv /usr/local/freenet/mirror/XXXXXXX/hts-cache/* /usr/local/freenet/ > >>mirror/XXXXXXX/hts-cache/ > >>mv /usr/local/freenet/mirror/XXXXXXX/hts-cache/* /usr/local/freenet/ > >>mirror/XXXXXXX/hts-cache/ > >>mv /usr/local/freenet/mirror/XXXXXXX/hts-cache/* /usr/local/freenet/ > >>mirror/XXXXXXX/hts-cache/ > >> > >>#Tell PyFCP to actually insert the content, and keep a log of it. > >>/usr/bin/freesitemgr -v -v update | tee ~/nodelog.txt > >> > >> > >>#Move the cache files back from the temp dirs > >>mv /usr/local/freenet/mirror/XXXXXXX/hts-cache/* /usr/local/freenet/ > >>mirror/XXXXXXX/hts-cache/ > >>mv /usr/local/freenet/mirror/XXXXXXX/hts-cache/* /usr/local/freenet/ > >>mirror/XXXXXXX/hts-cache/ > >>mv /usr/local/freenet/mirror/XXXXXXX/hts-cache/* /usr/local/freenet/ > >>mirror/XXXXXXX/hts-cache/ > >>mv /usr/local/freenet/mirror/XXXXXXX/hts-cache/* /usr/local/freenet/ > >>mirror/XXXXXXX/hts-cache/ > >>mv /usr/local/freenet/mirror/XXXXXXX/hts-cache/* /usr/local/freenet/ > >>mirror/XXXXXXX/hts-cache/ > >> > >> > >> > >>#request the keys, to help spread them. > >>cd /usr/local/freenet/mirror/temp > >>rm -rf /usr/local/freenet/mirror/temp/* > >>wget --mirror http://127.0.0.1:8888/USK@ XXXXXXX XXXXXXX > >>rm -rf /usr/local/freenet/mirror/temp/* > >>wget --mirror http://127.0.0.1:8888/USK@ XXXXXXX XXXXXXX > >>wget --mirror http://127.0.0.1:8888/USK@ XXXXXXX XXXXXXX > >>rm -rf /usr/local/freenet/mirror/temp/* > >> > >> > >>#request from elsewhere , to help spread them. This requests from > >>Apophis, who is nice enough as to open his node up, which lets me use > >>his to spread ;) > >>cd /usr/local/freenet/mirror/temp > >>rm -rf /usr/local/freenet/mirror/temp/* > >>wget --mirror http://apophis.li/fn.php?url=http://127.0.0.1:8888/USK@ > >>XXXXXXX XXXXXXX > >>rm -rf /usr/local/freenet/mirror/temp/* > >>wget --mirror http://apophis.li/fn.php?url=http://127.0.0.1:8888/USK@ > >>XXXXXXX XXXXXXX > >>rm -rf /usr/local/freenet/mirror/temp/* > >>wget --mirror http://apophis.li/fn.php?url=http://127.0.0.1:8888/USK@ > >>XXXXXXX XXXXXXX > >>rm -rf /usr/local/freenet/mirror/temp/* > >> > >> > >>_______________________________________________ > >>Devl mailing list > >>Devl at freenetproject.org > >>http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl > >> > > > >-- > >Matthew J Toseland - toad at amphibian.dyndns.org > >Freenet Project Official Codemonkey - http://freenetproject.org/ > >ICTHUS - Nothing is impossible. Our Boss says so. > >_______________________________________________ > >Devl mailing list > >Devl at freenetproject.org > >http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl > > _______________________________________________ > Devl mailing list > Devl at freenetproject.org > http://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl > -- Matthew J Toseland - toad at amphibian.dyndns.org Freenet Project Official Codemonkey - http://freenetproject.org/ ICTHUS - Nothing is impossible. Our Boss says so. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <https://emu.freenetproject.org/pipermail/devl/attachments/20060710/7a423e74/attachment.pgp>
