Re: Progress (sort of) with Wikipedia and Plucker (was: Re: Wikipedia and Plucker)

David A. Desrosiers Mon, 12 Jul 2004 07:44:41 -0700

> The perl script that makes the static version of wikipedia runs fine while
> parsing the whole english wikipedia (about 300'000 articles).


        There must have been a problem with your original message. I didn't
see the script with your modifications attached for other people to try out
on larger boxes. I have quite a few here that I could try it on with several
gigabytes of RAM to spare.

        All kidding aside, there are known leaks in the Python code (not
sure if it is a core Python problem, or the Plucker implementation of it).
You can see them by running pydb against the spider as it parses a site.
They should be pretty easy to fix, with the right amount of Python
knowledge.

> Is anyone interested in Wikipedia with plucker and ready to give
> suggestions ?

        This just (once again) puts more inertia behind me making my perl
spider more public-ready for release. IMHO, Python is just too inefficient
to deal with something of this size, properly.

        Can you send me your modified script, either directly or back to the
list? I'd love to give it a try on a larger box here to see what happens.


d.

_______________________________________________
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list

Re: Progress (sort of) with Wikipedia and Plucker (was: Re: Wikipedia and Plucker)

Reply via email to