contributing to droids-core? (Re: Dormant projects to mothball? (was missing reports ...))

Paul Rogalinski Thu, 18 Nov 2010 13:36:36 -0800

HelloWorld,

I'm currently building a crawler based on the droid-core implementation,trying not to change anything in the core API / interfaces yet. Due tothe lack of documentation I was not so eager to dive directly into a lotof crawler-code with unclear quality. Perhaps this was a mistake, but onthe other hand it does currently suit me quite well.

My goal is to have a crawler with a very small footprint to be embeddedinto a Hadoop map/reduce job. So I am not using Spring (IMHO too muchoverhead to initialize when running inside map/reduce), recrawling oreven multi-threaded crawling. I do plan to spawn a lot of droids, eachtaking care of one domain. Each droid has no need to jump domains orhosts. Extracted data will be written into an HBase cluster for furtherprocessing.

This is not some hobby side project for myself but a project with realworld deployment and it needs to be pretty much bullet proof. I am notgoing crazy about beautiful architecture but focus rather on stable,clean and hopefully bugfree code. Along with that I am finding smallerbugs in the droids-core implementation and thinking about additions andminor changes to the API.

I am not sure *all* of this has its place in the droids-core module - inthe end my requirements are not very generic. But if somebody isinterested I am open to discussion how my work can help improvingdroids-core.


Greetings,
Paul.

P.S.

just parked my butt over at #droids/freenode. My timezone is CET andI'll be checking activity on that channel in the evenings. To wake me upa ping on any IM mentioned in the signature will help.



Chapuis Bertil wrote:

IMHO one of the primary requirements is to clean the trunk: for exemple, the
work which has been done in the droids-crawler project has to be integrated
with the droids-core project. Then making some refactoring and implementing
some new features will be much easier.

--

paul rogalinski · mailto: [email protected] · msn: [email protected] · aim:pu1s4r · icq: 1177279 · skype: pulsar

contributing to droids-core? (Re: Dormant projects to mothball? (was missing reports ...))

Reply via email to