Hi Droids guys ! +1 also for release.
** I have in my bag : - Interactive droid = don't need a list of link to start a crawl, can order link by link (imply a refactoring of a droids class i don't recall the name now) - Sax output : can pass a sax consumer for output (not a clean integration thought, have to see pro/con to stax) - xml format for parametrize a droid and pass him a todo list [1] : this implementation is linked to Lenya now, but can be easily extracted. Thinks could be a nice feature in case of droids server and for communication between droids entities. ++ [1] : here come an example of file. This ns can be included in another one, so you have the result of crawl include in your original file. <?xml version="1.0" encoding="UTF-8"?> <robot xmlns="http://droids.apache.org/droids/0.2"> <!-- parametrize the droids --> <params> <!-- TODO : test this configuration, --> <delay>10</delay> <!-- TODO : use the source resolver in code to get the file, and be able to use fallback --> <filters> <resource>fallback://lenya/modules/droidsTransformer/samples/regex-urlfilter-null.txt</resource> </filters> </params> <!-- indicate locations --> <locations> <location>http://www.zegoal.com/foot/france-ligue1/</location> <location>http://localhost:8080/lenya/index.html</location> </locations> </robot> On 02/08/2011 10:52 AM, Chapuis Bertil wrote: > I agree. We have to release. > > The changes I'd like to contribute back are the following. > > - TaskQueue repleaced by java.util.Queue > - Handling process reviewed. > - Factory pattern only used for Worker > - Extractors inherit from Handler => no need to parse the document twice > - Entity renamed in Identifier > - ContentEntity renamed in Resource > - Crawler moved to droids-crawler > - Parser moved to droids-parser > - Walker moved to droids-walker > - The walker also use an Extractor > - some minor changes > > > > On 8 February 2011 10:32, Thorsten Scherler <[email protected]> wrote: > >> On Tue, 2011-02-08 at 09:50 +0100, Chapuis Bertil wrote: >>> In previous emails and jira comments I saw several people mentionning the >>> fact they have a local copy of droids which evolved too much to be merged >>> back with the trunk. This is my case, and I think Paul Rogalinski is in >> the >>> same situation. >>> >>> Since the patches have only been applied periodically on the trunk during >>> the last months, I'd love to know if someone else is in the same >> situation >>> and what can of changes they made locally. >> >> I am not sure but I see it like you describe. >> >> IMO we should release what we have right now and then plan how to merge >> back all this different versions into a new droids version. >> >> IMO the next droids version should focus on ease of reuse and a droid >> server which starts and monitors the different droids. >> >> To start with: >> * who has a version of droids which (s)he is interested to merge back >> * what are the main difference between the forge and the "original" >> * ... >> >> salu2 >> -- >> Thorsten Scherler <thorsten.at.apache.org> >> codeBusters S.L. - web based systems >> <consulting, training and solutions> >> http://www.codebusters.es/ >> >
