On Thu, 2009-03-26 at 22:40 +0800, Mingfai wrote:
> hi,
>
> Thanks for creating this very useful project.
:)
Thanks for this nice feedback.
>
> I'm new to the droids, and have just learnt most of the concepts and able to
> write custom parser, filter, handler etc. And I have encountered a use case
> that i want to parse and store some custom data in the Parse/ParseData, and
> have the custom data available in the handler.
>
We actually discussed this before but I am not sure whether it was here
or still on the labs list. Bottom line that we do not to rethink the API
around that. To begin with the API has an import to an implementation
class (ParseData) which is just a bad idea. Further like you pointed out
it may make sense to a allow Object to allow custom objects.
> Take an hypothetical example, assume I have a crawler that run on Google's
> search result, the parser parse the a result page and extract 10 links
> together with the 10 cache links. In the Droids framework, there is no way
> to pass the cache links to the handle, right?
Actually since they are links and if they are not excluded in the
regex-urlfilter.txt they would enter as "normal" link/task
> As a workaround, i could just
> use a singleton to store a map of data using the uri as the key, but it
> seems to me it is better if the ParseData could store more than the outlinks
> but also some custom data that we use. What do you think?
How about
public interface Parse {
Object getObject();
ParseData getParseData();
}
would merge with ParseData like
public interface Parse {
Object getParseObject();
Collection<Link> getOutlinks() ;
}
This way we can reduce the level of depth in the API and make the
relation clearer. We may even think about merging Parser and Parse too.
WDYT?
salu2
>
> The implementation could be very simple, just store a Map
>
> Regards,
> mingfai
--
Thorsten Scherler <thorsten.at.apache.org>
Open Source <consulting, training and solutions>