Hi,

Doug Cutting wrote:
> Doğacan Güney wrote:
>> I think it would make much more sense to change parse plugins to take
>> content and return Parse[] instead of Parse.
>
> You're right.  That does make more sense.

OK, then should I go forward with this and implement something?   This
should be pretty easy,
though I am not sure what to give as keys to a Parse[].

I mean, when getParse returned a single Parse, ParseSegment output them
as <url, Parse>. But, if getParse
returns an array, what will be the key for each element?

Something like <url#i, Parse[i]> may work, but this may cause problems
in dedup(for example,
assume we fetched the same rss feed twice, and indexed them in different
indexes. Two version's url#0 may be
different items but since they have the same key, dedup will delete the
older).

--
Doğacan Güney

>
> Doug
>
>
>


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to