Hi Aamir,

On Tue, Apr 3, 2012 at 12:05 PM, Aamir Khan <syst3m.w...@gmail.com> wrote:

>
> Exactly, I will have full summer to understand and get up to speed. But
> since my knowledge is very limited my proposal won't be too good.. :)
>
>>
>> This doesn't need to be the case. In fact it is crucial that the
submission is of a reasonable quality. The original issue was pretty well
discussed iirc, and additionally there is also some code uploaded by the
original author so you could have a look at that over the next few days
before making a crack at the submission. I can say one thing for sure
though, this issue might need to be branded more generically... just now
Nutch would benefit more from a generically oriented plugin for scraping
various parts of html. The original author had a use case driven approach
to this issue which meant he had to extract very specific content from news
sites... this may not suit you, and certainly isn't absolutely everyone's
cup of tea within the community. It would be great if you could discuss
both in your application and on the Jira thread how the issue could be
opened up, subsequently enabling more Nutch users to benefit... as you are
stepping up to apply here, how you wish to do this is entirely your own
choice so I would take the positives from the flexibility you have here and
focus on them within your submission. Does this sounds reasonable?

I look forward to seeing any progress you have and will seriously consider
stepping up to be a potential mentor as it was me that added the issue to
GSoC list of projects.

Thank you

Lewis

Reply via email to