Even i am facing the same problems...
I dont know how to eliminate or delete the particular index of an url which
is crawled.
i need to eliminate the porn url's from my search engine...
i m having the crawled data after crawling with me and now i need to
find,the indexes of the porn urls..
please help me in doing this...
With Thanks,
Franklin.S
Ratnesh,V2Solutions India wrote:
>
> no,
> i don't think that we hav to deal somthing we that, because if i remove
> then I wont b able to index my own file for which I am crawling to.
>
> But I will surely check, as at this moment I am not very sure??
> Can you tell me abour ur whereabots??
>
> Thnks
> Ratnesh V2Soltuons, India
>
> Siddharth Jonathan wrote:
>>
>> Hmmm...I haven't had to do this, but my guess would be to remove the
>> corresponding
>> plugin entries from the nutch-default.xml file.
>> There is a plugin include property in that file which includes the
>> default
>> indexing filters (index-basic,index-more etc)
>> and the query filter plugins(query-basic,query-more etc). Try removing
>> those. That might keep them from getting used.
>>
>> Jonathan
>>
>>
>> On 4/2/07, Ratnesh,V2Solutions India
>> wrote:
>>>
>>>
>>> exactly offcourse ,
>>>
>>> I want this only, Do you have any solution for this??
>>>
>>> looking forwards for your reply
>>>
>>> Thnx
>>>
>>>
>>> Siddharth Jonathan wrote:
>>> >
>>> > Do you mean how do you get rid of some of the fields that are indexed
>>> by
>>> > default? eg. content, anchor text etc.
>>> >
>>> > Jonathan
>>> > On 4/2/07, Ratnesh,V2Solutions India
>>> >
>>> > wrote:
>>> >>
>>> >>
>>> >> Hi,
>>> >> I have written a plugin , which finds no. of Object tags in a html
>>> and
>>> >> corresponding urls.
>>> >> I am storing "objects" as fields and page url as values.
>>> >>
>>> >> And finally interested in seeing the search realted with "objects"
>>> >> indexed
>>> >> fields not those which is already stored as indexed fields.
>>> >>
>>> >> So how shall I delete those index fields which is already stored????
>>> >>
>>> >> Looking forward towards your reply(Valuable
>>> >> inputs).........................
>>> >>
>>> >> Thnx to Nutch Community
>>> >> --
>>> >> View this message in context:
>>> >>
>>> http://www.nabble.com/How-to-delete-already-stored-indexed-fields----tf3504164.html#a9786377
>>> >> Sent from the Nutch - User mailing list archive at Nabble.com.
>>> >>
>>> >>
>>> >
>>> >
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/How-to-delete-already-stored-indexed-fields----tf3504164.html#a9803792
>>> Sent from the Nutch - User mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>
>
--
View this message in context:
http://www.nabble.com/How-to-delete-already-stored-indexed-fields----tf3504164.html#a10099074
Sent from the Nutch - User mailing list archive at Nabble.com.
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general