Hi,Franklin
My application ended with no result, as I faced gr8 difficulty in deleting
unwanted urls from the index, I havn't been able to deleted the unwanted
urls, but I have applied double filtering of my search from a list of wanted
urls contents .

I thnk in ur case u can use pruneIndexTool which will prune all the unwanted
urls related with porn site.

and If I found anything updated I will let u know later.

Thanks
"Ratnesh,V2Solutions,India"


franklinb4u wrote:
> 
> Even i am facing the same problems...
> I dont know how to eliminate or delete the particular index of an url
> which is crawled.
> i need to eliminate the porn url's from my search engine...
> 
> i m having the crawled data after crawling with me and now i need to
> find,the indexes of the porn urls..
> 
> please help me in doing this...
> 
> With Thanks,
> Franklin.S
> 
> Ratnesh,V2Solutions India wrote:
>> 
>> no, 
>> i don't think that we hav to deal somthing we that, because if i remove
>> then I wont b able to index my own file for which I am crawling to.
>> 
>> But I will surely check, as at this moment I am not very sure??
>> Can you tell me abour ur whereabots??
>> 
>> Thnks
>> Ratnesh V2Soltuons, India
>> 
>> Siddharth Jonathan wrote:
>>> 
>>> Hmmm...I haven't had to do this, but my guess would be to remove the
>>> corresponding
>>> plugin entries from the nutch-default.xml file.
>>> There is a plugin include property in that file which includes the
>>> default
>>> indexing filters (index-basic,index-more etc)
>>> and the query filter plugins(query-basic,query-more etc). Try removing
>>> those. That might keep them from getting used.
>>> 
>>> Jonathan
>>> 
>>> 
>>> On 4/2/07, Ratnesh,V2Solutions India
>>> <[EMAIL PROTECTED]>
>>> wrote:
>>>>
>>>>
>>>> exactly offcourse ,
>>>>
>>>> I want this only, Do you have any solution for this??
>>>>
>>>> looking forwards for your reply
>>>>
>>>> Thnx
>>>>
>>>>
>>>> Siddharth Jonathan wrote:
>>>> >
>>>> > Do you mean how do you get rid of some of the fields that are indexed
>>>> by
>>>> > default? eg. content, anchor text etc.
>>>> >
>>>> > Jonathan
>>>> > On 4/2/07, Ratnesh,V2Solutions India
>>>> > <[EMAIL PROTECTED]>
>>>> > wrote:
>>>> >>
>>>> >>
>>>> >> Hi,
>>>> >> I have written a plugin , which finds no. of Object tags in a html
>>>> and
>>>> >> corresponding urls.
>>>> >> I am storing "objects" as fields and page url as values.
>>>> >>
>>>> >> And finally interested in seeing the search realted with "objects"
>>>> >> indexed
>>>> >> fields not those which is already stored as indexed fields.
>>>> >>
>>>> >> So how shall I delete those index fields which is already stored????
>>>> >>
>>>> >> Looking forward towards your reply(Valuable
>>>> >> inputs).........................
>>>> >>
>>>> >> Thnx to Nutch Community
>>>> >> --
>>>> >> View this message in context:
>>>> >>
>>>> http://www.nabble.com/How-to-delete-already-stored-indexed-fields----tf3504164.html#a9786377
>>>> >> Sent from the Nutch - User mailing list archive at Nabble.com.
>>>> >>
>>>> >>
>>>> >
>>>> >
>>>>
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/How-to-delete-already-stored-indexed-fields----tf3504164.html#a9803792
>>>> Sent from the Nutch - User mailing list archive at Nabble.com.
>>>>
>>>>
>>> 
>>> 
>> 
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/How-to-delete-already-stored-indexed-fields----tf3504164.html#a10099493
Sent from the Nutch - User mailing list archive at Nabble.com.


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to