Hi Quang,

This is very basic question. You can delete in hbase shell. You can read this page http://wiki.apache.org/hadoop/Hbase/Shell. That is very useful for beginners.

Put a delete cell value at specified table/row/column and optionally timestamp coordinates. Deletes must match the deleted cell's coordinates exactly. When scanning, a delete cell suppresses older versions. To delete a cell from 't1' at row 'r1' under column 'c1' marked with the time 'ts1', do:

  hbase> delete 't1', 'r1', 'c1', ts1

For example, you want to delete http://www.google.com url and I assume you use webpage table. We use our urls as row key. For www.google.com, its row key is com.google.www:http/

  hbase> delete 'webpage', 'com.google.www:http/ '

this delete www.google.com. But if you want to not fetch a lot of urls. You can use urlfilter-regex plugin. You can write a regexp rule.

Have a nice day,
Talat

21-10-2013 05:44 tarihinde, Quang Tri yazd?:
I currently use Nutch with hbase.Now I dont want to crawl some link in
databse. How can i delete this link.
Thanks


Reply via email to