Hi Quang,
This is very basic question. You can delete in hbase shell. You can read
this page http://wiki.apache.org/hadoop/Hbase/Shell. That is very useful
for beginners.
Put a delete cell value at specified table/row/column and optionally
timestamp coordinates. Deletes must match the deleted cell's
coordinates exactly. When scanning, a delete cell suppresses older
versions. To delete a cell from 't1' at row 'r1' under column 'c1'
marked with the time 'ts1', do:
hbase> delete 't1', 'r1', 'c1', ts1
For example, you want to delete http://www.google.com url and I assume
you use webpage table. We use our urls as row key. For www.google.com,
its row key is com.google.www:http/
hbase> delete 'webpage', 'com.google.www:http/ '
this delete www.google.com. But if you want to not fetch a lot of urls.
You can use urlfilter-regex plugin. You can write a regexp rule.
Have a nice day,
Talat
21-10-2013 05:44 tarihinde, Quang Tri yazd?:
I currently use Nutch with hbase.Now I dont want to crawl some link in
databse. How can i delete this link.
Thanks