Re: Datanode block scans

Raghu Angadi Thu, 13 Nov 2008 10:51:06 -0800


How often is safe depends on what probabilities you are willing to accept.

I just checked on one of clusters with 4PB of data, the scanner fixesabout 1 block a day. Assuming avg size of 64MB per block (pretty high),probability that 3 copies of one replica go bad in 3 weeks is of therange 1e-12. In reality it is mostly 2-3 orders less probable.


Raghu.

Brian Bockelman wrote:

On Nov 13, 2008, at 11:32 AM, Raghu Angadi wrote:
Brian Bockelman wrote:
Hey all,
I noticed that the maximum throttle for the datanode block scanner ishardcoded at 8MB/s.I think this is insufficient; on a fully loaded Sun Thumper, a fullscan at 8MB/s would take something like 70 days.Is it possible to make this throttle a bit smarter? At the veryleast, would anyone object to a patch which exposed this throttle asa config option? Alternately, a smarter idea would be to throttlethe block scanner at (8MB/s) * (# of volumes), under the assumptionthat there is at least 1 disk per volume.
Making the max configurable seems useful. Either of the above optionsis fine, though the first one might be simpler for configuration.
8MB/s is calculated for around 4TB of data on a node. given 80kseconds a day, it is around 6-7 days. 8-10 MB/s is not too bad a loadon 2-4 disk machine.
Hm... on second thought, however trivial the resulting disk I/O wouldbe, on the Thumper example, the maximum throttle would be 3Gbps:that's a nontrivial load on the bus.How do other "big sites" handle this? We're currently at 110TB raw,are considering converting ~240TB over from another file system, andare planning to grow to 800TB during 2009. A quick calculation showsthat to do a weekly scan at that size, we're talking ~10Gbps ofsustained reads.
You have a 110 TB on single datanode and moving to 800TB nodes? Notethat this rate applies to amount of data on a single datanode.
Nah -110TB total in the system (200 datanodes), and will move to 800TBtotal (probably 250 datanodes).
However, we do have some larger nodes (we range from 80GB to 48TB pernode); recent and planned purchases are in the 4-8TB per node range, butI'd sure hate to throw away 48TB of disks :)
On the 48TB node, a scan at 8MB/s would take 70 days. I'd have to runat a rate of 80MB/s to scan through in 7 days. While 80MB/s over 48disks is not much, I was curious about how the rest of the system wouldperform (the node is in production on a different file system right now,so borrowing it is not easy...); 80MB/s sounds like an awful lot for"background noise".
Do any other large sites run such large nodes? How long of a periodbetween block scans do sites use in order to feel "safe" ?
Brian
Raghu.
I still worry that the rate is too low; if we have a suspicious node,or users report a problematic file, waiting a week for a full scan istoo long. I've asked a student to implement a tool which can triggera full block scan of a path (the idea would be able to do "hadoopfsck /path/to/file -deep"). What would be the best approach for himto take to initiate a high-rate "full volume" or "full datanode" scan?

Re: Datanode block scans

Reply via email to