[zfs-code] Disk Writes

Ben Rockwood Thu, 05 Feb 2009 14:15:08 -0800

Mark Maybee wrote:
> Ben Rockwood wrote:
>> I need some help with clarification.
>>
>> My understanding is that there are 2 instances in which ZFS will write
>> to disk:
>> 1) TXG Sync
>> 2) ZIL
>>
>> Post-snv_87 a TXG should sync out when the TXG is either over filled or
>> hits the timeout of 30 seconds.
>>
>> First question is... is there some place I can see what this max TXG
>> size is?  If I recall its 1/8th of system memory... but there has to be
>> a counter somewhere right?
>>
> There is both a memory throttle limit (enforced in arc_memory_throttle)
> and a write throughput throttle limit (calculated in dsl_pool_sync(),
> enforced in dsl_pool_tempreserve_space()).  The write limit is stored as
> the 'dp_write_limit' for each pool.


I cooked up the following:
$ dtrace -qn fbt::dsl_pool_sync:entry'{ printf("Throughput is \t %d\n
write limit is\t %d\n\n", args[0]->dp_throughput,
args[0]->dp_write_limit); }'
Throughput is    883975129
 write limit is  3211748352

I'm confused with regard to the units and interpretation. 

For instance, the write limit here is almost 3GB on a system with 4GB of
RAM.  However, if I read the code right the value here is already
inflated *6... so the real write limit is actually 510MB right?

As for the throughput, I need verification... I think the unit here is
bytes per second?


>> I'm unclear on ZIL writes.   I think that they happen independently of
>> the normal txg rotation, but I'm not sure.
>>
>> So the second question is: do they happen with a TXG sync (expitied) or
>> independent of the normal TXG sync flow?
>>
>> Finally, I'm unclear on exactly what constitutes a TXG Stall.  I had
>> assumed that it indicated TXG's that exceeded the alloted time, but
>> after some dtracing I'm uncertain.
>>
> I'm not certain what you mean by: "TXG Stall".

I refer to the following code, which I'm having some trouble properly
understanding:

   475 txg_stalled(dsl_pool_t *dp)
    476 {
    477     tx_state_t *tx = &dp->dp_tx;
    478     return (tx->tx_quiesce_txg_waiting > tx->tx_open_txg);
    479 }



Ultimately, what this all comes down to is finding a reliable way to
determine when ZFS is struggling.  I'm currently watching (on pre-87)
txg sync times and if it exceeds like 4 seconds per txg sync I know
their is trouble brewing.  I'm considering whether watching either
txg_stalled or txg_delay may be better ways to flag trouble.

dp_throughput looks like it also might be a good candidate, although it
was only added in snv_98 unfortunate so it doesn't help a lot of my
existing installs.  Nevertheless, graphing this value could be very
telling and would be nice to have available as a kstat.

The intended result is to have a reliable means of monitoring (via a
standard monitoring framework such as Nagios or Zabbix or something) ZFS
health... and from my studies simply watching traditional values via
iostat isn't the best method.  If ZIL is either disabled or pushing to
SLOG, then watching the breathing of TXG sync's should be all thats
really important to me, at least on the write side... thats my theory
anyway.  Feel free to flog me. :)

Thank you very much for your help Mark!

benr.

[zfs-code] Disk Writes

Reply via email to