On 10/17/2016 18:32, Steven Hartland wrote:
>
> On 17/10/2016 22:50, Karl Denninger wrote:
>> I will make some effort on the sandbox machine to see if I can come up
>> with a way to replicate this.  I do have plenty of spare larger drives
>> laying around that used to be in service and were obsolesced due to
>> capacity -- but what I don't know if whether the system will misbehave
>> if the source is all spinning rust.
>>
>> In other words:
>>
>> 1. Root filesystem is mirrored spinning rust (production is mirrored
>> SSDs)
>>
>> 2. Backup is mirrored spinning rust (of approx the same size)
>>
>> 3. Set up auto-snapshot exactly as the production system has now (which
>> the sandbox is NOT since I don't care about incremental recovery on that
>> machine; it's a sandbox!)
>>
>> 4. Run a bunch of build-somethings (e.g. buildworlds, cross-build for
>> the Pi2s I have here, etc) to generate a LOT of filesystem entropy
>> across lots of snapshots.
>>
>> 5. Back that up.
>>
>> 6. Export the backup pool.
>>
>> 7. Re-import it and "zfs destroy -r" the backup filesystem.
>>
>> That is what got me in a reboot loop after the *first* panic; I was
>> simply going to destroy the backup filesystem and re-run the backup, but
>> as soon as I issued that zfs destroy the machine panic'd and as soon as
>> I re-attached it after a reboot it panic'd again.  Repeat until I set
>> trim=0.
>>
>> But... if I CAN replicate it that still shouldn't be happening, and the
>> system should *certainly* survive attempting to TRIM on a vdev that
>> doesn't support TRIMs, even if the removal is for a large amount of
>> space and/or files on the target, without blowing up.
>>
>> BTW I bet it isn't that rare -- if you're taking timed snapshots on an
>> active filesystem (with lots of entropy) and then make the mistake of
>> trying to remove those snapshots (as is the case with a zfs destroy -r
>> or a zfs recv of an incremental copy that attempts to sync against a
>> source) on a pool that has been imported before the system realizes that
>> TRIM is unavailable on those vdevs.
>>
>> Noting this:
>>
>>      Yes need to find some time to have a look at it, but given how rare
>>      this is and with TRIM being re-implemented upstream in a totally
>>      different manor I'm reticent to spend any real time on it.
>>
>> What's in-process in this regard, if you happen to have a reference?
> Looks like it may be still in review: https://reviews.csiden.org/r/263/
>
>

Having increased the kernel stack page count I have not had another
instance of this in the last couple of weeks+, and I am running daily
backup jobs as usual...

So this *does not* appear to be an infinite recursion problem...

-- 
Karl Denninger
k...@denninger.net <mailto:k...@denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to