2010/12/8  <gon...@comcast.net>:
> To explain further the  slow delete problem:
>
> It is absolutely critical for zfs to manage the incoming data rate.
> This is done reasonably well for write transactions.
>
> Delete transactions, prior to dedup, were very light-weight, nearly free,
> so these are not managed.
>
> Because of dedup,  deletes become rather expensive, because they introduce a
> substantial seek penalty. Mostly because the need to update the dedupe
> meta data (reference counts and such)
>
> The mechanism of the problem:
> 1) Too many delete transactions are accepted into the
> open transaction group.
>
> 2) When this txg comes up to be synced to disk, the sync takes a very long
> time.
> ( instead of a healthy 1-2 seconds, minutes, hours or days)

Ok, had to look that one up, but the fog starts clearing up.
I reckon in zfs land, a command like "sync" has no effect at all?

> 3) Because the open txg can not be closed while the sync of a previous txg
> is in progress, eventually we run out of buffer space in the open txg, and all
> input is severely throttled.
>
> 4) Because of (3) other bad things happen, like the arc tries to shrink,
> memory shortage, making things worse.

Yes... I see... speaking of which: the arc size on my system would be
1685483656 bytes - that's 1.6 GB in a system with 6 GB, with 3942 MB
allocated to the kernel (dixit mdb's ::memstat module). So can i
assume that the better part of the rest is allocated in buffers that
needlessly fill up over time? I'd much rather have the memory used for
ARC :)

> 5) Because delete-s persist across reboots, you are unable to mount your
> pool
>
> Once solution is booting into maintenance mode, and renaming the zfs cache
> file (look in /etc/zfs, I forget the name at the moment)
> You can then boot up and import your pool. The import will take a long time
> but meanwhile you are up and can do other things.
> At that point you have the option of getting rid of the pool and starting
> over
> ( possibly installing a better kernel and starting over)..
> After update, and import, update your pool to the current pool version
> and life will be much better.

By now, the system booted up. It has taken quit a few hours though.
This system is actually running Nexenta but I'll see if I can upgrade
the kernel.

> I hope this helps, good luck

It clarified a few things. Thank you very much. There are one or two
things I still have to change on this system it seems...

> In addition, there was virtual memory related bug (allocating one of the zfs
> memory caches with the wrong object size) that would cause other
> components to hang, waiting for memory allocations.
>
> This was so bad in earlier kernels that systems would become unresponsive
> for
> a potentially very long time ( a phenomenon known as "bricking").
>
> As I recall a lot fo fixes came in in the 140 series kernels to fix this.
>
> Anything 145 and above should be OK.

I'm on 134f. No wonder.

-- 
Frank Van Damme
No part of this copyright message may be reproduced, read or seen,
dead or alive or by any means, including but not limited to telepathy
without the benevolence of the author.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to