Re: [zfs-discuss] CR# 6574286, remove slog device

2010-05-10 Thread Moshe Vainer
Did the fix for 6733267 make it to 134a (2010.05)? It isn't marked fixed, and i couldn't find it anywhere in the changelogs. Does that mean we'll have to wait for 2010.11 (or whatever v+2 is named)? Thanks, Moshe -- This message posted from opensolaris.org

Re: [zfs-discuss] CR# 6574286, remove slog device

2010-01-20 Thread Moshe Vainer
Hi George. Any news on this? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-12-01 Thread Moshe Vainer
@opensolaris.org Subject: Re: [zfs-discuss] CR# 6574286, remove slog device Hi Moshe: On Mon, Nov 30, 2009 at 20:30, Moshe Vainer mvai...@doyenz.com wrote: Any news on this bug? We are trying to implement write acceleration, but can't deploy to production with this issue still not fixed. If anyone has

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-11-30 Thread Moshe Vainer
Any news on this bug? We are trying to implement write acceleration, but can't deploy to production with this issue still not fixed. If anyone has an estimate (e.g., would it be part of 10.02?) i would very much appreciate to know. Thanks, Moshe -- This message posted from opensolaris.org

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-11-30 Thread Tim Cook
On Mon, Nov 30, 2009 at 2:30 PM, Moshe Vainer mvai...@doyenz.com wrote: Any news on this bug? We are trying to implement write acceleration, but can't deploy to production with this issue still not fixed. If anyone has an estimate (e.g., would it be part of 10.02?) i would very much appreciate

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-11-30 Thread Pablo Méndez Hernández
Hi Moshe: On Mon, Nov 30, 2009 at 20:30, Moshe Vainer mvai...@doyenz.com wrote: Any news on this bug? We are trying to implement write acceleration, but can't deploy to production with this issue still not fixed. If anyone has an estimate (e.g., would it be part of 10.02?) i would very much

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-11-30 Thread Moshe Vainer
I am sorry, i think i confused the matters a bit. I meant the bug that prevents importing with slog device missing, 6733267. I am aware that one can remove a slog device, but if you lose your rpool and the device goes missing while you rebuild, you will lose your pool in its entirety. Not a

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-11-30 Thread Moshe Vainer
I was responding to this: Now I have an exported file system that I cant import because of the log device but the disks are all there. Except the original log device which failed. Which actually means bug #6733267, not the one about slog removal. You can remove now (b125) but only if the pool

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-11-30 Thread George Wilson
Moshe Vainer wrote: I am sorry, i think i confused the matters a bit. I meant the bug that prevents importing with slog device missing, 6733267. I am aware that one can remove a slog device, but if you lose your rpool and the device goes missing while you rebuild, you will lose your pool in

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-22 Thread Darren J Moffat
Miles Nordin wrote: djm == Darren J Moffat darr...@opensolaris.org writes: djm I do; because I've done it to my own personal data pool. djm However it is not a procedure I'm willing to tell anyone how djm to do - so please don't ask - k, fine, fair enough and noted. djm a) it

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-22 Thread Mike Gerdts
On Tue, May 19, 2009 at 2:16 PM, Paul B. Henson hen...@acm.org wrote: I was checking with Sun support regarding this issue, and they say The CR currently has a high priority and the fix is understood. However, there is no eta, workaround, nor IDR. If it's a high priority, and it's known how

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-22 Thread Miles Nordin
mg == Mike Gerdts mger...@gmail.com writes: mg A rather interesting putback just happened... yeah, it is good when you can manually offline the same set of devices as the set of those which are allowed to fail without invoking the pool's failmode. I guess the putback means one less such

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-21 Thread Mike Gerdts
On Wed, May 20, 2009 at 9:35 AM, Paul B. Henson hen...@acm.org wrote: On Wed, 20 May 2009, Darren J Moffat wrote: Why do you think there is no progress ? Sorry if that's a wrong assumption, but I posted questions regarding it to this list with no response from a Sun employee until yours, and

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-21 Thread Richard Elling
Miles Nordin wrote: re == Richard Elling richard.ell...@gmail.com writes: re Whoa. re The slog is a top-level vdev like the others. The current re situation is that loss of a top-level vdev results in a pool re that cannot be imported. this taxonomy is wilfully

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-21 Thread Peter Woodman
Well, it worked for me, at least. Note that this is a very limited recovery case- it only works if you have the GUID of the slog device from zpool.cache, which in the case of a fail-on-export and reimport might not be available. The original author of the fix seems to imply that you can use any

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-21 Thread Miles Nordin
re == Richard Elling richard.ell...@gmail.com writes: es == Eric Schrock eric.schr...@sun.com writes: re Another way to look at this, there is no explicit flag set in re the pool that indicates whether the slog is empty or re full. Not that it makes a huge difference to me, but

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-21 Thread Bob Friesenhahn
On Thu, 21 May 2009, Miles Nordin wrote: Anyway, Richard I think your whole argument is ridiculous: you're acting like losing 30 seconds of data and losing the entire pool are equivalent. Who is this line of reasoning supposed to serve? From here it looks like everyone loses the further you

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-21 Thread Richard Elling
Miles Nordin wrote: re == Richard Elling richard.ell...@gmail.com writes: es == Eric Schrock eric.schr...@sun.com writes: re Another way to look at this, there is no explicit flag set in re the pool that indicates whether the slog is empty or re full. Not that it

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-21 Thread Paul B. Henson
On Thu, 21 May 2009, Peter Woodman wrote: Well, it worked for me, at least. Note that this is a very limited recovery case- it only works if you have the GUID of the slog device from zpool.cache, which in the case of a fail-on-export and reimport might not be available. The original author of

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-21 Thread Paul B. Henson
On Thu, 21 May 2009, Bob Friesenhahn wrote: For some people losing 30 seconds of data and losing the entire pool could be equivalent. In fact, it could be a billion dollar error. I don't think anybody's saying to just ignore a missing slog and continue on like nothing's wrong. Let the pool

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-20 Thread Darren J Moffat
Paul B. Henson wrote: I was checking with Sun support regarding this issue, and they say The CR currently has a high priority and the fix is understood. However, there is no eta, workaround, nor IDR. If it's a high priority, and it's known how to fix it, I was curious as to why has there been

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-20 Thread Paul B. Henson
On Wed, 20 May 2009, Darren J Moffat wrote: Why do you think there is no progress ? Sorry if that's a wrong assumption, but I posted questions regarding it to this list with no response from a Sun employee until yours, and the engineer assigned to my support ticket was unable to provide any

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-20 Thread Darren J Moffat
Paul B. Henson wrote: On Wed, 20 May 2009, Darren J Moffat wrote: Why do you think there is no progress ? Sorry if that's a wrong assumption, but I posted questions regarding it to this list with no response from a Sun employee until yours, and the engineer assigned to my support ticket was

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-20 Thread Robert Milkowski
On Tue, 19 May 2009, Dave wrote: Paul B. Henson wrote: I was checking with Sun support regarding this issue, and they say The CR currently has a high priority and the fix is understood. However, there is no eta, workaround, nor IDR. If it's a high priority, and it's known how to fix it, I

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-20 Thread Paul B. Henson
On Wed, 20 May 2009, Darren J Moffat wrote: How Sun Services reports the status of escalations to customers under contract is not a discussion for a public alias like this so I won't comment on this. Heh, but maybe it should be a discussion for some internal forum; more information = less

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-20 Thread Richard Elling
Will Murnane wrote: On Wed, May 20, 2009 at 12:42, Miles Nordin car...@ivy.net wrote: djm == Darren J Moffat darr...@opensolaris.org writes: djm a) it was highly dangerous and involved using multiple djm different zfs kernel modules was well as however...utter hogwash!

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-20 Thread Darren J Moffat
Paul B. Henson wrote: On Wed, 20 May 2009, Darren J Moffat wrote: How Sun Services reports the status of escalations to customers under contract is not a discussion for a public alias like this so I won't comment on this. Heh, but maybe it should be a discussion for some internal forum; more

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-20 Thread Will Murnane
On Wed, May 20, 2009 at 12:42, Miles Nordin car...@ivy.net wrote: djm == Darren J Moffat darr...@opensolaris.org writes:   djm a) it was highly dangerous and involved using multiple   djm different zfs kernel modules was well as however...utter hogwash!  Nothing is ``highly dangerous'' when

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-20 Thread Miles Nordin
re == Richard Elling richard.ell...@gmail.com writes: re in the case of a properly exported pool, we should be allowed re to import sans slog. seems so, but the non properly exported case is still important. for example NFS HA clusters would stop working if slogs were always ignored on

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-20 Thread Miles Nordin
djm == Darren J Moffat darr...@opensolaris.org writes: djm I do; because I've done it to my own personal data pool. djm However it is not a procedure I'm willing to tell anyone how djm to do - so please don't ask - k, fine, fair enough and noted. djm a) it was highly dangerous and

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-20 Thread Dave
Richard Elling wrote: Will Murnane wrote: On Wed, May 20, 2009 at 12:42, Miles Nordin car...@ivy.net wrote: djm == Darren J Moffat darr...@opensolaris.org writes: djm a) it was highly dangerous and involved using multiple djm different zfs kernel modules was well as

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-20 Thread Nicholas Lee
Not sure if this is a wacky question. Given a slog device does not really need much more than 10 GB. If I was to use a pair of X25-E (or STEC devices or whatever) in a mirror as the boot device and then either 1. created a loopback file vdev or 2. separate mirrored slice for the slog would this

[zfs-discuss] CR# 6574286, remove slog device

2009-05-19 Thread Paul B. Henson
I was checking with Sun support regarding this issue, and they say The CR currently has a high priority and the fix is understood. However, there is no eta, workaround, nor IDR. If it's a high priority, and it's known how to fix it, I was curious as to why has there been no progress? As I

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-19 Thread Dave
Paul B. Henson wrote: I was checking with Sun support regarding this issue, and they say The CR currently has a high priority and the fix is understood. However, there is no eta, workaround, nor IDR. If it's a high priority, and it's known how to fix it, I was curious as to why has there been

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-19 Thread Richard Elling
Paul B. Henson wrote: I was checking with Sun support regarding this issue, and they say The CR currently has a high priority and the fix is understood. However, there is no eta, workaround, nor IDR. If it's a high priority, and it's known how to fix it, I was curious as to why has there been

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-19 Thread Nicholas Lee
Does Solaris flush a slog device before it powers down? If so, removal during a shutdown cycle wouldn't lose any data. On Wed, May 20, 2009 at 7:57 AM, Dave dave-...@dubkat.com wrote: If you don't have mirrored slogs and the slog fails, you may lose any data that was in a txg group waiting

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-19 Thread Paul B. Henson
On Tue, 19 May 2009, Dave wrote: If you don't have mirrored slogs and the slog fails, you may lose any data that was in a txg group waiting to be committed to the main pool vdevs - you will never know if you lost any data or not. True; but from what I understand the failure rate of SSD's is

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-19 Thread Paul B. Henson
On Tue, 19 May 2009, Richard Elling wrote: Removal of a slog is different than failure of a slog. Removal is an administrative task, not a repair task. You can lump slog removal in with the more general shrink or top-level vdev removal tasks. Granted; however, for shrinking or removing

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-19 Thread Paul B. Henson
On Tue, 19 May 2009, Nicholas Lee wrote: Does Solaris flush a slog device before it powers down? If so, removal during a shutdown cycle wouldn't lose any data. Actually, if you remove a slog from a pool while the pool is exported, it becomes completely inaccessible. I have an open support

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-19 Thread Eric Schrock
On May 19, 2009, at 12:57 PM, Dave wrote: If you don't have mirrored slogs and the slog fails, you may lose any data that was in a txg group waiting to be committed to the main pool vdevs - you will never know if you lost any data or not. None of the above is correct. First off, you

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-19 Thread Nicholas Lee
So txg is sync to the slog device but retained in memory, and then rather than reading it back from the slog to memory it is copied to the pool from memory the copy? With the txg being a working set of the active commit, so might be a set of NFS iops? On Wed, May 20, 2009 at 3:43 PM, Eric

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-19 Thread Eric Schrock
On May 19, 2009, at 8:56 PM, Nicholas Lee wrote: So txg is sync to the slog device but retained in memory, and then rather than reading it back from the slog to memory it is copied to the pool from memory the copy? Yes, that is correct. It is best to think of the ZIL and the txg sync

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-19 Thread Paul B. Henson
On Tue, 19 May 2009, Eric Schrock wrote: The latter half of the above statement is also incorrect. Should you find yourself in the double-failure described above, you will get an FMA fault that describes the nature of the problem and the implications. If the slog is truly dead, you can

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-19 Thread Nicholas Lee
I guess this also means the relative value of a slog is also limited by the amount memory that can be allocated to the txg. On Wed, May 20, 2009 at 4:03 PM, Eric Schrock eric.schr...@sun.com wrote: Yes, that is correct. It is best to think of the ZIL and the txg sync process as orthogonal

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-19 Thread Dave
Eric Schrock wrote: On May 19, 2009, at 12:57 PM, Dave wrote: If you don't have mirrored slogs and the slog fails, you may lose any data that was in a txg group waiting to be committed to the main pool vdevs - you will never know if you lost any data or not. None of the above is

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-19 Thread Richard Elling
Eric Schrock wrote: On May 19, 2009, at 8:56 PM, Nicholas Lee wrote: So txg is sync to the slog device but retained in memory, and then rather than reading it back from the slog to memory it is copied to the pool from memory the copy? Yes, that is correct. It is best to think of the ZIL

Re: [zfs-discuss] CR# 6574286, remove slog device

2009-05-19 Thread Eric Schrock
On May 19, 2009, at 9:45 PM, Dave wrote: Thanks for correcting my statement. There is still a potential approximate 60 second window for data loss if there are 2 transaction groups waiting to sync with a 30 second txg commit timer, correct? No, only the syncing transaction group is