[zfs-discuss] unallocated block pointers in indirect block

2012-08-23 Thread Andriy Gapon

Guys,

I am curious if the following is something valid or a result of a bad 
corruption.
Here is a snippet from a dump of an indirect block (zdb -R with i flag):

...
[many uninterestingly good BPs skipped]
DVA[0]=0:13da6a47a00:2d000 [L0 ZFS plain file] fletcher4 uncompressed LE
contiguous unique single size=2L/2P birth=39369L/39369P fill=1
cksum=3df551808443:f842dcdc8a32e57:af4094c9cac4733c:c7ac3865df0c4ced
DVA[0]=0:13da6a74a00:2d000 [L0 ZFS plain file] fletcher4 uncompressed LE
contiguous unique single size=2L/2P birth=39369L/39369P fill=1
cksum=3e69cad45d5c:f8bc987d5aabad0:e5f72ba62f3b2955:84a2a388440c0d31
hole
hole
hole
hole
hole
hole
hole
hole
hole
hole
DVA[0]=0:ac00:1c400 DVA[1]=0:1ee00:1ea00 DVA[2]=0:1ee00:11400 [L0
unallocated] inherit inherit BE contiguous unique triple size=1c000L/200P
birth=12L/71P fill=93 cksum=9a:97:8f:b
...

From here on all the BPs appear either as holes or as L0 unallocated.
I wonder how those unallocated BPs came to be and if their presence is valid and
is to be expected by the ZFS code.

Just in case, the indirect block passes checksum verification (and also
decompression, in fact).

Thank you very much in advance.

-- 
Andriy Gapon
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zdb leaks checking

2011-12-11 Thread Andriy Gapon

Does zdb leak checking mechanism also check for the opposite situation?
That is, used/referenced blocks being in free regions of space maps.
Thank you.
-- 
Andriy Gapon
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] cross platform (freebsd) zfs pool replication

2011-07-01 Thread Andriy Gapon
on 01/07/2011 00:12 Joeri Vanthienen said the following:
 Hi,
 
 I have two servers running: freebsd with a zpool v28 and a nexenta 
 (opensolaris b134) running zpool v26.
 
 Replication (with zfs send/receive) from the nexenta box to the freebsd works 
 fine, but I have a problem accessing my replicated volume. When I'm typing 
 and autocomplete with tab key the command cd /remotepool/us (for 
 /remotepool/users) I get a panic.
 
 check the panic @ http://www.boeri.be/panic.jpg

Since this is a FreeBSD panic, I suggest that you try getting help on FreeBSD
mailing lists, f...@freebsd.org looks like the best choice.
BTW, your report doesn't contain your actual panic message and that could be
important.


 - autocomplete tab does a ls command in the background, I think
 
 I think there is a problem with NFSv4 acl/id mapping. Normal zfs (inititally 
 created on the FreeBSD box) file systems are working fine. 
 
 The nexenta box is Active Directory integrated with and the mappings for the 
 users on this cifs share have been created on the fly (Ephemeral ID Mapping). 
 
 Any solution for this? I really need the ACL permissions to be replicated. So 
 rsync is not a solution. Please help :) 
 
 root@ ~]# zfs get all remotepool/users
 NAME  PROPERTY  VALUESOURCE
 remotepool/users  type  filesystem   -
 remotepool/users  creation  Wed Jun 29 14:42 2011-
 remotepool/users  used  9.06G-
 remotepool/users  available 187G -
 remotepool/users  referenced9.06G-
 remotepool/users  compressratio 1.00x-
 remotepool/users  mounted   yes  -
 remotepool/users  quota none default
 remotepool/users  reservation   none default
 remotepool/users  recordsize128K default
 remotepool/users  mountpoint/remotepool/usersdefault
 remotepool/users  sharenfs  off  default
 remotepool/users  checksum  on   default
 remotepool/users  compression   off  default
 remotepool/users  atime on   default
 remotepool/users  devices   on   default
 remotepool/users  exec  on   default
 remotepool/users  setuidon   default
 remotepool/users  readonly  off  default
 remotepool/users  jailedoff  default
 remotepool/users  snapdir   hidden   received
 remotepool/users  aclinheritpassthrough  received
 remotepool/users  canmount  on   default
 remotepool/users  xattr off  temporary
 remotepool/users  copies1default
 remotepool/users  version   5-
 remotepool/users  utf8only  off  -
 remotepool/users  normalization none -
 remotepool/users  casesensitivity   insensitive  -
 remotepool/users  vscan off  default
 remotepool/users  nbmandon   received
 remotepool/users  sharesmb  name=users,guestok=true  received
 remotepool/users  refquota  none default
 remotepool/users  refreservationnone default
 remotepool/users  primarycache  all  default
 remotepool/users  secondarycacheall  default
 remotepool/users  usedbysnapshots   0-
 remotepool/users  usedbydataset 9.06G-
 remotepool/users  usedbychildren0-
 remotepool/users  usedbyrefreservation  0-
 remotepool/users  logbias   latency  default
 remotepool/users  dedup off  default
 remotepool/users  mlslabel   -
 remotepool/users  sync  standard default


-- 
Andriy Gapon
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ARC/VM question

2010-07-25 Thread Andriy Gapon

I have a semi-theoretical question about the following code in arc.c,
arc_reclaim_needed() function:

/*
 * take 'desfree' extra pages, so we reclaim sooner, rather than later
 */
extra = desfree;

/*
 * check that we're out of range of the pageout scanner.  It starts to
 * schedule paging if freemem is less than lotsfree and needfree.
 * lotsfree is the high-water mark for pageout, and needfree is the
 * number of needed free pages.  We add extra pages here to make sure
 * the scanner doesn't start up while we're freeing memory.
 */
if (freemem  lotsfree + needfree + extra)
return (1);

If I understand correctly, there is no pageout (page daemon is idle) until
freemem goes below lotsfree + needfree.
So, let's suppose that there is a slow eating userland application such that
each time that page scheduler and ARC reclaim thread check freemem its value is
in the range from lotsfree + needfree to lotsfree + needfree + extra.
This would mean that there would be no paging, but ARC reclaim thread would
reduce ARC size and free up some pages.
Then the application would eat into those pages again and so on, until ARC size
would reach its minimum value.

I think that such an outcome would be undesirable, because in my opinion all
memory consumers should cooperate in freeing / paging out memory when memory
shortage arises.

So, do I am miss anything in ARC code or OpenSolaris VM code that would prevent
the described scenario from taking place?
If not, then why 'extra' was added to the condition in question?

I really would like to understand interaction between VM behavior and ARC sizing
in OpenSolaris.
Thank you very much for any help!
-- 
Andriy Gapon
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Encryption?

2010-07-12 Thread Andriy Gapon
on 11/07/2010 14:21 Roy Sigurd Karlsbakk said the following:
 
 
 
 I'm planning on running FreeBSD in VirtualBox (with a Linux host)
 and giving it raw disk access to four drives, which I plan to
 configure as a raidz2 volume.
 
 Wouldn't it be better or just as good to use fuse-zfs for such a
 configuration? I/O from VirtualBox isn't really very good, but then, I
 haven't tested the linux/fbsd configuration...

Hmm, an unexpected question IMHO - wouldn't it better to just install FreeBSD on
the hardware? :-)
If an original poster is using Linux as a host OS, then probably he has some
very good reason to do that.  But performance and etc -wise, directly using
FreeBSD, of course, should win over fuse-zfs.  Right?

[Installing and maintaining one OS instead of two is the first thing that comes
to mind]

-- 
Andriy Gapon
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] gang blocks at will?

2010-05-27 Thread Andriy Gapon
on 27/05/2010 07:11 Jeff Bonwick said the following:
 You can set metaslab_gang_bang to (say) 8k to force lots of gang block 
 allocations.

Bill, Jeff,

thanks a lot!
This helped to reproduce the issue and find the bug.

Just in case:
http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/144214

 On May 25, 2010, at 11:42 PM, Andriy Gapon wrote:
 
 I am working on improving some ZFS-related bits in FreeBSD boot chain.
 At the moment it seems that the things work mostly fine except for a case 
 where
 the boot code needs to read gang blocks.  We have some reports from users 
 about
 failures, but unfortunately their pools are not available for testing anymore
 and I can not reproduce the issue at will.
 I am sure that (Open)Solaris GRUB version has been properly tested, including
 the above environment.
 Could you please help me with ideas how to create a pool/filesystem/file that
 would have gang-blocks with high probability?
 Perhaps, there are some pre-made test pool images available?
 Or some specialized tool?

 Thanks a lot!
 -- 
 Andriy Gapon
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 


-- 
Andriy Gapon
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] gang blocks at will?

2010-05-26 Thread Andriy Gapon

I am working on improving some ZFS-related bits in FreeBSD boot chain.
At the moment it seems that the things work mostly fine except for a case where
the boot code needs to read gang blocks.  We have some reports from users about
failures, but unfortunately their pools are not available for testing anymore
and I can not reproduce the issue at will.
I am sure that (Open)Solaris GRUB version has been properly tested, including
the above environment.
Could you please help me with ideas how to create a pool/filesystem/file that
would have gang-blocks with high probability?
Perhaps, there are some pre-made test pool images available?
Or some specialized tool?

Thanks a lot!
-- 
Andriy Gapon
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Oracle to no longer support ZFS on OpenSolaris?

2010-04-23 Thread Andriy Gapon
on 23/04/2010 04:22 BM said the following:
 On Tue, Apr 20, 2010 at 2:18 PM, Ken Gunderson kgund...@teamcool.net wrote:
 Greetings All:

 Granted there has been much fear, uncertainty, and doubt following
 Oracle's take over of Sun, but I ran across this on a FreeBSD mailing
 list post dated 4/20/2010

 ...Seems that Oracle won't offer support for ZFS on opensolaris

 Link here to full post here:

 http://lists.freebsd.org/pipermail/freebsd-questions/2010-April/215269.html
 
 I am not surprised it comes from FreeBSD mail list. :)

Why this attitude about FreeBSD? Did we eat your lunch?

Have you actually bothered to follow the link?
First, it was a pure speculation inside a question.
Second, look what kind of mailing list was that (general FreeBSD-related 
questions
from anyone).
Third, look from whom that came - just a random person asking a question.
[Paranoia mode: maybe it was even you.]

If, for example, I posted some nonsense about e.g. Apple on e.g. OpenSolaris
mailing list; what conclusions would you make then?

 I am amazed of
 their BSD conferences when they presenting all this *BSD stuff using
 Apple Macs (they claim it is a FreeBSD, just very bad version of it),
 Ubuntu Linux (not yet BSD) or GNU/Microsoft Windows (oh, everybody
 does that sin, right?) with a PowerPoint running on it (sure, who
 wants ugly OpenOffice if there no brain enough to use LaTeX).

What you wrote tells more about you than about FreeBSD and FreeBSD community.

P.S.
I am surprised that on this useful mostly technical mailing list such random
garbage from a random source gets posted at all.  And then gets taken seriously
even...

-- 
Andriy Gapon
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Accidentally added disk instead of attaching

2009-12-07 Thread Andriy Gapon
on 06/12/2009 19:40 Volker A. Brandt said the following:
 I wanted to add a disk to the tank pool to create a mirror. I accidentally
 used zpool add ? instead of zpool attach ? and now the disk is added. Is
 there a way to remove the disk without loosing data?
 
 Been there, done that -- at a customer site while showing off
 ZFS. :-)
 
 Currently, you cannot remove a simple device.  Depending
 on your Solaris version, you can remove things like hot spares and
 cache devices, but not simple vdevs.
 
 Backup the pool and recreate it in the correct way.
 
 
 Sorry for the bad news -- Volker

Yep.  My 2 cents -- 'add' and 'attach' are so similar the words that I think
that ZFS tools UI designers (if any) should re-consider naming of these
commands.  Or 'add' command should always be interactive and ask for at least
two confirmations that a user knows what he is doing and why.  Perhaps, it
should include a ZFS micro-exam too.
Jokes aside, this is too easy to make a mistake with the consequences that are
too hard to correct.  Anyone disagrees?

-- 
Andriy Gapon
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] feature proposal

2009-07-29 Thread Andriy Gapon

What do you think about the following feature?

Subdirectory is automatically a new filesystem property - an administrator 
turns
on this magic property of a filesystem, after that every mkdir *in the root* of
that filesystem creates a new filesystem. The new filesystems have
default/inherited properties except for the magic property which is off.

Right now I see this as being mostly useful for /home. Main benefit in this case
is that various user administration tools can work unmodified and do the right
thing when an administrator wants a policy of a separate fs per user
But I am sure that there could be other interesting uses for this.

-- 
Andriy Gapon
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] feature proposal

2009-07-29 Thread Andriy Gapon
on 29/07/2009 17:24 Andre van Eyssen said the following:
 On Wed, 29 Jul 2009, Andriy Gapon wrote:
 
 Subdirectory is automatically a new filesystem property - an
 administrator turns
 on this magic property of a filesystem, after that every mkdir *in the
 root* of
 that filesystem creates a new filesystem. The new filesystems have
 default/inherited properties except for the magic property which is off.

 Right now I see this as being mostly useful for /home. Main benefit in
 this case
 is that various user administration tools can work unmodified and do
 the right
 thing when an administrator wants a policy of a separate fs per user
 But I am sure that there could be other interesting uses for this.
 
 It's a nice idea, but zfs filesystems consume memory and have overhead.
 This would make it trivial for a non-root user (assuming they have
 permissions) to crush the host under the weight of .. mkdir.

Well, I specifically stated that this property should not be recursive, i.e. it
should work only in a root of a filesystem.
When setting this property on a filesystem an administrator should carefully set
permissions to make sure that only trusted entities can create directories 
there.

'rmdir' question requires some thinking, my first reaction is it should do zfs
destroy...


-- 
Andriy Gapon
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] feature proposal

2009-07-29 Thread Andriy Gapon
on 29/07/2009 17:52 Andre van Eyssen said the following:
 On Wed, 29 Jul 2009, Andriy Gapon wrote:
 
 Well, I specifically stated that this property should not be
 recursive, i.e. it
 should work only in a root of a filesystem.
 When setting this property on a filesystem an administrator should
 carefully set
 permissions to make sure that only trusted entities can create
 directories there.
 
 Even limited to the root of a filesystem, it still gives a user the
 ability to consume resources rapidly. While I appreciate the fact that
 it would be restricted by permissions, I can think of a number of usage
 cases where it could suddenly tank a host. One use that might pop up,
 for example, would be cache spools - which often contain *many*
 directories. One runaway and kaboom.

Well, the feature would not be on by default.
So careful evaluation and planning should prevent abuses.

 We generally use hosts now with plenty of RAM and the per-filesystem
 overhead for ZFS doesn't cause much concern. However, on a scratch box,
 try creating a big stack of filesystems - you can end up with a pool
 that consumes so much memory you can't import it!
 
 'rmdir' question requires some thinking, my first reaction is it
 should do zfs
 destroy...
 
 .. which will fail if there's a snapshot, for example. The problem seems
 to be reasonably complex - compounded by the fact that many programs
 that create or remove directories do so directly - not by calling
 externals that would be ZFS aware.

Well, snapshots could be destroyed too, nothing stops from doing that.
BTW, I am not proposing to implement this feature in mkdir/rmdir userland 
utility,
I am proposing to implement the feature in ZFS kernel code responsible for
directory creation/removal.

-- 
Andriy Gapon
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss