subject:"Re\: fsid change of ZFS\?"

Re: fsid change of ZFS?

2011-09-06 Thread Hiroki Sato

Hi Rick,

Rick Macklem rmack...@uoguelph.ca wrote
  in 468764384.310026.1314219682612.javamail.r...@erie.cs.uoguelph.ca:

rm It sounds like people have agreed that this is a reasonable solution.
rm If hrs@ can confirm that testing shows it fixes the original problem
rm (the ZFS file handles don't change when it's loaded at different times),
rm I'll pass it along to re@.

 I am sorry for the delay, but I tried the patch on several boxes and
 it worked fine:

 [old (fixed array) patch]
 % lsvfs
 Filesystem   Num  Refs Flags
  --- - ---
 ufs2 3
 oldnfs15 0 network
 zfs7 4 jail, delegated-administration
 nfs6 1 network
 cd9660 1 0 read-only
 procfs 3 0 synthetic
 devfs  4 1 synthetic
 msdosfs5 0

 [new (hash-based) patch]
 Filesystem   Num  Refs Flags
  --- - ---
 ufs   53 3
 oldnfs77 0 network
 zfs  222 4 jail, delegated-administration
 nfs   58 0 network
 cd9660   189 0 read-only
 procfs 2 0 synthetic
 devfs113 1 synthetic
 msdosfs   50 0

 [new patch, different loading order of kld modules]
 Filesystem   Num  Refs Flags
  --- - ---
 ufs   53 3
 zfs  222 4 jail, delegated-administration
 cd9660   189 0 read-only
 procfs 2 0 synthetic
 devfs113 1 synthetic
 msdosfs   50 0
 nfs   58 0 network

 Thanks a lot for the patch.  I think it should be committed before
 9.0R is released.

 Even for 8-STABLE this is useful but there is a problem that it will
 make an incomptibility of the fsid calculation between 8.N and
 8.(N+1).  What do you think about adding a loader tunable (something
 like vfs.fsidhash) to control this and making it disable by default?
 It would help sysadmins who will try a upgrade from 8.X to 9.X in the
 future, I think.

-- Hiroki


pgpqylKNsLgiT.pgp
Description: PGP signature

Re: fsid change of ZFS?

2011-08-25 Thread Rick Macklem

Benjamin Kaduk wrote:
 On Wed, 24 Aug 2011, Rick Macklem wrote:
 
  afs
 
 The current OpenAFS codebase uses the all-caps AFS. Judging by the
 omitted text, perhaps this should change. (We also don't use VFS_SET
 to
 set it, which I filed a bug about.)
 
 
  and here is my current rendition of the patch. (I took Gleb's
  suggestion
  and switched to fnv_32_str(). I'll leave it that way unless there is
  a
  collision after adding any names people post to the above list.)
 
  It sounds like people have agreed that this is a reasonable
  solution.
  If hrs@ can confirm that testing shows it fixes the original problem
  (the ZFS file handles don't change when it's loaded at different
  times),
  I'll pass it along to re@.
 
  Thanks for the helpful suggestions, rick
 
  --- kern/vfs_init.c.sav 2011-06-11 18:58:33.0 -0400
  +++ kern/vfs_init.c 2011-08-24 16:15:24.0 -0400
  @@ -39,6 +39,7 @@ __FBSDID($FreeBSD: head/sys/kern/vfs_in
 
  #include sys/param.h
  #include sys/systm.h
  +#include sys/fnv_hash.h
  #include sys/kernel.h
  #include sys/linker.h
  #include sys/mount.h
  @@ -138,6 +139,8 @@ vfs_register(struct vfsconf *vfc)
  struct sysctl_oid *oidp;
  struct vfsops *vfsops;
  static int once;
  + struct vfsconf *tvfc;
  + uint32_t hashval;
 
  if (!once) {
  vattr_null(va_null);
  @@ -152,7 +155,26 @@ vfs_register(struct vfsconf *vfc)
  if (vfs_byname(vfc-vfc_name) != NULL)
  return EEXIST;
 
  - vfc-vfc_typenum = maxvfsconf++;
  + /*
  + * Calculate a hash on vfc_name to use for vfc_typenum. Unless
  + * a collision occurs, it is limited to 8bits since that is
  + * what ZFS uses from vfc_typenum and that also limits how sparsely
  + * distributed vfc_typenum becomes.
  + */
  + hashval = fnv_32_str(vfc-vfc_name, FNV1_32_INIT);
  + hashval = 0xff;
  + do {
  + /* Look for and fix any collision. */
  + TAILQ_FOREACH(tvfc, vfsconf, vfc_list) {
  + if (hashval == tvfc-vfc_typenum) {
  + hashval++; /* Can exceed 8bits, if needed. */
 
 If we're confident that we won't ever fully fill the hash table, I
 would
 think that this should wrap around back to zero (or one?) instead of
 overflowing.
 
I did it that way so that it could overflow for the highly unlikely case
of more than 255 file system types.

I suppose I should scan for an unused value 1-255 and only allow it to
go past 255 if all of them are used.

I'll post yet another patch soon, rick

 Do we need to care about something attempting to add the same vfc_name
 twice? This code will happily add a second entry at the next available
 index.
 
  + break;
  + }
  + }
  + } while (tvfc != NULL);
  + vfc-vfc_typenum = hashval;
  + if (vfc-vfc_typenum = maxvfsconf)
  + maxvfsconf = vfc-vfc_typenum + 1;
 
 I guess we're holding off on killing maxvfsconf until after 9.0 is
 out?
 
 -Ben
 
  TAILQ_INSERT_TAIL(vfsconf, vfc, vfc_list);
 
  /*
 
 
 
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to
 freebsd-current-unsubscr...@freebsd.org
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-25 Thread Rick Macklem

Benjamin Kaduk wrote:
 
 If we're confident that we won't ever fully fill the hash table, I
 would
 think that this should wrap around back to zero (or one?) instead of
 overflowing.
 
Here's my updated patch (it will wrap to 1 the first time and then
exceed 255 if 1-255 are all in use).
--- kern/vfs_init.c.sav 2011-06-11 18:58:33.0 -0400
+++ kern/vfs_init.c 2011-08-25 11:09:14.0 -0400
@@ -39,6 +39,7 @@ __FBSDID($FreeBSD: head/sys/kern/vfs_in
 
 #include sys/param.h
 #include sys/systm.h
+#include sys/fnv_hash.h
 #include sys/kernel.h
 #include sys/linker.h
 #include sys/mount.h
@@ -138,6 +139,9 @@ vfs_register(struct vfsconf *vfc)
struct sysctl_oid *oidp;
struct vfsops *vfsops;
static int once;
+   struct vfsconf *tvfc;
+   uint32_t hashval;
+   int secondpass;
 
if (!once) {
vattr_null(va_null);
@@ -152,7 +156,31 @@ vfs_register(struct vfsconf *vfc)
if (vfs_byname(vfc-vfc_name) != NULL)
return EEXIST;
 
-   vfc-vfc_typenum = maxvfsconf++;
+   /*
+* Calculate a hash on vfc_name to use for vfc_typenum. Unless
+* all of 1-255 are assigned, it is limited to 8bits since that is
+* what ZFS uses from vfc_typenum and is also the preferred range
+* for vfs_getnewfsid().
+*/
+   hashval = fnv_32_str(vfc-vfc_name, FNV1_32_INIT);
+   hashval = 0xff;
+   secondpass = 0;
+   do {
+   /* Look for and fix any collision. */
+   TAILQ_FOREACH(tvfc, vfsconf, vfc_list) {
+   if (hashval == tvfc-vfc_typenum) {
+   if (hashval == 255  secondpass == 0) {
+   hashval = 1;
+   secondpass = 1;
+   } else
+   hashval++;
+   break;
+   }
+   }
+   } while (tvfc != NULL);
+   vfc-vfc_typenum = hashval;
+   if (vfc-vfc_typenum = maxvfsconf)
+   maxvfsconf = vfc-vfc_typenum + 1;
TAILQ_INSERT_TAIL(vfsconf, vfc, vfc_list);
 
/*

 Do we need to care about something attempting to add the same vfc_name
 twice? This code will happily add a second entry at the next available
 index.
 
If file systems use VFS_SET(), I don't think this can happen, since the
same vfc_name would imply same module name and the 2nd one wouldn't load.
(Been there, w.r.t. nfs.)

  + break;
  + }
  + }
  + } while (tvfc != NULL);
  + vfc-vfc_typenum = hashval;
  + if (vfc-vfc_typenum = maxvfsconf)
  + maxvfsconf = vfc-vfc_typenum + 1;
 
 I guess we're holding off on killing maxvfsconf until after 9.0 is
 out?

Well, I still don't know if anything has a use for vfs_sysctl(), so
I'm not volunteering to take it out. (If others feel it should come
out for 9.0, maybe... But I would still consider that a separate patch.)

rick

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-25 Thread Hiroki Sato

Rick Macklem rmack...@uoguelph.ca wrote
  in 468764384.310026.1314219682612.javamail.r...@erie.cs.uoguelph.ca:

rm Pawel Jakub Dawidek wrote:
rm  On Wed, Aug 24, 2011 at 04:41:25PM +0300, Kostik Belousov wrote:
rm   On Wed, Aug 24, 2011 at 09:36:37AM -0400, Rick Macklem wrote:
rmWell, doesn't this result in the same issue as the fixed table?
rmIn other words, the developer has to supply the suggested byte
rmfor
rmfsid and make sure that it doesn't conflict with other suggested
rmbyte
rmvalues or suffer the same consequence as forgetting to update the
rmfixed
rmtable. (ie. It just puts the fixed value in a different place,
rmfrom what
rmI see, for in-tree modules. Also, with a fixed table, they are all
rmin
rmone place, so it's easy to choose a non-colliding value?)
rm   The reason for my proposal was Pawel note that a porter of the
rm   filesystem
rm   should be aware of some place in kern/ where to register, besides
rm   writing
rm   the module.
rm 
rm  Well, he has to be aware, but we should do all we can to minimize the
rm  number of place he needs to update, as it is easy to forget some.
rm 
rm  I agree with Rick that what you proposed is similar to fixed table of
rm  file system names and I'd prefer to avoid that. If we can have
rm  name-based hash that produces no collision for in-tree file systems
rm  and
rm  know current 3rd party file systems plus collision detection for the
rm  future then it is good enough, IMHO. And this is what Rick proposed
rm  with
rm  his patch.
rm 
rm Ok, here is the list of file system names I've been checking and there
rm haven't been any collisions (either hash function). If you know of other
rm well known file type names, please email and I'll add them to the list
rm and check for collisions again.
rm
rm cd9660
rm ufs
rm procfs
rm devfs
rm msdosfs
rm nfs
rm xfs
rm reiserfs
rm tmpfs
rm hpfs
rm ntfs
rm ext2fs
rm udf
rm zfs
rm afs
rm
rm and here is my current rendition of the patch. (I took Gleb's suggestion
rm and switched to fnv_32_str(). I'll leave it that way unless there is a
rm collision after adding any names people post to the above list.)
rm
rm It sounds like people have agreed that this is a reasonable solution.
rm If hrs@ can confirm that testing shows it fixes the original problem
rm (the ZFS file handles don't change when it's loaded at different times),
rm I'll pass it along to re@.

 Thank you!  I will try the latest patch by Saturday.

-- Hiroki


pgpaTZdSmz5WW.pgp
Description: PGP signature

Re: fsid change of ZFS?

2011-08-25 Thread Benjamin Kaduk


On Thu, 25 Aug 2011, Rick Macklem wrote:


Benjamin Kaduk wrote:


If we're confident that we won't ever fully fill the hash table, I
would
think that this should wrap around back to zero (or one?) instead of
overflowing.


Here's my updated patch (it will wrap to 1 the first time and then
exceed 255 if 1-255 are all in use).
--- kern/vfs_init.c.sav 2011-06-11 18:58:33.0 -0400
+++ kern/vfs_init.c 2011-08-25 11:09:14.0 -0400
@@ -39,6 +39,7 @@ __FBSDID($FreeBSD: head/sys/kern/vfs_in

#include sys/param.h
#include sys/systm.h
+#include sys/fnv_hash.h
#include sys/kernel.h
#include sys/linker.h
#include sys/mount.h
@@ -138,6 +139,9 @@ vfs_register(struct vfsconf *vfc)
struct sysctl_oid *oidp;
struct vfsops *vfsops;
static int once;
+   struct vfsconf *tvfc;
+   uint32_t hashval;
+   int secondpass;

if (!once) {
vattr_null(va_null);
@@ -152,7 +156,31 @@ vfs_register(struct vfsconf *vfc)
if (vfs_byname(vfc-vfc_name) != NULL)
return EEXIST;

-   vfc-vfc_typenum = maxvfsconf++;
+   /*
+* Calculate a hash on vfc_name to use for vfc_typenum. Unless
+* all of 1-255 are assigned, it is limited to 8bits since that is
+* what ZFS uses from vfc_typenum and is also the preferred range
+* for vfs_getnewfsid().
+*/
+   hashval = fnv_32_str(vfc-vfc_name, FNV1_32_INIT);
+   hashval = 0xff;
+   secondpass = 0;
+   do {
+   /* Look for and fix any collision. */
+   TAILQ_FOREACH(tvfc, vfsconf, vfc_list) {
+   if (hashval == tvfc-vfc_typenum) {
+   if (hashval == 255  secondpass == 0) {
+   hashval = 1;
+   secondpass = 1;
+   } else
+   hashval++;
+   break;
+   }
+   }
+   } while (tvfc != NULL);
+   vfc-vfc_typenum = hashval;
+   if (vfc-vfc_typenum = maxvfsconf)
+   maxvfsconf = vfc-vfc_typenum + 1;
TAILQ_INSERT_TAIL(vfsconf, vfc, vfc_list);

/*


Do we need to care about something attempting to add the same vfc_name
twice? This code will happily add a second entry at the next available
index.


If file systems use VFS_SET(), I don't think this can happen, since the
same vfc_name would imply same module name and the 2nd one wouldn't load.
(Been there, w.r.t. nfs.)


Ah.  I guess I should get my act together and use VFS_SET, then.
*hangs head sheepishly*




+ break;
+ }
+ }
+ } while (tvfc != NULL);
+ vfc-vfc_typenum = hashval;
+ if (vfc-vfc_typenum = maxvfsconf)
+ maxvfsconf = vfc-vfc_typenum + 1;


I guess we're holding off on killing maxvfsconf until after 9.0 is
out?


Well, I still don't know if anything has a use for vfs_sysctl(), so
I'm not volunteering to take it out. (If others feel it should come
out for 9.0, maybe... But I would still consider that a separate patch.)


I don't particularly have an axe to grind, Danish or otherwise.

Thanks for the update,

Ben
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-24 Thread Kostik Belousov

On Tue, Aug 23, 2011 at 11:23:03PM +0200, Pawel Jakub Dawidek wrote:
 On Tue, Aug 23, 2011 at 04:11:20PM -0400, Rick Macklem wrote:
  Pawel Jakub Dawidek wrote:
   On Tue, Aug 23, 2011 at 10:09:41AM -0400, Rick Macklem wrote:
Ok, I'll admit I wasn't very fond of a fixed table that would
inevitably
get out of date someday, either.
   
I didn't think hashing for the cases not in the table was worth the
effort,
but doing a hash instead of a table seems reasonable.
   
I see that ZFS only uses the low order 8 bits, so I'll try and come
up
with an 8bit hash solution and will post a patch for testing/review
soon.
   
I don't think the vfs_sysctl() is that great a concern, given that
it
appears to be deprecated already anyhow. (With an 8bit hash,
vfs_typenum
won't be that sparse.) I'll also make sure that whatever hash I use
doesn't collide for the current list of file names (although I will
include
code that handles a collision in the patch).
   
   Sounds great. Thanks!
   
  Here's the patch. (Hiroki could you please test this, thanks, rick.)
  ps: If the white space gets trashed, the same patch is at:
 http://people.freebsd.org/~rmacklem/fsid.patch
 
 The patch is fine by me. Thanks, Rick!

Sorry, I am late.

It seems that the probability of the collisions for the hash is quite high.
Due to the fixup procedure, the resulting typenum will depend on the order
of the module initialization, isn't it ? IMO, it makes the patch goal not
met.


pgpvKs4ptQOnr.pgp
Description: PGP signature

Re: fsid change of ZFS?

2011-08-24 Thread Hiroki Sato

Kostik Belousov kostik...@gmail.com wrote
  in 20110824082119.gj17...@deviant.kiev.zoral.com.ua:

ko On Tue, Aug 23, 2011 at 11:23:03PM +0200, Pawel Jakub Dawidek wrote:
ko  On Tue, Aug 23, 2011 at 04:11:20PM -0400, Rick Macklem wrote:
ko   Pawel Jakub Dawidek wrote:
koOn Tue, Aug 23, 2011 at 10:09:41AM -0400, Rick Macklem wrote:
ko Ok, I'll admit I wasn't very fond of a fixed table that would
ko inevitably
ko get out of date someday, either.
ko
ko I didn't think hashing for the cases not in the table was worth the
ko effort,
ko but doing a hash instead of a table seems reasonable.
ko
ko I see that ZFS only uses the low order 8 bits, so I'll try and come
ko up
ko with an 8bit hash solution and will post a patch for testing/review
ko soon.
ko
ko I don't think the vfs_sysctl() is that great a concern, given that
ko it
ko appears to be deprecated already anyhow. (With an 8bit hash,
ko vfs_typenum
ko won't be that sparse.) I'll also make sure that whatever hash I use
ko doesn't collide for the current list of file names (although I will
ko include
ko code that handles a collision in the patch).
ko   
koSounds great. Thanks!
ko   
ko   Here's the patch. (Hiroki could you please test this, thanks, rick.)
ko   ps: If the white space gets trashed, the same patch is at:
ko  http://people.freebsd.org/~rmacklem/fsid.patch
ko 
ko  The patch is fine by me. Thanks, Rick!
ko
ko Sorry, I am late.
ko
ko It seems that the probability of the collisions for the hash is quite high.
ko Due to the fixup procedure, the resulting typenum will depend on the order
ko of the module initialization, isn't it ? IMO, it makes the patch goal not
ko met.

 I tried the following two experiments (the complete results are
 attached) to confirm the probability:

 1. [fsidhash1.txt]
well-known vfc_name and the names [a-z]fs (# of names is 36)
with no fix-up recalculation.

 2. [fsidhash2.txt]
well-known vfc_name and the names [a-z][a-z]fs (# of names is 710)
with no fix-up recalculation.

 There is no collision in the case 1.  And when [a-z][a-z]fs are
 included the average number of the collided names in the same hash
 value is 4.43 (i.e. 160 different hash values are generated, the
 theoretical best number is (710 entries / 256 buckets) = 2.77).

 At least, vfc_names we currently have in our kernel code have no
 collision, fortunately.  As you noticed [a-z][a-z]fs is an
 impractical data set and these results cannot explain the
 characteristics for all possible and practical vfc_names, so whether
 this hash is reasonable or not depends on how we think of them.
 Comments or other better idea?

-- Hiroki
0x09 = tfs
0x0b = tmpfs
0x0f = ifs
0x1f = xfs
0x24 = mfs
0x2a = bfs
0x39 = qfs
0x3b = msdosfs
0x3f = ffs
0x45 = ntfs
0x4e = ufs
0x54 = jfs
0x64 = yfs
0x69 = nfs
0x6f = cfs
0x7e = rfs
0x81 = devfs
0x84 = gfs
0x87 = ext2fs
0x89 = procfs
0x94 = vfs
0x99 = kfs
0x9c = hpfs
0xa1 = cd9660
0xa9 = zfs
0xae = ofs
0xb4 = dfs
0xc3 = sfs
0xca = hfs
0xcb = reiserfs
0xd9 = wfs
0xdf = lfs
0xe5 = afs
0xf4 = pfs
0xfa = efs
0xfe = udf
0x00 = awfs, difs, infs, nsfs, qefs, sxfs, vjfs
0x02 = owfs
0x03 = emfs, jrfs, mdfs, rifs, wnfs
0x05 = fqfs, smfs
0x06 = alfs, icfs, kvfs, nhfs, xrfs
0x08 = jgfs, wcfs
0x09 = bpfs, ebfs, gufs, lzfs, olfs, tfs, tqfs, yvfs
0x0b = aafs, tmpfs, zzfs
0x0c = ctfs, fffs, hyfs, kkfs, ppfs, sbfs, uufs, xgfs
0x0e = dxfs, qtfs
0x0f = befs, gjfs, ifs, lofs, oafs, tffs, vyfs, ykfs
0x11 = hnfs, ujfs
0x12 = cifs, msfs, pefs, rxfs, zofs
0x14 = ldfs
0x15 = dmfs, irfs, nwfs, qifs, vnfs
0x18 = eqfs, hcfs, jvfs, mhfs, rmfs, wrfs, zdfs
0x1a = fufs, sqfs
0x1b = apfs, dbfs, igfs, kzfs, nlfs, vcfs, xvfs
0x1d = jkfs
0x1e = btfs, effs, gyfs, opfs, rbfs, tufs, wgfs, yzfs
0x1f = xfs
0x20 = aefs, nafs
0x21 = cxfs, fjfs, kofs, ptfs, sffs, uyfs, xkfs
0x24 = bifs, gnfs, lsfs, mfs, oefs, qxfs, tjfs, yofs
0x26 = hrfs
0x27 = cmfs, kdfs, mwfs, pifs, unfs, zsfs
0x2a = bfs, dqfs, gcfs, ivfs, lhfs, qmfs, vrfs, ydfs
0x2c = cbfs
0x2d = eufs, hgfs, jzfs, mlfs, rqfs, ucfs, wvfs, zhfs
0x2f = fyfs
0x30 = atfs, dffs, ikfs, npfs, qbfs, sufs, vgfs, xzfs
0x33 = bxfs, ejfs, jofs, mafs, otfs, rffs, tyfs, wkfs
0x36 = aifs, fnfs, ksfs, nefs, pxfs, sjfs, xofs
0x39 = bmfs, grfs, jdfs, lwfs, oifs, qfs, tnfs, ysfs
0x3b = msdosfs
0x3c = cqfs, fcfs, hvfs, khfs, pmfs, urfs, xdfs, zwfs
0x3f = bbfs, dufs, ffs, ggfs, izfs, llfs, qqfs, tcfs, vvfs, yhfs
0x42 = eyfs, hkfs, mpfs, pbfs, rufs, ugfs, wzfs, zlfs
0x43 = cffs
0x45 = axfs, djfs, iofs, lafs, ntfs, qffs, syfs, vkfs
0x48 = enfs, jsfs, mefs, oxfs, rjfs, wofs, zafs
0x4b = frfs, idfs, kwfs, nifs, snfs, xsfs
0x4c = amfs
0x4e = bqfs, gvfs, jhfs, omfs, trfs, ufs, wdfs, ywfs
0x4f = ecfs
0x51 = abfs, cufs, fgfs, hzfs, klfs, pqfs, scfs, uvfs, xhfs
0x54 = bffs, dyfs, gkfs, jfs, lpfs, obfs, qufs, tgfs, vzfs, ylfs
0x57 = hofs, kafs, mtfs, pffs, ryfs, ukfs, zpfs
0x58 = cjfs
0x5a = dnfs, isfs, lefs, nxfs, qjfs, vofs
0x5b = yafs

Re: fsid change of ZFS?

2011-08-24 Thread Kostik Belousov

On Wed, Aug 24, 2011 at 09:34:58PM +0900, Hiroki Sato wrote:
 Kostik Belousov kostik...@gmail.com wrote
   in 20110824082119.gj17...@deviant.kiev.zoral.com.ua:
 
 ko On Tue, Aug 23, 2011 at 11:23:03PM +0200, Pawel Jakub Dawidek wrote:
 ko  On Tue, Aug 23, 2011 at 04:11:20PM -0400, Rick Macklem wrote:
 ko   Pawel Jakub Dawidek wrote:
 koOn Tue, Aug 23, 2011 at 10:09:41AM -0400, Rick Macklem wrote:
 ko Ok, I'll admit I wasn't very fond of a fixed table that would
 ko inevitably
 ko get out of date someday, either.
 ko
 ko I didn't think hashing for the cases not in the table was worth 
 the
 ko effort,
 ko but doing a hash instead of a table seems reasonable.
 ko
 ko I see that ZFS only uses the low order 8 bits, so I'll try and 
 come
 ko up
 ko with an 8bit hash solution and will post a patch for 
 testing/review
 ko soon.
 ko
 ko I don't think the vfs_sysctl() is that great a concern, given that
 ko it
 ko appears to be deprecated already anyhow. (With an 8bit hash,
 ko vfs_typenum
 ko won't be that sparse.) I'll also make sure that whatever hash I 
 use
 ko doesn't collide for the current list of file names (although I 
 will
 ko include
 ko code that handles a collision in the patch).
 ko   
 koSounds great. Thanks!
 ko   
 ko   Here's the patch. (Hiroki could you please test this, thanks, rick.)
 ko   ps: If the white space gets trashed, the same patch is at:
 ko  http://people.freebsd.org/~rmacklem/fsid.patch
 ko 
 ko  The patch is fine by me. Thanks, Rick!
 ko
 ko Sorry, I am late.
 ko
 ko It seems that the probability of the collisions for the hash is quite 
 high.
 ko Due to the fixup procedure, the resulting typenum will depend on the order
 ko of the module initialization, isn't it ? IMO, it makes the patch goal not
 ko met.
 
  I tried the following two experiments (the complete results are
  attached) to confirm the probability:
 
  1. [fsidhash1.txt]
   well-known vfc_name and the names [a-z]fs (# of names is 36)
   with no fix-up recalculation.
 
  2. [fsidhash2.txt]
   well-known vfc_name and the names [a-z][a-z]fs (# of names is 710)
   with no fix-up recalculation.
 
  There is no collision in the case 1.  And when [a-z][a-z]fs are
  included the average number of the collided names in the same hash
  value is 4.43 (i.e. 160 different hash values are generated, the
  theoretical best number is (710 entries / 256 buckets) = 2.77).
 
  At least, vfc_names we currently have in our kernel code have no
  collision, fortunately.  As you noticed [a-z][a-z]fs is an
  impractical data set and these results cannot explain the
  characteristics for all possible and practical vfc_names, so whether
  this hash is reasonable or not depends on how we think of them.
  Comments or other better idea?

Might be, add a new byte field to struct vfsconf, containing the suggested 
byte to be used by fsid. Divide the namespace into two half by the highest
bit 7. Use the byte values with the highest bit clear for statically
assignment for in-tree modules. Kernel will still check for the uniquiness
on initialization.

Reserve the value '0' to mean that kernel must assign the next unused
value = 128. This is done to support out-of-tree modules.

Might be, also reserve the value 0 to mean 'kernel
Main drawback is that is sounds complicated.


pgpF1ekzuaUOK.pgp
Description: PGP signature

Re: fsid change of ZFS?

2011-08-24 Thread Rick Macklem

Kostik Belousov wrote:
 On Tue, Aug 23, 2011 at 11:23:03PM +0200, Pawel Jakub Dawidek wrote:
  On Tue, Aug 23, 2011 at 04:11:20PM -0400, Rick Macklem wrote:
   Pawel Jakub Dawidek wrote:
On Tue, Aug 23, 2011 at 10:09:41AM -0400, Rick Macklem wrote:
 Ok, I'll admit I wasn't very fond of a fixed table that would
 inevitably
 get out of date someday, either.

 I didn't think hashing for the cases not in the table was
 worth the
 effort,
 but doing a hash instead of a table seems reasonable.

 I see that ZFS only uses the low order 8 bits, so I'll try and
 come
 up
 with an 8bit hash solution and will post a patch for
 testing/review
 soon.

 I don't think the vfs_sysctl() is that great a concern, given
 that
 it
 appears to be deprecated already anyhow. (With an 8bit hash,
 vfs_typenum
 won't be that sparse.) I'll also make sure that whatever hash
 I use
 doesn't collide for the current list of file names (although I
 will
 include
 code that handles a collision in the patch).
   
Sounds great. Thanks!
   
   Here's the patch. (Hiroki could you please test this, thanks,
   rick.)
   ps: If the white space gets trashed, the same patch is at:
  http://people.freebsd.org/~rmacklem/fsid.patch
 
  The patch is fine by me. Thanks, Rick!
 
 Sorry, I am late.
 
 It seems that the probability of the collisions for the hash is quite
 high.
 Due to the fixup procedure, the resulting typenum will depend on the
 order
 of the module initialization, isn't it ? IMO, it makes the patch goal
 not
 met.
Well, the about 15 file system types currently in FreeBSD don't collide.
(At least with the patch I posted. I'll test it again with hash  0xff;.)

Also, since the collision results in the second one using the next higher
value (usually, until you get something like 50+ file system types), it works
for those cases unless the order that these 2 file systems call vfs_register()
is reversed. (The likelyhood of this reversal is lower than the likelyhood
of vfs_register() just being called in a different order, I think?)

Since it now changes whenever a different kernel config or different module
loading order is used and there aren't a lot of complaints, I think usually
works is good enough. However, it doesn't matter to me, so I'll let others
decide.

Yet another option is to go back to what hrs@ originally suggested, which
was change ZFS to not use vfc_typenum in its fsid. (I now see that 0 won't
conflict with any assigned vfc_typenum values, since they start at 1 and
255 would also probably be safe, since its unlikely there ever will be
255 different file system types.) The problem with this is no future FS
can use the same trick.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-24 Thread Rick Macklem

Kostik Belousov wrote:
 On Wed, Aug 24, 2011 at 09:34:58PM +0900, Hiroki Sato wrote:
  Kostik Belousov kostik...@gmail.com wrote
in 20110824082119.gj17...@deviant.kiev.zoral.com.ua:
 
  ko On Tue, Aug 23, 2011 at 11:23:03PM +0200, Pawel Jakub Dawidek
  wrote:
  ko  On Tue, Aug 23, 2011 at 04:11:20PM -0400, Rick Macklem wrote:
  ko   Pawel Jakub Dawidek wrote:
  koOn Tue, Aug 23, 2011 at 10:09:41AM -0400, Rick Macklem
  wrote:
  ko Ok, I'll admit I wasn't very fond of a fixed table that
  would
  ko inevitably
  ko get out of date someday, either.
  ko
  ko I didn't think hashing for the cases not in the table
  was worth the
  ko effort,
  ko but doing a hash instead of a table seems reasonable.
  ko
  ko I see that ZFS only uses the low order 8 bits, so I'll
  try and come
  ko up
  ko with an 8bit hash solution and will post a patch for
  testing/review
  ko soon.
  ko
  ko I don't think the vfs_sysctl() is that great a concern,
  given that
  ko it
  ko appears to be deprecated already anyhow. (With an 8bit
  hash,
  ko vfs_typenum
  ko won't be that sparse.) I'll also make sure that whatever
  hash I use
  ko doesn't collide for the current list of file names
  (although I will
  ko include
  ko code that handles a collision in the patch).
  ko   
  koSounds great. Thanks!
  ko   
  ko   Here's the patch. (Hiroki could you please test this,
  thanks, rick.)
  ko   ps: If the white space gets trashed, the same patch is at:
  ko   http://people.freebsd.org/~rmacklem/fsid.patch
  ko 
  ko  The patch is fine by me. Thanks, Rick!
  ko
  ko Sorry, I am late.
  ko
  ko It seems that the probability of the collisions for the hash is
  quite high.
  ko Due to the fixup procedure, the resulting typenum will depend on
  the order
  ko of the module initialization, isn't it ? IMO, it makes the patch
  goal not
  ko met.
 
   I tried the following two experiments (the complete results are
   attached) to confirm the probability:
 
   1. [fsidhash1.txt]
  well-known vfc_name and the names [a-z]fs (# of names is 36)
  with no fix-up recalculation.
 
   2. [fsidhash2.txt]
  well-known vfc_name and the names [a-z][a-z]fs (# of names is
  710)
  with no fix-up recalculation.
 
   There is no collision in the case 1. And when [a-z][a-z]fs are
   included the average number of the collided names in the same hash
   value is 4.43 (i.e. 160 different hash values are generated, the
   theoretical best number is (710 entries / 256 buckets) = 2.77).
 
   At least, vfc_names we currently have in our kernel code have no
   collision, fortunately. As you noticed [a-z][a-z]fs is an
   impractical data set and these results cannot explain the
   characteristics for all possible and practical vfc_names, so
   whether
   this hash is reasonable or not depends on how we think of them.
   Comments or other better idea?
 
 Might be, add a new byte field to struct vfsconf, containing the
 suggested
 byte to be used by fsid. Divide the namespace into two half by the
 highest
 bit 7. Use the byte values with the highest bit clear for statically
 assignment for in-tree modules. Kernel will still check for the
 uniquiness
 on initialization.
 
Well, doesn't this result in the same issue as the fixed table?
In other words, the developer has to supply the suggested byte for
fsid and make sure that it doesn't conflict with other suggested byte
values or suffer the same consequence as forgetting to update the fixed
table. (ie. It just puts the fixed value in a different place, from what
I see, for in-tree modules. Also, with a fixed table, they are all in
one place, so it's easy to choose a non-colliding value?)

 Reserve the value '0' to mean that kernel must assign the next unused
 value = 128. This is done to support out-of-tree modules.
 
 Might be, also reserve the value 0 to mean 'kernel
 Main drawback is that is sounds complicated.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-24 Thread Kostik Belousov

On Wed, Aug 24, 2011 at 09:36:37AM -0400, Rick Macklem wrote:
 Kostik Belousov wrote:
  On Wed, Aug 24, 2011 at 09:34:58PM +0900, Hiroki Sato wrote:
   Kostik Belousov kostik...@gmail.com wrote
 in 20110824082119.gj17...@deviant.kiev.zoral.com.ua:
  
   ko On Tue, Aug 23, 2011 at 11:23:03PM +0200, Pawel Jakub Dawidek
   wrote:
   ko  On Tue, Aug 23, 2011 at 04:11:20PM -0400, Rick Macklem wrote:
   ko   Pawel Jakub Dawidek wrote:
   koOn Tue, Aug 23, 2011 at 10:09:41AM -0400, Rick Macklem
   wrote:
   ko Ok, I'll admit I wasn't very fond of a fixed table that
   would
   ko inevitably
   ko get out of date someday, either.
   ko
   ko I didn't think hashing for the cases not in the table
   was worth the
   ko effort,
   ko but doing a hash instead of a table seems reasonable.
   ko
   ko I see that ZFS only uses the low order 8 bits, so I'll
   try and come
   ko up
   ko with an 8bit hash solution and will post a patch for
   testing/review
   ko soon.
   ko
   ko I don't think the vfs_sysctl() is that great a concern,
   given that
   ko it
   ko appears to be deprecated already anyhow. (With an 8bit
   hash,
   ko vfs_typenum
   ko won't be that sparse.) I'll also make sure that whatever
   hash I use
   ko doesn't collide for the current list of file names
   (although I will
   ko include
   ko code that handles a collision in the patch).
   ko   
   koSounds great. Thanks!
   ko   
   ko   Here's the patch. (Hiroki could you please test this,
   thanks, rick.)
   ko   ps: If the white space gets trashed, the same patch is at:
   ko   http://people.freebsd.org/~rmacklem/fsid.patch
   ko 
   ko  The patch is fine by me. Thanks, Rick!
   ko
   ko Sorry, I am late.
   ko
   ko It seems that the probability of the collisions for the hash is
   quite high.
   ko Due to the fixup procedure, the resulting typenum will depend on
   the order
   ko of the module initialization, isn't it ? IMO, it makes the patch
   goal not
   ko met.
  
I tried the following two experiments (the complete results are
attached) to confirm the probability:
  
1. [fsidhash1.txt]
 well-known vfc_name and the names [a-z]fs (# of names is 36)
 with no fix-up recalculation.
  
2. [fsidhash2.txt]
 well-known vfc_name and the names [a-z][a-z]fs (# of names is
 710)
 with no fix-up recalculation.
  
There is no collision in the case 1. And when [a-z][a-z]fs are
included the average number of the collided names in the same hash
value is 4.43 (i.e. 160 different hash values are generated, the
theoretical best number is (710 entries / 256 buckets) = 2.77).
  
At least, vfc_names we currently have in our kernel code have no
collision, fortunately. As you noticed [a-z][a-z]fs is an
impractical data set and these results cannot explain the
characteristics for all possible and practical vfc_names, so
whether
this hash is reasonable or not depends on how we think of them.
Comments or other better idea?
  
  Might be, add a new byte field to struct vfsconf, containing the
  suggested
  byte to be used by fsid. Divide the namespace into two half by the
  highest
  bit 7. Use the byte values with the highest bit clear for statically
  assignment for in-tree modules. Kernel will still check for the
  uniquiness
  on initialization.
  
 Well, doesn't this result in the same issue as the fixed table?
 In other words, the developer has to supply the suggested byte for
 fsid and make sure that it doesn't conflict with other suggested byte
 values or suffer the same consequence as forgetting to update the fixed
 table. (ie. It just puts the fixed value in a different place, from what
 I see, for in-tree modules. Also, with a fixed table, they are all in
 one place, so it's easy to choose a non-colliding value?)
The reason for my proposal was Pawel note that a porter of the filesystem
should be aware of some place in kern/ where to register, besides writing
the module.

Anyway, do whatever you prefer.
 
  Reserve the value '0' to mean that kernel must assign the next unused
  value = 128. This is done to support out-of-tree modules.
  
  Might be, also reserve the value 0 to mean 'kernel
  Main drawback is that is sounds complicated.


pgptfUeGgQ5a2.pgp
Description: PGP signature

Re: fsid change of ZFS?

2011-08-24 Thread Gleb Kurtsou

On (24/08/2011 21:34), Hiroki Sato wrote:
 Kostik Belousov kostik...@gmail.com wrote
   in 20110824082119.gj17...@deviant.kiev.zoral.com.ua:
 
 ko On Tue, Aug 23, 2011 at 11:23:03PM +0200, Pawel Jakub Dawidek wrote:
 ko  On Tue, Aug 23, 2011 at 04:11:20PM -0400, Rick Macklem wrote:
 ko   Here's the patch. (Hiroki could you please test this, thanks, rick.)
 ko   ps: If the white space gets trashed, the same patch is at:
 ko  http://people.freebsd.org/~rmacklem/fsid.patch
 ko 
 ko  The patch is fine by me. Thanks, Rick!
 ko
 ko Sorry, I am late.
 ko
 ko It seems that the probability of the collisions for the hash is quite 
 high.
 ko Due to the fixup procedure, the resulting typenum will depend on the order
 ko of the module initialization, isn't it ? IMO, it makes the patch goal not
 ko met.
 
  I tried the following two experiments (the complete results are
  attached) to confirm the probability:
 
  1. [fsidhash1.txt]
   well-known vfc_name and the names [a-z]fs (# of names is 36)
   with no fix-up recalculation.
 
  2. [fsidhash2.txt]
   well-known vfc_name and the names [a-z][a-z]fs (# of names is 710)
   with no fix-up recalculation.
 
  There is no collision in the case 1.  And when [a-z][a-z]fs are
  included the average number of the collided names in the same hash
  value is 4.43 (i.e. 160 different hash values are generated, the
  theoretical best number is (710 entries / 256 buckets) = 2.77).
Could you run the same test with fnv_32_buf()? Collision rate is likely
to be lower.

  At least, vfc_names we currently have in our kernel code have no
  collision, fortunately.  As you noticed [a-z][a-z]fs is an
  impractical data set and these results cannot explain the
  characteristics for all possible and practical vfc_names, so whether
  this hash is reasonable or not depends on how we think of them.
  Comments or other better idea?
 
 -- Hiroki
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-24 Thread Hiroki Sato

Gleb Kurtsou gleb.kurt...@gmail.com wrote
  in 20110824150235.GA46460@tops:

gl On (24/08/2011 21:34), Hiroki Sato wrote:
gl  Kostik Belousov kostik...@gmail.com wrote
glin 20110824082119.gj17...@deviant.kiev.zoral.com.ua:
gl 
gl  ko On Tue, Aug 23, 2011 at 11:23:03PM +0200, Pawel Jakub Dawidek wrote:
gl  ko  On Tue, Aug 23, 2011 at 04:11:20PM -0400, Rick Macklem wrote:
gl  ko   Here's the patch. (Hiroki could you please test this, thanks, 
rick.)
gl  ko   ps: If the white space gets trashed, the same patch is at:
gl  ko  http://people.freebsd.org/~rmacklem/fsid.patch
gl  ko 
gl  ko  The patch is fine by me. Thanks, Rick!
gl  ko
gl  ko Sorry, I am late.
gl  ko
gl  ko It seems that the probability of the collisions for the hash is quite 
high.
gl  ko Due to the fixup procedure, the resulting typenum will depend on the 
order
gl  ko of the module initialization, isn't it ? IMO, it makes the patch goal 
not
gl  ko met.
gl 
gl   I tried the following two experiments (the complete results are
gl   attached) to confirm the probability:
gl 
gl   1. [fsidhash1.txt]
glwell-known vfc_name and the names [a-z]fs (# of names is 36)
glwith no fix-up recalculation.
gl 
gl   2. [fsidhash2.txt]
glwell-known vfc_name and the names [a-z][a-z]fs (# of names is 710)
glwith no fix-up recalculation.
gl 
gl   There is no collision in the case 1.  And when [a-z][a-z]fs are
gl   included the average number of the collided names in the same hash
gl   value is 4.43 (i.e. 160 different hash values are generated, the
gl   theoretical best number is (710 entries / 256 buckets) = 2.77).
gl Could you run the same test with fnv_32_buf()? Collision rate is likely
gl to be lower.

 The results by fnv_32_str() are attached.  The number of the
 collisions reduced but this time there were four collisions in the
 well-known ones + [a-z]fs case though no collision within the
 well-known names.

 The generated hash values for the case 2 was 235, so the rate is
 3.02.  This is certainly better than hash32_str().

-- Hiroki
0x0d = yfs
0x13 = mfs
0x19 = procfs
0x1a = kfs
0x30 = bfs
0x34 = ufs
0x3a = gfs
0x40 = sfs, udf
0x56 = jfs
0x5a = ntfs
0x5d = ext2fs
0x61 = devfs
0x6f = hpfs
0x75 = hfs
0x7a = pfs
0x7b = ofs
0x7c = dfs
0x81 = wfs
0x88 = msdosfs
0xa0 = cd9660
0xa1 = lfs, xfs
0xa8 = cfs, tfs
0xbf = tmpfs
0xc0 = afs
0xc2 = reiserfs
0xc6 = ifs
0xc8 = ffs, rfs
0xec = qfs
0xed = zfs
0xee = efs
0xef = nfs
0xf4 = vfs
0x01 = mcfs, prfs, xufs
0x02 = iafs, jifs, kpfs, xifs
0x04 = bzfs, dbfs, ehfs, ewfs, klfs, yhfs
0x05 = ycfs
0x06 = dyfs, ecfs, ipfs, qdfs, qpfs
0x07 = pnfs, ywfs
0x08 = qofs, xafs
0x09 = bmfs, iffs, iyfs, kdfs, pufs, vffs
0x0a = imfs
0x0b = bffs, dmfs
0x0c = idfs, yofs
0x0d = adfs, nmfs, pafs, vqfs, yfs
0x0e = aifs, qwfs
0x0f = hafs, nffs, vzfs
0x10 = zafs
0x11 = wcfs
0x12 = zjfs
0x13 = mfs, nqfs, wwfs
0x14 = nefs, vcfs
0x15 = bkfs
0x16 = clfs, zifs
0x17 = njfs, zrfs
0x18 = ocfs, owfs, tvfs, wdfs
0x19 = fefs, procfs, tbfs, vwfs, zffs
0x1a = fnfs, fqfs, kfs
0x1b = gofs, ryfs
0x1d = fzfs, rmfs, rrfs, skfs
0x1e = odfs, tqfs
0x1f = lvfs, oofs, oxfs, rffs, tzfs
0x20 = eefs, gwfs, kmfs, tmfs, xcfs
0x21 = ulfs, yqfs
0x22 = fbfs, hrfs, mtfs, qifs, sxfs, xjfs
0x23 = dofs, sdfs, ssfs, ugfs
0x24 = etfs, kofs, usfs, uxfs
0x25 = kqfs, ljfs, mkfs, opfs, pcfs
0x26 = jrfs, pwfs, qxfs
0x27 = ghfs
0x28 = hjfs, xvfs
0x29 = hufs, ikfs, jpfs, ksfs, slfs
0x2a = byfs, jefs, khfs, ukfs
0x2b = brfs, dhfs, jnfs, ytfs
0x2c = defs, hwfs, msfs
0x2d = apfs, qlfs
0x2e = pdfs, pifs, vefs
0x2f = befs, cgfs, hmfs, iwfs, jlfs, vnfs
0x30 = bfs, bjfs, jwfs
0x31 = csfs, cxfs
0x33 = alfs, axfs, vyfs
0x34 = ufs
0x35 = acfs, jcfs, vvfs
0x36 = awfs, bcfs, vbfs
0x37 = wkfs
0x38 = tofs
0x3a = fmfs, gfs
0x3b = nvfs
0x3c = nbfs, rqfs
0x3d = tgfs, tsfs, zwfs
0x3e = wlfs, zzfs
0x3f = fkfs, wgfs, znfs
0x40 = dvfs, gcfs, sfs, thfs, udf, wpfs
0x41 = fwfs
0x42 = pzfs, rwfs, udfs
0x43 = dgfs, lgfs, sgfs
0x44 = rkfs
0x45 = dnfs, llfs, lsfs, ogfs
0x46 = lxfs, pkfs, wxfs
0x47 = dqfs, eofs, gdfs, xofs
0x48 = dzfs, pbfs, upfs, xdfs, xpfs
0x49 = qcfs
0x4a = dsfs, pvfs, rcfs
0x4b = exfs, jzfs, mhfs, ohfs
0x4c = esfs, gpfs, ylfs
0x4d = qtfs, ygfs
0x4e = hbfs, hofs, ihfs
0x4f = bhfs, hvfs, icfs, uvfs
0x50 = bwfs, ktfs, mnfs, stfs
0x51 = ejfs
0x53 = itfs
0x54 = qffs, vkfs, vtfs
0x56 = bofs, cofs, jfs, jkfs
0x57 = iofs, yjfs
0x59 = qrfs
0x5a = atfs, ntfs, vsfs
0x5b = akfs
0x5d = ccfs, chfs, ctfs, ext2fs, zlfs, zxfs
0x5e = fvfs, wzfs
0x60 = nsfs, rjfs, wsfs
0x61 = devfs, ftfs, nofs, nxfs
0x62 = agfs, fjfs, xbfs
0x63 = ojfs
0x64 = grfs, kgfs, rhfs, rvfs, tkfs, wofs, zcfs, ztfs
0x65 = lofs, ttfs
0x66 = fhfs, gffs
0x67 = mgfs, ngfs, svfs, xsfs
0x68 = fsfs, sbfs
0x69 = ppfs, rtfs, zkfs
0x6a = jxfs, rofs, uhfs
0x6b = dkfs, elfs, kkfs, mefs, mpfs, ucfs
0x6c = omfs, txfs, yxfs
0x6d = egfs, fgfs, pgfs, ysfs
0x6e = gufs, gxfs, kbfs, plfs, uafs, xgfs
0x6f = hpfs
0x70 = myfs, qkfs, sqfs, uzfs
0x71 = epfs, jtfs, kxfs
0x72 = mbfs, ydfs
0x73 = hgfs, hsfs
0x74 = ekfs, ykfs
0x75 = cvfs, hfs,

Re: fsid change of ZFS?

2011-08-24 Thread Hiroki Sato

Rick Macklem rmack...@uoguelph.ca wrote
  in 920337541.272757.1314192294772.javamail.r...@erie.cs.uoguelph.ca:

rm Kostik Belousov wrote:
rm  On Tue, Aug 23, 2011 at 11:23:03PM +0200, Pawel Jakub Dawidek wrote:
rm   On Tue, Aug 23, 2011 at 04:11:20PM -0400, Rick Macklem wrote:
rmPawel Jakub Dawidek wrote:
rm On Tue, Aug 23, 2011 at 10:09:41AM -0400, Rick Macklem wrote:
rm  Ok, I'll admit I wasn't very fond of a fixed table that would
rm  inevitably
rm  get out of date someday, either.
rm 
rm  I didn't think hashing for the cases not in the table was
rm  worth the
rm  effort,
rm  but doing a hash instead of a table seems reasonable.
rm 
rm  I see that ZFS only uses the low order 8 bits, so I'll try and
rm  come
rm  up
rm  with an 8bit hash solution and will post a patch for
rm  testing/review
rm  soon.
rm 
rm  I don't think the vfs_sysctl() is that great a concern, given
rm  that
rm  it
rm  appears to be deprecated already anyhow. (With an 8bit hash,
rm  vfs_typenum
rm  won't be that sparse.) I'll also make sure that whatever hash
rm  I use
rm  doesn't collide for the current list of file names (although I
rm  will
rm  include
rm  code that handles a collision in the patch).
rm
rm Sounds great. Thanks!
rm
rmHere's the patch. (Hiroki could you please test this, thanks,
rmrick.)
rmps: If the white space gets trashed, the same patch is at:
rm   http://people.freebsd.org/~rmacklem/fsid.patch
rm  
rm   The patch is fine by me. Thanks, Rick!
rm 
rm  Sorry, I am late.
rm 
rm  It seems that the probability of the collisions for the hash is quite
rm  high.
rm  Due to the fixup procedure, the resulting typenum will depend on the
rm  order
rm  of the module initialization, isn't it ? IMO, it makes the patch goal
rm  not
rm  met.
rm Well, the about 15 file system types currently in FreeBSD don't collide.
rm (At least with the patch I posted. I'll test it again with hash  0xff;.)
rm
rm Also, since the collision results in the second one using the next higher
rm value (usually, until you get something like 50+ file system types), it 
works
rm for those cases unless the order that these 2 file systems call 
vfs_register()
rm is reversed. (The likelyhood of this reversal is lower than the likelyhood
rm of vfs_register() just being called in a different order, I think?)
rm
rm Since it now changes whenever a different kernel config or different module
rm loading order is used and there aren't a lot of complaints, I think usually
rm works is good enough. However, it doesn't matter to me, so I'll let others
rm decide.
rm
rm Yet another option is to go back to what hrs@ originally suggested, which
rm was change ZFS to not use vfc_typenum in its fsid. (I now see that 0 won't
rm conflict with any assigned vfc_typenum values, since they start at 1 and
rm 255 would also probably be safe, since its unlikely there ever will be
rm 255 different file system types.) The problem with this is no future FS
rm can use the same trick.

 My opinion is using a hash function which occurs no collision in the
 well-known names is sufficient for our purpose because the number of
 file systems on a running system is small anyway and not changed
 frequently in most cases.  I am not sure which function is best;
 fnv_32_str() seemed better against a large data set but it is not
 likely to have 30 file systems, for example.

-- Hiroki


pgp5gcbSw2jiZ.pgp
Description: PGP signature

Re: fsid change of ZFS?

2011-08-24 Thread Benjamin Kaduk



On Thu, 25 Aug 2011, Hiroki Sato wrote:



My opinion is using a hash function which occurs no collision in the
well-known names is sufficient for our purpose because the number of
file systems on a running system is small anyway and not changed
frequently in most cases.  I am not sure which function is best;
fnv_32_str() seemed better against a large data set but it is not
likely to have 30 file systems, for example.


This was a large part of my reasoning when proposing a hash table 
initially -- there are 256 (maybe 255 if you want to reserve one) slots, 
and only 14 or so filesystems currently in the tree.  It seems very 
unlikely that any single machine would ever have more than another five 
out-of-tree filesystem types on it (right?), so putting 20 items into 250 
hash bins is a pretty good chance of being collision-free.


-Ben
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-24 Thread Pawel Jakub Dawidek

On Wed, Aug 24, 2011 at 04:41:25PM +0300, Kostik Belousov wrote:
 On Wed, Aug 24, 2011 at 09:36:37AM -0400, Rick Macklem wrote:
  Well, doesn't this result in the same issue as the fixed table?
  In other words, the developer has to supply the suggested byte for
  fsid and make sure that it doesn't conflict with other suggested byte
  values or suffer the same consequence as forgetting to update the fixed
  table. (ie. It just puts the fixed value in a different place, from what
  I see, for in-tree modules. Also, with a fixed table, they are all in
  one place, so it's easy to choose a non-colliding value?)
 The reason for my proposal was Pawel note that a porter of the filesystem
 should be aware of some place in kern/ where to register, besides writing
 the module.

Well, he has to be aware, but we should do all we can to minimize the
number of place he needs to update, as it is easy to forget some.

I agree with Rick that what you proposed is similar to fixed table of
file system names and I'd prefer to avoid that. If we can have
name-based hash that produces no collision for in-tree file systems and
know current 3rd party file systems plus collision detection for the
future then it is good enough, IMHO. And this is what Rick proposed with
his patch.

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
FreeBSD committer http://www.FreeBSD.org
Am I Evil? Yes, I Am! http://yomoli.com


pgpOyu4gfRBq3.pgp
Description: PGP signature

Re: fsid change of ZFS?

2011-08-24 Thread Rick Macklem

Pawel Jakub Dawidek wrote:
 On Wed, Aug 24, 2011 at 04:41:25PM +0300, Kostik Belousov wrote:
  On Wed, Aug 24, 2011 at 09:36:37AM -0400, Rick Macklem wrote:
   Well, doesn't this result in the same issue as the fixed table?
   In other words, the developer has to supply the suggested byte
   for
   fsid and make sure that it doesn't conflict with other suggested
   byte
   values or suffer the same consequence as forgetting to update the
   fixed
   table. (ie. It just puts the fixed value in a different place,
   from what
   I see, for in-tree modules. Also, with a fixed table, they are all
   in
   one place, so it's easy to choose a non-colliding value?)
  The reason for my proposal was Pawel note that a porter of the
  filesystem
  should be aware of some place in kern/ where to register, besides
  writing
  the module.
 
 Well, he has to be aware, but we should do all we can to minimize the
 number of place he needs to update, as it is easy to forget some.
 
 I agree with Rick that what you proposed is similar to fixed table of
 file system names and I'd prefer to avoid that. If we can have
 name-based hash that produces no collision for in-tree file systems
 and
 know current 3rd party file systems plus collision detection for the
 future then it is good enough, IMHO. And this is what Rick proposed
 with
 his patch.
 
Ok, here is the list of file system names I've been checking and there
haven't been any collisions (either hash function). If you know of other
well known file type names, please email and I'll add them to the list
and check for collisions again.

cd9660
ufs
procfs
devfs
msdosfs
nfs
xfs
reiserfs
tmpfs
hpfs
ntfs
ext2fs
udf
zfs
afs

and here is my current rendition of the patch. (I took Gleb's suggestion
and switched to fnv_32_str(). I'll leave it that way unless there is a
collision after adding any names people post to the above list.)

It sounds like people have agreed that this is a reasonable solution.
If hrs@ can confirm that testing shows it fixes the original problem
(the ZFS file handles don't change when it's loaded at different times),
I'll pass it along to re@.

Thanks for the helpful suggestions, rick

--- kern/vfs_init.c.sav 2011-06-11 18:58:33.0 -0400
+++ kern/vfs_init.c 2011-08-24 16:15:24.0 -0400
@@ -39,6 +39,7 @@ __FBSDID($FreeBSD: head/sys/kern/vfs_in
 
 #include sys/param.h
 #include sys/systm.h
+#include sys/fnv_hash.h
 #include sys/kernel.h
 #include sys/linker.h
 #include sys/mount.h
@@ -138,6 +139,8 @@ vfs_register(struct vfsconf *vfc)
struct sysctl_oid *oidp;
struct vfsops *vfsops;
static int once;
+   struct vfsconf *tvfc;
+   uint32_t hashval;
 
if (!once) {
vattr_null(va_null);
@@ -152,7 +155,26 @@ vfs_register(struct vfsconf *vfc)
if (vfs_byname(vfc-vfc_name) != NULL)
return EEXIST;
 
-   vfc-vfc_typenum = maxvfsconf++;
+   /*
+* Calculate a hash on vfc_name to use for vfc_typenum. Unless
+* a collision occurs, it is limited to 8bits since that is
+* what ZFS uses from vfc_typenum and that also limits how sparsely
+* distributed vfc_typenum becomes.
+*/
+   hashval = fnv_32_str(vfc-vfc_name, FNV1_32_INIT);
+   hashval = 0xff;
+   do {
+   /* Look for and fix any collision. */
+   TAILQ_FOREACH(tvfc, vfsconf, vfc_list) {
+   if (hashval == tvfc-vfc_typenum) {
+   hashval++; /* Can exceed 8bits, if needed. */
+   break;
+   }
+   }
+   } while (tvfc != NULL);
+   vfc-vfc_typenum = hashval;
+   if (vfc-vfc_typenum = maxvfsconf)
+   maxvfsconf = vfc-vfc_typenum + 1;
TAILQ_INSERT_TAIL(vfsconf, vfc, vfc_list);
 
/*


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-24 Thread Rick Macklem

Pawel Jakub Dawidek wrote:
 On Wed, Aug 24, 2011 at 04:41:25PM +0300, Kostik Belousov wrote:
  On Wed, Aug 24, 2011 at 09:36:37AM -0400, Rick Macklem wrote:
   Well, doesn't this result in the same issue as the fixed table?
   In other words, the developer has to supply the suggested byte
   for
   fsid and make sure that it doesn't conflict with other suggested
   byte
   values or suffer the same consequence as forgetting to update the
   fixed
   table. (ie. It just puts the fixed value in a different place,
   from what
   I see, for in-tree modules. Also, with a fixed table, they are all
   in
   one place, so it's easy to choose a non-colliding value?)
  The reason for my proposal was Pawel note that a porter of the
  filesystem
  should be aware of some place in kern/ where to register, besides
  writing
  the module.
 
 Well, he has to be aware, but we should do all we can to minimize the
 number of place he needs to update, as it is easy to forget some.
 
 I agree with Rick that what you proposed is similar to fixed table of
 file system names and I'd prefer to avoid that. If we can have
 name-based hash that produces no collision for in-tree file systems
 and
 know current 3rd party file systems plus collision detection for the
 future then it is good enough, IMHO. And this is what Rick proposed
 with
 his patch.
 
Ok, here is the list of file system names I've been checking and there
haven't been any collisions (either hash function). If you know of other
well known file type names, please email and I'll add them to the list
and check for collisions again.

cd9660
ufs
procfs
devfs
msdosfs
nfs
xfs
reiserfs
tmpfs
hpfs
ntfs
ext2fs
udf
zfs
afs

and here is my current rendition of the patch. (I took Gleb's suggestion
and switched to fnv_32_str(). I'll leave it that way unless there is a
collision after adding any names people post to the above list.)

It sounds like people have agreed that this is a reasonable solution.
If hrs@ can confirm that testing shows it fixes the original problem
(the ZFS file handles don't change when it's loaded at different times),
I'll pass it along to re@.

Thanks for the helpful suggestions, rick

--- kern/vfs_init.c.sav 2011-06-11 18:58:33.0 -0400
+++ kern/vfs_init.c 2011-08-24 16:15:24.0 -0400
@@ -39,6 +39,7 @@ __FBSDID($FreeBSD: head/sys/kern/vfs_in
 
 #include sys/param.h
 #include sys/systm.h
+#include sys/fnv_hash.h
 #include sys/kernel.h
 #include sys/linker.h
 #include sys/mount.h
@@ -138,6 +139,8 @@ vfs_register(struct vfsconf *vfc)
struct sysctl_oid *oidp;
struct vfsops *vfsops;
static int once;
+   struct vfsconf *tvfc;
+   uint32_t hashval;
 
if (!once) {
vattr_null(va_null);
@@ -152,7 +155,26 @@ vfs_register(struct vfsconf *vfc)
if (vfs_byname(vfc-vfc_name) != NULL)
return EEXIST;
 
-   vfc-vfc_typenum = maxvfsconf++;
+   /*
+* Calculate a hash on vfc_name to use for vfc_typenum. Unless
+* a collision occurs, it is limited to 8bits since that is
+* what ZFS uses from vfc_typenum and that also limits how sparsely
+* distributed vfc_typenum becomes.
+*/
+   hashval = fnv_32_str(vfc-vfc_name, FNV1_32_INIT);
+   hashval = 0xff;
+   do {
+   /* Look for and fix any collision. */
+   TAILQ_FOREACH(tvfc, vfsconf, vfc_list) {
+   if (hashval == tvfc-vfc_typenum) {
+   hashval++; /* Can exceed 8bits, if needed. */
+   break;
+   }
+   }
+   } while (tvfc != NULL);
+   vfc-vfc_typenum = hashval;
+   if (vfc-vfc_typenum = maxvfsconf)
+   maxvfsconf = vfc-vfc_typenum + 1;
TAILQ_INSERT_TAIL(vfsconf, vfc, vfc_list);
 
/*


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-24 Thread Benjamin Kaduk


On Wed, 24 Aug 2011, Rick Macklem wrote:


afs


The current OpenAFS codebase uses the all-caps AFS.  Judging by the 
omitted text, perhaps this should change.  (We also don't use VFS_SET to 
set it, which I filed a bug about.)




and here is my current rendition of the patch. (I took Gleb's suggestion
and switched to fnv_32_str(). I'll leave it that way unless there is a
collision after adding any names people post to the above list.)

It sounds like people have agreed that this is a reasonable solution.
If hrs@ can confirm that testing shows it fixes the original problem
(the ZFS file handles don't change when it's loaded at different times),
I'll pass it along to re@.

Thanks for the helpful suggestions, rick

--- kern/vfs_init.c.sav 2011-06-11 18:58:33.0 -0400
+++ kern/vfs_init.c 2011-08-24 16:15:24.0 -0400
@@ -39,6 +39,7 @@ __FBSDID($FreeBSD: head/sys/kern/vfs_in

#include sys/param.h
#include sys/systm.h
+#include sys/fnv_hash.h
#include sys/kernel.h
#include sys/linker.h
#include sys/mount.h
@@ -138,6 +139,8 @@ vfs_register(struct vfsconf *vfc)
struct sysctl_oid *oidp;
struct vfsops *vfsops;
static int once;
+   struct vfsconf *tvfc;
+   uint32_t hashval;

if (!once) {
vattr_null(va_null);
@@ -152,7 +155,26 @@ vfs_register(struct vfsconf *vfc)
if (vfs_byname(vfc-vfc_name) != NULL)
return EEXIST;

-   vfc-vfc_typenum = maxvfsconf++;
+   /*
+* Calculate a hash on vfc_name to use for vfc_typenum. Unless
+* a collision occurs, it is limited to 8bits since that is
+* what ZFS uses from vfc_typenum and that also limits how sparsely
+* distributed vfc_typenum becomes.
+*/
+   hashval = fnv_32_str(vfc-vfc_name, FNV1_32_INIT);
+   hashval = 0xff;
+   do {
+   /* Look for and fix any collision. */
+   TAILQ_FOREACH(tvfc, vfsconf, vfc_list) {
+   if (hashval == tvfc-vfc_typenum) {
+   hashval++; /* Can exceed 8bits, if needed. */


If we're confident that we won't ever fully fill the hash table, I would 
think that this should wrap around back to zero (or one?) instead of 
overflowing.


Do we need to care about something attempting to add the same vfc_name 
twice?  This code will happily add a second entry at the next available 
index.



+   break;
+   }
+   }
+   } while (tvfc != NULL);
+   vfc-vfc_typenum = hashval;
+   if (vfc-vfc_typenum = maxvfsconf)
+   maxvfsconf = vfc-vfc_typenum + 1;


I guess we're holding off on killing maxvfsconf until after 9.0 is out?

-Ben


TAILQ_INSERT_TAIL(vfsconf, vfc, vfc_list);

/*




___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-23 Thread Pawel Jakub Dawidek

On Sat, Aug 20, 2011 at 08:15:34PM -0400, Rick Macklem wrote:
 Hiroki, could you please test the attached patch.
 
 One problem with this patch is that I don't know how to create a fixed
 table that matches what systems would already have been getting.
 (I got the first 6 entries by booting a GENERIC i386 kernel with a
  printf in vfs_init(), so I suspect those don't change much, although
  I'm not sure if ZFS will usually end up before or after them?)
 
 Do you guys know what ZFS gets assigned typically? (I realize that
 changes w.r.t. when it gets loaded, so the question also becomes
 do you know how it typically gets loaded so the table can have
 that vfc_typenum value assigned to it?)
 Maybe you could boot a system with a printf like:
 
 printf(%s, %d\n, vfc-vfc_name, vfc-vfc_typenum);
 
 just after vfc-vfc_typenum = maxvfsconf++; in vfs_init() and
 then look in dmesg after booting, to see what your tables look like?
 (Without the attached patch installed.)

Rick, I'm sorry to arrive so late, but in my opinion hardcoding list of
file systems in the kernel is a step in wrong direction, really.
We are trying to keep things modularized, so there are no such things
laying around that have to be cleaned up when file system goes away or
updated when new file system arrives.

I remember for example fts code where I found that it keeps list of file
systems that can be handled faster. ZFS could have been handled faster,
but I found this after few years.
For this case there should be VFCF_* flag that fts shuld recognize and not
hardcore file system names.

This was also the reason that when I added support for jail-friendly
file systems and support for file systems with delegated administration
I haven't created list of file system types that support it, but added
VFCF_JAIL and VFCF_DELEGADMIN flags.

Here you cannot use those flags to solve the problem, but hardcoding
file system types in an array is really not the way to go.

I much prefer Ben's idea of calculating a hash from file system name and
detecting collisions.

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
FreeBSD committer http://www.FreeBSD.org
Am I Evil? Yes, I Am! http://yomoli.com


pgpJBiqW9g9eb.pgp
Description: PGP signature

Re: fsid change of ZFS?

2011-08-23 Thread Rick Macklem

Pawel Jakub Dawidek wrote:
 On Sat, Aug 20, 2011 at 08:15:34PM -0400, Rick Macklem wrote:
  Hiroki, could you please test the attached patch.
 
  One problem with this patch is that I don't know how to create a
  fixed
  table that matches what systems would already have been getting.
  (I got the first 6 entries by booting a GENERIC i386 kernel with a
   printf in vfs_init(), so I suspect those don't change much,
   although
   I'm not sure if ZFS will usually end up before or after them?)
 
  Do you guys know what ZFS gets assigned typically? (I realize that
  changes w.r.t. when it gets loaded, so the question also becomes
  do you know how it typically gets loaded so the table can have
  that vfc_typenum value assigned to it?)
  Maybe you could boot a system with a printf like:
 
  printf(%s, %d\n, vfc-vfc_name, vfc-vfc_typenum);
 
  just after vfc-vfc_typenum = maxvfsconf++; in vfs_init() and
  then look in dmesg after booting, to see what your tables look like?
  (Without the attached patch installed.)
 
 Rick, I'm sorry to arrive so late, but in my opinion hardcoding list
 of
 file systems in the kernel is a step in wrong direction, really.
 We are trying to keep things modularized, so there are no such things
 laying around that have to be cleaned up when file system goes away or
 updated when new file system arrives.
 
 I remember for example fts code where I found that it keeps list of
 file
 systems that can be handled faster. ZFS could have been handled
 faster,
 but I found this after few years.
 For this case there should be VFCF_* flag that fts shuld recognize and
 not
 hardcore file system names.
 
 This was also the reason that when I added support for jail-friendly
 file systems and support for file systems with delegated
 administration
 I haven't created list of file system types that support it, but added
 VFCF_JAIL and VFCF_DELEGADMIN flags.
 
 Here you cannot use those flags to solve the problem, but hardcoding
 file system types in an array is really not the way to go.
 
 I much prefer Ben's idea of calculating a hash from file system name
 and
 detecting collisions.
 
Ok, I'll admit I wasn't very fond of a fixed table that would inevitably
get out of date someday, either.

I didn't think hashing for the cases not in the table was worth the effort,
but doing a hash instead of a table seems reasonable.

I see that ZFS only uses the low order 8 bits, so I'll try and come up
with an 8bit hash solution and will post a patch for testing/review soon.

I don't think the vfs_sysctl() is that great a concern, given that it
appears to be deprecated already anyhow. (With an 8bit hash, vfs_typenum
won't be that sparse.) I'll also make sure that whatever hash I use
doesn't collide for the current list of file names (although I will include
code that handles a collision in the patch).

Thanks for the comments, rick

 --
 Pawel Jakub Dawidek http://www.wheelsystems.com
 FreeBSD committer http://www.FreeBSD.org
 Am I Evil? Yes, I Am! http://yomoli.com
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-23 Thread Pawel Jakub Dawidek

On Tue, Aug 23, 2011 at 10:09:41AM -0400, Rick Macklem wrote:
 Ok, I'll admit I wasn't very fond of a fixed table that would inevitably
 get out of date someday, either.
 
 I didn't think hashing for the cases not in the table was worth the effort,
 but doing a hash instead of a table seems reasonable.
 
 I see that ZFS only uses the low order 8 bits, so I'll try and come up
 with an 8bit hash solution and will post a patch for testing/review soon.
 
 I don't think the vfs_sysctl() is that great a concern, given that it
 appears to be deprecated already anyhow. (With an 8bit hash, vfs_typenum
 won't be that sparse.) I'll also make sure that whatever hash I use
 doesn't collide for the current list of file names (although I will include
 code that handles a collision in the patch).

Sounds great. Thanks!

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
FreeBSD committer http://www.FreeBSD.org
Am I Evil? Yes, I Am! http://yomoli.com


pgpDcE0skKKef.pgp
Description: PGP signature

Re: fsid change of ZFS?

2011-08-23 Thread Rick Macklem

Pawel Jakub Dawidek wrote:
 On Tue, Aug 23, 2011 at 10:09:41AM -0400, Rick Macklem wrote:
  Ok, I'll admit I wasn't very fond of a fixed table that would
  inevitably
  get out of date someday, either.
 
  I didn't think hashing for the cases not in the table was worth the
  effort,
  but doing a hash instead of a table seems reasonable.
 
  I see that ZFS only uses the low order 8 bits, so I'll try and come
  up
  with an 8bit hash solution and will post a patch for testing/review
  soon.
 
  I don't think the vfs_sysctl() is that great a concern, given that
  it
  appears to be deprecated already anyhow. (With an 8bit hash,
  vfs_typenum
  won't be that sparse.) I'll also make sure that whatever hash I use
  doesn't collide for the current list of file names (although I will
  include
  code that handles a collision in the patch).
 
 Sounds great. Thanks!
 
Here's the patch. (Hiroki could you please test this, thanks, rick.)
ps: If the white space gets trashed, the same patch is at:
   http://people.freebsd.org/~rmacklem/fsid.patch

--- kern/vfs_init.c.sav 2011-06-11 18:58:33.0 -0400
+++ kern/vfs_init.c 2011-08-23 15:55:30.0 -0400
@@ -39,6 +39,7 @@ __FBSDID($FreeBSD: head/sys/kern/vfs_in
 
 #include sys/param.h
 #include sys/systm.h
+#include sys/hash.h
 #include sys/kernel.h
 #include sys/linker.h
 #include sys/mount.h
@@ -138,6 +139,8 @@ vfs_register(struct vfsconf *vfc)
struct sysctl_oid *oidp;
struct vfsops *vfsops;
static int once;
+   struct vfsconf *tvfc;
+   uint32_t hashval;
 
if (!once) {
vattr_null(va_null);
@@ -152,7 +155,27 @@ vfs_register(struct vfsconf *vfc)
if (vfs_byname(vfc-vfc_name) != NULL)
return EEXIST;
 
-   vfc-vfc_typenum = maxvfsconf++;
+   /*
+* Calculate a hash on vfc_name to use for vfc_typenum. Unless
+* a collision occurs, it is limited to 8bits since that is
+* what ZFS uses from vfc_typenum and that also limits how sparsely
+* distributed vfc_typenum becomes.
+*/
+   hashval = hash32_str(vfc-vfc_name, 0);
+   hashval = ((hashval  0xff) + ((hashval  8)  0xff) +
+   ((hashval  16)  0xff) + (hashval  24))  0xff;
+   do {
+   /* Look for and fix any collision. */
+   TAILQ_FOREACH(tvfc, vfsconf, vfc_list) {
+   if (hashval == tvfc-vfc_typenum) {
+   hashval++; /* Can exceed 8bits, if needed. */
+   break;
+   }
+   }
+   } while (tvfc != NULL);
+   vfc-vfc_typenum = hashval;
+   if (vfc-vfc_typenum = maxvfsconf)
+   maxvfsconf = vfc-vfc_typenum + 1;
TAILQ_INSERT_TAIL(vfsconf, vfc, vfc_list);
 
/*
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-23 Thread Pawel Jakub Dawidek

On Tue, Aug 23, 2011 at 04:11:20PM -0400, Rick Macklem wrote:
 Pawel Jakub Dawidek wrote:
  On Tue, Aug 23, 2011 at 10:09:41AM -0400, Rick Macklem wrote:
   Ok, I'll admit I wasn't very fond of a fixed table that would
   inevitably
   get out of date someday, either.
  
   I didn't think hashing for the cases not in the table was worth the
   effort,
   but doing a hash instead of a table seems reasonable.
  
   I see that ZFS only uses the low order 8 bits, so I'll try and come
   up
   with an 8bit hash solution and will post a patch for testing/review
   soon.
  
   I don't think the vfs_sysctl() is that great a concern, given that
   it
   appears to be deprecated already anyhow. (With an 8bit hash,
   vfs_typenum
   won't be that sparse.) I'll also make sure that whatever hash I use
   doesn't collide for the current list of file names (although I will
   include
   code that handles a collision in the patch).
  
  Sounds great. Thanks!
  
 Here's the patch. (Hiroki could you please test this, thanks, rick.)
 ps: If the white space gets trashed, the same patch is at:
http://people.freebsd.org/~rmacklem/fsid.patch

The patch is fine by me. Thanks, Rick!

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
FreeBSD committer http://www.FreeBSD.org
Am I Evil? Yes, I Am! http://yomoli.com


pgpsuJPptDK8Q.pgp
Description: PGP signature

Re: fsid change of ZFS?

2011-08-22 Thread Rick Macklem

Benjamin Kaduk wrote:
 On Sun, 21 Aug 2011, Rick Macklem wrote:
 
  Benjamin Kaduk wrote:
  On Sat, 20 Aug 2011, Rick Macklem wrote:
 
  If anyone thinks using a fixed table to assign vfc_typenum for
  known
  file system types is a bad idea, please let us know.
 
  Fixed table sounds like a good plan.
  Is there a reason for/against using a hash table for types that are
  not in
  the fixed table? The advantage would be that out-of-tree
  filesystems
  get
  a consistent typenum (modulo hash collisions), but there would be
  more
  overhead in maintaining the table.
 
  Well, the current code maintains maxvfsconf as the largest value of
  vfc_typenum configured.
  (vfs_register() increases it and vfs_unregister() shrinks it back
  down.)
 
 Yes; I assume this is just so that it can keep track of how to
 efficiently
 allocate an id to the next filesystem that is registered. It still
 walks
 the whole list in vfs_unregister, though, so there is some amount of
 overhead involved.
 
 
  Then, vfs_sysctl() returns maxvfsconf for a sysctl. I don't know
  what uses
 
 vfs_sysctl() also starts off with:
 printf(WARNING: userland calling deprecated sysctl, 
 please rebuild world\n);
 I don't know where (if anywhere) it is legitimately used, but it is
 plausible that it could safely be deprecated.
 
If/what might use this sysctl is my main concern w.r.t. the patch.
(Even without using a hash, the table will result in some gaps in
 the assignment of vfc_typenum and the sysctl will return ENOTSUPP
 for those. If a program is looping through them all, from 0-maxvfsconf
 it would need to ignore those.)

I'm going to grep /usr/src for any use(s) of it, but I don't know beyond
that.

  that sysctl, but it seems that doing a hash on vfc_name (which I
  think was
  what you were suggesting?) would result in a large maxvfsconf with
  sparsely
  distributed vfc_typenum values.
  I don't know, but I suspect that wouldn't satisfy the desired
  sysctl() behaviour?
 
 I don't think tracking maxvfsconf would be useful with a hash table
 approach -- fxr seems to indicate that it is only used so as to give a
 unique number to a new filesystem, and a hash table with collision
 detection could perform that role.
 
 But, I don't have a good assessment of the tradeoffs involved. It may
 not
 be worth the code churn just for the sake of a couple of out-of-tree
 filesystems that never get exported over NFS; I just don't know.
 
At this point in the 9.0 release cycle, I'd lean towards less churn and
preserving the sysctl(), except that it would still have some small gaps
in it. (As you say, there aren't a lot of file system types that will
be NFS exported anyhow.)
How do others feel w.r.t. this?

W.r.t what file system types get exported over NFS...
- I think the current list might be: ufs, zfs, msdosfs, cd9660, udf
Does anyone out there NFS export any others?

rick
 -Ben
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to
 freebsd-current-unsubscr...@freebsd.org
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-21 Thread Rick Macklem

Benjamin Kaduk wrote:
 On Sat, 20 Aug 2011, Rick Macklem wrote:
 
  Yes, using vfs_getnewfsid() does not solve the issue.
 
  I noticed that Solaris looked up a fixed array vfssw[] exactly for
  the purpose. I think a table like it is a good solution for fixing
  fsid for each file system.
 
  -- Hiroki
  If anyone thinks using a fixed table to assign vfc_typenum for known
  file system types is a bad idea, please let us know.
 
 Fixed table sounds like a good plan.
 Is there a reason for/against using a hash table for types that are
 not in
 the fixed table? The advantage would be that out-of-tree filesystems
 get
 a consistent typenum (modulo hash collisions), but there would be more
 overhead in maintaining the table.
 
Well, the current code maintains maxvfsconf as the largest value of vfc_typenum 
configured.
(vfs_register() increases it and vfs_unregister() shrinks it back down.)

Then, vfs_sysctl() returns maxvfsconf for a sysctl. I don't know what uses
that sysctl, but it seems that doing a hash on vfc_name (which I think was
what you were suggesting?) would result in a large maxvfsconf with sparsely
distributed vfc_typenum values.
I don't know, but I suspect that wouldn't satisfy the desired sysctl() 
behaviour?

rick
 -Ben Kaduk
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to
 freebsd-current-unsubscr...@freebsd.org
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-21 Thread Hiroki Sato

Hi Rick,

Rick Macklem rmack...@uoguelph.ca wrote
  in 59520805.118597.1313885734529.javamail.r...@erie.cs.uoguelph.ca:

rm Hiroki, could you please test the attached patch.
rm
rm One problem with this patch is that I don't know how to create a fixed
rm table that matches what systems would already have been getting.
rm (I got the first 6 entries by booting a GENERIC i386 kernel with a
rm  printf in vfs_init(), so I suspect those don't change much, although
rm  I'm not sure if ZFS will usually end up before or after them?)
rm
rm Do you guys know what ZFS gets assigned typically? (I realize that
rm changes w.r.t. when it gets loaded, so the question also becomes
rm do you know how it typically gets loaded so the table can have
rm that vfc_typenum value assigned to it?)
rm Maybe you could boot a system with a printf like:
rm
rm printf(%s, %d\n, vfc-vfc_name, vfc-vfc_typenum);
rm
rm just after vfc-vfc_typenum = maxvfsconf++; in vfs_init() and
rm then look in dmesg after booting, to see what your tables look like?
rm (Without the attached patch installed.)
rm
rm Thanks, rick
rm ps: The patch is also at
rm  http://people.freebsd.org/~rmacklem/fsid.patch for anyone on
rm the list interested (if the attachment doesn't make it through.
rm
rm
rm rick

 Yes, we can think the typical number of zfs is 7 on GENERIC.

 However, I checked vfc_typenum in three boxes around me by using the
 attached patch for lsvfs(1), but the results were not consistent:

 Machine A (8-STABLE)
 |
 |Filesystem   Num  Refs Flags
 | --- - ---
 |msdosfs1 0
 |procfs 2 0 synthetic
 |devfs  3 4 synthetic
 |cd9660 4 0 read-only
 |nfs5 2 network
 |ufs6 6

 Machine B (HEAD with newnfs complied)
 |
 |Filesystem   Num  Refs Flags
 | --- - ---
 |msdosfs1 0
 |ufs2 5
 |nfs3 0 network
 |cd9660 4 0 read-only
 |procfs 5 0 synthetic
 |devfs  6 1 synthetic
 |zfs7 1 jail, delegated-administration
 |

 Machine C (HEAD with oldnfs compiled)
 |
 |Filesystem   Num  Refs Flags
 | --- - ---
 |ufs1 3
 |msdosfs3 0
 |procfs 4 0 synthetic
 |devfs  5 1 synthetic
 |oldnfs 6 0 network
 |cd9660 7 0 read-only
 |zfs8 4 jail, delegated-administration
 |nfs9 1 network

 I guess the ordering is timing-dependent and consistent only in the
 same box.

 The result of the Machine C was changed like the following after
 applying the patch.  It worked as expected:

 |Filesystem   Num  Refs Flags
 | --- - ---
 |ufs2 3
 |oldnfs15 0 network
 |zfs7 4 jail, delegated-administration
 |devfs  4 1 synthetic
 |msdosfs5 0
 |nfs6 0 network
 |cd9660 1 0 read-only
 |procfs 3 0 synthetic

 Hmm, based on these results it looks difficult to assign the same
 number to each file system.  Anyway, not fixing them is worse, so we
 may have to accept this as a small upgrade bump in 9.0R.

-- Hiroki
Index: usr.bin/lsvfs/lsvfs.1
===
--- usr.bin/lsvfs/lsvfs.1	(revision 225066)
+++ usr.bin/lsvfs/lsvfs.1	(working copy)
@@ -2,7 +2,7 @@
 .\ Garrett A. Wollman, September 1994
 .\ This file is in the public domain.
 .\
-.Dd March 16, 1995
+.Dd August 22, 2011
 .Dt LSVFS 1
 .Os
 .Sh NAME
@@ -36,6 +36,8 @@
 .Fl t
 option to
 .Xr mount 8
+.It Num
+the filesystem type number.
 .It Refs
 the number of references to this VFS; i.e., the number of currently
 mounted file systems of this type
Index: usr.bin/lsvfs/lsvfs.c
===
--- usr.bin/lsvfs/lsvfs.c	(revision 225066)
+++ usr.bin/lsvfs/lsvfs.c	(working copy)
@@ -17,9 +17,9 @@
 #include stdlib.h
 #include string.h

-#define FMT %-32.32s %5d %s\n
-#define HDRFMT %-32.32s %5.5s %s\n
-#define DASHES  - ---\n
+#define FMT %-32.32s %3d %5d %s\n
+#define HDRFMT %-32.32s %3s

Re: fsid change of ZFS?

2011-08-21 Thread Benjamin Kaduk


On Sun, 21 Aug 2011, Rick Macklem wrote:


Benjamin Kaduk wrote:

On Sat, 20 Aug 2011, Rick Macklem wrote:


If anyone thinks using a fixed table to assign vfc_typenum for known
file system types is a bad idea, please let us know.


Fixed table sounds like a good plan.
Is there a reason for/against using a hash table for types that are
not in
the fixed table? The advantage would be that out-of-tree filesystems
get
a consistent typenum (modulo hash collisions), but there would be more
overhead in maintaining the table.


Well, the current code maintains maxvfsconf as the largest value of vfc_typenum 
configured.
(vfs_register() increases it and vfs_unregister() shrinks it back down.)


Yes; I assume this is just so that it can keep track of how to efficiently 
allocate an id to the next filesystem that is registered.  It still walks 
the whole list in vfs_unregister, though, so there is some amount of 
overhead involved.




Then, vfs_sysctl() returns maxvfsconf for a sysctl. I don't know what uses


vfs_sysctl() also starts off with:
printf(WARNING: userland calling deprecated sysctl, 
please rebuild world\n);
I don't know where (if anywhere) it is legitimately used, but it is 
plausible that it could safely be deprecated.



that sysctl, but it seems that doing a hash on vfc_name (which I think was
what you were suggesting?) would result in a large maxvfsconf with sparsely
distributed vfc_typenum values.
I don't know, but I suspect that wouldn't satisfy the desired sysctl() 
behaviour?


I don't think tracking maxvfsconf would be useful with a hash table 
approach -- fxr seems to indicate that it is only used so as to give a 
unique number to a new filesystem, and a hash table with collision 
detection could perform that role.


But, I don't have a good assessment of the tradeoffs involved.  It may not 
be worth the code churn just for the sake of a couple of out-of-tree 
filesystems that never get exported over NFS; I just don't know.


-Ben
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-20 Thread Rick Macklem

Hiroki Sato wrote:
 Rick Macklem rmack...@uoguelph.ca wrote
 in 1565511281.69213.1313764157732.javamail.r...@erie.cs.uoguelph.ca:
 
 rm Hiroki Sato wrote:
 rm  fsid_guid = dmu_objset_fsid_guid(zfsvfs-z_os);
 rm  ASSERT((fsid_guid  ~((1ULL56)-1)) == 0);
 rm  vfsp-vfs_fsid.val[0] = fsid_guid;
 rm  vfsp-vfs_fsid.val[1] = ((fsid_guid32)  8) |
 rm  vfsp-mnt_vfc-vfc_typenum  0xFF;
 rm 
 rm  Since the vfc_typenum variable is incremented every time a new
 vfs is
 rm  installed, loading order of modules that call vfs_register()
 affects
 rm  ZFS's fsid.
 rm 
 rm  Anyway, possibility of fsid change is troublesome especially for
 an
 rm  NFS server with a lot of clients running. Can zeroing or setting
 a
 rm  fixed value to the lowest 8-bit of vfs_fsid.val[1] be harmful?
 rm 
 rm  -- Hiroki
 rm Well, the problem is that the fsid needs to be unique among all
 mounts.
 rm The vfs_typenum field is used to try and ensure that it does not
 end up
 rm the same value as a non-ZFS file system.
 rm
 rm (A) I think making that field a fixed constant should be ok, if
 the function
 rm checks for a conflict by calling vfs_getvfs() to check for one.
 rm See vfs_getnewfsid() for how this is done. (There is a mutex lock
 that
 rm needs to be held while doing it.) Alternately, if ZFS can call
 vfs_getnewfsid()
 rm instead of doing its own, that might be nicer?
 rm
 rm (B) Another way to fix this would be to modify vfs_register() to
 look up
 rm file systems in a table (by vfc_name) and used a fixed, assigned
 value
 rm from the table for vfc_typenum for entries found in the table.
 Only do
 rm the maxvfsconf++ when there isn't an entry for the fstype in the
 table.
 rm (VFS_GENERIC can be set to the size of the table. That's what
 happened
 rm in the bad old days when vfsconf was a table built at kernel
 config time.)
 rm
 rm If you guys think (B) is preferred, I could come up with a patch.
 I don't
 rm know enough about ZFS to do (A).
 
 rm Oh, and I think other fs types will suffer the same fate, except
 that
 rm they usually avoid it, because they are compiled into the kernel
 and
 rm the assignment of vfs_typenum happens in the same order--same
 value.
 
 Yes, using vfs_getnewfsid() does not solve the issue.
 
 I noticed that Solaris looked up a fixed array vfssw[] exactly for
 the purpose. I think a table like it is a good solution for fixing
 fsid for each file system.
 
 -- Hiroki
If anyone thinks using a fixed table to assign vfc_typenum for known
file system types is a bad idea, please let us know.

Otherwise I will make up a patch, rick

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-20 Thread Rick Macklem

Hiroki Sato wrote:
 Rick Macklem rmack...@uoguelph.ca wrote
 in 1565511281.69213.1313764157732.javamail.r...@erie.cs.uoguelph.ca:
 
 rm Hiroki Sato wrote:
 rm  fsid_guid = dmu_objset_fsid_guid(zfsvfs-z_os);
 rm  ASSERT((fsid_guid  ~((1ULL56)-1)) == 0);
 rm  vfsp-vfs_fsid.val[0] = fsid_guid;
 rm  vfsp-vfs_fsid.val[1] = ((fsid_guid32)  8) |
 rm  vfsp-mnt_vfc-vfc_typenum  0xFF;
 rm 
 rm  Since the vfc_typenum variable is incremented every time a new
 vfs is
 rm  installed, loading order of modules that call vfs_register()
 affects
 rm  ZFS's fsid.
 rm 
 rm  Anyway, possibility of fsid change is troublesome especially for
 an
 rm  NFS server with a lot of clients running. Can zeroing or setting
 a
 rm  fixed value to the lowest 8-bit of vfs_fsid.val[1] be harmful?
 rm 
 rm  -- Hiroki
 rm Well, the problem is that the fsid needs to be unique among all
 mounts.
 rm The vfs_typenum field is used to try and ensure that it does not
 end up
 rm the same value as a non-ZFS file system.
 rm
 rm (A) I think making that field a fixed constant should be ok, if
 the function
 rm checks for a conflict by calling vfs_getvfs() to check for one.
 rm See vfs_getnewfsid() for how this is done. (There is a mutex lock
 that
 rm needs to be held while doing it.) Alternately, if ZFS can call
 vfs_getnewfsid()
 rm instead of doing its own, that might be nicer?
 rm
 rm (B) Another way to fix this would be to modify vfs_register() to
 look up
 rm file systems in a table (by vfc_name) and used a fixed, assigned
 value
 rm from the table for vfc_typenum for entries found in the table.
 Only do
 rm the maxvfsconf++ when there isn't an entry for the fstype in the
 table.
 rm (VFS_GENERIC can be set to the size of the table. That's what
 happened
 rm in the bad old days when vfsconf was a table built at kernel
 config time.)
 rm
 rm If you guys think (B) is preferred, I could come up with a patch.
 I don't
 rm know enough about ZFS to do (A).
 
 rm Oh, and I think other fs types will suffer the same fate, except
 that
 rm they usually avoid it, because they are compiled into the kernel
 and
 rm the assignment of vfs_typenum happens in the same order--same
 value.
 
 Yes, using vfs_getnewfsid() does not solve the issue.
 
 I noticed that Solaris looked up a fixed array vfssw[] exactly for
 the purpose. I think a table like it is a good solution for fixing
 fsid for each file system.
 
 -- Hiroki
Hiroki, could you please test the attached patch.

One problem with this patch is that I don't know how to create a fixed
table that matches what systems would already have been getting.
(I got the first 6 entries by booting a GENERIC i386 kernel with a
 printf in vfs_init(), so I suspect those don't change much, although
 I'm not sure if ZFS will usually end up before or after them?)

Do you guys know what ZFS gets assigned typically? (I realize that
changes w.r.t. when it gets loaded, so the question also becomes
do you know how it typically gets loaded so the table can have
that vfc_typenum value assigned to it?)
Maybe you could boot a system with a printf like:

printf(%s, %d\n, vfc-vfc_name, vfc-vfc_typenum);

just after vfc-vfc_typenum = maxvfsconf++; in vfs_init() and
then look in dmesg after booting, to see what your tables look like?
(Without the attached patch installed.)

Thanks, rick
ps: The patch is also at
 http://people.freebsd.org/~rmacklem/fsid.patch for anyone on
the list interested (if the attachment doesn't make it through.


rick--- kern/vfs_init.c.sav	2011-06-11 18:58:33.0 -0400
+++ kern/vfs_init.c	2011-08-20 20:01:31.0 -0400
@@ -54,9 +54,38 @@ static int	vfs_unregister(struct vfsconf
 MALLOC_DEFINE(M_VNODE, vnodes, Dynamically allocated vnodes);
 
 /*
+ * This table assigns static vfc_typenum values for file systems that
+ * are well known to the system. This avoids the problem of a file system's
+ * fsid and, therefore, its NFS file handle, from changing based on
+ * when the file system is loaded into the kernel.
+ * File system types not in this list will be assigned values based
+ * on maxvfsconf.
+ */
+static struct vfstypenums {
+	char	*typename;
+	int	typenum;
+} vfstypenums[] = {
+	{ cd9660,	1	},
+	{ ufs,	2	},
+	{ procfs,	3	},
+	{ devfs,	4	},
+	{ msdosfs,	5	},
+	{ nfs,	6	},
+	{ zfs,	7	},
+	{ reiserfs,	8	},
+	{ tmpfs,	9	},
+	{ hpfs,	10	},
+	{ ntfs,	11	},
+	{ ext2fs,	12	},
+	{ udf,	13	},
+	{ xfs,	14	},
+	{ ,		0	}	/* Must be last. */
+};
+
+/*
  * The highest defined VFS number.
  */
-int maxvfsconf = VFS_GENERIC + 1;
+int maxvfsconf = sizeof(vfstypenums) / sizeof(struct vfstypenums);
 
 /*
  * Single-linked list of configured VFSes.
@@ -138,6 +167,7 @@ vfs_register(struct vfsconf *vfc)
 	struct sysctl_oid *oidp;
 	struct vfsops *vfsops;
 	static int once;
+	int i;
 
 	if (!once) {
 		vattr_null(va_null);
@@ -152,7 +182,18 @@ vfs_register(struct vfsconf *vfc)
 	if (vfs_byname(vfc-vfc_name) != NULL)
 		return EEXIST;
 
-	vfc-vfc_typenum = maxvfsconf++;
+	/* First, search for a match in

Re: fsid change of ZFS?

2011-08-20 Thread Benjamin Kaduk


On Sat, 20 Aug 2011, Rick Macklem wrote:


Yes, using vfs_getnewfsid() does not solve the issue.

I noticed that Solaris looked up a fixed array vfssw[] exactly for
the purpose. I think a table like it is a good solution for fixing
fsid for each file system.

-- Hiroki

If anyone thinks using a fixed table to assign vfc_typenum for known
file system types is a bad idea, please let us know.


Fixed table sounds like a good plan.
Is there a reason for/against using a hash table for types that are not in 
the fixed table?  The advantage would be that out-of-tree filesystems get 
a consistent typenum (modulo hash collisions), but there would be more 
overhead in maintaining the table.


-Ben Kaduk
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-19 Thread Hiroki Sato

Hiroki Sato h...@freebsd.org wrote
  in 20110819.002046.908756241495481148@allbsd.org:

hr Hi,
hr
hr  I have experienced Stale NFS file handle issue when switching
hr  between oldnfs and newnfs on a CURRENT box (NFS server exporting ZFS
hr  mountpoints).  The cause was that fsid was changed in the following
hr  conditions and not in the NFS subsystem itself, but I am wondering if
hr  these are expected behavior...
hr
hr  First, I tried the following configurations of NFS and ZFS, and saw
hr  if fsid of the same mountpoint (a mounted ZFS dataset) changed or
hr  not by using statfs(2):
hr
hr  compile opts   kld module  fsid[0:1]   kld 
loaded by
hr  

hr  NFSSERVER+NFSCLIENTzfs 865798fa:8346ef02   loader
hr
hr  NFSSERVER+NFSCLIENTzfs 865798fa:8346ef07   
kldload(8)
hr
hr  NFSSERVER+NFSCLIENT+
hr  NFSD+NFSCL zfs 865798fa:8346ef03   loader
hr
hr  NFSSERVER+NFSCLIENT+
hr  NFSD+NFSCL zfs 865798fa:8346ef08   kldload(8)
hr
hr  NFSSERVER+NFSCLIENTnfsd+nfscl+zfs  865798fa:8346ef08   loader
hr  


 Ah, I found why this happened:

   /*
* The fsid is 64 bits, composed of an 8-bit fs type, which
* separates our fsid from any other filesystem types, and a
* 56-bit objset unique ID.  The objset unique ID is unique to
* all objsets open on this system, provided by unique_create().
* The 8-bit fs type must be put in the low bits of fsid[1]
* because that's where other Solaris filesystems put it.
*/
   fsid_guid = dmu_objset_fsid_guid(zfsvfs-z_os);
   ASSERT((fsid_guid  ~((1ULL56)-1)) == 0);
   vfsp-vfs_fsid.val[0] = fsid_guid;
   vfsp-vfs_fsid.val[1] = ((fsid_guid32)  8) |
   vfsp-mnt_vfc-vfc_typenum  0xFF;

 Since the vfc_typenum variable is incremented every time a new vfs is
 installed, loading order of modules that call vfs_register() affects
 ZFS's fsid.

 Anyway, possibility of fsid change is troublesome especially for an
 NFS server with a lot of clients running.  Can zeroing or setting a
 fixed value to the lowest 8-bit of vfs_fsid.val[1] be harmful?

-- Hiroki


pgpV3vmW4Frcl.pgp
Description: PGP signature

Re: fsid change of ZFS?

2011-08-19 Thread Rick Macklem

Hiroki Sato wrote:
 Hiroki Sato h...@freebsd.org wrote
 in 20110819.002046.908756241495481148@allbsd.org:
 
 hr Hi,
 hr
 hr I have experienced Stale NFS file handle issue when switching
 hr between oldnfs and newnfs on a CURRENT box (NFS server exporting
 ZFS
 hr mountpoints). The cause was that fsid was changed in the following
 hr conditions and not in the NFS subsystem itself, but I am wondering
 if
 hr these are expected behavior...
 hr
 hr First, I tried the following configurations of NFS and ZFS, and
 saw
 hr if fsid of the same mountpoint (a mounted ZFS dataset) changed or
 hr not by using statfs(2):
 hr
 hr compile opts kld module fsid[0:1] kld loaded by
 hr
 
 hr NFSSERVER+NFSCLIENT zfs 865798fa:8346ef02 loader
 hr
 hr NFSSERVER+NFSCLIENT zfs 865798fa:8346ef07 kldload(8)
 hr
 hr NFSSERVER+NFSCLIENT+
 hr NFSD+NFSCL zfs 865798fa:8346ef03 loader
 hr
 hr NFSSERVER+NFSCLIENT+
 hr NFSD+NFSCL zfs 865798fa:8346ef08 kldload(8)
 hr
 hr NFSSERVER+NFSCLIENT nfsd+nfscl+zfs 865798fa:8346ef08 loader
 hr
 
 
 Ah, I found why this happened:
 
 /*
 * The fsid is 64 bits, composed of an 8-bit fs type, which
 * separates our fsid from any other filesystem types, and a
 * 56-bit objset unique ID. The objset unique ID is unique to
 * all objsets open on this system, provided by unique_create().
 * The 8-bit fs type must be put in the low bits of fsid[1]
 * because that's where other Solaris filesystems put it.
 */
 fsid_guid = dmu_objset_fsid_guid(zfsvfs-z_os);
 ASSERT((fsid_guid  ~((1ULL56)-1)) == 0);
 vfsp-vfs_fsid.val[0] = fsid_guid;
 vfsp-vfs_fsid.val[1] = ((fsid_guid32)  8) |
 vfsp-mnt_vfc-vfc_typenum  0xFF;
 
 Since the vfc_typenum variable is incremented every time a new vfs is
 installed, loading order of modules that call vfs_register() affects
 ZFS's fsid.
 
 Anyway, possibility of fsid change is troublesome especially for an
 NFS server with a lot of clients running. Can zeroing or setting a
 fixed value to the lowest 8-bit of vfs_fsid.val[1] be harmful?
 
 -- Hiroki
Oh, and I think other fs types will suffer the same fate, except that
they usually avoid it, because they are compiled into the kernel and
the assignment of vfs_typenum happens in the same order--same value.

My (B) suggestion would avoid this for all file system types in the
fixed table.

rick
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-19 Thread Rick Macklem

Hiroki Sato wrote:
 Hiroki Sato h...@freebsd.org wrote
 in 20110819.002046.908756241495481148@allbsd.org:
 
 hr Hi,
 hr
 hr I have experienced Stale NFS file handle issue when switching
 hr between oldnfs and newnfs on a CURRENT box (NFS server exporting
 ZFS
 hr mountpoints). The cause was that fsid was changed in the following
 hr conditions and not in the NFS subsystem itself, but I am wondering
 if
 hr these are expected behavior...
 hr
 hr First, I tried the following configurations of NFS and ZFS, and
 saw
 hr if fsid of the same mountpoint (a mounted ZFS dataset) changed or
 hr not by using statfs(2):
 hr
 hr compile opts kld module fsid[0:1] kld loaded by
 hr
 
 hr NFSSERVER+NFSCLIENT zfs 865798fa:8346ef02 loader
 hr
 hr NFSSERVER+NFSCLIENT zfs 865798fa:8346ef07 kldload(8)
 hr
 hr NFSSERVER+NFSCLIENT+
 hr NFSD+NFSCL zfs 865798fa:8346ef03 loader
 hr
 hr NFSSERVER+NFSCLIENT+
 hr NFSD+NFSCL zfs 865798fa:8346ef08 kldload(8)
 hr
 hr NFSSERVER+NFSCLIENT nfsd+nfscl+zfs 865798fa:8346ef08 loader
 hr
 
 
 Ah, I found why this happened:
 
 /*
 * The fsid is 64 bits, composed of an 8-bit fs type, which
 * separates our fsid from any other filesystem types, and a
 * 56-bit objset unique ID. The objset unique ID is unique to
 * all objsets open on this system, provided by unique_create().
 * The 8-bit fs type must be put in the low bits of fsid[1]
 * because that's where other Solaris filesystems put it.
 */
 fsid_guid = dmu_objset_fsid_guid(zfsvfs-z_os);
 ASSERT((fsid_guid  ~((1ULL56)-1)) == 0);
 vfsp-vfs_fsid.val[0] = fsid_guid;
 vfsp-vfs_fsid.val[1] = ((fsid_guid32)  8) |
 vfsp-mnt_vfc-vfc_typenum  0xFF;
 
 Since the vfc_typenum variable is incremented every time a new vfs is
 installed, loading order of modules that call vfs_register() affects
 ZFS's fsid.
 
 Anyway, possibility of fsid change is troublesome especially for an
 NFS server with a lot of clients running. Can zeroing or setting a
 fixed value to the lowest 8-bit of vfs_fsid.val[1] be harmful?
 
 -- Hiroki
Well, the problem is that the fsid needs to be unique among all mounts.
The vfs_typenum field is used to try and ensure that it does not end up
the same value as a non-ZFS file system.

(A) I think making that field a fixed constant should be ok, if the function
checks for a conflict by calling vfs_getvfs() to check for one.
See vfs_getnewfsid() for how this is done. (There is a mutex lock that
needs to be held while doing it.) Alternately, if ZFS can call vfs_getnewfsid()
instead of doing its own, that might be nicer?

(B) Another way to fix this would be to modify vfs_register() to look up
file systems in a table (by vfc_name) and used a fixed, assigned value
from the table for vfc_typenum for entries found in the table. Only do
the maxvfsconf++ when there isn't an entry for the fstype in the table.
(VFS_GENERIC can be set to the size of the table. That's what happened
 in the bad old days when vfsconf was a table built at kernel config time.)

If you guys think (B) is preferred, I could come up with a patch. I don't
know enough about ZFS to do (A).

rick
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: fsid change of ZFS?

2011-08-19 Thread Hiroki Sato

Rick Macklem rmack...@uoguelph.ca wrote
  in 1565511281.69213.1313764157732.javamail.r...@erie.cs.uoguelph.ca:

rm Hiroki Sato wrote:
rm  fsid_guid = dmu_objset_fsid_guid(zfsvfs-z_os);
rm  ASSERT((fsid_guid  ~((1ULL56)-1)) == 0);
rm  vfsp-vfs_fsid.val[0] = fsid_guid;
rm  vfsp-vfs_fsid.val[1] = ((fsid_guid32)  8) |
rm  vfsp-mnt_vfc-vfc_typenum  0xFF;
rm 
rm  Since the vfc_typenum variable is incremented every time a new vfs is
rm  installed, loading order of modules that call vfs_register() affects
rm  ZFS's fsid.
rm 
rm  Anyway, possibility of fsid change is troublesome especially for an
rm  NFS server with a lot of clients running. Can zeroing or setting a
rm  fixed value to the lowest 8-bit of vfs_fsid.val[1] be harmful?
rm 
rm  -- Hiroki
rm Well, the problem is that the fsid needs to be unique among all mounts.
rm The vfs_typenum field is used to try and ensure that it does not end up
rm the same value as a non-ZFS file system.
rm
rm (A) I think making that field a fixed constant should be ok, if the function
rm checks for a conflict by calling vfs_getvfs() to check for one.
rm See vfs_getnewfsid() for how this is done. (There is a mutex lock that
rm needs to be held while doing it.) Alternately, if ZFS can call 
vfs_getnewfsid()
rm instead of doing its own, that might be nicer?
rm
rm (B) Another way to fix this would be to modify vfs_register() to look up
rm file systems in a table (by vfc_name) and used a fixed, assigned value
rm from the table for vfc_typenum for entries found in the table. Only do
rm the maxvfsconf++ when there isn't an entry for the fstype in the table.
rm (VFS_GENERIC can be set to the size of the table. That's what happened
rm  in the bad old days when vfsconf was a table built at kernel config time.)
rm
rm If you guys think (B) is preferred, I could come up with a patch. I don't
rm know enough about ZFS to do (A).

rm Oh, and I think other fs types will suffer the same fate, except that
rm they usually avoid it, because they are compiled into the kernel and
rm the assignment of vfs_typenum happens in the same order--same value.

 Yes, using vfs_getnewfsid() does not solve the issue.

 I noticed that Solaris looked up a fixed array vfssw[] exactly for
 the purpose.  I think a table like it is a good solution for fixing
 fsid for each file system.

-- Hiroki


pgpn7hZbYSoTB.pgp
Description: PGP signature

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

Re: fsid change of ZFS?

35 matches

Site Navigation

Mail list logo

Footer information