Re: panic: vputx: missed vn_close

2013-01-10 Thread Peter Holm
On Thu, Jan 10, 2013 at 01:40:07AM +0200, Konstantin Belousov wrote:
 On Wed, Jan 09, 2013 at 07:52:43PM +0100, Florian Smeets wrote:
  Hi,
  
  I got this while building packages with poudriere. I'm running r245188.
  
  Let me know if you need anything else from the dump.
  
  Florian
  
  VNASSERT failed
  0xfe04fda5bba0: tag zfs, type VREG
  usecount 1, writecount 1, refcount 1 mountedhere 0
  flags (VI_ACTIVE)
   VI_LOCKedv_object 0xfe062f6479f8 ref 0 pages 0
  lock type zfs: EXCL by thread 0xfe00bd683480 (pid 34602, umount,
  tid 100578)
  panic: vputx: missed vn_close
  cpuid = 3
  Uptime: 9h25m23s
  Dumping 13255 out of 32647
  MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
  
  [...]
  
  (kgdb) where
  #0  doadump (textdump=1) at pcpu.h:229
  #1  0x804c4ab7 in kern_reboot (howto=260) at
  /usr/home/flo/dev/checkouts/svn-src/sys/kern/kern_shutdown.c:446
  #2  0x804c4fc6 in vpanic (fmt=value optimized out, ap=value
  optimized out) at
  /usr/home/flo/dev/checkouts/svn-src/sys/kern/kern_shutdown.c:753
  #3  0x804c4e56 in kassert_panic (fmt=value optimized out) at
  /usr/home/flo/dev/checkouts/svn-src/sys/kern/kern_shutdown.c:641
  #4  0x8055714d in vputx (vp=0xfe04fda5bba0, func=2) at
  /usr/home/flo/dev/checkouts/svn-src/sys/kern/vfs_subr.c:2243
  #5  0x80d6b42f in null_reclaim (ap=value optimized out) at
  /usr/home/flo/dev/checkouts/svn-src/sys/modules/nullfs/../../fs/nullfs/null_vnops.c:743
  #6  0x8070aee8 in VOP_RECLAIM_APV (vop=value optimized out,
  a=value optimized out) at vnode_if.c:1959
  #7  0x8055844c in vgonel (vp=0xfe04fda5b7c0) at vnode_if.h:830
  #8  0x80557a7f in vflush (mp=0xfe0533ce3cc0, rootrefs=1,
  flags=2, td=0xfe00bd683480) at
  /usr/home/flo/dev/checkouts/svn-src/sys/kern/vfs_subr.c:2625
  #9  0x80d6aa4e in nullfs_unmount (mp=0xfe0533ce3cc0,
  mntflags=value optimized out)
  at
  /usr/home/flo/dev/checkouts/svn-src/sys/modules/nullfs/../../fs/nullfs/null_vfsops.c:250
  #10 0x805502cf in dounmount (mp=0xfe0533ce3cc0,
  flags=134742016, td=value optimized out) at
  /usr/home/flo/dev/checkouts/svn-src/sys/kern/vfs_mount.c:1314
  #11 0x8054ff8b in sys_unmount (td=0xfe00bd683480,
  uap=0xff90d2c87a40) at
  /usr/home/flo/dev/checkouts/svn-src/sys/kern/vfs_mount.c:1211
  #12 0x806b4845 in amd64_syscall (td=0xfe00bd683480,
  traced=0) at subr_syscall.c:134
  #13 0x8069d04b in Xfast_syscall () at exception.S:387
  #14 0x000800882ffa in ?? ()
  Previous frame inner to this frame (corrupt stack?)
  
 
 I was able to reproduce it locally. I think that you need to have a file
 opened for write on the nullfs mount, and then do forced unmount of
 the mount, while file is still open.
 
 The patch below fixed it for me.
 
 diff --git a/sys/fs/nullfs/null_vnops.c b/sys/fs/nullfs/null_vnops.c
 index cc35d81..3be7366 100644
 --- a/sys/fs/nullfs/null_vnops.c

I've verified the scenario and are now testing with your patch.

- Peter
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: loopback interface broken on current

2013-01-10 Thread O. Hartmann
Am 01/09/13 09:59, schrieb Gleb Smirnoff:
 On Wed, Jan 09, 2013 at 09:56:25AM +0100, Hartmann, O. wrote:
 H Same here.
 H The OS: FreeBSD 10.0-CURRENT/amd64 r245218. Since I have three boxes
 H running approximately the same configurations (I share my configs
 H between lab and home), but different hardware, I'm confused.
 H 
 H The symptoms in my case are:
 H 
 H Booting the box is all right until it comes to the start of nfsuserd.
 H Prior to that, ntp adjusts the clock properly with an external time
 H server  - so this implies network connectivity. Start of nfsuserd is
 H stuck forever.
 H Interrupting the start of nfsuserd restarts several other services, but
 H winbindd and slapd (OpenLDAP) get stuck again. In case I also interrupt
 H them, there are other services which will not start.
 H 
 H Trying to login as root on the console fails - I never get a password
 H tag after having issued the root login name. Since this machine is bound
 H to a local and remote OpenLDAP backend, I'm used to have an emergency
 H local user which usually works - but this time, neither root nor this
 H user can login!
 H 
 H Bringing up the box in SINGLEUSER allows me to login. Investigating the
 H network interfaces with ifconfig reveals, that the loopback did not get
 H assigned to any inet 127.0.0.1 address. Sometimes there is only inet6
 H linklocal address, some nd6 options, but sometimes even IPv6 assignments
 H do not show up.
 H 
 H In a desperate move I tried to recompile a kernel. In /etc/src.conf, I
 H recompile also the kernel module for the most recent virtual-box kernel
 H module. While the kernel and module (*.ko) get installed properly, the
 H recompilation of the VirtualBox port gets stuck when the system unfolds
 H the source tarball. Hitting Ctrl-T say sbwait for the process. Other
 H processes seem to have trouble getting a proper ownership or UID for a
 H file - this is my naiv interpretation what I see at the surface.
 H 
 H The funny thing is, that after several reboots, the box gets up as normal.
 H 
 H I revealed this issue approx. two weeks ago when out of the sudden the
 H amd automounter stopped working and the NFSv4 network drives didn't
 H attach properly and made the whole box being stuck.
 H 
 H Sorry for the more superficial description of the problem ...
 H 
 H Has the problem been identified? Is there a solution? Since it affects
 H only my very modern hardware (i7-3930, 32GB RAM, ASUS P9X79 WS
 H mainboard), while a very same setup on older hardware (our local server
 H is Intel Q6600 with 8GB RAM and and oldish Intel P45-chipset based
 H mainboard), both systems do have Intel NICs, I'm a bit confused.
 
 This looks unrelated to the problem discussed, because r245218 is
 later than r244989 which backed out my change.
 
 Can you do a binary search to identify which revision broke things?
 

Sorry for the delay.

Today I realized that the problems occured and described are due to the
fact that I use jumbo frames (mtu 9100 or mtu 6150) on the em0-device
(em0@pci0:0:25:0:class=0x02 card=0x844e1043 chip=0x15038086
rev=0x05 hdr=0x00
vendor = 'Intel Corporation'
device = '82579V Gigabit Network Connection'
class  = network
subclass   = ethernet
). Using the default of mtu 1500 does not make the problem occur.

My time constraints disallow to do further or deeper investigations - at
least until the end of next week, I'm sorry. The problems arose around
Christmas, it might be even earlier, since I didn't access the machine
since 10th December.

I have other boxes with Intel NICs - different chiptypes. They do not
show this problem.

Oliver



signature.asc
Description: OpenPGP digital signature


Re: gptzfsboot error using HP Smart Array P410i Controller

2013-01-10 Thread John Baldwin
On Wednesday, January 09, 2013 05:57:06 PM Palle Girgensohn wrote:
 Palle Girgensohn skrev:
  Hi!
  
  This is still happening with FreeBSD 9.0-RELEASE, as I have just
  discovered. The hack works like a charm, but seems kind of odd... :)
  
  Any progress in getting a real fix into the repository? Any risks
  with the hack - is it likely to believe that it will suddenly or
  sporadically fail?
  
  Cheers, Palle
  
  Christoph Hoffmann skrev:
  Hello Daniel,
  
  Last time I checked up on the issue was on the 23rd of September,
  it was not fixed then. I was able to to boot from drive 0x80 after
  adding:
  
  *** zfsboot.c.orig Fri Sep 23 18:03:26 2011 --- zfsboot.c  Fri Sep
  23 18:47:44 2011 *** *** 459,464  --- 459,465 
  heap_end = (char *) PTOV(bios_basemem); }
  
  +  printf(Hello! I am a hack.\n); dsk = malloc(sizeof(struct
  dsk)); dsk-drive = *(uint8_t *)PTOV(ARGS); dsk-type = dsk-drive
   DRV_HARD ? TYPE_AD : TYPE_FD;
  
  I am inclined to think that this is related to the way how we
  compile this code, especially when run on the following particular
  processor:
  
  1 Processor(s) detected, 4 total cores enabled, Hyperthreading is
  enabled Proc 1: Intel(R) Xeon(R) CPU E5630 @ 2.53GHz QPI Speed: 5.8
  GT/s.
  
  
  Regards,
  
  Christoph
  
  On Oct 11, 2011, at 3:16 PM, Daniel Kalchev wrote:
  Has this issue been resolved somehow? Sane method to build
  gptzfsboot that will run on HP's P410i?
 
 Hi,
 
 This is still happening with 9.2-RELEASE on a HP DL 380 G5:

Presumably 9.1?

 gptzfsboot: error 1 lba 32
 gptzfsboot: error 1 lba 1
 gptzfsboot: No ZFS pools located, can't boot
 
 Andriy suggested the latest sys/boot/i386/common/edd.h@243024 from head,
 but unfortunately it makes no difference.
 
 The printf hack above still works fine though.

Do you have avg's most recent commit to edd.h to pack various structures?  I'm 
not sure that made it into 9.1.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: gptzfsboot error using HP Smart Array P410i Controller

2013-01-10 Thread Palle Girgensohn


10 jan 2013 kl. 18:15 skrev John Baldwin j...@freebsd.org:

 On Wednesday, January 09, 2013 05:57:06 PM Palle Girgensohn wrote:
 Palle Girgensohn skrev:
 Hi!
 
 This is still happening with FreeBSD 9.0-RELEASE, as I have just
 discovered. The hack works like a charm, but seems kind of odd... :)
 
 Any progress in getting a real fix into the repository? Any risks
 with the hack - is it likely to believe that it will suddenly or
 sporadically fail?
 
 Cheers, Palle
 
 Christoph Hoffmann skrev:
 Hello Daniel,
 
 Last time I checked up on the issue was on the 23rd of September,
 it was not fixed then. I was able to to boot from drive 0x80 after
 adding:
 
 *** zfsboot.c.origFri Sep 23 18:03:26 2011 --- zfsboot.cFri Sep
 23 18:47:44 2011 *** *** 459,464  --- 459,465 
 heap_end = (char *) PTOV(bios_basemem); }
 
 +printf(Hello! I am a hack.\n); dsk = malloc(sizeof(struct
 dsk)); dsk-drive = *(uint8_t *)PTOV(ARGS); dsk-type = dsk-drive
  DRV_HARD ? TYPE_AD : TYPE_FD;
 
 I am inclined to think that this is related to the way how we
 compile this code, especially when run on the following particular
 processor:
 
 1 Processor(s) detected, 4 total cores enabled, Hyperthreading is
 enabled Proc 1: Intel(R) Xeon(R) CPU E5630 @ 2.53GHz QPI Speed: 5.8
 GT/s.
 
 
 Regards,
 
 Christoph
 
 On Oct 11, 2011, at 3:16 PM, Daniel Kalchev wrote:
 Has this issue been resolved somehow? Sane method to build
 gptzfsboot that will run on HP's P410i?
 
 Hi,
 
 This is still happening with 9.2-RELEASE on a HP DL 380 G5:
 
 Presumably 9.1?
 
 gptzfsboot: error 1 lba 32
 gptzfsboot: error 1 lba 1
 gptzfsboot: No ZFS pools located, can't boot
 
 Andriy suggested the latest sys/boot/i386/common/edd.h@243024 from head,
 but unfortunately it makes no difference.
 
 The printf hack above still works fine though.
 
 Do you have avg's most recent commit to edd.h to pack various structures?  
 I'm 
 not sure that made it into 9.1.
 

9.1, of course, sorry! :-)

Yes, I've built a fresh gptzfsboot  using 9.1 + edd.h from head (with _packed 
keywords added). 
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: sysctl -a causes kernel trap 12

2013-01-10 Thread Xin Li
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

To all: this became more and more hard to replicate lately.  I've
tried these options and the most important progress is that it's
possible to get a crashdump when debug.debugger_on_panic=0 and I
managed to get a backtrace which indicates the panic occur when trying
to do mtx_lock(Giant) - __mtx_lock_sleep - turnstile_wait -
propagate_priority, but after I've added some instruments to the
surrounding code and enabled INVARIANT and/or WITNESS, it mysteriously
went away.

Reverting my instruments code and update to latest svn makes the issue
disappear for one day.  I've hit it again today but unfortunately
didn't get a successful dump and after reboot I can't reproduce it
again :(

Still trying...

Cheers,
- -- 
Xin LI delp...@delphij.nethttps://www.delphij.net/
FreeBSD - The Power to Serve!   Live free or die
-BEGIN PGP SIGNATURE-

iQEcBAEBCAAGBQJQ7z/sAAoJEG80Jeu8UPuzJbMIAL2xM6xWATXp2swY1E25WaCj
UoDAJtVkGvI5pOQmt7UBvDJfqr74/1c1ugGodFVAtRluKihxQ6amXcmF62eqPu0g
ARj7R+g/5qQ+QDDOVFcqnvuz1A1KwoDD5jkfAyq+oWECQ5a4ROk/59EhlriK9CQd
I4NRzuJLgOf3t4xNk7nAEYSnx+zL07vpGmSNIHdWkLieGNIoa1X5W9HtfpOGgRpm
c5ELbWTpxGTtAFmFxc7h2hygu38/hlj6KPJHRK6HGcR1t/EMc2Rauzn7Bl3R3C/W
TjDrxknPjZUUA70oI2V2Vo8tGZaJCzpq8dDWb8fx5rbKxLM+svmShHYftow78rM=
=ooDY
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: clang 3.2 RC2 miscompiles libgcc?

2013-01-10 Thread Dimitry Andric

On 2013-01-08 09:58, Stefan Farfeleder wrote:

On Tue, Jan 08, 2013 at 12:21:12AM +0100, Dimitry Andric wrote:

...

After a lot of splitting up of unwind-dw2.c, I arrived at _Unwind_Resume
which when compiled by clang caused the crashes, but when compiled by
gcc ran OK.

your patch seems to work just fine. No crashes whatsoever so far. Thank
you.


I have committed a slighly cleaned-up version of this hack in r245272,
so until this is fixed by upstream, everybody will at least have a
correctly functioning libgcc on amd64.

Since this issue can potentially also occur on stable/9, I will MFC the
fix too, after a few days timeout.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Circular port dependency

2013-01-10 Thread George Mitchell

I grabbed the ports tree as of 308518, the RELEASE_9_1_0 tag.
devel/libtool won't build, because it requires autom4te during the
configure phase.  So I put BUILD_DEPENDS= autom4te:devel/autoconf
in the Makefile.  But autoconf depends on gmake, which depends on
gettext, which depends on libiconv, which depends on libtool.
What to do?

I'm running on a CURRENT build on my Raspberry Pi.
-- George Mitchell
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


setlocale() for base system utilities

2013-01-10 Thread Xin Li
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hi,

I just noticed that many base system utilities, like rm, cat, etc.
does not do setlocale() at beginning.  Is this intentional or just
nobody have yet to done it?

Cheers,
- -- 
Xin LI delp...@delphij.nethttps://www.delphij.net/
FreeBSD - The Power to Serve!   Live free or die
-BEGIN PGP SIGNATURE-

iQEcBAEBCAAGBQJQ73VUAAoJEG80Jeu8UPuzWR4H/Ap79HVGQso6/HPmiud2JC5q
dzeY20K1P2rlQAjDhday1HWJg3bW4ZCzvIX09AVi+lQB8fRAcRzLIFjXt611ovqC
tBOhTgtwJvFEYGBs5batWCrEOtbTnbM2YZlOyJegSdjqhIoXiWrj5BCpbr5OaGw7
GK9yJhiYX60vHQwL0kRP4Xwn9Yc1+UPyyzPXj0HpgTutJhFFwcXCymK2ZpWmyxT4
6SdwEcIEsBH2iluunS9yDGKCexk8v8BT/uPUOGOQ6vK9y4L/3egJ3RKrQj4Q3Mu+
Yksn+jIT5eXdijmmbnYZKR0QW/7+eyJbZ4w4ZNasYhEYGskCfj2ce32sbYa7ilw=
=p9Ch
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: setlocale() for base system utilities

2013-01-10 Thread Zhihao Yuan
On Thu, Jan 10, 2013 at 8:13 PM, Xin Li delp...@delphij.net wrote:
 I just noticed that many base system utilities, like rm, cat, etc.
 does not do setlocale() at beginning.  Is this intentional or just
 nobody have yet to done it?

Enabling locale in the non-wide-char-awared utilities only makes
difference for 8-bit locales, like ISO8859-*, but not multibyte
ones.  From a user's point of view, this is an inconsistency.

--
Zhihao Yuan, ID lichray
The best way to predict the future is to invent it.
___
4BSD -- http://4bsd.biz/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: setlocale() for base system utilities

2013-01-10 Thread Xin LI
On Thursday, January 10, 2013, Zhihao Yuan wrote:

 On Thu, Jan 10, 2013 at 8:13 PM, Xin Li delp...@delphij.netjavascript:;
 wrote:
  I just noticed that many base system utilities, like rm, cat, etc.
  does not do setlocale() at beginning.  Is this intentional or just
  nobody have yet to done it?

 Enabling locale in the non-wide-char-awared utilities only makes
 difference for 8-bit locales, like ISO8859-*, but not multibyte
 ones.  From a user's point of view, this is an inconsistency.


It's inconsistency that some utilities use localized messages while some
do, too.  So I don't buy that argument.


-- 
Xin LI delp...@delphij.net https://www.delphij.net/
FreeBSD - The Power to Serve! Live free or die
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: setlocale() for base system utilities

2013-01-10 Thread Zhihao Yuan
On Thu, Jan 10, 2013 at 9:24 PM, Xin LI delp...@gmail.com wrote:
 It's inconsistency that some utilities use localized messages while some do,
 too.  So I don't buy that argument.

If one utility support catalogs, then localized messages can be fully
supported by providing catalogs for all the locales.  But if enabling
locale for a narrow-char-only utility, only 8-bit encodings locales
are supported, not all.  The inconsistency happens within the utility.

--
Zhihao Yuan, ID lichray
The best way to predict the future is to invent it.
___
4BSD -- http://4bsd.biz/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Expanding ZFS RAIDZ on the fly?

2013-01-10 Thread O. Hartmann
My question may sound naiv, sorry.

I have already set up a RAIDZ (on FreeBSD 10.0-CUR), comprised with
three 3 TB disks. I'd like to expand the array with an additional disk -
on the fly.

oh



signature.asc
Description: OpenPGP digital signature


Re: clang 3.2 RC2 miscompiles libgcc?

2013-01-10 Thread Stefan Farfeleder
On Fri, Jan 11, 2013 at 12:39:44AM +0100, Dimitry Andric wrote:
 On 2013-01-08 09:58, Stefan Farfeleder wrote:
  On Tue, Jan 08, 2013 at 12:21:12AM +0100, Dimitry Andric wrote:
 ...
  After a lot of splitting up of unwind-dw2.c, I arrived at _Unwind_Resume
  which when compiled by clang caused the crashes, but when compiled by
  gcc ran OK.
  your patch seems to work just fine. No crashes whatsoever so far. Thank
  you.
 
 I have committed a slighly cleaned-up version of this hack in r245272,
 so until this is fixed by upstream, everybody will at least have a
 correctly functioning libgcc on amd64.
 
 Since this issue can potentially also occur on stable/9, I will MFC the
 fix too, after a few days timeout.

Thanks!
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org