Re: Performance of madvise / msync

2008-06-26 Thread Matthew Dillon
:With madvise() and without msync(), there are high numbers of
:faults, which matches the number of disk io operations.  It
:goes through cycles, every once in a while stalling while about
:60MB of data is dumped to disk at 20MB/s or so (buffers flushing?)
:At the beginning of each cycle it's fast, with 140 faults/s or so,
:and slows as the number of faults climbs to 180/s or so before
:stalling and flusing again.  It never gets _really_ slow though.

Yah, without the msync() the dirty pages build up in the kernel's
VM page cache.  A flush should happen automatically every 30-60
seconds, or sooner if the buffer cache builds up too many dirty pages.

The activity you are seeing sounds like the 30-60 second filesystem
sync the kernel does periodically.

Either NetBSD or OpenBSD, I forget which, implemented a partial sync
feature to prevent long stalls when the filesystem syncer hits a file
with a lot of dirty pages.  FreeBSD could borrow that optimization if
they want to reduce stalls from the filesytem sync.  I ported it to DFly
a while back myself.

:With msync() and without madvise(), things are very slow, and
:there are no faults, just writes.
:...
:>  The size_t argument to msync() (0x453b7618) is highly questionable.
:>  It could be ktrace reporting the wrong value, but maybe not.
:
:That's the size of rg2.rrd.  It's 1161524760 bytes long.
:...
:Looks like the source of my problem is very slow msync() on the
:file when the file is over a certain size.  It's still fastest
:without either madvise or msync.
:
:Thanks for your time,
:
:Marcus

The msync() is clearly the problem.  There are numerous optimizations
in the kernel but msync() is frankly a rather nasty critter even with
the optimizations work.  Nobody using msync() in real life ever tries
to run it over the entirety of such a large mapping... usually it is
just run on explicit sub-ranges that the program wishes to sync.

One reason why msync() is so nasty is that the kernel must physically
check the page table(s) to determine whether a page has been marked dirty
by the MMU, so it can't just iterate the pages it knows are dirty in
the VM object.  It's nasty whether it scans the VM object and iterates
the page tables, or scans the page tables and looks up the related VM
pages.   The only way to optimize this is to force write-faults by
mapping clean pages read-only, in order to track whether a page is
actually dirty in real time instead of lazily.  Then msync() would
only have to do a ranged-scan of the VM object's dirty-page list
and would not have to actually check the page tables for clean pages.

A secondary effect of the msync() is that it is initiating asynchronous
I/O for what sounds like hundreds of VM pages, or even more.  All those
pages are locked and busied from the point they are queued to the point
the I/O finishes, which for some of the pages can be a very, very long
time (into the multiples of seconds).  Pages locked that long will
interfere with madvise() calls made after the msync(), and probably
even interfere with the follow msync().

It used to be that msync() only synced VM pages to the underlying
file, making them consistent with read()'s and write()'s against
the underlying file.  Since FreeBSD uses a unified VM page cache
this is always true.  However, the Open Group specification now
requires that the dirty pages actually be written out to the underlying
media... i.e. issue real I/O.  So msync() can't be a NOP if you go by
the OpenGroup specification.

-Matt
Matthew Dillon 
<[EMAIL PROTECTED]>
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Problem with /boot/loader

2008-06-26 Thread Kevin Oberman
> Date: Thu, 26 Jun 2008 20:55:40 -0700
> From: Jeremy Chadwick <[EMAIL PROTECTED]>
> 
> On Thu, Jun 26, 2008 at 08:12:33PM -0700, Kevin Oberman wrote:
> > > Date: Thu, 26 Jun 2008 23:53:44 +0200
> > > From: Volker <[EMAIL PROTECTED]>
> > > Sender: [EMAIL PROTECTED]
> > > 
> > > On 12/23/-58 20:59, Kelly Black wrote:
> > > > Hello,
> > > > 
> > > > I have a problem with loader. I recently upgraded from 6_rel to 7_rel.
> > > > Now when I install world there is a problem booting.
> > > > 
> > > > Here is what I do:
> > > > cd /usr/src
> > > > make buildworld
> > > > make buildkernel KERNCONF=BLACK
> > > > make installkernel KERNCONF=BLACK
> > > > 
> > > > At this point I can reboot and all is good. After boot I install the 
> > > > new world:
> > > > 
> > > > cd /usr/src
> > > > mergemaster -p
> > > > reboot into single user mode
> > > > cd /usr/src
> > > > make installworld
> > > > mergemaster
> > > > 
> > > > Now when I reboot there is a problem. I get an error that the system
> > > > cannot boot. Part of it looks like this:
> > > > Can't work out which disk we are booting from.
> > > > Guessed BIOS device 0x not found by probes, defaulting to disk0:
> > > > 
> > > > If I boot from a live disk and replace /boot/loader with
> > > > /boot/loader.old it boots up fine and everything looks good. A new
> > > > world and a new kernel. I would be grateful for any help or any
> > > > pointers.
> > > > 
> > > > Sincerely,
> > > > Kel
> > > > 
> > > > PS I do not do anything special with my loader config files:
> > > > 
> > > > $ cat loader.conf
> > > >...
> > > 
> > > Kelly,
> > > 
> > > the /boot/loader.conf file does not come into play at that stage. Early
> > > in the loader code, loader needs to figure out, which disk (BIOS device)
> > > has been booted from. Until loader knows which device was booted up,
> > > it's unable to access any files (even loader.conf) on your boot device.
> > > 
> > > As I've never seen such a problem while upgrading any system, I suspect
> > > your problem must be settings specific. Can you show me your kernel
> > > config or are you using a plain vanilla GENERIC? Which arch are we
> > > talking about?
> > > 
> > > As I'm currently investigating another boot problem (but earlier in the
> > > boot chain), I'll check boot logic in the source code and may check for
> > > your issue, too, at that time, so it's just one effort. But please stay
> > > patient for some days, as I'm currently too busy.
> > 
> > We just got hit by this. The loader never loads and nothing boots. But a
> > system admin discovered that the problem disappeared if the /boot.conf
> > file was deleted. It just contained '-P'.
> > 
> > Once this file was removed, the system just booted up as expected. When
> > he changed it to -D or -h, the boot still locked up. 
> 
> I believe you mean /boot.config.  :-)]

Fingers faster than brain. Sorry.

> -P set the console to the serial port assuming no AT/PS2 keyboard is
> connected to the machine.  -D and -h are described in more detail in the
> Handbook.

man boot describes that pretty well, to0.
> 
> Even with -P, -D, -h, or -Dh, the system *should* still actually boot
> up, you just won't see anything on the VGA console until the system is
> fully up and getty/login is run against ttyv0 (VGA console login).

The system in question are servers and have no VGA console. They just
hung without any sign of the loader, but when he used the old loader,
the system booted normally. I thought that /boot.config was read by
boot2, so it should be over an done with before loader is even started.

All I can think is that the result of having set any of these flags, all
of which are related, causes loader(8) to go off the deep end. I think
I'll suggest that he try just -s in /boot.config

> If Kelly has no /boot.config, then I'd say the issue you're reporting is
> a different bug/problem, but should still be investigated.

It may be different or it may be a different manifestation of the same
one. Rather hard to tell at this point.

I am very busy with preparation for a major upgrade to our Chicago hubs
and may not get much chance to play with these now, but I'll ask the
admin to give it a shot, along with an empty /boot.config.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: [EMAIL PROTECTED]   Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4  EADA 927D EBB3 987B 3751


pgp9OhjSp18Sm.pgp
Description: PGP signature


Re: Problem with /boot/loader

2008-06-26 Thread Jeremy Chadwick
On Thu, Jun 26, 2008 at 08:12:33PM -0700, Kevin Oberman wrote:
> > Date: Thu, 26 Jun 2008 23:53:44 +0200
> > From: Volker <[EMAIL PROTECTED]>
> > Sender: [EMAIL PROTECTED]
> > 
> > On 12/23/-58 20:59, Kelly Black wrote:
> > > Hello,
> > > 
> > > I have a problem with loader. I recently upgraded from 6_rel to 7_rel.
> > > Now when I install world there is a problem booting.
> > > 
> > > Here is what I do:
> > > cd /usr/src
> > > make buildworld
> > > make buildkernel KERNCONF=BLACK
> > > make installkernel KERNCONF=BLACK
> > > 
> > > At this point I can reboot and all is good. After boot I install the new 
> > > world:
> > > 
> > > cd /usr/src
> > > mergemaster -p
> > > reboot into single user mode
> > > cd /usr/src
> > > make installworld
> > > mergemaster
> > > 
> > > Now when I reboot there is a problem. I get an error that the system
> > > cannot boot. Part of it looks like this:
> > > Can't work out which disk we are booting from.
> > > Guessed BIOS device 0x not found by probes, defaulting to disk0:
> > > 
> > > If I boot from a live disk and replace /boot/loader with
> > > /boot/loader.old it boots up fine and everything looks good. A new
> > > world and a new kernel. I would be grateful for any help or any
> > > pointers.
> > > 
> > > Sincerely,
> > > Kel
> > > 
> > > PS I do not do anything special with my loader config files:
> > > 
> > > $ cat loader.conf
> > >...
> > 
> > Kelly,
> > 
> > the /boot/loader.conf file does not come into play at that stage. Early
> > in the loader code, loader needs to figure out, which disk (BIOS device)
> > has been booted from. Until loader knows which device was booted up,
> > it's unable to access any files (even loader.conf) on your boot device.
> > 
> > As I've never seen such a problem while upgrading any system, I suspect
> > your problem must be settings specific. Can you show me your kernel
> > config or are you using a plain vanilla GENERIC? Which arch are we
> > talking about?
> > 
> > As I'm currently investigating another boot problem (but earlier in the
> > boot chain), I'll check boot logic in the source code and may check for
> > your issue, too, at that time, so it's just one effort. But please stay
> > patient for some days, as I'm currently too busy.
> 
> We just got hit by this. The loader never loads and nothing boots. But a
> system admin discovered that the problem disappeared if the /boot.conf
> file was deleted. It just contained '-P'.
> 
> Once this file was removed, the system just booted up as expected. When
> he changed it to -D or -h, the boot still locked up. 

I believe you mean /boot.config.  :-)

-P set the console to the serial port assuming no AT/PS2 keyboard is
connected to the machine.  -D and -h are described in more detail in the
Handbook.

Even with -P, -D, -h, or -Dh, the system *should* still actually boot
up, you just won't see anything on the VGA console until the system is
fully up and getty/login is run against ttyv0 (VGA console login).

If Kelly has no /boot.config, then I'd say the issue you're reporting is
a different bug/problem, but should still be investigated.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Problem with /boot/loader

2008-06-26 Thread Kevin Oberman
> Date: Thu, 26 Jun 2008 23:53:44 +0200
> From: Volker <[EMAIL PROTECTED]>
> Sender: [EMAIL PROTECTED]
> 
> On 12/23/-58 20:59, Kelly Black wrote:
> > Hello,
> > 
> > I have a problem with loader. I recently upgraded from 6_rel to 7_rel.
> > Now when I install world there is a problem booting.
> > 
> > Here is what I do:
> > cd /usr/src
> > make buildworld
> > make buildkernel KERNCONF=BLACK
> > make installkernel KERNCONF=BLACK
> > 
> > At this point I can reboot and all is good. After boot I install the new 
> > world:
> > 
> > cd /usr/src
> > mergemaster -p
> > reboot into single user mode
> > cd /usr/src
> > make installworld
> > mergemaster
> > 
> > Now when I reboot there is a problem. I get an error that the system
> > cannot boot. Part of it looks like this:
> > Can't work out which disk we are booting from.
> > Guessed BIOS device 0x not found by probes, defaulting to disk0:
> > 
> > If I boot from a live disk and replace /boot/loader with
> > /boot/loader.old it boots up fine and everything looks good. A new
> > world and a new kernel. I would be grateful for any help or any
> > pointers.
> > 
> > Sincerely,
> > Kel
> > 
> > PS I do not do anything special with my loader config files:
> > 
> > $ cat loader.conf
> >...
> 
> Kelly,
> 
> the /boot/loader.conf file does not come into play at that stage. Early
> in the loader code, loader needs to figure out, which disk (BIOS device)
> has been booted from. Until loader knows which device was booted up,
> it's unable to access any files (even loader.conf) on your boot device.
> 
> As I've never seen such a problem while upgrading any system, I suspect
> your problem must be settings specific. Can you show me your kernel
> config or are you using a plain vanilla GENERIC? Which arch are we
> talking about?
> 
> As I'm currently investigating another boot problem (but earlier in the
> boot chain), I'll check boot logic in the source code and may check for
> your issue, too, at that time, so it's just one effort. But please stay
> patient for some days, as I'm currently too busy.

We just got hit by this. The loader never loads and nothing boots. But a
system admin discovered that the problem disappeared if the /boot.conf
file was deleted. It just contained '-P'.

Once this file was removed, the system just booted up as expected. When
he changed it to -D or -h, the boot still locked up. 
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: [EMAIL PROTECTED]   Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4  EADA 927D EBB3 987B 3751


pgpLxj5zhFDZm.pgp
Description: PGP signature


Re: Performance of madvise / msync

2008-06-26 Thread Marcus Reid
On Thu, Jun 26, 2008 at 05:48:13PM -0700, Matthew Dillon wrote:
> :   65074 python   0.06 CALL  madvise(0x287c5000,0x70,_MADV_WILLNEED)
> :   65074 python   0.027455 RET   madvise 0
> :   65074 python   0.58 CALL  madvise(0x287c5000,0x1c20,_MADV_WILLNEED)
> :   65074 python   0.016904 RET   madvise 0
> :   65074 python   0.000179 CALL  madvise(0x287c6000,0x1950,_MADV_WILLNEED)
> :   65074 python   0.008629 RET   madvise 0
> :   65074 python   0.40 CALL  madvise(0x287c8000,0x8,_MADV_WILLNEED)
> :   65074 python   0.004173 RET   madvise 0
> :...
> :   65074 python   0.006084 CALL  msync(0x287c5000,0x453b7618,MS_ASYNC)
> :   65074 python   0.106284 RET   msync 0
> :...
> :As you can see, it's quite a bit faster.
> :
> :I know that msync is necessary under Linux but obsolete under FreeBSD, but
> :it's still funny that it takes a tenth of a second to return even with
> :MS_ASYNC specified.
> :
> :Also, why is it that the madvise() calls take so much longer when the
> :program does a couple of its own madvise() calls?  Was madvise() never
> :intended to be run so frequently and is therefore a little slower than
> :it could be?
> :
> :Here's the diff between the code for the first kdump above and the
> :second one.
> 
>  Those times are way way too large, even with other running threads
>  in the system.  madvise() should not take that long unless it is
>  being forced to wait on a busied page, and neither should msync().
>  madvise() doesn't even do any I/O (or shouldn't anyhow).
> 
>  Try removing just the msync() but keep the madvise() calls and see
>  if the madvise() calls continue to take horrendous amounts of time.
>  Then try the vise-versa.

Ok, first off, I'm noticing that of the 4 other files that this
is doing the same operations on, sized 569, 940, 116 and 116mB,
all of the msync() and madvise() calls are nice and fast.  It's
only with the 1161524760 byte file that msync is much, much
slower.  It's not linear -- it hits a wall somewhere between
940 and 1161 million bytes.

With madvise() and without msync(), there are high numbers of
faults, which matches the number of disk io operations.  It
goes through cycles, every once in a while stalling while about
60MB of data is dumped to disk at 20MB/s or so (buffers flushing?)
At the beginning of each cycle it's fast, with 140 faults/s or so,
and slows as the number of faults climbs to 180/s or so before
stalling and flusing again.  It never gets _really_ slow though.

   36286 python   0.16 NAMI  "rg2.rrd"
   36286 python   0.25 RET   open 7
   36286 python   0.09 CALL  fstat(0x7,0xbfbfe428)
   36286 python   0.14 RET   fstat 0
   36286 python   0.10 CALL  
mmap(0,0x453b7618,PROT_READ|PROT_WRITE,MAP_SHARED,0x7,0,0)
   36286 python   0.20 RET   mmap 679235584/0x287c5000
   36286 python   0.10 CALL  madvise(0x287c5000,0x453b7618,_MADV_RANDOM)
   36286 python   0.10 RET   madvise 0
   36286 python   0.09 CALL  madvise(0x287c5000,0x70,_MADV_WILLNEED)
   36286 python   0.67 RET   madvise 0
   36286 python   0.16 CALL  madvise(0x287c5000,0x1c20,_MADV_WILLNEED)
   36286 python   0.15 RET   madvise 0
   36286 python   0.19 CALL  madvise(0x287c6000,0x1950,_MADV_WILLNEED)
   36286 python   0.13 RET   madvise 0
   36286 python   0.10 CALL  madvise(0x287c8000,0x8,_MADV_WILLNEED)
   36286 python   0.10 RET   madvise 0
   36286 python   0.12 CALL  gettimeofday(0xbfbfe554,0)
   36286 python   0.10 RET   gettimeofday 0
   36286 python   0.14 CALL  fcntl(0x7,,0xbfbfe478)
   36286 python   0.21 RET   fcntl 0
   36286 python   0.040061 CALL  munmap(0x287c5000,0x453b7618)
   36286 python   0.000324 RET   munmap 0
   36286 python   0.16 CALL  close(0x7)
   36286 python   0.44 RET   close 0
   36286 python   0.000113 CALL  
__sysctl(0xbfbfe388,0x2,0xbfbfe394,0xbfbfe398,0,0)
   36286 python   0.18 RET   __sysctl 0

With msync() and without madvise(), things are very slow, and
there are no faults, just writes.

   61609 python   0.16 NAMI  "rg2.rrd"
   61609 python   0.24 RET   open 7
   61609 python   0.10 CALL  fstat(0x7,0xbfbfe428)
   61609 python   0.13 RET   fstat 0
   61609 python   0.10 CALL  
mmap(0,0x453b7618,PROT_READ|PROT_WRITE,MAP_SHARED,0x7,0,0)
   61609 python   0.21 RET   mmap 679235584/0x287c5000
   61609 python   0.066603 CALL  madvise(0x287c5000,0x1c20,_MADV_WILLNEED)
   61609 python   0.57 RET   madvise 0
   61609 python   0.09 CALL  madvise(0x287c6000,0x1950,_MADV_WILLNEED)
   61609 python   0.10 RET   madvise 0
   61609 python   0.09 CALL  madvise(0x287c8000,0x8,_MADV_WILLNEED)
   61609 python   0.09 RET   madvise 0
   61609 python   0.10 CALL  gettimeofday(0xbfbfe554,0)
   61609 python   0.18 RET   gettimeofday 0
   61609 python   0.14 CALL  fcntl(0x7,,0xbfbfe478)
   61609 python   0.26 RET   fcntl 0
   61609 python   0.004044 CALL  msync(0x287c5000,0x453b7618,MS_ASYNC

Re: Performance of madvise / msync

2008-06-26 Thread Matthew Dillon
:   65074 python   0.06 CALL  madvise(0x287c5000,0x70,_MADV_WILLNEED)
:   65074 python   0.027455 RET   madvise 0
:   65074 python   0.58 CALL  madvise(0x287c5000,0x1c20,_MADV_WILLNEED)
:   65074 python   0.016904 RET   madvise 0
:   65074 python   0.000179 CALL  madvise(0x287c6000,0x1950,_MADV_WILLNEED)
:   65074 python   0.008629 RET   madvise 0
:   65074 python   0.40 CALL  madvise(0x287c8000,0x8,_MADV_WILLNEED)
:   65074 python   0.004173 RET   madvise 0
:...
:   65074 python   0.006084 CALL  msync(0x287c5000,0x453b7618,MS_ASYNC)
:   65074 python   0.106284 RET   msync 0
:...
:As you can see, it's quite a bit faster.
:
:I know that msync is necessary under Linux but obsolete under FreeBSD, but
:it's still funny that it takes a tenth of a second to return even with
:MS_ASYNC specified.
:
:Also, why is it that the madvise() calls take so much longer when the
:program does a couple of its own madvise() calls?  Was madvise() never
:intended to be run so frequently and is therefore a little slower than
:it could be?
:
:Here's the diff between the code for the first kdump above and the
:second one.

 Those times are way way too large, even with other running threads
 in the system.  madvise() should not take that long unless it is
 being forced to wait on a busied page, and neither should msync().
 madvise() doesn't even do any I/O (or shouldn't anyhow).

 Try removing just the msync() but keep the madvise() calls and see
 if the madvise() calls continue to take horrendous amounts of time.
 Then try the vise-versa.

 It kinda feels like a prior msync() is initiating physical I/O on
 pages and a later mmap/madvise or page fault is being forced to
 wait on the prior pages for the I/O to finish.

 The size_t argument to msync() (0x453b7618) is highly questionable.
 It could be ktrace reporting the wrong value, but maybe not.
 On any sort of random writing test, particularly if multiple threads
 are involved, specifying a size that large could result in very large
 latencies.

-Matt

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: TMPFS: File System is Full

2008-06-26 Thread Xin LI

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Ivan Voras wrote:
| Lin Jui-Nan Eric wrote:
|
|> I think there should be a "lower bound" size limit. Does TMPFS use
|> kernel-space memory?
|
| Yes, tmpfs does use kmem and competes with ZFS.

Yes and no.  tmpfs makes use of some kernel memory but most of data is
stored in swappable memory.

Cheers,
- --
Xin LI <[EMAIL PROTECTED]>http://www.delphij.net/
FreeBSD - The Power to Serve!
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.9 (FreeBSD)

iEYEARECAAYFAkhkNfsACgkQi+vbBBjt66Ad8ACfYD1Yq09oY7iIjxq353iMFKKp
FGYAnjA3I8V2cN1s1EW2NuwfhilaJBMe
=vnM3
-END PGP SIGNATURE-
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Performance of madvise / msync

2008-06-26 Thread Marcus Reid
Hi,

I'm using py-rrdtool 0.2.1 with rrdtool 1.3.0 under 7.0-STABLE, and
there's a couple of things about this new version of rrdtool that
hurt performance under FreeBSD, but apparently help on whatever they
tested on.

For every update, the database file is opened, mapped into memory,
madvise() is called, contents are modified, msync() is called, and
the file is unmapped and closed:

   65074 python   0.09 CALL  open(0x831a1b4,O_RDWR,0)
   65074 python   0.13 NAMI  "rg2.rrd"
   65074 python   0.24 RET   open 7
   65074 python   0.07 CALL  fstat(0x7,0xbfbfe428)
   65074 python   0.11 RET   fstat 0
   65074 python   0.08 CALL  
mmap(0,0x453b7618,PROT_READ|PROT_WRITE,MAP_SHARED,0x7,0,0)
   65074 python   0.18 RET   mmap 679235584/0x287c5000
   65074 python   0.07 CALL  madvise(0x287c5000,0x453b7618,_MADV_RANDOM)
   65074 python   0.08 RET   madvise 0
   65074 python   0.06 CALL  madvise(0x287c5000,0x70,_MADV_WILLNEED)
   65074 python   0.027455 RET   madvise 0
   65074 python   0.58 CALL  madvise(0x287c5000,0x1c20,_MADV_WILLNEED)
   65074 python   0.016904 RET   madvise 0
   65074 python   0.000179 CALL  madvise(0x287c6000,0x1950,_MADV_WILLNEED)
   65074 python   0.008629 RET   madvise 0
   65074 python   0.40 CALL  madvise(0x287c8000,0x8,_MADV_WILLNEED)
   65074 python   0.004173 RET   madvise 0
   65074 python   0.48 CALL  gettimeofday(0xbfbfe554,0)
   65074 python   0.09 RET   gettimeofday 0
   65074 python   0.12 CALL  fcntl(0x7,,0xbfbfe478)
   65074 python   0.24 RET   fcntl 0
   65074 python   0.006084 CALL  msync(0x287c5000,0x453b7618,MS_ASYNC)
   65074 python   0.106284 RET   msync 0
   65074 python   0.000483 CALL  munmap(0x287c5000,0x453b7618)
   65074 python   0.000356 RET   munmap 0
   65074 python   0.12 CALL  close(0x7)
   65074 python   0.46 RET   close 0
   65074 python   0.000173 CALL  
__sysctl(0xbfbfe388,0x2,0xbfbfe394,0xbfbfe398,0,0)
   65074 python   0.16 RET   __sysctl 0

Here's a similar update without the calls to madvise and msync:

   96372 python   0.11 CALL  open(0x831aa34,O_RDWR,0)
   96372 python   0.16 NAMI  "rg2.rrd"
   96372 python   0.25 RET   open 7
   96372 python   0.09 CALL  fstat(0x7,0xbfbfe428)
   96372 python   0.14 RET   fstat 0
   96372 python   0.10 CALL  
mmap(0,0x453b7618,PROT_READ|PROT_WRITE,MAP_SHARED,0x7,0,0)
   96372 python   0.21 RET   mmap 679235584/0x287c5000
   96372 python   0.000101 CALL  madvise(0x287c5000,0x1c20,_MADV_WILLNEED)
   96372 python   0.13 RET   madvise 0
   96372 python   0.10 CALL  madvise(0x287c6000,0x1950,_MADV_WILLNEED)
   96372 python   0.10 RET   madvise 0
   96372 python   0.09 CALL  madvise(0x287c8000,0x8,_MADV_WILLNEED)
   96372 python   0.09 RET   madvise 0
   96372 python   0.10 CALL  gettimeofday(0xbfbfe554,0)
   96372 python   0.09 RET   gettimeofday 0
   96372 python   0.11 CALL  fcntl(0x7,,0xbfbfe478)
   96372 python   0.16 RET   fcntl 0
   96372 python   0.002024 CALL  munmap(0x287c5000,0x453b7618)
   96372 python   0.000366 RET   munmap 0
   96372 python   0.16 CALL  close(0x7)
   96372 python   0.46 RET   close 0
   96372 python   0.000108 CALL  
__sysctl(0xbfbfe388,0x2,0xbfbfe394,0xbfbfe398,0,0)
   96372 python   0.17 RET   __sysctl 0

As you can see, it's quite a bit faster.

I know that msync is necessary under Linux but obsolete under FreeBSD, but
it's still funny that it takes a tenth of a second to return even with
MS_ASYNC specified.

Also, why is it that the madvise() calls take so much longer when the
program does a couple of its own madvise() calls?  Was madvise() never
intended to be run so frequently and is therefore a little slower than
it could be?

Here's the diff between the code for the first kdump above and the
second one.

*** src/rrd_open.c.orig Tue Jun 10 23:12:55 2008
--- src/rrd_open.c  Wed Jun 25 21:43:54 2008
***
*** 175,191 
  #endif
  if (rdwr & RRD_CREAT)
  goto out_done;
- #ifdef USE_MADVISE
- if (rdwr & RRD_COPY) {
- /* We will read everything in a moment (copying) */
- madvise(data, rrd_file->file_len, MADV_WILLNEED | MADV_SEQUENTIAL);
- } else {
- /* We do not need to read anything in for the moment */
- madvise(data, rrd_file->file_len, MADV_RANDOM);
- /* the stat_head will be needed soonish, so hint accordingly */
- madvise(data, sizeof(stat_head_t), MADV_WILLNEED | MADV_RANDOM);
- }
- #endif

  __rrd_read(rrd->stat_head, stat_head_t,
 1);
--- 175,180 
***
*** 388,396 
  int   ret;

  #ifdef HAVE_MMAP
- ret = msync(rrd_file->file_start, rrd_file->file_len, MS_ASYNC);
- if (ret != 0)
- rrd_set_error("msync rrd_file: %s", rrd_strerror(errno));
  ret = munmap(rrd_file->file_start, rrd_file->file_len);
  if (ret != 0)
  rrd_set_error("munmap rrd_file: %s", rrd_strerror(errno

Re: gmirror+gjournal: unable to boot after crash

2008-06-26 Thread Volker
On 12/23/-58 20:59, Michael Harris wrote:
> Hi,
> 
> after one month with gmirror and gjournal running on a 7.0-RELEASE #p2 amd64 
> (built from latest CVS source), the box hung a couple of times when on high 
> disk load. Finally, while building some port it won't boot for no reason 
> obvious to me.
> 
> This is what I get with kernel.geom.mirror.debug=2:
> 
> ata2-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire
> ad4: 476940MB  at ata2-master SATA300
> ad4: 976773168 sectors [969021C/16H/63S] 16 sectors/interrupt 1 depth queue
> GEOM: new disk ad4
> ad4: nVidia check1 failed
> ad4: Adaptec check1 failed
> ad4: LSI (v3) check1 failed
> GEOM_MIRROR[2]: Tasting ad4.
> ad4: LSI (v2) check1 failed
>  magic: GEOM::MIRROR
>version: 4
>   name: gm0
>mid: 2403671335
>did: 1321347210
>all: 2
>  genid: 0
> syncid: 1
>   priority: 0
>  slice: 4096
>balance: round-robin
>  mediasize: 500107861504
> sectorsize: 512
> syncoffset: 0
> mflags: NONE
> dflags: DIRTY
> hcprovider: 
>   provsize: 500107862016
>   MD5 hash: fd8b1cfa1aeb685da9b4228f5be3dc41
> GEOM_MIRROR[1]: Creating device gm0 (id=2403671335).
> GEOM_MIRROR[1]: Device gm0 created (2 components, id=2403671335).
> GEOM_MIRROR[1]: root_mount_hold 0xff0001318040
> GEOM_MIRROR[1]: Adding disk ad4 to gm0.
> GEOM_MIRROR[2]: Adding disk ad4.
> GEOM_MIRROR[2]: Disk ad4 connected.
> ad4: FreeBSD check1 failed
> GEOM_MIRROR[1]: Disk ad4 state changed from NONE to NEW (device gm0).
> GEOM_MIRROR[1]: Device gm0: provider ad4 detected.
> ata4-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire
> ad8: 476940MB  at ata4-master SATA300
> ad8: 976773168 sectors [969021C/16H/63S] 16 sectors/interrupt 1 depth queue
> GEOM_MIRROR[2]: Tasting ad4s1.
> GEOM_MIRROR[2]: Tasting ad4a.
> GEOM_MIRROR[2]: Tasting ad4c.
> GEOM: new disk ad8
> ad8: nVidia check1 failed
> ad8: Adaptec check1 failed
> ad8: LSI (v3) check1 failed
> GEOM_MIRROR[2]: Tasting ad8.
> ad8: LSI (v2) check1 failed
>  magic: GEOM::MIRROR
>version: 4
>   name: gm0
>mid: 2403671335
>did: 3638214596
>all: 2
>  genid: 0
> syncid: 1
>   priority: 0
>  slice: 4096
>balance: round-robin
>  mediasize: 500107861504
> sectorsize: 512
> syncoffset: 0
> mflags: NONE
> dflags: NONE
> hcprovider: 
>   provsize: 500107862016
>   MD5 hash: 6a44a256f5a29312f9632d22785dadce
> GEOM_MIRROR[1]: Adding disk ad8 to gm0.
> GEOM_MIRROR[2]: Adding disk ad8.
> GEOM_MIRROR[2]: Disk ad8 connected.
> GEOM_MIRROR[1]: Disk ad8 state changed from NONE to NEW (device gm0).
> GEOM_MIRROR[1]: Device gm0: provider ad8 detected.
> GEOM_MIRROR[1]: Device gm0 state changed from STARTING to RUNNING.
> GEOM_MIRROR[1]: Disk ad8 state changed from NEW to ACTIVE (device gm0).
> ad8: FreeBSD check1 failed
> GEOM_MIRROR[2]: Metadata on ad8 updated.
> GEOM_MIRROR[1]: Device gm0: provider ad8 activated.
> GEOM_MIRROR[1]: Disk ad4 state changed from NEW to SYNCHRONIZING (device gm0).
> GEOM_MIRROR[0]: Device mirror/gm0 launched (1/2).
> GEOM_MIRROR[2]: Access request for mirror/gm0: r1w0e0.
> GEOM_MIRROR[0]: Device gm0: rebuilding provider ad4.
> GEOM_MIRROR[1]: root_mount_rel[2379] 0xff0001318040
> ata5-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire
> ad10: 476940MB  at ata5-master SATA300
> ad10: 976773168 sectors [969021C/16H/63S] 16 sectors/interrupt 1 depth queue
> ad10: nVidia check1 failed
> ad10: Adaptec check1 failed
> GEOM_MIRROR[2]: Tasting ad8s1.
> GEOM_MIRROR[2]: Tasting ad8a.
> GEOM_MIRROR[2]: Tasting ad8c.
> GEOM_MIRROR[2]: Access request for mirror/gm0: r1w0e0.
> ad10: LSI (v3) check1 failed
> GEOM_MIRROR[2]: Access request for mirror/gm0: r-1w0e0.
> GEOM_MIRROR[2]: Access request for mirror/gm0: r1w0e0.
> ad10: LSI (v2) check1 failed
> GEOM_MIRROR[2]: Access request for mirror/gm0: r-1w0e0.
> GEOM_JOURNAL: Journal 2550245011: mirror/gm0 contains data.
> GEOM_JOURNAL: Journal 2550245011: mirror/gm0 contains journal.
> GEOM_MIRROR[2]: Access request for mirror/gm0: r1w1e1.
> GEOM_MIRROR[2]: Access request for mirror/gm0: r1w0e0.
> ad10: FreeBSD check1 failed
> GEOM_MIRROR[2]: Access request for mirror/gm0: r-1w0e0.
> GEOM_MIRROR[2]: Access request for mirror/gm0: r1w0e0.
> ATA PseudoRAID loaded
> GEOM_MIRROR[2]: Access request for mirror/gm0: r-1w0e0.
> GEOM_MIRROR[2]: Access request for mirror/gm0: r1w0e0.
> GEOM_MIRROR[2]: Access request for mirror/gm0: r-1w0e0.
> GEOM: new disk ad10
> GEOM_MIRROR[2]: Tasting mirror/gm0s1.
> GEOM_MIRROR[2]: Tasting mirror/gm0a.
> GEOM_MIRROR[2]: Tasting mirror/gm0c.
> GEOM_MIRROR[2]: Tasting ad10.
> GEOM_MIRROR[2]: Tasting ad10s1.
> GEOM_MIRROR[2]: Tasting ad10s1a.
> GEOM_MIRROR[2]: Tasting ad10s1c.
> Trying to mount root from ufs:/dev/mirror/gm0.journals1a
> 
> Manual root filesystem specification:
>   :  Mount  using filesystem 
>eg. ufs:da0s1a
>   ?  List valid disk boot devices
>  Abort manual input
> 
> mountroot> 
> 

T

Re: Problem with /boot/loader

2008-06-26 Thread Volker
On 12/23/-58 20:59, Kelly Black wrote:
> Hello,
> 
> I have a problem with loader. I recently upgraded from 6_rel to 7_rel.
> Now when I install world there is a problem booting.
> 
> Here is what I do:
> cd /usr/src
> make buildworld
> make buildkernel KERNCONF=BLACK
> make installkernel KERNCONF=BLACK
> 
> At this point I can reboot and all is good. After boot I install the new 
> world:
> 
> cd /usr/src
> mergemaster -p
> reboot into single user mode
> cd /usr/src
> make installworld
> mergemaster
> 
> Now when I reboot there is a problem. I get an error that the system
> cannot boot. Part of it looks like this:
> Can't work out which disk we are booting from.
> Guessed BIOS device 0x not found by probes, defaulting to disk0:
> 
> If I boot from a live disk and replace /boot/loader with
> /boot/loader.old it boots up fine and everything looks good. A new
> world and a new kernel. I would be grateful for any help or any
> pointers.
> 
> Sincerely,
> Kel
> 
> PS I do not do anything special with my loader config files:
> 
> $ cat loader.conf
>...

Kelly,

the /boot/loader.conf file does not come into play at that stage. Early
in the loader code, loader needs to figure out, which disk (BIOS device)
has been booted from. Until loader knows which device was booted up,
it's unable to access any files (even loader.conf) on your boot device.

As I've never seen such a problem while upgrading any system, I suspect
your problem must be settings specific. Can you show me your kernel
config or are you using a plain vanilla GENERIC? Which arch are we
talking about?

As I'm currently investigating another boot problem (but earlier in the
boot chain), I'll check boot logic in the source code and may check for
your issue, too, at that time, so it's just one effort. But please stay
patient for some days, as I'm currently too busy.

Volker
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: jdk16 web plugin on FreeBSD 7 AMD64 Issues

2008-06-26 Thread Jung-uk Kim
On Thursday 26 June 2008 04:23 pm, Sabeeh Baig wrote:
> I recently installed jdk16 from ports on my FreeBSD 7-Stable AMD64
> home server.  When I try to access websites that use the Java web
> plugin, the applets don't load.  Checking on about:plugins in
> Firefox shows the Java plugin listed and enabled.  Does jdk16 web
> plugin work on FreeBSD 7 AMD64?  I saw on previous mailing list
> entries mentioned that the plugins from jdk16 and jdk15 on AMD64
> worked.

Try:

env JAVAVM_DRYRUN=yes java

It will print something like the following:

JAVA_HOME=/usr/local/jdk1.6.0
JAVAVM_CONF=/usr/local/etc/javavms
JAVAVM_OPTS_CONF=/usr/local/etc/javavm_opts.conf
JAVAVM_PROG=/usr/local/jdk1.6.0/bin/java
JAVAVM_OPTS=
JAVAVM_COMMAND=/usr/local/jdk1.6.0/bin/java

If ${JAVA_HOME} does not match the installed plugin, it will not work.  
If it matches, remove ~/.java directory and try again.

Jung-uk Kim
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


jdk16 web plugin on FreeBSD 7 AMD64 Issues

2008-06-26 Thread Sabeeh Baig
I recently installed jdk16 from ports on my FreeBSD 7-Stable AMD64
home server.  When I try to access websites that use the Java web
plugin, the applets don't load.  Checking on about:plugins in Firefox
shows the Java plugin listed and enabled.  Does jdk16 web plugin work
on FreeBSD 7 AMD64?  I saw on previous mailing list entries mentioned
that the plugins from jdk16 and jdk15 on AMD64 worked.

Sabeeh Ahmed Baig
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Problem with /boot/loader

2008-06-26 Thread Jeremy Chadwick
On Thu, Jun 26, 2008 at 02:34:44PM -0400, Kelly Black wrote:
> >On Wed, Jun 25, 2008 at 7:38 AM, Kelly Black <[EMAIL PROTECTED]> wrote:
> >> I have a problem with loader. I recently upgraded from 6_rel to 7_rel.
> >> Now when I install world there is a problem booting.
> >>
> >> Here is what I do:
> >[snip]
> >> Now when I reboot there is a problem. I get an error that the system
> >> cannot boot. Part of it looks like this:
> >> Can't work out which disk we are booting from.
> >> Guessed BIOS device 0x not found by probes, defaulting to disk0:
> >>
> >> If I boot from a live disk and replace /boot/loader with
> >> /boot/loader.old it boots up fine and everything looks good. A new
> >> world and a new kernel. I would be grateful for any help or any
> >> pointers.
> >
> >What do you have in /etc/make.conf?  I recall there being a point in
> >time where incorrect CFLAGS options could build a broken loader.
> >
> >Try renaming /etc/make.conf (or just commenting out all
> >CFLAGS/CXXFLAGS options) and rebuilding either just the loader or the
> >whole world, and see if that makes a difference.
> 
> Hello,
> 
> Thank you for the reply.  I put my make.conf file back to its default
> when I first did the upgrade to avoid other kinds of problems:
> 
> make.conf
> # added by use.perl 2008-04-07 11:54:35
> PERL_VER=5.8.8
> PERL_VERSION=5.8.8
> 
> And it still produced the loader that does not load.

Kelly,

A couple things:

I'm wondering if you're getting bit by changes made to loader(8) by John
Baldwin last year.  Those changes were positive and increased
compatibility with systems greatly, but there were a couple reports of
users whose systems preferred the old "method" used.  Those changes are
documented here; and yes, I realise you don't get a screen full of
continual register dumps, but different people saw different behaviour:

http://wiki.freebsd.org/JeremyChadwick/Commonly_reported_issues

Also see these mailing list threads:

http://lists.freebsd.org/pipermail/freebsd-current/2007-October/078755.html
http://lists.freebsd.org/pipermail/freebsd-stable/2007-November/038214.html

Secondly, you said you migrated from "6_rel to 7_rel".  Do you mean
7.0-RELEASE, or are you referring to the RELENG_7 tag?  What tag are you
following when doing csup/cvsup?

CC'ing John as well.

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Problem with /boot/loader

2008-06-26 Thread Kelly Black
>
>On Wed, Jun 25, 2008 at 7:38 AM, Kelly Black <[EMAIL PROTECTED]> wrote:
>> I have a problem with loader. I recently upgraded from 6_rel to 7_rel.
>> Now when I install world there is a problem booting.
>>
>> Here is what I do:
>[snip]
>> Now when I reboot there is a problem. I get an error that the system
>> cannot boot. Part of it looks like this:
>> Can't work out which disk we are booting from.
>> Guessed BIOS device 0x not found by probes, defaulting to disk0:
>>
>> If I boot from a live disk and replace /boot/loader with
>> /boot/loader.old it boots up fine and everything looks good. A new
>> world and a new kernel. I would be grateful for any help or any
>> pointers.
>
>What do you have in /etc/make.conf?  I recall there being a point in
>time where incorrect CFLAGS options could build a broken loader.
>
>Try renaming /etc/make.conf (or just commenting out all
>CFLAGS/CXXFLAGS options) and rebuilding either just the loader or the
>whole world, and see if that makes a difference.
>
>--
>Freddie Cash
>[EMAIL PROTECTED]
>

Hello,

Thank you for the reply.  I put my make.conf file back to its default
when I first did the upgrade to avoid other kinds of problems:

make.conf
# added by use.perl 2008-04-07 11:54:35
PERL_VER=5.8.8
PERL_VERSION=5.8.8

And it still produced the loader that does not load.

Sincerely,
Kel


-- 
___
Kelly Black Phone: (518) 388-8727
Department of Mathematics FAX: (603) 388-6005
Union College e-mail: [EMAIL PROTECTED]
Schenectady NY 12308 (USA) WWW: http://blackk.union.edu/~black
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: panic in drm

2008-06-26 Thread Gavin Atkinson
On Thu, 2008-06-26 at 16:09 +0200, Ronald Klop wrote:
> Hello,
> 
> At my work I see a panic if I do this.
> Leaf my computer on screensaver (don't no if that is necessary) and come  
> back the next morning. My monitor is than black, but the light is green,  
> so DPMS didn't kick in.
> Keys or mouse don't wake up the system. CTRL-ALT-F1 switches to the  
> console and immediately triggers this panic.
> 
> I'm running latest x.org with xf86-video-i810-1.7.4_1. The newer  
> xf86-video-intel also crashes my system sometimes when I'm working on it,  
> so a downgraded.
> 
> Any ideas, suggestions or fixes available?

This sounds very familiar to me.  I think I was seeing the same panic in
2006...
http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2006-12/msg00344.html

Can you add this to src/sys/dev/drm/i915_irq.c

if (!dev->irqr) {
  return DRM_ERR(EINVAL);
}

just above the DRM_WAIT_ON in i915_driver_vblank_wait() (around line
140), and see if that prevents the panic?  Note that I believe this to
be a workaround, and not really the correct fix.

(sorry for not supplying patches, I'm not in a position to right now)

Gavin


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: bus_dmamem_alloc failed to align memory properly

2008-06-26 Thread Jeff Blank
On Mon, Jun 23, 2008 at 10:50:40AM -1000, Clifton Royston wrote:
>   I can not be completely sure, but believe this may be a cross-OS
> problem with Xorg support of this specific ATI card family.

Interesting.  Thanks for the tip.

>   If you don't want to spend a lot of time on this, I'd try a different
> video card; I have a strong suspicion that your problems are not
> FreeBSD related and would go away with that change.

Cool, I'll try a Radeon X1550 on Monday.

thanks,
Jeff
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


panic in drm

2008-06-26 Thread Ronald Klop

Hello,

At my work I see a panic if I do this.
Leaf my computer on screensaver (don't no if that is necessary) and come  
back the next morning. My monitor is than black, but the light is green,  
so DPMS didn't kick in.
Keys or mouse don't wake up the system. CTRL-ALT-F1 switches to the  
console and immediately triggers this panic.


I'm running latest x.org with xf86-video-i810-1.7.4_1. The newer  
xf86-video-intel also crashes my system sometimes when I'm working on it,  
so a downgraded.


Any ideas, suggestions or fixes available?

Cheers,
Ronald.

FreeBSD ronald.office.base.nl 7.0-STABLE FreeBSD 7.0-STABLE #: Mon May 19  
12:12:58 CEST 2008  
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP-RONALD  i386



kgdb kernel.debug /var/crash/vmcore.2
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you  
are
welcome to change it and/or distribute copies of it under certain  
conditions.

Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd"...

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x188
fault code  = supervisor read, page not present
instruction pointer = 0x20:0xc04faee4
stack pointer   = 0x28:0xe7b25a94
frame pointer   = 0x28:0xe7b25aac
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 58549 (xlock)
trap number = 12
panic: page fault
cpuid = 1
Uptime: 20d18h31m44s
Physical memory: 2037 MB
Dumping 251 MB: 236 220 204 188 172 156 140 124 108 92 76 60 44 28 12




#0  doadump () at pcpu.h:195
195 __asm __volatile("movl %%fs:0,%0" : "=r" (td));
(kgdb) bt
#0  doadump () at pcpu.h:195
#1  0xc050867f in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418
#2  0xc050893a in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:572
#3  0xc067a080 in trap_fatal (frame=0xe7b25a54, eva=392)
at /usr/src/sys/i386/i386/trap.c:899
#4  0xc067a2d0 in trap_pfault (frame=0xe7b25a54, usermode=0, eva=392)
at /usr/src/sys/i386/i386/trap.c:812
#5  0xc067ab81 in trap (frame=0xe7b25a54) at  
/usr/src/sys/i386/i386/trap.c:490

#6  0xc066277b in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#7  0xc04faee4 in _mtx_lock_sleep (m=0xc4ddfcc0, tid=3321921536, opts=0,
file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:335
#8  0xc04fb005 in lock_mtx (lock=0xc4ddfcc0, how=0)
at /usr/src/sys/kern/kern_mutex.c:141
#9  0xc05102b8 in _sleep (ident=0xc4ddfe64, lock=0xc4ddfcc0, priority=340,
wmesg=0xc58007e7 "drmwtq", timo=3000) at  
/usr/src/sys/kern/kern_synch.c:237
#10 0xc57fefa7 in i915_driver_vblank_wait (dev=0xc4ddfc00,  
sequence=0xe7b25b48)

at /usr/src/sys/modules/drm/i915/../../../dev/drm/i915_irq.c:141
#11 0xc580bf06 in drm_wait_vblank (kdev=0xc570b600, cmd=Variable "cmd" is  
not available.

)
at /usr/src/sys/modules/drm/drm/../../../dev/drm/drm_irq.c:254
#12 0xc580a8e4 in drm_ioctl (kdev=0xc570b600, cmd=399706,
data=0xc72790b0 "", flags=3, p=0xc6008000)
at /usr/src/sys/modules/drm/drm/../../../dev/drm/drm_drv.c:911
#13 0xc04d58e0 in giant_ioctl (dev=0xc570b600, cmd=399706,
data=0xc72790b0 "", fflag=3, td=0xc6008000)
at /usr/src/sys/kern/kern_conf.c:405
#14 0xc04b0998 in devfs_ioctl_f (fp=0xc87f0c18, com=399706,
data=0xc72790b0, cred=0xc81ea000, td=0xc6008000)
at /usr/src/sys/fs/devfs/devfs_vnops.c:494
#15 0xc053bbb5 in kern_ioctl (td=0xc6008000, fd=5, com=399706,
data=0xc72790b0 "") at file.h:266
#16 0xc053bcfa in ioctl (td=0xc6008000, uap=0xe7b25cfc)
at /usr/src/sys/kern/sys_generic.c:570
#17 0xc067a5c1 in syscall (frame=0xe7b25d38)
at /usr/src/sys/i386/i386/trap.c:1035
#18 0xc06627e0 in Xint0x80_syscall () at  
/usr/src/sys/i386/i386/exception.s:196

#19 0x0033 in ?? ()
Previous frame inner to this frame (corrupt stack?)



(kgdb) list *0xc04faee4
0xc04faee4 is in _mtx_lock_sleep (/usr/src/sys/kern/kern_mutex.c:337).
332  */
333 v = m->mtx_lock;
334 if (v != MTX_UNOWNED) {
335 owner = (struct thread *)(v &  
~MTX_FLAGMASK);

336 #ifdef ADAPTIVE_GIANT
337 if (TD_IS_RUNNING(owner)) {
338 #else
339 if (m != &Giant && TD_IS_RUNNING(owner)) {
340 #endif
341 if (LOCK_LOG_TEST(&m->lock_object,  
0))


--- dmesg output ---
Copyright (c) 1992-2008 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
Fre

gmirror+gjournal: unable to boot after crash

2008-06-26 Thread Michael Harris
Hi,

after one month with gmirror and gjournal running on a 7.0-RELEASE #p2 amd64 
(built from latest CVS source), the box hung a couple of times when on high 
disk load. Finally, while building some port it won't boot for no reason 
obvious to me.

This is what I get with kernel.geom.mirror.debug=2:

ata2-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire
ad4: 476940MB  at ata2-master SATA300
ad4: 976773168 sectors [969021C/16H/63S] 16 sectors/interrupt 1 depth queue
GEOM: new disk ad4
ad4: nVidia check1 failed
ad4: Adaptec check1 failed
ad4: LSI (v3) check1 failed
GEOM_MIRROR[2]: Tasting ad4.
ad4: LSI (v2) check1 failed
 magic: GEOM::MIRROR
   version: 4
  name: gm0
   mid: 2403671335
   did: 1321347210
   all: 2
 genid: 0
syncid: 1
  priority: 0
 slice: 4096
   balance: round-robin
 mediasize: 500107861504
sectorsize: 512
syncoffset: 0
mflags: NONE
dflags: DIRTY
hcprovider: 
  provsize: 500107862016
  MD5 hash: fd8b1cfa1aeb685da9b4228f5be3dc41
GEOM_MIRROR[1]: Creating device gm0 (id=2403671335).
GEOM_MIRROR[1]: Device gm0 created (2 components, id=2403671335).
GEOM_MIRROR[1]: root_mount_hold 0xff0001318040
GEOM_MIRROR[1]: Adding disk ad4 to gm0.
GEOM_MIRROR[2]: Adding disk ad4.
GEOM_MIRROR[2]: Disk ad4 connected.
ad4: FreeBSD check1 failed
GEOM_MIRROR[1]: Disk ad4 state changed from NONE to NEW (device gm0).
GEOM_MIRROR[1]: Device gm0: provider ad4 detected.
ata4-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire
ad8: 476940MB  at ata4-master SATA300
ad8: 976773168 sectors [969021C/16H/63S] 16 sectors/interrupt 1 depth queue
GEOM_MIRROR[2]: Tasting ad4s1.
GEOM_MIRROR[2]: Tasting ad4a.
GEOM_MIRROR[2]: Tasting ad4c.
GEOM: new disk ad8
ad8: nVidia check1 failed
ad8: Adaptec check1 failed
ad8: LSI (v3) check1 failed
GEOM_MIRROR[2]: Tasting ad8.
ad8: LSI (v2) check1 failed
 magic: GEOM::MIRROR
   version: 4
  name: gm0
   mid: 2403671335
   did: 3638214596
   all: 2
 genid: 0
syncid: 1
  priority: 0
 slice: 4096
   balance: round-robin
 mediasize: 500107861504
sectorsize: 512
syncoffset: 0
mflags: NONE
dflags: NONE
hcprovider: 
  provsize: 500107862016
  MD5 hash: 6a44a256f5a29312f9632d22785dadce
GEOM_MIRROR[1]: Adding disk ad8 to gm0.
GEOM_MIRROR[2]: Adding disk ad8.
GEOM_MIRROR[2]: Disk ad8 connected.
GEOM_MIRROR[1]: Disk ad8 state changed from NONE to NEW (device gm0).
GEOM_MIRROR[1]: Device gm0: provider ad8 detected.
GEOM_MIRROR[1]: Device gm0 state changed from STARTING to RUNNING.
GEOM_MIRROR[1]: Disk ad8 state changed from NEW to ACTIVE (device gm0).
ad8: FreeBSD check1 failed
GEOM_MIRROR[2]: Metadata on ad8 updated.
GEOM_MIRROR[1]: Device gm0: provider ad8 activated.
GEOM_MIRROR[1]: Disk ad4 state changed from NEW to SYNCHRONIZING (device gm0).
GEOM_MIRROR[0]: Device mirror/gm0 launched (1/2).
GEOM_MIRROR[2]: Access request for mirror/gm0: r1w0e0.
GEOM_MIRROR[0]: Device gm0: rebuilding provider ad4.
GEOM_MIRROR[1]: root_mount_rel[2379] 0xff0001318040
ata5-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire
ad10: 476940MB  at ata5-master SATA300
ad10: 976773168 sectors [969021C/16H/63S] 16 sectors/interrupt 1 depth queue
ad10: nVidia check1 failed
ad10: Adaptec check1 failed
GEOM_MIRROR[2]: Tasting ad8s1.
GEOM_MIRROR[2]: Tasting ad8a.
GEOM_MIRROR[2]: Tasting ad8c.
GEOM_MIRROR[2]: Access request for mirror/gm0: r1w0e0.
ad10: LSI (v3) check1 failed
GEOM_MIRROR[2]: Access request for mirror/gm0: r-1w0e0.
GEOM_MIRROR[2]: Access request for mirror/gm0: r1w0e0.
ad10: LSI (v2) check1 failed
GEOM_MIRROR[2]: Access request for mirror/gm0: r-1w0e0.
GEOM_JOURNAL: Journal 2550245011: mirror/gm0 contains data.
GEOM_JOURNAL: Journal 2550245011: mirror/gm0 contains journal.
GEOM_MIRROR[2]: Access request for mirror/gm0: r1w1e1.
GEOM_MIRROR[2]: Access request for mirror/gm0: r1w0e0.
ad10: FreeBSD check1 failed
GEOM_MIRROR[2]: Access request for mirror/gm0: r-1w0e0.
GEOM_MIRROR[2]: Access request for mirror/gm0: r1w0e0.
ATA PseudoRAID loaded
GEOM_MIRROR[2]: Access request for mirror/gm0: r-1w0e0.
GEOM_MIRROR[2]: Access request for mirror/gm0: r1w0e0.
GEOM_MIRROR[2]: Access request for mirror/gm0: r-1w0e0.
GEOM: new disk ad10
GEOM_MIRROR[2]: Tasting mirror/gm0s1.
GEOM_MIRROR[2]: Tasting mirror/gm0a.
GEOM_MIRROR[2]: Tasting mirror/gm0c.
GEOM_MIRROR[2]: Tasting ad10.
GEOM_MIRROR[2]: Tasting ad10s1.
GEOM_MIRROR[2]: Tasting ad10s1a.
GEOM_MIRROR[2]: Tasting ad10s1c.
Trying to mount root from ufs:/dev/mirror/gm0.journals1a

Manual root filesystem specification:
  :  Mount  using filesystem 
   eg. ufs:da0s1a
  ?  List valid disk boot devices
 Abort manual input

mountroot> 

With kernel.geom.journal.debug set to 2 I get:

ata2-master: pio=PIO4 wdma=WDMA2 udma=UDMA133 cable=40 wire
ad4: 476940MB  at ata2-master SATA300
ad4: 976773168 sectors [969021C/16H/63S] 16 sectors/interrupt 1 depth queue
GEOM: new disk ad4
ad4: nVidia check1 failed
ad4: Adaptec check1 failed
ad4: LSI (v3) check1

make buildworld fails on a RELENG_7_0 machine

2008-06-26 Thread Rajkumar S
Hi,

I have a fresh freebsd 7.0 box, installed from CD. I have src cvsupd
with following supfile.

*default host=cvsup.de.FreeBSD.org
*default base=/var/db
*default prefix=/usr
*default release=cvs tag=RELENG_7_0
*default delete use-rel-suffix
*default compress
src-all

after csup when I cd to /usr/src and issue a make buildworld It stops
with following error:

===> gnu/usr.bin/cvs/contrib (cleandir)
sed -e 's,@CSH@,/bin/csh,' -e 's,@PERL@,/usr/bin/perl,'
/usr/src/gnu/usr.bin/cvs/contrib/../../../../contrib/cvs/contrib/Makefile.in
> Makefile
"Makefile", line 15: Need an operator
make: fatal errors encountered -- cannot continue
*** Error code 1

Stop in /usr/src/gnu/usr.bin/cvs.
*** Error code 1

Stop in /usr/src/gnu/usr.bin.
*** Error code 1

Stop in /usr/src/gnu.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.

The line in question ( /usr/src/contrib/cvs/contrib/Makefile.in:15
)has the following text.

@SET_MAKE@

If I do a csup after a comple just one file get's updated every time:

Parsing supfile "src-supfile"
Connecting to cvsup.de.FreeBSD.org
Connected to 212.19.57.134
Server software version: SNAP_16_1h
Negotiating file attribute support
Exchanging collection information
Establishing multiplexed-mode data connection
Running
Updating collection src-all/cvs
 Checkout src/gnu/usr.bin/cvs/contrib/Makefile
Shutting down connection to server
Finished successfully

Any idea what could be wrong here?

raj
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: tracking -stable in the enterprise

2008-06-26 Thread Peter Wemm
On Wed, Jun 25, 2008 at 12:21 PM, Jo Rhett <[EMAIL PROTECTED]> wrote:
> On Jun 25, 2008, at 3:46 AM, Peter Wemm wrote:
>>
>> Correct.  We roll our own build snapshots periodically, but we also
>> keep a pretty careful eye on what's going on in the -stable branches.
>
> Okay, that makes sense to me ;-)
>
>>> I mean, I guess Yahoo has enough resources to literally run every commit
>>> to
>>> -stable through a full test cycle and push it out to every machine, but
>>> my
>
>> No.  Why on earth would we do that?  if we wanted to cause ourselves
>> that much pain for no good reason, we'd go get a pencil and stab
>> ourselves in the eye.
>
> Yes, we are definitely on the same page.   Thanks for the clarification ;-)
>
>> We don't upgrade machines that have been deployed unless there is a
>> good reason to.
>
> Do you deploy machines for longer than 1 year?  How do you deal with
> security patches in the longer term?

I think we still have FreeBSD-3.x machines in production. I know we
have FreeBSD-4.3.  99.9% of security issues don't affect us.  We have
our own package system built on top of FreeBSD's pkg_add format and
have the ability to push packages to machines.  If circumstances
warrant it, we can push a fix for something.  It'll either push a new
binary or be a source patch that is compiled directly on the machines
in question.   The machines run a custom software stack.  More often
we push fixes for driver or performance fixes or things like timezone
updates.

The important thing is that we don't disturb machines that are running
happily.  Hardware vendors are constantly messing with firmware, bugs
in silicon, etc etc.  This is an issue for NEW installs, usually not
existing machines.  Usually.

-- 
Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; KI6FJV
"All of this is for nothing if we don't go to the stars" - JMS/B5
"If Java had true garbage collection, most programs would delete
themselves upon execution." -- Robert Sewell
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"