Re: FreeBSD 10 prognostication...

2012-05-17 Thread Hamell, Rick (SPARQ)
What, 8bit color ANSI isn't GUI enough?

But seriously, it feels like it works even worse then it did a decade ago. 

Rick Hamell
Sent from my iPhone

On May 17, 2012, at 4:20 PM, "Vance Siemens"  wrote:

> Eh, sorry. I got excited at the prospect of downloading FreeBSD from
> the App Store and having the installer "just work" in a modern GUI.
> You have to admit, FreeBSD is lacking in this area. It would be a
> boon.
> 
> On Wed, May 16, 2012 at 7:18 AM, Dag-Erling Smørgrav  wrote:
>> Vance Siemens  writes:
>>> Can you share a brief overview of what's wrong with it?
>> 
>> Umm, it's about as factual as The Onion, except not as funny.  FreeBSD
>> never had to "jettison two thirds of its code base and start from
>> scratch".  Apple is not involved in FreeBSD development.  No Mac OS X or
>> Darwin version "includes" FreeBSD.  FreeBSD and Mac OS X will never
>> merge.  FreeBSD was never acquired by WinDriver Systems or by anyone
>> else, although a company named WindRiver Systems (makers of the embedded
>> operating system VxWorks, not of Windows video drivers) did at one point
>> acquire BSDI, which had previously acquired Walnut Creek CD-ROM, which
>> was heavily involved in the early history of both FreeBSD and Slackware
>> Linux.  The remains of Walnut Creek CD-ROM and BSDI are now known as
>> FreeBSD Mall and iXsystems (of PC-BSD and FreeNAS fame).
>> 
>> DES
>> --
>> Dag-Erling Smørgrav - d...@des.no
> ___
> freebsd-c...@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-chat
> To unsubscribe, send any mail to "freebsd-chat-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: FreeBSD 10 prognostication...

2012-05-17 Thread Vance Siemens
Eh, sorry. I got excited at the prospect of downloading FreeBSD from
the App Store and having the installer "just work" in a modern GUI.
You have to admit, FreeBSD is lacking in this area. It would be a
boon.

On Wed, May 16, 2012 at 7:18 AM, Dag-Erling Smørgrav  wrote:
> Vance Siemens  writes:
>> Can you share a brief overview of what's wrong with it?
>
> Umm, it's about as factual as The Onion, except not as funny.  FreeBSD
> never had to "jettison two thirds of its code base and start from
> scratch".  Apple is not involved in FreeBSD development.  No Mac OS X or
> Darwin version "includes" FreeBSD.  FreeBSD and Mac OS X will never
> merge.  FreeBSD was never acquired by WinDriver Systems or by anyone
> else, although a company named WindRiver Systems (makers of the embedded
> operating system VxWorks, not of Windows video drivers) did at one point
> acquire BSDI, which had previously acquired Walnut Creek CD-ROM, which
> was heavily involved in the early history of both FreeBSD and Slackware
> Linux.  The remains of Walnut Creek CD-ROM and BSDI are now known as
> FreeBSD Mall and iXsystems (of PC-BSD and FreeNAS fame).
>
> DES
> --
> Dag-Erling Smørgrav - d...@des.no
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Chrome crashing system (amd64-10.0-CURRENT)

2012-05-17 Thread Evan Martin
On Thu, May 17, 2012 at 4:08 PM, Conrad J. Sabatier  wrote:
> Thanks.  I tried those, and it still locked up.
>
> I finally just moved away ~/.config/chromium, and it started up OK.
> Luckily, I was able to restore pretty much everything from my synced
> data.

It's a little surprising to me that a userspace app is able to nuke
your system, but perhaps the bug is just something mundane like out of
control memory allocations and it's just swapping.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Chrome crashing system (amd64-10.0-CURRENT)

2012-05-17 Thread Conrad J. Sabatier
On Thu, 17 May 2012 07:20:51 -0500
Chuck Burns  wrote:

> On 5/17/2012 2:11 AM, John Hixson wrote:
> > On Thu, May 17, 2012 at 01:15:54AM -0500, Conrad J. Sabatier wrote:
> >> For the last week or so, I've been unable to run chrome.  Any
> >> attempt to start it up will cause the system either to freeze up
> >> or reboot.
> >
> > To add to this, I've had the same problem on 10-CURRENT for several
> > months now.
> 
> Are you guys building ports with clang? There's a known bug with 
> google-perftools, when it's built with clang, chrome will crash upon
> launch.
> 
> 
> chrome itself can be built with any compiler, but if google-perftools
> is built with clang, crash!
> 
> http://code.google.com/p/gperftools/issues/detail?id=394
> 

Ah, yes, I remember you mentioning this a month or two ago (at least, I
think it was you).

Thanks for the reminder.  I'm gonna make sure my /etc/make.conf
specifies gcc for that port.

-- 
Conrad J. Sabatier
conr...@cox.net
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Chrome crashing system (amd64-10.0-CURRENT)

2012-05-17 Thread Conrad J. Sabatier
On Thu, 17 May 2012 16:12:15 -0700
Evan Martin  wrote:

> On Thu, May 17, 2012 at 4:08 PM, Conrad J. Sabatier 
> wrote:
> > Thanks.  I tried those, and it still locked up.
> >
> > I finally just moved away ~/.config/chromium, and it started up OK.
> > Luckily, I was able to restore pretty much everything from my synced
> > data.
> 
> It's a little surprising to me that a userspace app is able to nuke
> your system, but perhaps the bug is just something mundane like out of
> control memory allocations and it's just swapping.

Yes, that *is* a little troubling.  I'm always touting FreeBSD to
people as being a rock-solid platform, so I was slightly embarrassed
when this happened several times recently when I had a friend over.  :-)

I'm looking into some sysctl settings now that do seem to have the
ability to trigger odd behavior with certain apps, e.g., kern.maxfiles,
kern.maxfilesperproc, various shared mem settings, etc.  Some apps will
either mysteriously refuse to run, or crash just after startup,
depending on the settings of these and similar.

My chrome problem was no doubt related to my recently having tinkered
with some chrome://flags settings.  Conservatism and caution definitely
seem to be called for with these!  :-)

-- 
Conrad J. Sabatier
conr...@cox.net
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Chrome crashing system (amd64-10.0-CURRENT)

2012-05-17 Thread Conrad J. Sabatier
On Thu, 17 May 2012 08:55:49 -0700
Evan Martin  wrote:

> These kinds of hard locks often point at graphics driver problems, but
> normally Chrome relies on a driver whitelist that likely doesn't
> include any FreeBSD drivers.  Did you perhaps set a flag somewhere to
> bypass a blacklist?
> 
> You could try some command line flags like
> --blacklist-accelerated-compositing
> --blacklist-webgl
> to see if they help.
> 
> (I found those on
> http://peter.sh/experiments/chromium-command-line-switches/ , not
> certain if they do what you need.)
> 
> Another idea is to use strace/ktrace/truss into a log file to see what
> it was doing around the time of dying.

Thanks.  I tried those, and it still locked up.

I finally just moved away ~/.config/chromium, and it started up OK.
Luckily, I was able to restore pretty much everything from my synced
data.

Happy ending.  :-)

-- 
Conrad J. Sabatier
conr...@cox.net
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [review request] usr.sbin/service - make showing files configurable

2012-05-17 Thread Doug Barton
On 05/17/2012 02:51 PM, Bryan Drewery wrote:
> Yeah it's what I get for mashing a pseudo example up and not testing it!

S'ok, I screwed up ${service##*/} in mine. :)
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ports/bash4 --enable-static fails

2012-05-17 Thread Sean Bruno
On Thu, 2012-05-10 at 05:56 -0700, Chet Ramey wrote:
> On 5/10/12 12:20 AM, Craig Rodrigues wrote:
> 
> > Bash is trying to override the malloc() functions in libc with its own
> > implementation in lib/malloc/malloc.c .
> > I have seen this type of trick before 3rd party code that tries to
> > override the libc implementation of malloc() / free() with its own.
> > 
> > kan@ explained this to me before, but I don't know if I can explain it
> > as well as him, because it has to do
> > with how static linking works. :)
> > 
> > Basically, the malloc.o object from bash, *must* have implementations of
> > *all* the relevant functions in jemalloc_jemalloc.o in order for
> > malloc.o to properly override jemalloc_jemalloc.o.
> > 
> > If you have something like:
> > jemalloc_jemalloc.o  (libc)   malloc.o (Bash)
> > ===   =
> > malloc()  malloc()
> > free()   free()
> > calloc()
> > realloc()
> > 
> > 
> > the static linker will not be able to replace jemalloc_jemalloc.o from
> > libc with malloc.o from Bash,
> > because calloc() and realloc() symbols in jemalloc_jemalloc.o (libc)
> > do not exist malloc.o (Bash).
> > 
> > Since the linker can only deal with whole objects (.o files), it will
> > try to pull in both
> > jemalloc_jemalloc.o and malloc.o when doing static linking.
> > 
> > I may have got some of the details/explanation wrong, but I have fixed
> > something similar
> > to this in 3rd party code, when the layout of malloc() functions in
> > libc changed between FreeBSD 4 and FreeBSD 6.
> 
> This explanation is substantially correct.
> 
> > 
> > What you need to do is:
> >(1)  run nm or readelf on jemalloc_jemalloc.o,   then run nm or
> > readelf on malloc.o
> >(2)  Look at the symbols in both
> >(3)  Add the missing symbols to malloc.c in Bash
> 
> The bash malloc includes definitions for malloc/free/realloc/calloc/cfree/
> valloc/memalign.  I'd be interested in knowing what other global symbols
> jemalloc exports.  I'd also be interested in seeing how someone managed to
> compile the bash malloc and leave out realloc.
> 
> Chet

Just to kind of close the loop here, we went ahead and changed our local
build of bash to do:
./configure --enable-static-link --without-bash-malloc

This matches the ports implementation, so we have moved on here.

Sean


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [review request] usr.sbin/service - make showing files configurable

2012-05-17 Thread Bryan Drewery


On 5/17/2012 4:37 PM, Doug Barton wrote:
> On 05/14/2012 06:35, Bryan Drewery wrote:
> 
> 
>> On 5/13/2012 6:15 PM, Doug Barton wrote:
>>> On 5/12/2012 8:23 PM, Bryan Drewery wrote:
 Hi,

 I found service(8) to be inconsistent that it listed files with
 `service -e`, but plain services with `service -l`
> 
>>> That behavior is by design.
> 
> 
> 
>> Could you please elaborate on the design decision?
> 
> For services that are enabled (IOW, a tiny subset of the overall
> number) I thought it was useful to indicate to the user where those
> services come from. The -l option dumps everything in the directories,
> even if it's not a service. Users interested in differentiating
> /etc/rc.d from $local_startup can use ls.

Thanks for explaining.

> 
>> I did of course look in base for uses of service -e and service
>> -l, before considering this patch. The only case I can find is in a
>> cshrc example, which my patch does not affect.
> 
> That's not relevant, as you cannot possibly know what other uses
> service(1) is being put to. Also, it's bad form to change the default
> output of a tool (and/or the semantics of its command line options)
> years after its introduction.

True.

> 
>> I had expected service -e to behave like service -l, so I could
>> for example, put it into a loop and check all services, using the
>> service(8) script itself.
> 
>> for service_name in `service -e`; do service status $service_name
>> || service start $service_name; done
> 
> for service in `service -e` ; do
>   service ${##*/service} status || service ${##*/service} start
> done

Yes, I resorted to that before the patch. I just think consistency is
better.

> 
> (Note, your syntax for the service command is wrong above.)

Yeah it's what I get for mashing a pseudo example up and not testing it!

> 
> 
> hth,
> 
> Doug
> 

Thank you,
Bryan Drewery
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [review request] usr.sbin/service - make showing files configurable

2012-05-17 Thread Doug Barton
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 05/14/2012 06:35, Bryan Drewery wrote:
> 
> 
> On 5/13/2012 6:15 PM, Doug Barton wrote:
>> On 5/12/2012 8:23 PM, Bryan Drewery wrote:
>>> Hi,
>>> 
>>> I found service(8) to be inconsistent that it listed files with
>>> `service -e`, but plain services with `service -l`
> 
>> That behavior is by design.
> 
> 
> 
> Could you please elaborate on the design decision?

For services that are enabled (IOW, a tiny subset of the overall
number) I thought it was useful to indicate to the user where those
services come from. The -l option dumps everything in the directories,
even if it's not a service. Users interested in differentiating
/etc/rc.d from $local_startup can use ls.

> I did of course look in base for uses of service -e and service
> -l, before considering this patch. The only case I can find is in a
> cshrc example, which my patch does not affect.

That's not relevant, as you cannot possibly know what other uses
service(1) is being put to. Also, it's bad form to change the default
output of a tool (and/or the semantics of its command line options)
years after its introduction.

> I had expected service -e to behave like service -l, so I could
> for example, put it into a loop and check all services, using the
> service(8) script itself.
> 
> for service_name in `service -e`; do service status $service_name
> || service start $service_name; done

for service in `service -e` ; do
service ${##*/service} status || service ${##*/service} start
done

(Note, your syntax for the service command is wrong above.)


hth,

Doug

- -- 

This .signature sanitized for your protection
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (FreeBSD)

iQEcBAEBCAAGBQJPtW+MAAoJEFzGhvEaGryEpokH/RbWnJZN/RCQzidxoIbAx0+5
nAEX33e0Iazfqs/km7uFP8T/4SD2b0pOmr3dNBaKHqnpz005ACzhTcWD111ik/d2
ypRKdzh+vlq+Y9bDB4PozMjnalZrhkAUIinUIDDH6xMW46fIbN2bXPqz8AIe1Umo
a8LaHW59ARJf197o7iyWNOYOcF6+S3haaSzu8UXL5MTDtKBpn5XY5Eg6ppc/ZD9J
Mzaq1k7baCrGqCSsyZusmCv7WWDcOw7tOspUKzoNMm+wBMf7MrQyPUQsaA9vfGXZ
cB39Byryvi9Rhbz/ACjgw44ZRVUcjWJaxkFVc5WwkLbCDTv4tny5q2KpIAHfhPk=
=ykfV
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD and LDAP users, bug or feature?

2012-05-17 Thread Mark Felder

On Thu, 17 May 2012 13:41:19 -0500, Joel Dahl  wrote:



Thanks, setting uidstart to 1000 indeed works around the problem. :)

However, I would still like to know if this is intended behaviour.



I'm not sure but hopefully someone here can answer that for you.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: "random device not loaded; using insecure entropy" during boot

2012-05-17 Thread Andriy Gapon
on 14/05/2012 21:17 Bruce Cran said the following:
> While booting -current I noticed a new warning introduced in r230230**
>  (though it's 
> not
> in 'dmesg' once booted):
> 
> FreeBSD/SMP: 1 package(s) x 2 core(s) x 2 SMT threads
>  cpu0 (BSP): APIC ID:  0
>  cpu1 (AP): APIC ID: 1
>  cpu2 (AP): APIC ID: 4
>  cpu3 (AP): APIC ID: 5
> random device not loaded; using insecure entropy
> 
> I guess something's wanting random data before its been initialized?   Once
> booted kern.random shows that it is loaded and working.

It seems that the message is triggered by __stack_chk_init.  I am not sure if we
really need a "secure" random value there.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


[panic] zfs_zget() panic during 'svn update'

2012-05-17 Thread Glen Barber
Hi,

I'm running -CURRENT from a few weeks ago (r234559M), and during 'svn
update', had the machine panic.

I have kgdb output available here, and am happy to provide additional
information if needed:

http://people.freebsd.org/~gjb/zfs_zget-panic.kgdb.txt

Regards,

Glen

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD 10 prognostication...

2012-05-17 Thread Christer Solskogen
On Thu, May 17, 2012 at 4:57 AM, Chuck Burns  wrote:
> You guys DO realize that's a troll website, right? And you're being
> seriously trolled.. right?
>

The URL is legit! This is noes trollz!

-- 
chs,
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ACPI 'driver bug: Unable to set devclass'

2012-05-17 Thread John Baldwin
On Thursday, May 17, 2012 11:33:56 am Andriy Gapon wrote:
> on 17/05/2012 17:05 John Baldwin said the following:
> > On Wednesday, May 16, 2012 4:07:43 pm John Baldwin wrote:
> >> Oh, whoops.  Actually, the right way to do this I think is 
> >> bus_hint_device_unit()
> >> (and/or, not make the unit number in cpuX mean anything at all, but use a 
> >> separate
> >> ivar to track what PCPU_GET(cpuid) a given cpuX device corresponds to).  I 
> >> think
> >> the last approach is really the right way to fix this.
> > 
> > I haven't been able to try this yet, but I have a first cut at
> > www.freebsd.org/~jhb/patches/acpi_cpu.patch
> > 
> 
> The patch has not been compile-tested? :)

Not yet.  I'll try to test it later today unless someone beats me to it. :-P

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: GCC update for testing

2012-05-17 Thread Pedro Giffuni

Hi Dimitry;

On 05/17/12 11:44, Dimitry Andric wrote:

On 2012-05-17 17:44, Pedro Giffuni wrote:>  Hi;

I took a bunch of patches that were merged into the GCC 4.1 branch
(under GPLv2) and prepared a patch for merging them into our base
gcc. These are supposed to be bug fixes only.

You can get the patch here:
http://people.freebsd.org/~pfg/patches/patch-contrib-gcc
And, for those really interested, the log is here:
http://people.freebsd.org/~pfg/patches/gcc41-pr-log

It may be my imagination but the patch seems to be causing more
warnings than usual and it breaks the kernel here:

...

../../../cam/scsi/scsi_sa.c: In function 'samount':
../../../cam/scsi/scsi_sa.c:1887: warning: 'comp_supported' may be used
uninitialized in this function
../../../cam/scsi/scsi_sa.c:1888: warning: 'write_protect' may be used
uninitialized in this function

These warnings seem wrong, upon casual inspection of the code.  In any
case, if you compile the same file with gcc 4.7, they don't appear. :)

Any idea which particular gcc fix is responsible for them?

As a workaround, we could set all those variable to some reasonable
value, but that feels like a cheap trick to me...

But I'd rather just not import the fix that causes those warnings.


I will have to dig deeper into the changes to see what causes this.
In any case I do agree and the patch will not be committed.
Ultimately I can just leave the list around and we bring them
in only as needed.

Pedro.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


VNET jails freebsd 9.0

2012-05-17 Thread Krzysztof Kowalski
Hi,
i was trying to make jail, under VB 4.1.14r77440 WINDOWS7 as host,  using
this:
http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/jails-build.html
http://wiki.polymorf.fr/index.php/Howto:FreeBSD_jail_vnet
I set everything like wiki says, recompiled kernel and patched
/etc/rc.d/jail, but when i try to do:
/etc/rc.d/netif cloneup
i get:
ifconfig: create: bad value
At startup i get:
Starting jails:epair0a: Ethernet address: 02:c0:84:00:05:0a
epair0b: Ethernet address: 02:c0:84:00:06:0b
epair0a
 cannot start jail "misc"
tail: /tmp/jail.qqSQ0EB4/jail.17: No such file or directory
.
/etc/rc: WARNING: Ignoring scratch file /etc/rc.d/jail.orig
Sorry if some information are missing but i don't know what to attach
/etc/rc.conf: http://wklej.to/VUy3l
/etc/jails/misc.conf: http://wklej.to/vQVxW
dmesg: http://wklej.to/jJiqO
Can someone tell me whats wrong?
Best regards,
Krzysztof K.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBSD and LDAP users, bug or feature?

2012-05-17 Thread Joel Dahl
On 17-05-2012  8:24, Mark Felder wrote:
> Check man adduser.conf(5)
> 
> There is an option for "uidstart" which should do what you want. If you  
> set it to 1000 every time you run "adduser" it will show:

Thanks, setting uidstart to 1000 indeed works around the problem. :)

However, I would still like to know if this is intended behaviour.

-- 
Joel
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: GCC update for testing

2012-05-17 Thread Dimitry Andric
On 2012-05-17 17:44, Pedro Giffuni wrote:> Hi;
> I took a bunch of patches that were merged into the GCC 4.1 branch
> (under GPLv2) and prepared a patch for merging them into our base
> gcc. These are supposed to be bug fixes only.
> 
> You can get the patch here:
> http://people.freebsd.org/~pfg/patches/patch-contrib-gcc
> And, for those really interested, the log is here:
> http://people.freebsd.org/~pfg/patches/gcc41-pr-log
> 
> It may be my imagination but the patch seems to be causing more
> warnings than usual and it breaks the kernel here:
...
> ../../../cam/scsi/scsi_sa.c: In function 'samount':
> ../../../cam/scsi/scsi_sa.c:1887: warning: 'comp_supported' may be used 
> uninitialized in this function
> ../../../cam/scsi/scsi_sa.c:1888: warning: 'write_protect' may be used 
> uninitialized in this function

These warnings seem wrong, upon casual inspection of the code.  In any
case, if you compile the same file with gcc 4.7, they don't appear. :)

Any idea which particular gcc fix is responsible for them?

As a workaround, we could set all those variable to some reasonable
value, but that feels like a cheap trick to me...

But I'd rather just not import the fix that causes those warnings.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Chrome crashing system (amd64-10.0-CURRENT)

2012-05-17 Thread Evan Martin
These kinds of hard locks often point at graphics driver problems, but
normally Chrome relies on a driver whitelist that likely doesn't
include any FreeBSD drivers.  Did you perhaps set a flag somewhere to
bypass a blacklist?

You could try some command line flags like
--blacklist-accelerated-compositing
--blacklist-webgl
to see if they help.

(I found those on
http://peter.sh/experiments/chromium-command-line-switches/ , not
certain if they do what you need.)

Another idea is to use strace/ktrace/truss into a log file to see what
it was doing around the time of dying.

On Wed, May 16, 2012 at 11:15 PM, Conrad J. Sabatier  wrote:
> For the last week or so, I've been unable to run chrome.  Any attempt
> to start it up will cause the system either to freeze up or reboot.
>
> To make matters worse, no trace of what's happening is anywhere to be
> found.  Nothing in any log files.  The system doesn't drop into the
> kernel debugger, either.  It's either a hard freeze or sudden reboot.
>
> I've tried rebuilding the chromium port, with both clang and gcc 4.6,
> to no avail.  I've also updated the system sources several times this
> week and remade world/kernel.  Nothing seems to help.
>
> I'm totally stumped as to how to determine what's going on here.  Any
> suggestions as to how to obtain some useful info?
>
> Thanks!
>
> --
> Conrad J. Sabatier
> conr...@cox.net
> ___
> freebsd-chrom...@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-chromium
> To unsubscribe, send any mail to "freebsd-chromium-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


GCC update for testing

2012-05-17 Thread Pedro Giffuni

Hi;

I took a bunch of patches that were merged into the GCC 4.1 branch
(under GPLv2) and prepared a patch for merging them into our base
gcc. These are supposed to be bug fixes only.

You can get the patch here:
http://people.freebsd.org/~pfg/patches/patch-contrib-gcc
And, for those really interested, the log is here:
http://people.freebsd.org/~pfg/patches/gcc41-pr-log

It may be my imagination but the patch seems to be causing more
warnings than usual and it breaks the kernel here:


$ make
cc -c -O2 -Os -pipe -fno-strict-aliasing -march=athlon64 -std=c99 -g 
-Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  
-Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef 
-Wno-pointer-sign -fformat-extensions  -Wmissing-include-dirs 
-fdiagnostics-show-option   -nostdinc  -I. -I../../.. 
-I../../../contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include 
opt_global.h -fno-common -finline-limit=8000 --param 
inline-unit-growth=100 --param large-function-growth=1000  
-fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse 
-msoft-float  -fno-asynchronous-unwind-tables -ffreestanding 
-fstack-protector -Werror  ../../../cam/scsi/scsi_sa.c

cc1: warnings being treated as errors
../../../cam/scsi/scsi_sa.c: In function 'samount':
../../../cam/scsi/scsi_sa.c:1887: warning: 'comp_supported' may be used 
uninitialized in this function
../../../cam/scsi/scsi_sa.c:1888: warning: 'write_protect' may be used 
uninitialized in this function
../../../cam/scsi/scsi_sa.c:1887: warning: 'comp_enabled' may be used 
uninitialized in this function
../../../cam/scsi/scsi_sa.c:2728: warning: 'current_speed' may be used 
uninitialized in this function

../../../cam/scsi/scsi_sa.c:2728: note: 'current_speed' was declared here
../../../cam/scsi/scsi_sa.c:2727: warning: 'current_density' may be used 
uninitialized in this function

../../../cam/scsi/scsi_sa.c:2727: note: 'current_density' was declared here
../../../cam/scsi/scsi_sa.c:2725: warning: 'current_blocksize' may be 
used uninitialized in this function
../../../cam/scsi/scsi_sa.c:2725: note: 'current_blocksize' was declared 
here
../../../cam/scsi/scsi_sa.c:2728: warning: 'current_speed' may be used 
uninitialized in this function

../../../cam/scsi/scsi_sa.c:2728: note: 'current_speed' was declared here
../../../cam/scsi/scsi_sa.c:2725: warning: 'current_blocksize' may be 
used uninitialized in this function
../../../cam/scsi/scsi_sa.c:2725: note: 'current_blocksize' was declared 
here
../../../cam/scsi/scsi_sa.c:2728: warning: 'current_speed' may be used 
uninitialized in this function

../../../cam/scsi/scsi_sa.c:2728: note: 'current_speed' was declared here
../../../cam/scsi/scsi_sa.c:2725: warning: 'current_blocksize' may be 
used uninitialized in this function
../../../cam/scsi/scsi_sa.c:2725: note: 'current_blocksize' was declared 
here
../../../cam/scsi/scsi_sa.c:2728: warning: 'current_speed' may be used 
uninitialized in this function

../../../cam/scsi/scsi_sa.c:2728: note: 'current_speed' was declared here
../../../cam/scsi/scsi_sa.c:2725: warning: 'current_blocksize' may be 
used uninitialized in this function
../../../cam/scsi/scsi_sa.c:2725: note: 'current_blocksize' was declared 
here

*** Error code 1
...


Other stuff I tested (Apache OpenOffice) builds fine.

cheers,

Pedro.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: "make delete-old" performance.

2012-05-17 Thread Jilles Tjoelker
On Thu, May 17, 2012 at 02:13:40PM +0200, Dimitry Andric wrote:
> On 2012-05-17 05:18, b. f. wrote:...
> > The slowdown is probably due - at least in part - to two factors:

> > - the list of files to be checked for removal has grown substantially,
> > because missing entries for old knobs and new entries for new knobs
> > have been added; and

> > - a new (and slower) method of checking was added in:
> > http://svnweb.FreeBSD.org/base?view=revision&revision=220255
> > because the old method broke down with the size of the new lists of files.

> Hm, maybe it would have been better to fix make, so it can accept
> arbitrarily long lists, without segfaulting?  There's such a thing as
> malloc(), and I simply don't believe any of those lists could be larger
> than a few hundred kilobytes.

Alternatively, make could be fixed so that the original code works.
Although an invocation like
  sh -c 'for file in VERY_LONG_LIST; do something; done'
will bump into {ARG_MAX}, the shell itself does not have a fixed
limitation so longer command lines can be written to a temporary file
and passed to sh that way.

In some cases (such as with -j), make always uses a temporary file,
slowing things down and obscuring ps output.

At the cost of needing the temporary file named a bit longer, it is
better to pass the pathname to sh rather than feeding the script on
standard input: this avoids interfering with terminal input and is
potentially faster.

The code currently in Makefile.inc1 can probably be sped up by passing
the output of the make -V command to something like
  xargs sh -c 'for file do rm -i "${DESTDIR}/${file}"; done' sh
instead of the xargs -n1 | while read file; do ...; done loop.

(Note the second "sh" at the end, which serves as a value for $0 so all
strings from xargs become positional parameters.)

-- 
Jilles Tjoelker
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ACPI 'driver bug: Unable to set devclass'

2012-05-17 Thread Andriy Gapon
on 17/05/2012 17:05 John Baldwin said the following:
> On Wednesday, May 16, 2012 4:07:43 pm John Baldwin wrote:
>> Oh, whoops.  Actually, the right way to do this I think is 
>> bus_hint_device_unit()
>> (and/or, not make the unit number in cpuX mean anything at all, but use a 
>> separate
>> ivar to track what PCPU_GET(cpuid) a given cpuX device corresponds to).  I 
>> think
>> the last approach is really the right way to fix this.
> 
> I haven't been able to try this yet, but I have a first cut at
> www.freebsd.org/~jhb/patches/acpi_cpu.patch
> 

The patch has not been compile-tested? :)

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Chrome crashing system (amd64-10.0-CURRENT)

2012-05-17 Thread Dimitry Andric
On 2012-05-17 14:20, Chuck Burns wrote:> On 5/17/2012 2:11 AM, John Hixson 
wrote:
>> On Thu, May 17, 2012 at 01:15:54AM -0500, Conrad J. Sabatier wrote:
>>> For the last week or so, I've been unable to run chrome.  Any attempt
>>> to start it up will cause the system either to freeze up or reboot.
>>
>> To add to this, I've had the same problem on 10-CURRENT for several months
>> now.
> 
> Are you guys building ports with clang? There's a known bug with 
> google-perftools, when it's built with clang, chrome will crash upon launch.

Please note the OP is talking about a complete system crash and/or
restart, not just chrome itself crashing.


> chrome itself can be built with any compiler, but if google-perftools is 
> built with clang, crash!
> 
> http://code.google.com/p/gperftools/issues/detail?id=394

There seem to be several problems with gperftools; compiled with gcc
4.2.1, there are at least 3 failures in its test suite (of 40 tests).
Compiled with gcc 4.7 it doesn't even compile, since it erroneously
tries to use backtrace_symbols(), which we don't provide.

Compiled with clang 3.1, there are 12 failures. I assume this is because
it is doing all kinds of tricky things with threads and re-implementing
Yet Another Malloc, which seems to be very hard, as recent experience
with head has shown. :)

In any case, a good start would be to attempt to diagnose all the test
failures that occur with clang only, to see if they indicate a problem
in gperftools or clang itself.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Ethernet Drivers: Question on ifp->if_ioctl invocation for SIOCADDMULTI and SIOCDELMULTI

2012-05-17 Thread John Baldwin
On Wednesday, May 16, 2012 2:41:25 pm David Somayajulu wrote:
> Hi All,
> When ifp->if_ioctl() is invoked for the ioctl cmd SIOCADDMULTI,
> 
> 
> 
> IN_MULTI_LOCK() is called in one of the functions in_joingroup()  in the 
caller stack.
> 
> 
> 
> >From netinet/in_var.h, line 357 :  #define IN_MULTI_LOCK() 
mtx_lock(&in_multi_mtx)
> 
> 
> 
> >From netinet/in_mcast.c
> 1098 in_joingroup(struct ifnet *ifp, const struct in_addr *gina,
> 1099 /*const*/ struct in_mfilter *imf, struct in_multi **pinm)
> 1100 {
> 1101 int error;
> 1102
>  1103 IN_MULTI_LOCK();
> 1104 error = in_joingroup_locked(ifp, gina, imf, pinm);
> 1105 IN_MULTI_UNLOCK();
> 1106
> 
> This is also the case for SIOCDELMULTI, where the function holding  
"in_multi_mtx" lock is in_leavegroup()
> 
> This poses a problem in the driver in that the hardware dependent function 
performing it,  is not allowed to sleep() in case it needs to poll some state.
> 
> Question:
> 
> 1.   If I want to implement any delays - for the above case - in the 
driver using DELAY(usec) macro, is there a maximum amount of time that the 
driver is allowed to complete this function? I am concerned that if it takes 
to too long I might run into a soft_lockup() situation.
> 
> 2.   Is it o.k to defer the processing in a separate in a separate 
thread which can sleep() ?

You can always queue a task to update the MAC table if you need to use a 
sleep.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ACPI 'driver bug: Unable to set devclass'

2012-05-17 Thread John Baldwin
On Wednesday, May 16, 2012 4:07:43 pm John Baldwin wrote:
> On Wednesday, May 16, 2012 12:18:25 pm Andriy Gapon wrote:
> > on 16/05/2012 17:50 John Baldwin said the following:
> > > On Tuesday, May 15, 2012 12:35:12 pm Andriy Gapon wrote:
> > >> Not sure what you disagree with...
> > >> First, the wildcard device is added to the child list during the walk.
> > >> Then, the unit 0 device is added to the list when acpi_timer identify is 
> > >> executed.
> > >> Then, the wildcard device is probed and gets unit number of zero.
> > >> Then, the fixed device is being probed and the unit number conflict 
> > >> arises.
> > >>
> > >> Am I misunderstanding something?
> > > 
> > > Yes.  The third step will see that unit 0 is already in use and shouldn't
> > > reuse unit 0.
> > > 
> > 
> > Looks like I missed the call to devclass_add_device() in make_device().
> > 
> > Your guess:
> > > I wonder if this is related to the recent changes to set the unit number 
> > > for CPUs?
> > 
> > seems to be true.
> > 
> > The device_t-s created for CPUs have NULL driver name / devclass, but a
> > non-wildcard unit number.  So when such a device with unit number 0 is 
> > probed by
> > acpi_timer we get a unit number conflict with acpi_timer0 pre-created via 
> > the
> > identify.
> > Similarly we get conflicts for acpi_sysresource driver, because we do an 
> > early
> > probe-and-attach for this driver and the attached devices get some unit 
> > numbers
> > (0, 1, etc).  So when during the normal probe pass the "CPU" devices with 
> > matching
> > unit numbers are passed to the driver the conflict results.
> > 
> > I guess that it is an unorthodox use of newbus to specify a unit number 
> > without
> > specifying a driver name...  It's like saying "this device must be unit N 
> > whatever
> > driver claims it (be it kbdN or diskN) just because I say so".  Not sure if 
> > this
> > ever makes sense and maybe we should prohibit such a combination (reject it 
> > earlier).
> > I guess that in this particular case we already know that the devices are 
> > really
> > CPU devices and are going to be claimed by acpi cpu driver.  So we should 
> > pass
> > "cpu" as the name.
> 
> Oh, whoops.  Actually, the right way to do this I think is 
> bus_hint_device_unit()
> (and/or, not make the unit number in cpuX mean anything at all, but use a 
> separate
> ivar to track what PCPU_GET(cpuid) a given cpuX device corresponds to).  I 
> think
> the last approach is really the right way to fix this.

I haven't been able to try this yet, but I have a first cut at
www.freebsd.org/~jhb/patches/acpi_cpu.patch

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: new panic in cpu_reset() with WITNESS

2012-05-17 Thread Attilio Rao
2012/5/17, Andriy Gapon :
> on 25/01/2012 23:52 Andriy Gapon said the following:
>> on 24/01/2012 14:32 Gleb Smirnoff said the following:
>>> Yes, now:
>>>
>>> Rebooting...
>>> lock order reversal:
>>>  1st 0x80937140 smp rendezvous (smp rendezvous) @
>>> /usr/src/head/sys/kern/kern_shutdown.c:542
>>>  2nd 0xfe0001f5d838 uart_hwmtx (uart_hwmtx) @
>>> /usr/src/head/sys/dev/uart/uart_cpu.h:92
>>> panic: mtx_lock_spin: recursed on non-recursive mutex cnputs_mtx @
>>> /usr/src/head/sys/kern/kern_cons.c:500
>>
>> OK, so it's just a plain LOR between smp rendezvous and uart_hwmtx, no
>> new
>> details...
>>
>> It's still intriguing to me why the LOR *doesn't* happen [*] with
>> stop_scheduler_on_panic=0.  But I don't see a productive way to pursue
>> this
>> investigation further.
>
> Salve Glebius!
> After your recent nudging I took yet another look at this issue and it seems
> that
> I have some findings.
>
> For those who might get interested here is a convenience reference to the
> whole
> thread on gmane: http://thread.gmane.org/gmane.os.freebsd.current/139307
>
> A short summary.
> Prerequisites: an SMP x86 system, its kernel is configured with WITNESS &&
> !WITNESS_SKIPSPIN (this is important) and a system uses serial console via
> uart.
> Then, if stop_scheduler_on_panic is set to zero the system can be rebooted
> without
> a problem.  On the other hand, if stop_scheduler_on_panic is enabled, then
> the
> system first runs into a LOR when calling printf() in cpu_reset() and then
> it runs
> into a panic when printf is recursively called from witness(9) to report the
> LOR.
>  The panic happens because of the recursion on cnputs_mtx, which should
> ensure
> that cnputs() output is not intermingled and which is not flagged to
> support
> recursion.
>
> There are two things about this report that greatly confused and puzzled
> me:
> 1. stop_scheduler_on_panic variable is used _only_ in panic(9).  So how
> could it
> be possible that changing its value affects behavior of the system when
> panic(9)
> is not called?!
>
> 2. The LOR in question happens between "smp rendezvous" (smp_ipi_mtx) and
> "uart_hwmtx" (sc_hwmtx_s in uart core) spin locks.  The order of these locks
> is
> actually predefined in witness order_lists[] such that uart_hwmtx must come
> before
> smp_ipi_mtx.  But in the reboot path we first take smp_ipi_mtx in
> shutdown_reset(), then we call cpu_reset, then it calls printf and from
> there we
> get to uart_hwmtx for serial console output.  So the order between these
> spinlocks
> is always violated in the x86 SMP reboot path.
> How come witness(9) doesn't _always_ detect this LOR?
> How come it didn't detect this LOR before any "stop scheduler" commits?!
>
> [Spoiler alert :-)]
>
> Turns out that the two puzzles above are closely related.
> Let's first list all the things that change when stop_scheduler_on_panic is
> enabled and a panic happens:
> - other CPUs are stopped (forced to spin)
> - interrupts on current CPU are disabled
> - by virtue of the above the current thread should be the only thread
> running
> (unless it executes a voluntary switch)
> - all locks are "busted", they are completely ignored / bypassed
> - by virtue of the above no lock invariants and witness checks are
> performed, so
> no lock order checking, no recursion checking, etc
>
> So, what I observe is this: when stop_scheduler_on_panic is disabled, the
> LOR is
> actually detected as it should be.  witness(9) works properly here.  Once
> the LOR
> is detected witness(9) wants to report it using printf(9).  That's where we
> run
> into the cnputs_mtx recursion panic.  It's all exactly as with
> stop_scheduler_on_panic enabled.  Then panic(9) wants to report the panic
> using
> printf(9), which goes to cnputs() again, where _mtx_lock_spin_flags()
> detects
> locks recursion again (this is independent of witness(9)) and calls
> panic(9)
> again.  Then panic(9) wants to report the panic using printf(9)...
> I assume that when the stack is exhausted we run into a double fault and
> dblfault_handler wants to print something again...  Likely we eventually run
> into
> a triple fault which resets the machine.  Although, I recall some old
> reports of
> machines hanging during reboot in a place suspiciously close to where the
> described ordeal happens...
> But if the machine doesn't hang we get a full appearance of the reset
> successfully
> happening (modulo some last messages missing).
>
> With stop_scheduler_on_panic enabled and all the locks being ignored we, of
> course, do not run into any secondary lock recursions and resulting panics.
> So
> the system is able to at least panic "gracefully" (still we should have
> just
> reported the LOR gracefully, no panic is required).
>
> Some obvious conclusions:
> - the "smp rendezvous" and "uart_hwmtx" LOR is real and it is the true cause
> of
> the problem; it should be fixed one way or other - either by correcting
> witness
> order_lists[] or by avoiding the 

Re: FreeBSD and LDAP users, bug or feature?

2012-05-17 Thread Mark Felder

Check man adduser.conf(5)

There is an option for "uidstart" which should do what you want. If you  
set it to 1000 every time you run "adduser" it will show:


#  adduser
Username: foo
Full name: bar
Uid [1000]:

Don't worry -- it's just showing you the starting range. If there is  
already a UID of 1000 in use it will choose the next available. It's  
pretty nice because it even fills in holes if you remove users and then  
add new ones. However, that might be undesirable if you happen to leave  
files around from previous users and the new user gets the owning UID of  
those files



hth
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Chrome crashing system (amd64-10.0-CURRENT)

2012-05-17 Thread Chuck Burns

On 5/17/2012 2:11 AM, John Hixson wrote:

On Thu, May 17, 2012 at 01:15:54AM -0500, Conrad J. Sabatier wrote:

For the last week or so, I've been unable to run chrome.  Any attempt
to start it up will cause the system either to freeze up or reboot.


To add to this, I've had the same problem on 10-CURRENT for several months
now.


Are you guys building ports with clang? There's a known bug with 
google-perftools, when it's built with clang, chrome will crash upon launch.



chrome itself can be built with any compiler, but if google-perftools is 
built with clang, crash!


http://code.google.com/p/gperftools/issues/detail?id=394

--

Chuck Burns
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: "make delete-old" performance.

2012-05-17 Thread Dimitry Andric
On 2012-05-17 05:18, b. f. wrote:...
> The slowdown is probably due - at least in part - to two factors:
> 
> - the list of files to be checked for removal has grown substantially,
> because missing entries for old knobs and new entries for new knobs
> have been added; and
> 
> - a new (and slower) method of checking was added in:
> http://svnweb.FreeBSD.org/base?view=revision&revision=220255
> because the old method broke down with the size of the new lists of files.

Hm, maybe it would have been better to fix make, so it can accept
arbitrarily long lists, without segfaulting?  There's such a thing as
malloc(), and I simply don't believe any of those lists could be larger
than a few hundred kilobytes.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: new panic in cpu_reset() with WITNESS

2012-05-17 Thread Andriy Gapon
on 25/01/2012 23:52 Andriy Gapon said the following:
> on 24/01/2012 14:32 Gleb Smirnoff said the following:
>> Yes, now:
>>
>> Rebooting...
>> lock order reversal:
>>  1st 0x80937140 smp rendezvous (smp rendezvous) @ 
>> /usr/src/head/sys/kern/kern_shutdown.c:542
>>  2nd 0xfe0001f5d838 uart_hwmtx (uart_hwmtx) @ 
>> /usr/src/head/sys/dev/uart/uart_cpu.h:92
>> panic: mtx_lock_spin: recursed on non-recursive mutex cnputs_mtx @ 
>> /usr/src/head/sys/kern/kern_cons.c:500
> 
> OK, so it's just a plain LOR between smp rendezvous and uart_hwmtx, no new
> details...
> 
> It's still intriguing to me why the LOR *doesn't* happen [*] with
> stop_scheduler_on_panic=0.  But I don't see a productive way to pursue this
> investigation further.

Salve Glebius!
After your recent nudging I took yet another look at this issue and it seems 
that
I have some findings.

For those who might get interested here is a convenience reference to the whole
thread on gmane: http://thread.gmane.org/gmane.os.freebsd.current/139307

A short summary.
Prerequisites: an SMP x86 system, its kernel is configured with WITNESS &&
!WITNESS_SKIPSPIN (this is important) and a system uses serial console via uart.
Then, if stop_scheduler_on_panic is set to zero the system can be rebooted 
without
a problem.  On the other hand, if stop_scheduler_on_panic is enabled, then the
system first runs into a LOR when calling printf() in cpu_reset() and then it 
runs
into a panic when printf is recursively called from witness(9) to report the 
LOR.
 The panic happens because of the recursion on cnputs_mtx, which should ensure
that cnputs() output is not intermingled and which is not flagged to support
recursion.

There are two things about this report that greatly confused and puzzled me:
1. stop_scheduler_on_panic variable is used _only_ in panic(9).  So how could it
be possible that changing its value affects behavior of the system when panic(9)
is not called?!

2. The LOR in question happens between "smp rendezvous" (smp_ipi_mtx) and
"uart_hwmtx" (sc_hwmtx_s in uart core) spin locks.  The order of these locks is
actually predefined in witness order_lists[] such that uart_hwmtx must come 
before
smp_ipi_mtx.  But in the reboot path we first take smp_ipi_mtx in
shutdown_reset(), then we call cpu_reset, then it calls printf and from there we
get to uart_hwmtx for serial console output.  So the order between these 
spinlocks
is always violated in the x86 SMP reboot path.
How come witness(9) doesn't _always_ detect this LOR?
How come it didn't detect this LOR before any "stop scheduler" commits?!

[Spoiler alert :-)]

Turns out that the two puzzles above are closely related.
Let's first list all the things that change when stop_scheduler_on_panic is
enabled and a panic happens:
- other CPUs are stopped (forced to spin)
- interrupts on current CPU are disabled
- by virtue of the above the current thread should be the only thread running
(unless it executes a voluntary switch)
- all locks are "busted", they are completely ignored / bypassed
- by virtue of the above no lock invariants and witness checks are performed, so
no lock order checking, no recursion checking, etc

So, what I observe is this: when stop_scheduler_on_panic is disabled, the LOR is
actually detected as it should be.  witness(9) works properly here.  Once the 
LOR
is detected witness(9) wants to report it using printf(9).  That's where we run
into the cnputs_mtx recursion panic.  It's all exactly as with
stop_scheduler_on_panic enabled.  Then panic(9) wants to report the panic using
printf(9), which goes to cnputs() again, where _mtx_lock_spin_flags() detects
locks recursion again (this is independent of witness(9)) and calls panic(9)
again.  Then panic(9) wants to report the panic using printf(9)...
I assume that when the stack is exhausted we run into a double fault and
dblfault_handler wants to print something again...  Likely we eventually run 
into
a triple fault which resets the machine.  Although, I recall some old reports of
machines hanging during reboot in a place suspiciously close to where the
described ordeal happens...
But if the machine doesn't hang we get a full appearance of the reset 
successfully
happening (modulo some last messages missing).

With stop_scheduler_on_panic enabled and all the locks being ignored we, of
course, do not run into any secondary lock recursions and resulting panics.  So
the system is able to at least panic "gracefully" (still we should have just
reported the LOR gracefully, no panic is required).

Some obvious conclusions:
- the "smp rendezvous" and "uart_hwmtx" LOR is real and it is the true cause of
the problem; it should be fixed one way or other - either by correcting witness
order_lists[] or by avoiding the LOR in shutdown_reset/cpu_reset;
- because witness(9) uses printf(9) to report problems, it is very fragile to 
use
witness with any locks that can be acquired under printf(9)
- stop_scheduler_on_panic just uncovers the true bug

The

FreeBSD and LDAP users, bug or feature?

2012-05-17 Thread Joel Dahl
Hi,

I have a machine running FreeBSD and openldap24-server, and several client
machines running FreeBSD and openldap24-client and I'm experiencing a weird
behaviour with adduser/pw. I create my LDAP users on the LDAP server, with
UIDs starting at 5001. Local users on the server and clients should start
at UID 1001, but this does not really work. If I use adduser to create a new
local user on one of the client machines, it'll automatically be assigned
with UID 5002 - which I find very confusing. This also breaks my LDAP setup,
because when I add an LDAP user on the server, it'll also get UID 5002.

Running pw usernext on one of the client machines confirms this behaviour:

root@crashbox [~] pw usernext
5002:5002

But looking inside my /etc/passwd on the same machine reveals that the next
free UID should be 1002.

So pw is obviously getting information from LDAP and tries to be friendly
and automatically gives me the next free UID from LDAP - which would make
sense if pw could create LDAP users in addition to local users, but it can't.

So right now I'm forced to check /etc/passwd on my machines each time I
add a new local user and manually use that UID whenever I run adduser or pw.
It works, but it's easy to shoot myself in the foot.

Is this intended behaviour, or a bug? Or perhaps a misconfiguration on my
part?

I can provide configuration examples from my environment, but there
really isn't much to see - I haven't made many changes besides installing
the required applications from ports (openldap,nss_ldap,pam_ldap), changed
my nsswitch.conf and a couple of files in /etc/pam.d/.

-- 
Joel
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: buildkernel fails

2012-05-17 Thread Joel Dahl
On 16-05-2012 23:41, Dimitry Andric wrote:
> On 2012-05-16 23:18, Joel Dahl wrote:> Hi,
> > I did a buildworld+buildkernel on my workstation today and buildkernel 
> > fails with:
> > 
> > cc -c -O2 -frename-registers -pipe -fno-strict-aliasing  -std=c99 -g -Wall
> > -Wredundant-decls -Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes
> > -Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign
> > -fformat-extensions  -Wmissing-include-dirs -fdiagnostics-show-option
> > -nostdinc  -I. -I/usr/src/sys -I/usr/src/sys/contrib/altq -D_KERNEL
> > -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common
> > -finline-limit=8000 --param inline-unit-growth=100 --param
> > large-function-growth=1000  -fno-omit-frame-pointer -mcmodel=kernel
> > -mno-red-zone -mno-mmx -mno-sse -msoft-float
> > -fno-asynchronous-unwind-tables -ffreestanding -fstack-protector -Werror
> > /usr/src/sys/dev/et/if_et.c
> > Bus error (core dumped)
> > *** [isci.ko.debug] Error code 138
> > 1 error
> > *** [all] Error code 2
> > 1 error
> > *** [modules-all] Error code 2
> > ctfconvert -L VERSION -g if_et.o
> > 1 error
> > *** [buildkernel] Error code 2
> > 1 error
> > *** [buildkernel] Error code 2
> > 1 error
> > 
> > My src tree is at the latest rev. No /usr/obj. I'm currently running
> > CURRENT from May, 5th.
> 
> I think you may be hitting the libthr issue that was introduced in
> r234947 (Thu May 3 09:17:31 2012 UTC) and fixed in r235068 (Sat May 5
> 23:51:24 2012 UTC).  This caused some programs to randomly bomb out with
> bus errors or other weirdness.
> 
> Please try building and installing lib/libthr (from your updated source
> tree) before running the rest of the world/kernel build.

That fixed it. Thanks!

-- 
Joel
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Chrome crashing system (amd64-10.0-CURRENT)

2012-05-17 Thread John Hixson
On Thu, May 17, 2012 at 01:15:54AM -0500, Conrad J. Sabatier wrote:
> For the last week or so, I've been unable to run chrome.  Any attempt
> to start it up will cause the system either to freeze up or reboot.
> 
> To make matters worse, no trace of what's happening is anywhere to be
> found.  Nothing in any log files.  The system doesn't drop into the
> kernel debugger, either.  It's either a hard freeze or sudden reboot.
> 
> I've tried rebuilding the chromium port, with both clang and gcc 4.6,
> to no avail.  I've also updated the system sources several times this
> week and remade world/kernel.  Nothing seems to help.
> 
> I'm totally stumped as to how to determine what's going on here.  Any
> suggestions as to how to obtain some useful info?
> 

To add to this, I've had the same problem on 10-CURRENT for several months
now.

-John
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"