Re: kernel constructor

2014-11-10 Thread Matt Thomas

> On Nov 8, 2014, at 11:16 PM, Masao Uebayashi  wrote:
> 
> Ideally the long hardcoded sequence of init functions in init_main:main() is
> converted to a single vector whose order is resolved by modular dependency.
> But for the moment such a hardcoded priority should be good enough to improve
> modularity.
> 
> Question - where to put the declarations (typedef, __link_set_decl())?

No more link sets please.
Can’t we use __attributes__((__constructor__))
and __attributes__((destructor));


[PATCH] UFS1 extended attributes support for GENERIC

2014-11-10 Thread Emmanuel Dreyfus
Hi

I plan to commit the patch below to enable support for UFS1 extended
attributes in GENERIC and GENERIC-like kernels.

This change just brings UFS1 extended attribute *support* in the kernel,
extended attributes are not enabled unless three conditions are met:
1) filesystem is UFS1 (newfs -O1)
2) .attribute/system and .attribute/user directories are created at fs root
3) filesystem is mounted with -o extattr

Despite extended attributes being a MI feature, it is not obvious in 
what kernel it should be built. Obviously we have GENERIC kernels
that are memory-constrained. My rule of thumb was to add the 
extended attribute kernel options if options if QUOTA/QUOTA2 were 
enabled, otherwise I just added them commented out. Machine-dependent
advices are welcome here.

Index: sys/arch/acorn26/conf/GENERIC
===
RCS file: /cvsroot/src/sys/arch/acorn26/conf/GENERIC,v
retrieving revision 1.79
diff -U 4 -r1.79 GENERIC
--- sys/arch/acorn26/conf/GENERIC   23 Aug 2014 20:26:56 -  1.79
+++ sys/arch/acorn26/conf/GENERIC   10 Nov 2014 05:00:32 -
@@ -71,8 +71,11 @@
 optionsFFS_EI  # FFS endianness-independence support
 optionsWAPBL   # File system journaling support
 #options   UFS_DIRHASH # UFS Large Directory Hashing - Experimental
 #options   FFS_NO_SNAPSHOT # No FFS snapshot support
+optionsUFS_EXTATTR # Extended attribute support for UFS1
+optionsUFS_EXTATTR_AUTOSTART
+optionsUFS_EXTATTR_AUTOCREATE=1024
 
 # Executable format options
 optionsEXEC_ELF32
 optionsEXEC_SCRIPT
Index: sys/arch/acorn32/conf/GENERIC
===
RCS file: /cvsroot/src/sys/arch/acorn32/conf/GENERIC,v
retrieving revision 1.114
diff -U 4 -r1.114 GENERIC
--- sys/arch/acorn32/conf/GENERIC   23 Aug 2014 20:26:56 -  1.114
+++ sys/arch/acorn32/conf/GENERIC   10 Nov 2014 05:00:33 -
@@ -72,8 +72,11 @@
 optionsWAPBL   # File system journaling support
 #options   UFS_DIRHASH # UFS Large Directory Hashing - Experimental
 optionsNFSSERVER
 #options   FFS_NO_SNAPSHOT # No FFS snapshot support
+optionsUFS_EXTATTR # Extended attribute support for UFS1
+optionsUFS_EXTATTR_AUTOSTART
+optionsUFS_EXTATTR_AUTOCREATE=1024
 
 # Networking options
 
 optionsGATEWAY # packet forwarding
Index: sys/arch/alpha/conf/GENERIC
===
RCS file: /cvsroot/src/sys/arch/alpha/conf/GENERIC,v
retrieving revision 1.360
diff -U 4 -r1.360 GENERIC
--- sys/arch/alpha/conf/GENERIC 23 Aug 2014 20:26:56 -  1.360
+++ sys/arch/alpha/conf/GENERIC 10 Nov 2014 05:00:34 -
@@ -97,8 +97,11 @@
 optionsWAPBL   # File system journaling support
 #options   UFS_DIRHASH # UFS Large Directory Hashing - Experimental
 optionsNFSSERVER   # Sun NFS-compatible file system server
 #options   FFS_NO_SNAPSHOT # No FFS snapshot support
+optionsUFS_EXTATTR # Extended attribute support for UFS1
+optionsUFS_EXTATTR_AUTOSTART
+optionsUFS_EXTATTR_AUTOCREATE=1024
 
 # Networking options
 #options   GATEWAY # packet forwarding
 optionsINET# IP + ICMP + TCP + UDP
Index: sys/arch/amd64/conf/GENERIC
===
RCS file: /cvsroot/src/sys/arch/amd64/conf/GENERIC,v
retrieving revision 1.402
diff -U 4 -r1.402 GENERIC
--- sys/arch/amd64/conf/GENERIC 2 Nov 2014 23:08:40 -   1.402
+++ sys/arch/amd64/conf/GENERIC 10 Nov 2014 05:00:35 -
@@ -172,8 +172,11 @@
 optionsNFSSERVER   # Network File System server
 #options   EXT2FS_SYSTEM_FLAGS # makes ext2fs file flags (append and
# immutable) behave as system flags.
 #options   FFS_NO_SNAPSHOT # No FFS snapshot support
+optionsUFS_EXTATTR # Extended attribute support for UFS1
+optionsUFS_EXTATTR_AUTOSTART
+optionsUFS_EXTATTR_AUTOCREATE=1024
 
 # Networking options
 #options   GATEWAY # packet forwarding
 optionsINET# IP + ICMP + TCP + UDP
Index: sys/arch/amd64/conf/XEN3_DOM0
===
RCS file: /cvsroot/src/sys/arch/amd64/conf/XEN3_DOM0,v
retrieving revision 1.110
diff -U 4 -r1.110 XEN3_DOM0
--- sys/arch/amd64/conf/XEN3_DOM0   18 Oct 2014 16:56:51 -  1.110
+++ sys/arch/amd64/conf/XEN3_DOM0   10 Nov 2014 05:00:36 -
@@ -114,8 +114,11 @@
 optionsWAPBL   # File system journaling support
 #options   UFS_DIRHASH # UFS Large Directory Hashing - Experimental
 optionsNFSSERVER   # Network File System server
 #options   FFS_NO_SNAPSHOT # No FFS snapshot support
+optionsUFS_EXTATTR # Extended attribut

Re: kernel constructor

2014-11-10 Thread Martin Husemann
On Sun, Nov 09, 2014 at 05:46:21PM -0800, Matt Thomas wrote:
> No more link sets please.

I agree.

> Can't we use __attributes__((__constructor__))
> and __attributes__((destructor));

How about splitting the $subsystem_init() function from the function
marked as __constructor__: let the constructor function register the
$subsystem_init() function as a callback passing either a simple
integral priority or (even better, but not sure if this is
overengeneering) some representation of the dependencies - then
(topological) sort all registered callbacks and call them one after the
other.

Martin


Re: kernel constructor

2014-11-10 Thread Masao Uebayashi
On Mon, Nov 10, 2014 at 5:25 PM, Martin Husemann  wrote:
> On Sun, Nov 09, 2014 at 05:46:21PM -0800, Matt Thomas wrote:
>> No more link sets please.
>
> I agree.

I agreed one week ago.  But now I have MI linker script that merges
"link_set_*" into .rodata, I can live with link-set. :)

>> Can't we use __attributes__((__constructor__))
>> and __attributes__((destructor));
>
> How about splitting the $subsystem_init() function from the function
> marked as __constructor__: let the constructor function register the
> $subsystem_init() function as a callback passing either a simple
> integral priority or (even better, but not sure if this is
> overengeneering) some representation of the dependencies - then
> (topological) sort all registered callbacks and call them one after the
> other.

__attribute__((constructor(n))), where n being priority, can do
ordering (hint from pooka@).

Question is, how to provide __CTOR_LIST__, __CTOR_LIST_END__ equivalent symbols.

(It is super easy if MI linker script is there. :)


Re: kernel constructor

2014-11-10 Thread Justin Cormack
On Nov 10, 2014 10:02 AM, "Masao Uebayashi"  wrote:
>
> __attribute__((constructor(n))), where n being priority, can do
> ordering (hint from pooka@).
>
> Question is, how to provide __CTOR_LIST__, __CTOR_LIST_END__ equivalent
symbols.
>
> (It is super easy if MI linker script is there. :)

Constructors have priorities but it is a single global ordering which is a
bit ugly to use. But it is a nice idea to use this mechanism.

Justin


Re: kernel constructor

2014-11-10 Thread Masao Uebayashi
On Mon, Nov 10, 2014 at 7:46 PM, Justin Cormack
 wrote:
>
> On Nov 10, 2014 10:02 AM, "Masao Uebayashi"  wrote:
>>
>> __attribute__((constructor(n))), where n being priority, can do
>> ordering (hint from pooka@).
>>
>> Question is, how to provide __CTOR_LIST__, __CTOR_LIST_END__ equivalent
>> symbols.
>>
>> (It is super easy if MI linker script is there. :)
>
> Constructors have priorities but it is a single global ordering which is a
> bit ugly to use. But it is a nice idea to use this mechanism.

Hard to be uglier than how init_main.c looks like now...


Re: Proposal: kmem_valloc [was: Re: raspberry pi panic 7.0_BETA after install fs resize]

2014-11-10 Thread Steffen Nurpmeso
Maxime Villard  wrote:
 |  - kmem_intr_zalloc
 |After all, kmem_intr_zalloc is not needed since the caller can simply do
 | ptr = kmem_intr_alloc(...);
 | memset(ptr, 0, ...);

Well i never looked into the source but for my very own thing it
was a major benefit to pass a "zero" flag down to the bottom since
it allowed zeroing the real range, which was always on an aligned
boundary and of "an aligned range" (multiple of eight or sixteen,
multiple of pagesize for larger allocations), therefore ending up
using the fastest memset path without need for pointer alignment
and without the need for a do-the-remaining-byte thing, which
resulted in measurable boosts (and that on x86, but in artificial
use cases, of course).

--steffen


Re: .eh_frame

2014-11-10 Thread Andrew Cagney
On 9 November 2014 17:12, Joerg Sonnenberger  wrote:
>> >> o .eh_frame in kernel is not used yet, and safely removed from /netbsd
>> >
>> > Please do not.
>>
>> o Is it correct that .eh_frame is not used by anything at all at the moment?
>
> gdb should in principle, haven't tried. libunwind is not hooked into ddb
> (yet).

Can you be more specific?

A remote debugger will call on either .debug_frame or .eh_frame when
generating a back-trace - what it uses depends on what it chooses to
look for first at each address. In fact, ignoring the potential for
bugs, you could:
- strip .eh_frame
- strip all debug info except .debug_frame
and still have good back-traces without weighing down the kernel's
text segment with .eh_frame info.

Andrew


Re: .eh_frame

2014-11-10 Thread Joerg Sonnenberger
On Mon, Nov 10, 2014 at 06:06:54PM +, Andrew Cagney wrote:
> On 9 November 2014 17:12, Joerg Sonnenberger  wrote:
> >> >> o .eh_frame in kernel is not used yet, and safely removed from /netbsd
> >> >
> >> > Please do not.
> >>
> >> o Is it correct that .eh_frame is not used by anything at all at the 
> >> moment?
> >
> > gdb should in principle, haven't tried. libunwind is not hooked into ddb
> > (yet).
> 
> Can you be more specific?
> 
> A remote debugger will call on either .debug_frame or .eh_frame when
> generating a back-trace - what it uses depends on what it chooses to
> look for first at each address. In fact, ignoring the potential for
> bugs, you could:
> - strip .eh_frame
> - strip all debug info except .debug_frame
> and still have good back-traces without weighing down the kernel's
> text segment with .eh_frame info.

Consider x86_64 where you can't do reliable stack unwinding without also
disabling -fomit-frame-pointer. The question is not about .debug_frame
vs .eh_frame, you don't get the former at all without explicitly asking
for debug data.

Joerg


Re: kernel constructor

2014-11-10 Thread Justin Cormack
On Mon, Nov 10, 2014 at 11:27 AM, Masao Uebayashi  wrote:
> Hard to be uglier than how init_main.c looks like now...

Can't disagree there.

Justin


Re: Proposal: kmem_valloc [was: Re: raspberry pi panic 7.0_BETA after install fs resize]

2014-11-10 Thread Thor Lancelot Simon
On Sun, Nov 09, 2014 at 09:06:40AM +0100, Maxime Villard wrote:
> Le 08/11/2014 23:28, Jean-Yves Migeon a ??crit :
> 
> Yes, but still, what's the point of giving a size then? If the kernel
> knows the size, we can remove kmem_free's size argument, otherwise it
> is inconsistent.
> 
> And then, as I said, we lose the memory optimisation: one more kmem page
> is allocated to hold the size.

I'm not in favor of that.

Also, my experience porting a great deal of code from a very old 4.4BSD kernel
to NetBSD several years ago was that in converting the code from malloc to
kmem_alloc, I found a _lot_ of bugs -- use after frees, double frees, etc. --
because you had to keep much more careful track of what exactly you were
freeing, and when, in order to know its size.  I don't think we should give
that up (though I think the pool allocators are even better in this regard
and that perhaps it is use of any variable-size allocator that should be
discouraged).

Thor


Re: kernel constructor

2014-11-10 Thread Thor Lancelot Simon
On Sun, Nov 09, 2014 at 04:16:13PM +0900, Masao Uebayashi wrote:
> Ideally the long hardcoded sequence of init functions in init_main:main() is
> converted to a single vector whose order is resolved by modular dependency.
> But for the moment such a hardcoded priority should be good enough to improve
> modularity.

I'm in favor of *any* way we do this so long as we get rid of the second copy
of this code in rump.

In fact, I'm in favor of *any config modification whatsoever* if we can get
rid of the secret special version-7-unix kernel configuration "framework" of
rump.

Thor


Re: .eh_frame

2014-11-10 Thread Andrew Cagney
On 10 November 2014 18:21, Joerg Sonnenberger  wrote:
> Consider x86_64 where you can't do reliable stack unwinding without also
> disabling -fomit-frame-pointer. The question is not about .debug_frame
> vs .eh_frame, you don't get the former at all without explicitly asking
> for debug data.

You've lost me.  I agree that without the help of CFI (call frame
information), amd64's stack is impenetrable goop.  That just means
that we need to ensure that there is cfi information on hand should
someone need it - in the .debug_frame section instead of .eh_frame.

(small technical nit, currently to get just the .debug_frame section,
gcc seems to require: -fno-unwind-tables
-fno-asynchronous-unwind-tables -g; which seems a tad excessive given
that gas's .cfi_sections pretty much does all we need)

As an aside, even with no CFI, it's still possible to get a feel for
what went down - just scan the stack for what look like code
addresses.


Re: .eh_frame

2014-11-10 Thread Andrew Cagney
On 10 November 2014 06:53, Masao Uebayashi  wrote:
>
>> So, does your kernel contain C++ code using exceptions, hand written
>> eh-frame code, or an in kernel consumer? :-)
>
> According to joerg@, there is nothing (yet).

BTW, looks like arm (sys/arch/arm/include/profile.h) may contain
hand-written eh-frame code.  That shouldn't be a problem as, from gas:

7.11 `.cfi_sections SECTION_LIST'
=

`.cfi_sections' may be used to specify whether CFI directives should
emit `.eh_frame' section and/or `.debug_frame' section.  If
SECTION_LIST is `.eh_frame', `.eh_frame' is emitted, if SECTION_LIST is
`.debug_frame', `.debug_frame' is emitted.  To emit both use
`.eh_frame, .debug_frame'.  The default if this directive is not used
is `.cfi_sections .eh_frame'.

it is easy to change.

It's just a shame that more ports aren't exploiting this feature.
Makes unwinding less debugger-tool centric.


Re: kernel constructor

2014-11-10 Thread Masao Uebayashi
On Tue, Nov 11, 2014 at 11:48 AM, Thor Lancelot Simon  wrote:
> On Sun, Nov 09, 2014 at 04:16:13PM +0900, Masao Uebayashi wrote:
>> Ideally the long hardcoded sequence of init functions in init_main:main() is
>> converted to a single vector whose order is resolved by modular dependency.
>> But for the moment such a hardcoded priority should be good enough to improve
>> modularity.
>
> I'm in favor of *any* way we do this so long as we get rid of the second copy
> of this code in rump.
>
> In fact, I'm in favor of *any config modification whatsoever* if we can get
> rid of the secret special version-7-unix kernel configuration "framework" of
> rump.

You say as if rump did something wrong. :)  I think rump only exposed
existing problems, not rump's faults.

I guess .ctors should not be defined for rump.  init_main.c is not
shared by rump, that is good (for me).

Speaking of config(1), rump proved that partial (definition-only) use
of config works.