Re: zero-length symlinks

2013-11-05 Thread Christos Zoulas
In article <20131105220754.gb...@snowdrop.l8s.co.uk>,
David Laight   wrote:
>On Sun, Nov 03, 2013 at 04:35:19PM -0800, John Nemeth wrote:
>> 
>>  It has to do with the fact that historically mkdir(2) was
>> actually mkdir(3), it wasn't an atomic syscall and was a sequence
>> of operation performed by a library routine...
>
>Actually I think you'll find that mkdir way always a system call.
>It was directory rename that was done with a series of link and
>unlink system calls.

Nope, on 4.1BSD and I believe SVR1 (please correct me),
it was a setuid binary that did:

mknod("foo", 04, 0);
chown("foo", getuid());
link("foo", "foo/.");
link(".", "foo/..");

>Also, if you look at any current fs code the processing of "." and
>".." is special - they will be treated as requests for the current
>and parent directories regardless of the inodes they reference.
>Doing otherwise is a complete locking nightmare!

I think that this also came much later. I believe with 4.4BSD.

christos



Re: Changing __USING_TOPDOWN_VM to a runtime decision

2013-11-05 Thread Christos Zoulas
In article <20131105144023.gc17...@mail.duskware.de>,
Martin Husemann   wrote:
>-=-=-=-=-=-
>
>Hey folks,
>
>I would like to change the current (mostly) compile time decision
>wether we will use top-down VA layout for userland processes to a
>runtime check.
>
>This allows emulations to disable it, and also allows MD code to recognize
>binaries not suitable for topdown VM layout and give those binaries the
>old layout.
>
>The latter point is what I actually need: on sparc64 we have compiled most
>code in the "medlow" code model, which does not allow big addresses. I am
>about to commit changes that switch this default and properly mark new
>binaries. To still allow running old binaries, I need something like the
>attached patch.
>
>The patch is mostly straight forward: I define a new flag EXEC_TOPDOWN_VM,
>initialized by default according to __USING_TOPDOWN_VM, but overridable
>by a MD function. This way the exec_package carries over the information,
>wether we will use topdown-vm for the to-be-loaded binary.
>
>Most other changes are mechanical, like pass through this information through
>a few uvm layers.
>
>For architectures already using topdown-VM, no change is intended.
>
>Comments?

I don't like the !!(expr) syntax, I'd prefer to hide the ugliness in a macro
that does (expr != 0) 

christos



Re: zero-length symlinks

2013-11-05 Thread Michael van Elst
da...@l8s.co.uk (David Laight) writes:

>Actually I think you'll find that mkdir way always a system call.
>It was directory rename that was done with a series of link and
>unlink system calls.

mkdir(1) did a sequence of mknod,chown,link,link if you believe
the public sys3 sources. According to pubs.opengroup.org, the
mkdir system call originated in 4.2BSD.



Re: zero-length symlinks

2013-11-05 Thread David Laight
On Sun, Nov 03, 2013 at 04:35:19PM -0800, John Nemeth wrote:
> 
>  It has to do with the fact that historically mkdir(2) was
> actually mkdir(3), it wasn't an atomic syscall and was a sequence
> of operation performed by a library routine...

Actually I think you'll find that mkdir way always a system call.
It was directory rename that was done with a series of link and
unlink system calls.

Also, if you look at any current fs code the processing of "." and
".." is special - they will be treated as requests for the current
and parent directories regardless of the inodes they reference.
Doing otherwise is a complete locking nightmare!

David

-- 
David Laight: da...@l8s.co.uk


Re: pulse-per-second API status

2013-11-05 Thread Warner Losh

On Nov 1, 2013, at 12:19 PM, paul_kon...@dell.com wrote:

> 
> On Nov 1, 2013, at 2:04 PM, Mouse  wrote:
> 
>> ...
>> But it still may not work in the sense of living up to the expectations
>> people have come to have for PPS on serial ports.
>> 
>> My worry is not that it's not the best time available in some
>> circumstances.  My worry is that putting it into the tree will lead to
>> its getting used as if it were as good as PPS on anything else, leading
>> both to timeservers that claim stratum 1 but give bad chime and to
>> people blaming NetBSD for its crappy PPS support when the real problem
>> is that they don't understand the USB issues and it _looks_ like any
>> other PPS support until you test the resulting time carefully.
> 
> Not just PPS on serial ports, but PPS on other hardware.
> 
> I don't know this API.  But my first reaction when I saw the designation 
> "PPS" is to think of GPS timekeeping boxes and other precision frequency 
> sources that have a PPS output.  On those devices, the PPS output is divided 
> down from the main oscillator frequency, i.e., you can expect accuracies of 
> 10^-9 for modest price crystal oscillators, 10^-10 to 10^-12 for higher end 
> stuff -- and jitter in the nanosecond range or better.
> 
> It seems rather confusing to have another interface that goes by the same 
> name but has specs 6 or more orders of magnitude worse.  How about a 
> different name that avoids this confusion?

Just because the signal has an Allen Variance of 10^-10 doesn't mean that 
you'll be able to measure each pulse with that precision, or that the tau of 
that figure is 1s... Most common time counter hardware in SoCs and the like is 
good to anywhere from hundreds of microseconds to tens of nano seconds. 
Hundreds of microseconds isn't much worse than the millisecondish USB accuracy. 
The PPS API even allows for an estimate of the accuracy of the measurements, 
IIRC, but that may be a higher-level facility of NTP (it has been a few years 
since I've done this stuff professionally). I don't think there will be any 
confusion at all, especially if the measured accuracy and variance of this 
facility is documented.

1ms is quite accurate enough for NTP though. NTP has trouble on the network 
getting below 1ms of accuracy, especially when there are any hops at all in the 
topology. It won't be the best NTP server in the world, but it will be accurate 
enough for most things. If you need more accuracy, get better hardware..

To those saying 'fix NMEA mode to be better': You can't. The characters that 
spit this code out aren't guaranteed to be at top of second any more than 
approximately...The exact timing varies from receiver to receiver, and if USB 
is involved, the same silly delays are present there too, only worse because 
the message spans USB packets (or likely would since it is just short of 100 
characters long IIRC)... And even if you get those issues out of the way, I 
also believe there's ambiguity in the NMEA standard between the 'on time' point 
for the NMEA messages. Is it the start of the message, the end? Is is the first 
transition of the first bit of the message, or the end of the first character?  
Since it isn't considered a precision signal, nobody times it exactly (or 
didn't a few years ago). It is useful, at best, for knowing what time the 
external PPS is about to be or just was...

So adding support to ucom isn't a horrible idea, as long as expectations are 
managed...

Warner

Re: MACHINE_ARCH on NetBSD/evbearmv6hf-el current

2013-11-05 Thread Ryo ONODERA
From: Warner Losh , Date: Tue, 5 Nov 2013 09:05:01 -0700

> 
> On Oct 26, 2013, at 12:24 PM, Alistair Crooks wrote:
> 
>> On Sat, Oct 26, 2013 at 11:10:52AM -0700, Matt Thomas wrote:
>>> 
>>> On Oct 26, 2013, at 10:54 AM, Izumi Tsutsui  wrote:
>>> 
>> By static MACHINE_ARCH, or dynamic sysctl(3)?
>> If dynamic sysctl(3) is prefered, which node?
> 
> hw.machine_arch
> 
> which has been defined for a long long time.
 
 Yes, defined before sf vs hf issue arised, and
 you have changed the definition (i.e. make it dynamic)
 without public discussion.  That's the problem.
>>> 
>>> It was already dynamic (it changes for compat_netbsd32).
>> 
>> Whether or when it's dynamic or not, it would be great if you could
>> fix it so that binary packages can be used.
>> 
>> And Tsutsui-san is right - public discussion needs to take place, and
>> consumers made aware, before these kind of changes are made.
> 
> I don't see any further emails on this thread. Was there ever a resolution, 
> or just crickets?

Hi,

It seems that this commit solve the problem.
http://mail-index.netbsd.org/source-changes/2013/10/26/msg048721.html

But no explanation and feedback yet.

--
Ryo ONODERA // ryo...@yk.rim.or.jp
PGP fingerprint = 82A2 DC91 76E0 A10A 8ABB  FD1B F404 27FA C7D1 15F3


Re: pulse-per-second API status

2013-11-05 Thread Warner Losh

On Nov 2, 2013, at 1:33 AM, Alan Barrett wrote:

> On Fri, 01 Nov 2013, Greg Troxel wrote:
>>> But if NetBSD enables PPS on ucom, there's going to be an expectation that 
>>> it is good enough for stratum-1 timekeeping, like PPS on real serial ports.
>> 
>> I don't think there's any such expectation created.
>> [...]
>> People who expect the same as serial PPS are confused, and we are not 
>> responsible for that.
> 
> I think that PPS on a device with very high "interrupt" latency is 
> sufficiently similar to PPS on a device with low interrupt latency that it 
> deserves to have the same API.  I don't think it even needs a sysctl to 
> enable it.
> 
> I think that it just needs careful documentation, in ucom(4) and wherever we 
> document the PPS API.  Maybe the documentation for applications like ntpd 
> should also warn against using PPS on USB interfaces.

It isn't the latency that's the problem with the interrupt even. A 2ms latency 
that has a variance of 10ns is much much better for time keeping than a 10us 
latency with a 1us variance. Variance of the interrupt latency is the killer, 
since the on-time point can be calibrated and systemic delays can be 
compensated for rather easily.

Warner



Re: MACHINE_ARCH on NetBSD/evbearmv6hf-el current

2013-11-05 Thread Warner Losh

On Oct 26, 2013, at 12:24 PM, Alistair Crooks wrote:

> On Sat, Oct 26, 2013 at 11:10:52AM -0700, Matt Thomas wrote:
>> 
>> On Oct 26, 2013, at 10:54 AM, Izumi Tsutsui  wrote:
>> 
> By static MACHINE_ARCH, or dynamic sysctl(3)?
> If dynamic sysctl(3) is prefered, which node?
 
 hw.machine_arch
 
 which has been defined for a long long time.
>>> 
>>> Yes, defined before sf vs hf issue arised, and
>>> you have changed the definition (i.e. make it dynamic)
>>> without public discussion.  That's the problem.
>> 
>> It was already dynamic (it changes for compat_netbsd32).
> 
> Whether or when it's dynamic or not, it would be great if you could
> fix it so that binary packages can be used.
> 
> And Tsutsui-san is right - public discussion needs to take place, and
> consumers made aware, before these kind of changes are made.

I don't see any further emails on this thread. Was there ever a resolution, or 
just crickets?

Warner



Changing __USING_TOPDOWN_VM to a runtime decision

2013-11-05 Thread Martin Husemann
Hey folks,

I would like to change the current (mostly) compile time decision
wether we will use top-down VA layout for userland processes to a
runtime check.

This allows emulations to disable it, and also allows MD code to recognize
binaries not suitable for topdown VM layout and give those binaries the
old layout.

The latter point is what I actually need: on sparc64 we have compiled most
code in the "medlow" code model, which does not allow big addresses. I am
about to commit changes that switch this default and properly mark new
binaries. To still allow running old binaries, I need something like the
attached patch.

The patch is mostly straight forward: I define a new flag EXEC_TOPDOWN_VM,
initialized by default according to __USING_TOPDOWN_VM, but overridable
by a MD function. This way the exec_package carries over the information,
wether we will use topdown-vm for the to-be-loaded binary.

Most other changes are mechanical, like pass through this information through
a few uvm layers.

For architectures already using topdown-VM, no change is intended.

Comments?

Martin
Index: kern/exec_elf.c
===
RCS file: /cvsroot/src/sys/kern/exec_elf.c,v
retrieving revision 1.49
diff -u -p -r1.49 exec_elf.c
--- kern/exec_elf.c 5 Nov 2013 14:26:19 -   1.49
+++ kern/exec_elf.c 5 Nov 2013 14:27:22 -
@@ -422,14 +422,15 @@ elf_load_file(struct lwp *l, struct exec
p = l->l_proc;
 
KASSERT(p->p_vmspace);
-   if (__predict_true(p->p_vmspace != proc0.p_vmspace))
+   if (__predict_true(p->p_vmspace != proc0.p_vmspace)) {
use_topdown = p->p_vmspace->vm_map.flags & VM_MAP_TOPDOWN;
-   else
+   } else {
 #ifdef __USING_TOPDOWN_VM
-   use_topdown = true;
+   use_topdown = !!(epp->ep_flags & EXEC_TOPDOWN_VM);
 #else
use_topdown = false;
 #endif
+   }
 
/*
 * 1. open file
Index: kern/kern_exec.c
===
RCS file: /cvsroot/src/sys/kern/kern_exec.c,v
retrieving revision 1.363
diff -u -p -r1.363 kern_exec.c
--- kern/kern_exec.c12 Sep 2013 19:01:38 -  1.363
+++ kern/kern_exec.c5 Nov 2013 14:27:22 -
@@ -112,6 +112,15 @@ __KERNEL_RCSID(0, "$NetBSD: kern_exec.c,
 
 #include 
 
+#ifndef MD_TOPDOWN_INIT
+#error MD_TOPDOWN_INIT fehlt
+#ifdef __USING_TOPDOWN_VM
+#defineMD_TOPDOWN_INIT(epp)(epp)->ep_flags |= EXEC_TOPDOWN_VM
+#else
+#defineMD_TOPDOWN_INIT(epp)
+#endif
+#endif
+
 static int exec_sigcode_map(struct proc *, const struct emul *);
 
 #ifdef DEBUG_EXEC
@@ -653,6 +662,7 @@ execve_loadvm(struct lwp *l, const char 
data->ed_pack.ep_vmcmds.evs_used = 0;
data->ed_pack.ep_vap = &data->ed_attr;
data->ed_pack.ep_flags = 0;
+   MD_TOPDOWN_INIT(&data->ed_pack);
data->ed_pack.ep_emul_root = NULL;
data->ed_pack.ep_interp = NULL;
data->ed_pack.ep_esch = NULL;
@@ -933,10 +943,12 @@ execve_runproc(struct lwp *l, struct exe
 */
if (is_spawn)
uvmspace_spawn(l, data->ed_pack.ep_vm_minaddr,
-   data->ed_pack.ep_vm_maxaddr);
+   data->ed_pack.ep_vm_maxaddr,
+   !!(data->ed_pack.ep_flags & EXEC_TOPDOWN_VM));
else
uvmspace_exec(l, data->ed_pack.ep_vm_minaddr,
-   data->ed_pack.ep_vm_maxaddr);
+   data->ed_pack.ep_vm_maxaddr,
+   !!(data->ed_pack.ep_flags & EXEC_TOPDOWN_VM));
 
/* record proc's vnode, for use by procfs and others */
 if (p->p_textvp)
Index: kern/kern_proc.c
===
RCS file: /cvsroot/src/sys/kern/kern_proc.c,v
retrieving revision 1.189
diff -u -p -r1.189 kern_proc.c
--- kern/kern_proc.c25 Oct 2013 15:52:57 -  1.189
+++ kern/kern_proc.c5 Nov 2013 14:27:22 -
@@ -483,7 +483,13 @@ proc0_init(void)
 * share proc0's vmspace, and thus, the kernel pmap.
 */
uvmspace_init(&vmspace0, pmap_kernel(), round_page(VM_MIN_ADDRESS),
-   trunc_page(VM_MAX_ADDRESS));
+   trunc_page(VM_MAX_ADDRESS),
+#ifdef __USING_TOPDOWN_VM
+   true
+#else
+   false
+#endif
+   );
 
/* Initialize signal state for proc0. XXX IPL_SCHED */
mutex_init(&p->p_sigacts->sa_mutex, MUTEX_DEFAULT, IPL_SCHED);
Index: sys/exec.h
===
RCS file: /cvsroot/src/sys/sys/exec.h,v
retrieving revision 1.141
diff -u -p -r1.141 exec.h
--- sys/exec.h  30 Oct 2013 23:32:30 -  1.141
+++ sys/exec.h  5 Nov 2013 14:27:23 -
@@ -226,6 +226,7 @@ struct exec_package {
 #defineEXEC_DESTR  0x0010  /* destructive ops performed */
 #defineEXEC_32 0x0020  /* 32-bit binary emulation */
 #defineEXEC_FORCEAUX   0x0040  /