Re: Heads up: Recent status: emacs24/25 FTBFS since a long time on GNU/Hurd

2016-12-10 Thread Samuel Thibault
Hello,

Svante Signell, on Sat 10 Dec 2016 20:52:20 +0100, wrote:
> On Thu, 2016-12-08 at 16:32 +0100, Richard Braun wrote:
> > On Thu, Dec 08, 2016 at 03:40:34PM +0100, Svante Signell wrote:
> > 
> > > OK! Then maybe the sbrk() feature should be flagged as not
> > > available in order
> > > not to fool configure and the compiler. In fact FreeBSD/arm64 did
> > > exactly that,
> > > see https://debbugs.gnu.org/cgi/bugreport.cgi?bug=24892 So that
> > > platform is on
> > > the same par as GNU/Hurd then. On all other supported platforms
> > > emacs builds and
> > > runs perfectly, though.
> 
> Samuel. It seems like sbrk() is still needed for elf/dl-sysdep.c, and
> cannot be removed easily. Why, when malloc() is the preferred usage?

See the comment. The allocation there *has* to be exactly at that
address.

> Otherwise sbrk for Hurd should be removed completely, since it does not
> work as expected.

No need to remove it completely: you could just remove the sbrk/brk
aliases:

weak_alias (__brk, brk)

and

weak_alias (__sbrk, sbrk)

dl-sysdep.c will still be able to use __sbrk.

I'm however concerned with breaking all applications which make use of
brk/sbrk that way.  Couldn't emacs be made to know that it shouldn't use
sbrk on GNU/Hurd?

> I saw your patch wrt PIE builds, that feels like a brown paper bag
> fix!

Which patch?  What is the relation with the problem at stake?

> > Then find out how they do it, and see if we can do the same.
> 
> Richard: Any ideas on where to start? I patched brk/sbrk to return
> proper error codes, but to no avail.

emacs' configure probably doesn't check for the actual value being
returned. Just try to remove the aliases. But really I don't think we
want to do that, but rather hardcode in emacs that sbrk shouldn't be
used on GNU/Hurd.

> And, BTW, linux and kfreebsd use
> the implementations in misc, but not Hurd, why?

See the code, it's just a wrapper around __brk.

Samuel



Re: Heads up: Recent status: emacs24/25 FTBFS since a long time on GNU/Hurd

2016-12-10 Thread Svante Signell
On Thu, 2016-12-08 at 16:32 +0100, Richard Braun wrote:
> On Thu, Dec 08, 2016 at 03:40:34PM +0100, Svante Signell wrote:
> 
> > OK! Then maybe the sbrk() feature should be flagged as not
> > available in order
> > not to fool configure and the compiler. In fact FreeBSD/arm64 did
> > exactly that,
> > see https://debbugs.gnu.org/cgi/bugreport.cgi?bug=24892 So that
> > platform is on
> > the same par as GNU/Hurd then. On all other supported platforms
> > emacs builds and
> > runs perfectly, though.

Samuel. It seems like sbrk() is still needed for elf/dl-sysdep.c, and
cannot be removed easily. Why, when malloc() is the preferred usage?

Otherwise sbrk for Hurd should be removed completely, since it does not
work as expected. And it does definitely not work in the emacs case. I
saw your patch wrt PIE builds, that feels like a brown paper bag fix!

> Then find out how they do it, and see if we can do the same.

Richard: Any ideas on where to start? I patched brk/sbrk to return
proper error codes, but to no avail. And, BTW, linux and kfreebsd use
the implementations in misc, but not Hurd, why?




Re: Heads up: Recent status: emacs24/25 FTBFS since a long time on GNU/Hurd

2016-12-08 Thread Svante Signell
On Thu, 2016-12-08 at 14:47 +0100, Richard Braun wrote:
> On Thu, Dec 08, 2016 at 10:44:09AM +0100, Svante Signell wrote:
> > Since a long time emacs FTBFS due to unknown reasons. The latest version
> > building was Debian 24.5+1-5, from 28 Nov 2015.
> 
> As already mentioned, the real issue is in Emacs. See the relevant
> LWN article [1] for details.

I've read it, thanks! I think emacs is in a similar situation as Hurd with
respect to the still missing mlockall/munlockall functions.

> > Even before successful builds were by pure luck. One suspicious issue is
> > that emacs use sbrk() for memory allocation, right? Notably sbrk() is not
> > fool-proof as implemented for Hurd in glibc: Is it or is it not?
> 
> No it's not. And this is the real point I want to make in this message.
> 
> For performance reasons, the implementation of virtual memory maps
> was changed in GNU Mach [2]. 

> GNU Mach
> used to implement a bottom-up policy prior to this change, on which
> the sbrk() implementation relied. This is no longer true. Mappings can
> be created anywhere in the map, and it turns out that some are created
> immediately following the heap, preventing sbrk() from doing its job.

OK! Then maybe the sbrk() feature should be flagged as not available in order
not to fool configure and the compiler. In fact FreeBSD/arm64 did exactly that,
see https://debbugs.gnu.org/cgi/bugreport.cgi?bug=24892 So that platform is on
the same par as GNU/Hurd then. On all other supported platforms emacs builds and
runs perfectly, though.

> Now, we could fix this by doing the same thing Linux does, but as usual,
> someone has to do it, and the benefits aren't that big. Our virtual
> spaces have never been as tidy as on other systems, and it doesn't
> prevent most programs from working just fine. It usually matters for
> things like emulators, such as Wine,

Wine already builds and works on Hurd when I last tested it. Do you mean it does
not work any longer, it's been some time since I tried it?

> or debuggers like Valgrind,

There are efforts to port Valgrind to Hurd too, It's even in the TODO project
list. What to do, remove that task from the list?

> which
> have strong requirements regarding address values, but we currently
> don't build them because other dependencies are missing.

See above.

> In other words, we can't go back for now, especially not just to make
> an obsolete interface used by only one piece of software that is
> likely to fix the issue soon. Unless you really want to work on those
> red-black trees, stop pinging about this issue.

The current discussion on emacs-devel reveals that unexec will not be replaced
in the near future. The main discussion is about if the dumper should be written
in C or elisp. Until then, unexec will probably not be removed from emacs
upstream. Maybe not until the needed features are removed from glibc. (similar
situation as mlockall/munlockall on Hurd)

And: Consider vi* not building for more than a year, due to the same reason,
what would you have done then? Not everybody use vi* as their editor.




Re: Heads up: Recent status: emacs24/25 FTBFS since a long time on GNU/Hurd

2016-12-08 Thread Richard Braun
On Thu, Dec 08, 2016 at 10:44:09AM +0100, Svante Signell wrote:
> Since a long time emacs FTBFS due to unknown reasons. The latest version
> building was Debian 24.5+1-5, from 28 Nov 2015.

As already mentioned, the real issue is in Emacs. See the relevant
LWN article [1] for details.

> Even before successful builds were by pure luck. One suspicious issue is that
> emacs use sbrk() for memory allocation, right? Notably sbrk() is not 
> fool-proof
> as implemented for Hurd in glibc: Is it or is it not?

No it's not. And this is the real point I want to make in this message.

For performance reasons, the implementation of virtual memory maps
was changed in GNU Mach [2]. Basically, VM maps maintain a red-black
tree of holes for O(log(n)) allocations of virtual memory. This is what
Linux does too, except they have "augmented" their red-black tree to
use the one that already stores entries, instead of adding one for
holes.

It also works a bit differently because it still allows
bottom-up / top-down allocations. That's the big difference. GNU Mach
used to implement a bottom-up policy prior to this change, on which
the sbrk() implementation relied. This is no longer true. Mappings can
be created anywhere in the map, and it turns out that some are created
immediately following the heap, preventing sbrk() from doing its job.

Now, we could fix this by doing the same thing Linux does, but as usual,
someone has to do it, and the benefits aren't that big. Our virtual
spaces have never been as tidy as on other systems, and it doesn't
prevent most programs from working just fine. It usually matters for
things like emulators, such as Wine, or debuggers like Valgrind, which
have strong requirements regarding address values, but we currently
don't build them because other dependencies are missing.

On the other hand, the performance improvement is really, really needed.
It is now common, even on monolithic systems, to have processes with
thousands of entries in their address space, but keep in mind the Hurd
is a multi-server system, which basically means that what would be in
the kernel for Linux is in a userspace process on the Hurd, and with
the improvements on the page cache, it's actually normal for an ext2fs
server to have between 1k and 10k map entries, sometimes more. Here
is an example, run on one of our buildd systems :

# vminfo -v 410 | wc -l
20157

And that's one file system instance. At the time, the system was using
a total of around 56k map entries.

In other words, we can't go back for now, especially not just to make
an obsolete interface used by only one piece of software that is
likely to fix the issue soon. Unless you really want to work on those
red-black trees, stop pinging about this issue.

-- 
Richard Braun

[1] https://lwn.net/Articles/673724/
[2] 
https://git.sceen.net/hurd/gnumach.git/commit/?id=1db202eb8406785500f1bc3ceef7868566e416a1



Heads up: Recent status: emacs24/25 FTBFS since a long time on GNU/Hurd

2016-12-08 Thread Svante Signell
Hello bug-hurd ML,

Since a long time emacs FTBFS due to unknown reasons. The latest version
building was Debian 24.5+1-5, from 28 Nov 2015.

Even before successful builds were by pure luck. One suspicious issue is that
emacs use sbrk() for memory allocation, right? Notably sbrk() is not fool-proof
as implemented for Hurd in glibc: Is it or is it not?
Use of sbrk is found in files alloc.c, unexelf.c and gmalloc.c, which are
all compiled. Avoiding compilation of ralloc.c with 0001-Default-REL_ALLOC-to-
no.patch did not improve the situation.

First time I compiled emacs 25.1 from upstream it passed, second time not.
Compiling Debian versions always fails now (almost always a year ago, see
below). Mostly the build fails with temacs failing to execute: Killed or dumping
core depending on crsh server settings. In my oponion it's a real loss not to
gave a modern version of emacs25 available for use in GNU/Hurd (not everybody
use vi).

As written on IRC yesterday:
(23:17:22) srs: braunr: I'd like to add a few facts about emacs building on
GNU/Hurd: Latest successful build was 24.1+1-5 on 28 Nov 2015.
(23:17:22) srs: Installed then was: glibc-2.19-22, hurd-0.7-1, gnumach-1.6-1.
Trying to build that version now fails miserably,
(23:17:22) srs: (I just did that as well as on Nov 11 2016) so it is not only
not yet supported causing the build failure :(
(23:17:22) srs: The missing functions were not implemented  a year ago either.

And here is the latest gdb trace using the core file:
(unfortunately the rpctrace output is too large to include here)

emacs25-25.1+1/debian/build-x$ gdb ./src/bootstrap-emacs -c lisp/core
...
warning: core file may not match specified executable file.
[New process 24043]
[New process 1]

warning: Unexpected size of section `.reg2/24043' in core file.
Core was generated by `../src/bootstrap-emacs -batch --no-site-lisp --no-site-
file --eval (setq max-lis'.
Program terminated with signal SIGSEGV, Segmentation fault.

warning: Unexpected size of section `.reg2/24043' in core file.
#0  0x081a5c32 in backtrace_top () at eval.c:184
184 eval.c: No such file or directory.
[Current thread is 1 (process 24043)]
(gdb) info thre
  Id   Target Id Frame 
* 1process 24043 0x081a5c32 in backtrace_top () at eval.c:184
  2process 1 warning: Unexpected size of section `.reg2/1' in core
file.
0x0520c9ac in ?? ()
(gdb) thread apply all bt full

Thread 2 (process 1):
#0  0x0520c9ac in ?? ()
No symbol table info available.

Thread 1 (process 24043):
#0  0x081a5c32 in backtrace_top () at eval.c:184
pdl = 0x358d78
#1  near_C_stack_top () at eval.c:202
No locals.
#2  0x0814aaed in stack_overflow (siginfo=0x84d573c )
at sysdep.c:1659
addr = 0x8d31cd8 
bot = 0x98128c7 ""
top = 
#3  handle_sigsegv (sig=11, siginfo=0x84d573c , 
arg=0x84d5548 ) at sysdep.c:1691
fatal = false
#4  0x0522da72 in ?? ()
No symbol table info available.
#5  0x000b in ?? ()
No symbol table info available.
#6  0x084d573c in sigsegv_stack ()
No symbol table info available.
#7  0x084d5548 in sigsegv_stack ()
No symbol table info available.
#8  0x05257600 in ?? ()
No symbol table info available.
#9  0x0522da76 in ?? ()
No symbol table info available.
#10 0x084d5460 in sigsegv_stack ()
No symbol table info available.
#11 0x0001 in ?? ()
No symbol table info available.
#12 0x in ?? ()
No symbol table info available.
#0  0x0520c9ac in ?? ()
  Id   Target Id Frame 
  1process 24043 0x081a5c32 in backtrace_top () at eval.c:184
* 2process 1 0x0520c9ac in ?? ()