Re: Page fault outside of application

2018-01-23 Thread Geraldo Netto
Hello Rick, Rick, could you please, provide the full output with the -V ? eg: scripts/run.py -V I may be wrong but erlexec may not work in OSv because OSv does not provide fork(), execXX(), ... also, if I'm not mistaken, elf support is incomplete which means you can only load native software in

Re: Page fault outside of application

2018-01-23 Thread Nadav Har'El
On Tue, Jan 23, 2018 at 12:40 PM, Rick Payne wrote: > > A few moving parts, so not sure what is causing this - but trying to start > an erlang application I'm seeing this: > I don't have any bright ideas, but just a few small comments below, hopefully (?) they will help something... > eth0: 19

Re: Page fault outside of application

2018-01-24 Thread Rick Payne
Hi, On 23/01/18 20:16, Nadav Har'El wrote: I don't have any bright ideas, but just a few small comments below, hopefully (?) they will help something... Appreciated... This writes in "addr", which seems a reasonable address (doesn't seem like junk). In object::resolve_pltgot() you can see th

Re: Page fault outside of application

2018-01-24 Thread Rick Payne
Hi Geraldo, On 23/01/18 19:58, Geraldo Netto wrote: Hello Rick, Rick, could you please, provide the full output with the -V ? eg: scripts/run.py -V Its a custom build, I'm running it via qemu direct. I may be wrong but erlexec may not work in OSv because OSv does not provide fork(), execX

Re: Page fault outside of application

2018-01-24 Thread Rick Payne
On 24/01/18 17:09, Rick Payne wrote: Hi Geraldo, On 23/01/18 19:58, Geraldo Netto wrote: Hello Rick, Rick, could you please, provide the full output with the -V ? eg: scripts/run.py -V Its a custom build, I'm running it via qemu direct. Here it is: qemu-system-x86_64: -mon chardev=stdio,

Re: Page fault outside of application

2018-01-28 Thread Rick Payne
> On 24 Jan 2018, at 22:07, Rick Payne wrote: > > I don't believe so. I think this is right where erlexec is being started. > I'll work on verifying that now. I fixed the problem by recompiling my Erlang ERTS system using gcc 6.2. Ubuntu 17.10 has 7.2 which seems to be the issue. I did try 6

Re: Page fault outside of application

2018-01-29 Thread Nadav Har'El
On Wed, Jan 24, 2018 at 11:07 AM, Rick Payne wrote: > Hi, > > On 23/01/18 20:16, Nadav Har'El wrote: > >> I don't have any bright ideas, but just a few small comments below, >> hopefully (?) they will help something... >> > > Appreciated... > > This writes in "addr", which seems a reasonable addr

Re: Page fault outside of application

2018-01-29 Thread Rick Payne
On Mon, 2018-01-29 at 10:54 +0200, Nadav Har'El wrote: > This all seems reasonable. > Maybe we somehow got the PLT becoming read-only, so we are getting a > pagefault trying to write to it? > Can you please try in gdb "osv mmap" and look at the mapping which > includes the faulting address (0x1

Re: Page fault outside of application

2018-01-29 Thread Nadav Har'El
On Mon, Jan 29, 2018 at 11:20 AM, Rick Payne wrote: > On Mon, 2018-01-29 at 10:54 +0200, Nadav Har'El wrote: > > This all seems reasonable. > Maybe we somehow got the PLT becoming read-only, so we are getting a > pagefault trying to write to it? > Can you please try in gdb "osv mmap" and look at

Re: Page fault outside of application

2018-01-29 Thread Rick Payne
On Mon, 2018-01-29 at 11:43 +0200, Nadav Har'El wrote: > > Hmm, I don't know, I wasn't aware anything like that changed. > We usually change parts of the object marked by PT_GNU_RELRO to read- > only in object::fix_permissions(), I'm guessing (but didn't check) > this what caused the read-only pag

Re: Page fault outside of application

2018-01-29 Thread Nadav Har'El
On Mon, Jan 29, 2018 at 12:16 PM, Rick Payne wrote: > > > Only when "-z now" is used during linking (DT_BIND_NOW object flag) > > do we do all the function lookups on startup (see > > object::relocate_pltgot()) and then, it's ok that the .GOT.PLT is > > also marked RELRO and made read-only. > > >

Re: Page fault outside of application

2018-01-29 Thread Rick Payne
On Mon, 2018-01-29 at 12:27 +0200, Nadav Har'El wrote: > Both versions used "-pie", not "-shared"? Should be, yes. Its exactly the same build setup and the Makefile shows '-pie' for LDFLAGS. I don't think gcc7.2 contains any of the -mindirect-branch changes, so thats a red-herring. I'll continue

Re: Page fault outside of application

2018-01-29 Thread Nadav Har'El
On Mon, Jan 29, 2018 at 12:16 PM, Rick Payne wrote: > > Maybe I'm not following. The GNU_RELO sections look the same between > the 2 versions of erlexec. First one (-ubuntu17.10) fails, second one > is fine: > > rickp@mo:~$ readelf --headers /usr/local/packages/OTP-20.0.5-OSv- > ubuntu17.10/erts-

Re: Page fault outside of application

2018-01-29 Thread Rick Payne
On Mon, 2018-01-29 at 11:43 +0200, Nadav Har'El wrote: > 1. Your compiler defaults to "full relro" (-Wl,-z,now -Wl,-z,relro) > but for some reason object::relocate_pltgot() doesn't recognize the > bind_now. FWIW, on both workign and non-working builds, I see '-pie -z now -z relro' being passed to

Re: Page fault outside of application

2018-01-30 Thread Nadav Har'El
On Mon, Jan 29, 2018 at 3:51 PM, Nadav Har'El wrote: > > On Mon, Jan 29, 2018 at 12:16 PM, Rick Payne wrote: > >> >> Maybe I'm not following. The GNU_RELO sections look the same between >> the 2 versions of erlexec. First one (-ubuntu17.10) fails, second one >> is fine: >> >> rickp@mo:~$ readelf

Re: Page fault outside of application

2018-01-30 Thread Rick Payne
On Tue, 2018-01-30 at 11:47 +0200, Nadav Har'El wrote: > I have a vague feeling that fix_permissions() cannot just work on the > whole object it needs to know which of the PT_LOAD segments (see > file::load_segment()) the RELRO falls in, but I'm hazy on the > details. Maybe even file::load_segment(