Re: panic: mutex pmap not owned at ... efirt_machdep.c:255

2018-08-06 Thread Eitan Adler
On Mon, 6 Aug 2018 at 11:27, Kyle Evans  wrote:
>
> On Sun, Aug 5, 2018 at 5:43 AM, Konstantin Belousov  
> wrote:
> > On Sat, Aug 04, 2018 at 09:46:39PM -0500, Kyle Evans wrote:
> >>
> >> He now gets a little further, but ends up with the same panic due to
> >> efirtc_probe trying to get time to verify the rtc's actually
> >> implemented. What kind of approach must we take to ensure curcpu is
> >> synced?
> >
> > It does not panic for me, when I load efirt.ko from the loader prompt.
> > Anyway, try this
>
> Right, I also don't get a panic on any of my machines from this.
> Hopefully he'll have a chance to try this soon.

This change has no impact: it still panics in the same way as without the patch.


-- 
Eitan Adler
Source, Ports, Doc committer
Bugmeister, Ports Security teams
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: programs like gdb core dump

2018-08-06 Thread Allan Jude
On 2018-08-06 23:11, Erich Dollansky wrote:
> Hi,
> 
> On Mon, 6 Aug 2018 15:57:53 -0700
> John Baldwin  wrote:
> 
>> On 8/4/18 4:38 PM, Erich Dollansky wrote:
>>> Hi,
>>>
>>> I compiled me yesterday this system:
>>>
>>> 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r337285:
>>>
>>> When restarting fortune core dumps. When trying to load the core
>>> dump, gdb core dumps.
>>>
>>> The message is always:
>>>
>>> Bad system call (core dumped)
>>>
>>> Trying to install ports results in the same effect.
>>>
>>> Erich  
>>
>> Did you upgrade from stable/11 with a world that is still stable/11?
>> If so, did you make sure your kernel config includes COMPAT_FREEBSD11?
>> (GENERIC should include this)
>>
> 
> I never have had a machine running 11. This machine is on 12 since 2 or
> 3 years. I will check if this configuration was properly set on that
> machine.
> 
> Thanks!
> 
> Erich
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
> 

compare the output of: `uname -K` and `uname -U`

-- 
Allan Jude



signature.asc
Description: OpenPGP digital signature


Re: programs like gdb core dump

2018-08-06 Thread Erich Dollansky
Hi,

On Mon, 6 Aug 2018 15:57:53 -0700
John Baldwin  wrote:

> On 8/4/18 4:38 PM, Erich Dollansky wrote:
> > Hi,
> > 
> > I compiled me yesterday this system:
> > 
> > 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r337285:
> > 
> > When restarting fortune core dumps. When trying to load the core
> > dump, gdb core dumps.
> > 
> > The message is always:
> > 
> > Bad system call (core dumped)
> > 
> > Trying to install ports results in the same effect.
> > 
> > Erich  
> 
> Did you upgrade from stable/11 with a world that is still stable/11?
> If so, did you make sure your kernel config includes COMPAT_FREEBSD11?
> (GENERIC should include this)
> 

I never have had a machine running 11. This machine is on 12 since 2 or
3 years. I will check if this configuration was properly set on that
machine.

Thanks!

Erich
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: A head buildworld race visible in the ci.freebsd.org build history

2018-08-06 Thread Li-Wen Hsu
On Thu, Jun 21, 2018 at 10:49 PM Mark Millard  wrote:
> Has the range r328278 < PROBLEM_START <= r330304 been narrowed down
> some more?
>
> (I'm just curious were the problem started.)

After several rounds of binary search, I found it might have something
todo with r329625.

The only thing I think this commit related to the situation we met is
it touched the code for doing unmount.  But I cannot confirm if it is
the cause.

It is a bit tricky to reproduce.  I will try to keep it concise.

We do builds for head in a jail (11.2-RELEASE) on a -CURRENT host.
The jail is on a
dedicated zfs.  And there is a daemon doing jail/zfs cleanup running
outside of the jail.

In some edge cases, that cleanup daemon wants to destroy the zfs of
the jail in which a build is still running.  If that happens, with an
earlier -CURRENT, it should just get "cannot unmount
'/jenkins/jails/test-ranlib': Device busy" and nothing serious will
happen.  Recently, although it still didn't destroy the
busy zfs, it started causing build error out with "ranlib: fatal:
Failed to open 'libXXX.a'"

To reproduce this, create a zfs and use that as the root of a jail,
run this build script under /usr/src inside the jail:

https://gist.github.com/lwhsu/ae3b8b1f0c856837f93984ab2493f629#file-build-sh

Run this cleanup script on the host:

https://gist.github.com/lwhsu/ae3b8b1f0c856837f93984ab2493f629#file-clean-test-ranlib-sh
(need to modify the zfs path)

I use powerpcspe as TARGET_ARCH here because it takes a shorter time
in one iteration.  There should be nothing related to the
architectures.

I am not very sure about what is the next step, maybe modifying ranlib
and log more what it gets "fatal: Failed to open 'libxxx.a'"  Any good
idea about debugging this?


Li-wen

--
Li-Wen Hsu 
https://lwhsu.org
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: programs like gdb core dump

2018-08-06 Thread John Baldwin
On 8/4/18 4:38 PM, Erich Dollansky wrote:
> Hi,
> 
> I compiled me yesterday this system:
> 
> 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r337285:
> 
> When restarting fortune core dumps. When trying to load the core dump,
> gdb core dumps.
> 
> The message is always:
> 
> Bad system call (core dumped)
> 
> Trying to install ports results in the same effect.
> 
> Erich

Did you upgrade from stable/11 with a world that is still stable/11?
If so, did you make sure your kernel config includes COMPAT_FREEBSD11?
(GENERIC should include this)

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: OpenVPN produces garbage on TAP on -current

2018-08-06 Thread Lars Schotte
OK, so now I verified that it does happen on:
 CentOS 7 - NetworkManager with OpenVPN plugin
 openSUSE tumbleweed - NetworkManager with OpenVPN plugin

however, I am sure that it also happened at least once with FreeBSD -
FreeBSD bare OpenVPN.

On Mon, 6 Aug 2018 23:06:55 +0200
Lars Schotte  wrote:

> So, now in fact I was able to make a stable connection between 2
> FreeBSD's one of them server running 12-current.
> 
> Now I do not know why the problem did not occur, could be just
> accident or sth. I am starting to suspect that I ran into some
> strange case.
> 
> I will continue looking into it.
> 
> On Mon, 6 Aug 2018 20:01:58 +0200
> Lars Schotte  wrote:
> 
> > Yes, I also have very recent 12-current without change on what SSL
> > library it uses.
> > 
> > However, the problem does not seem to be the SSL library, since it
> > connects and reports no problems at all. The issue here is only that
> > it does not transfer packages correctly.
> > 
> > And I am not ruling out other problems yet. It may also be a problem
> > on the client's side (Linux) with NetworkManager. However, that also
> > uses OpenVPN and I also tested it with FreeBSD <-> FreeBSD both
> > OpenVPN and the issue was the same. Something is wrong here. I am
> > staying in touch with this, but I am not testing 24/7 as I have also
> > other things to do.
> > 
> > On Mon, 6 Aug 2018 16:31:43 +0200 (CEST)
> > Ronald Klop  wrote:
> >   
> > > I'm running a very recent 12-current too and latest openvpn from
> > > pkgs. No problems. I did not change anything in the defaults for
> > > the SSL-library it uses.
> > > 
> > > Ronald.
> > >  
> > > Van: Gleb Popov 
> > > Datum: maandag, 6 augustus 2018 10:30
> > > Aan: Lars Schotte 
> > > CC: FreeBSD Ports , FreeBSD current
> > >  Onderwerp: Re: OpenVPN produces
> > > garbage on TAP on -current
> > > > 
> > > > On Sun, Aug 5, 2018 at 11:51 PM, Lars Schotte 
> > > > wrote: 
> > > > > Here a bit of paste:
> > > > > https://paste.fedoraproject.org/paste/Hn4M2JqZ~5xccLWOVD1xUw/raw
> > > > > just to illustrate how it does not work.
> > > > >
> > > > > TAP device works good inside OS (FreeBSD current) however,
> > > > > everything that comes over OpenVPN is just garbage.
> > > > >  
> > > > 
> > > > I'm using CURRENT from June 10 and tap device works fine for me
> > > > with OpenVPN 2.4.6_1
> > > > 
> > > >   
> > > > > --
> > > > >  Lars Schotte
> > > > >  Mudro?ova 13
> > > > > 92101 Pie??any
> > > > > ___
> > > > > freebsd-po...@freebsd.org mailing list
> > > > > https://lists.freebsd.org/mailman/listinfo/freebsd-ports
> > > > > To unsubscribe, send any mail to
> > > > > "freebsd-ports-unsubscr...@freebsd.org" 
> > > > ___
> > > > freebsd-current@freebsd.org mailing list
> > > > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > > > To unsubscribe, send any mail to
> > > > "freebsd-current-unsubscr...@freebsd.org"
> > > > 
> > > > 
> > > >   
> > 
> > ___
> > freebsd-po...@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-ports
> > To unsubscribe, send any mail to
> > "freebsd-ports-unsubscr...@freebsd.org"  
> 
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to
> "freebsd-current-unsubscr...@freebsd.org"

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Linux process causes kernel panic

2018-08-06 Thread Vladimir Kondratyev
On 8/6/18 11:41 PM, Konstantin Belousov wrote:
>>> linux_sys_futex(0x33b0fac,0x85,0x1,0x1,0x33b0fa8,0x401)
>>> -- here it stops --
>> Can you fix your mail client ?

Unfortunately, it did all that dumb wraps at send time not at edit. Sorry.

>>> ddb also shows that process is looping somewhere inside linux_sys_futex()
>> There are two bugs.  One is that ifuncs handling for relocations against
>> local symbols in elf obj modules was missed.  Patch below fixed it for me.
>>
>> Second bug is that futexes seems to not handle accesses to the CoW
>> mappings which are not yet copied.  I think that the second bug is
>> irrelevant for your case, since it worked before.
>>
>> Try this patch in addition to the linux/ patches I sent before.

It fixed skype for me too! Thank you!


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: OpenVPN produces garbage on TAP on -current

2018-08-06 Thread Lars Schotte
So, now in fact I was able to make a stable connection between 2
FreeBSD's one of them server running 12-current.

Now I do not know why the problem did not occur, could be just accident
or sth. I am starting to suspect that I ran into some strange case.

I will continue looking into it.

On Mon, 6 Aug 2018 20:01:58 +0200
Lars Schotte  wrote:

> Yes, I also have very recent 12-current without change on what SSL
> library it uses.
> 
> However, the problem does not seem to be the SSL library, since it
> connects and reports no problems at all. The issue here is only that
> it does not transfer packages correctly.
> 
> And I am not ruling out other problems yet. It may also be a problem
> on the client's side (Linux) with NetworkManager. However, that also
> uses OpenVPN and I also tested it with FreeBSD <-> FreeBSD both
> OpenVPN and the issue was the same. Something is wrong here. I am
> staying in touch with this, but I am not testing 24/7 as I have also
> other things to do.
> 
> On Mon, 6 Aug 2018 16:31:43 +0200 (CEST)
> Ronald Klop  wrote:
> 
> > I'm running a very recent 12-current too and latest openvpn from
> > pkgs. No problems. I did not change anything in the defaults for the
> > SSL-library it uses.
> > 
> > Ronald.
> >  
> > Van: Gleb Popov 
> > Datum: maandag, 6 augustus 2018 10:30
> > Aan: Lars Schotte 
> > CC: FreeBSD Ports , FreeBSD current
> >  Onderwerp: Re: OpenVPN produces
> > garbage on TAP on -current  
> > > 
> > > On Sun, Aug 5, 2018 at 11:51 PM, Lars Schotte 
> > > wrote:   
> > > > Here a bit of paste:
> > > > https://paste.fedoraproject.org/paste/Hn4M2JqZ~5xccLWOVD1xUw/raw
> > > > just to illustrate how it does not work.
> > > >
> > > > TAP device works good inside OS (FreeBSD current) however,
> > > > everything that comes over OpenVPN is just garbage.
> > > >
> > > 
> > > I'm using CURRENT from June 10 and tap device works fine for me
> > > with OpenVPN 2.4.6_1
> > > 
> > > 
> > > > --
> > > >  Lars Schotte
> > > >  Mudro?ova 13
> > > > 92101 Pie??any
> > > > ___
> > > > freebsd-po...@freebsd.org mailing list
> > > > https://lists.freebsd.org/mailman/listinfo/freebsd-ports
> > > > To unsubscribe, send any mail to
> > > > "freebsd-ports-unsubscr...@freebsd.org"   
> > > ___
> > > freebsd-current@freebsd.org mailing list
> > > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > > To unsubscribe, send any mail to
> > > "freebsd-current-unsubscr...@freebsd.org"
> > > 
> > > 
> > > 
> 
> ___
> freebsd-po...@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-ports
> To unsubscribe, send any mail to
> "freebsd-ports-unsubscr...@freebsd.org"

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Linux process causes kernel panic

2018-08-06 Thread Konstantin Belousov
On Mon, Aug 06, 2018 at 11:37:38PM +0300, Konstantin Belousov wrote:
> On Mon, Aug 06, 2018 at 06:24:43PM +0300, Vladimir Kondratyev wrote:
> > I've got similar panic right after skype start
> > 
> > Disabling of SMAP via loader tunable workarounded the panic for me.
> > 
> > Applying of the patch make skype eating 100%CPU in unkillable state.
> > 
> > tail of ktrace dump
> > 
> >   1238 skype    CALL  linux_gettid
> >   1238 skype    RET   linux_gettid 101123/0x18b03
> >   1238 skype    CALL 
> > linux_sys_futex(0x3301edc,0x84,0x1,0x7fff,0x3301ec0,0x2)
> >   1238 skype    RET   linux_sys_futex 0
> >   1238 skype    CALL  linux_sys_futex(0x33b0fac,0x80,0x1,0,0x33b0f90,0x1)
> >   1238 skype    CALL  linux_sys_futex(0x3301edc,0x80,0x1,0,0x3301ec0,0x1)
> >   1238 skype    RET   linux_sys_futex -1 errno -11 Resource temporarily
> > unavailable
> >   1238 skype    CALL 
> > linux_sys_futex(0x3301ec0,0x81,0x1,0x3301ec0,0x33b02c8,0xc168)
> >   1238 skype    RET   linux_sys_futex 0
> >   1238 skype    CALL 
> > linux_sys_futex(0x33b0fac,0x85,0x1,0x1,0x33b0fa8,0x401)
> > -- here it stops --
> Can you fix your mail client ?
> 
> > ddb also shows that process is looping somewhere inside linux_sys_futex()
> 
> There are two bugs.  One is that ifuncs handling for relocations against
> local symbols in elf obj modules was missed.  Patch below fixed it for me.
> 
> Second bug is that futexes seems to not handle accesses to the CoW
> mappings which are not yet copied.  I think that the second bug is
> irrelevant for your case, since it worked before.
> 
> Try this patch in addition to the linux/ patches I sent before.
Wrong patch, I forgot to commit part of the changes.

diff --git a/sys/kern/link_elf_obj.c b/sys/kern/link_elf_obj.c
index 43f85bd17c9..94d29769142 100644
--- a/sys/kern/link_elf_obj.c
+++ b/sys/kern/link_elf_obj.c
@@ -142,7 +142,7 @@ static int  link_elf_each_function_name(linker_file_t,
 static int link_elf_each_function_nameval(linker_file_t,
linker_function_nameval_callback_t,
void *);
-static int link_elf_reloc_local(linker_file_t);
+static int link_elf_reloc_local(linker_file_t, bool);
 static longlink_elf_symtab_get(linker_file_t, const Elf_Sym **);
 static longlink_elf_strtab_get(linker_file_t, caddr_t *);
 
@@ -441,10 +441,9 @@ link_elf_link_preload(linker_class_t cls, const char 
*filename,
}
 
/* Local intra-module relocations */
-   error = link_elf_reloc_local(lf);
+   error = link_elf_reloc_local(lf, false);
if (error != 0)
goto out;
-
*result = lf;
return (0);
 
@@ -479,13 +478,18 @@ link_elf_link_preload_finish(linker_file_t lf)
ef = (elf_file_t)lf;
error = relocate_file(ef);
if (error)
-   return error;
+   return (error);
 
/* Notify MD code that a module is being loaded. */
error = elf_cpu_load_file(lf);
if (error)
return (error);
 
+   /* Now ifuncs. */
+   error = link_elf_reloc_local(lf, true);
+   if (error != 0)
+   return (error);
+
/* Invoke .ctors */
link_elf_invoke_ctors(lf->ctors_addr, lf->ctors_size);
return (0);
@@ -969,7 +973,7 @@ link_elf_load_file(linker_class_t cls, const char *filename,
}
 
/* Local intra-module relocations */
-   error = link_elf_reloc_local(lf);
+   error = link_elf_reloc_local(lf, false);
if (error != 0)
goto out;
 
@@ -990,6 +994,11 @@ link_elf_load_file(linker_class_t cls, const char 
*filename,
if (error)
goto out;
 
+   /* Now ifuncs. */
+   error = link_elf_reloc_local(lf, true);
+   if (error != 0)
+   goto out;
+
/* Invoke .ctors */
link_elf_invoke_ctors(lf->ctors_addr, lf->ctors_size);
 
@@ -1374,7 +1383,10 @@ elf_obj_lookup(linker_file_t lf, Elf_Size symidx, int 
deps, Elf_Addr *res)
 
/* Quick answer if there is a definition included. */
if (sym->st_shndx != SHN_UNDEF) {
-   *res = sym->st_value;
+   res1 = (Elf_Addr)sym->st_value;
+   if (ELF_ST_TYPE(sym->st_info) == STT_GNU_IFUNC)
+   res1 = ((Elf_Addr (*)(void))res1)();
+   *res = res1;
return (0);
}
 
@@ -1470,7 +1482,7 @@ link_elf_fix_link_set(elf_file_t ef)
 }
 
 static int
-link_elf_reloc_local(linker_file_t lf)
+link_elf_reloc_local(linker_file_t lf, bool ifuncs)
 {
elf_file_t ef = (elf_file_t)lf;
const Elf_Rel *rellim;
@@ -1505,8 +1517,13 @@ link_elf_reloc_local(linker_file_t lf)
/* Only do local relocs */
if (ELF_ST_BIND(sym->st_info) != STB_LOCAL)
continue;
-   elf_reloc_local(lf, base, rel, ELF_RELOC_REL,
-   elf_obj_lookup);
+   

Re: Linux process causes kernel panic

2018-08-06 Thread Konstantin Belousov
On Mon, Aug 06, 2018 at 06:24:43PM +0300, Vladimir Kondratyev wrote:
> I've got similar panic right after skype start
> 
> Disabling of SMAP via loader tunable workarounded the panic for me.
> 
> Applying of the patch make skype eating 100%CPU in unkillable state.
> 
> tail of ktrace dump
> 
>   1238 skype    CALL  linux_gettid
>   1238 skype    RET   linux_gettid 101123/0x18b03
>   1238 skype    CALL 
> linux_sys_futex(0x3301edc,0x84,0x1,0x7fff,0x3301ec0,0x2)
>   1238 skype    RET   linux_sys_futex 0
>   1238 skype    CALL  linux_sys_futex(0x33b0fac,0x80,0x1,0,0x33b0f90,0x1)
>   1238 skype    CALL  linux_sys_futex(0x3301edc,0x80,0x1,0,0x3301ec0,0x1)
>   1238 skype    RET   linux_sys_futex -1 errno -11 Resource temporarily
> unavailable
>   1238 skype    CALL 
> linux_sys_futex(0x3301ec0,0x81,0x1,0x3301ec0,0x33b02c8,0xc168)
>   1238 skype    RET   linux_sys_futex 0
>   1238 skype    CALL 
> linux_sys_futex(0x33b0fac,0x85,0x1,0x1,0x33b0fa8,0x401)
> -- here it stops --
Can you fix your mail client ?

> ddb also shows that process is looping somewhere inside linux_sys_futex()

There are two bugs.  One is that ifuncs handling for relocations against
local symbols in elf obj modules was missed.  Patch below fixed it for me.

Second bug is that futexes seems to not handle accesses to the CoW
mappings which are not yet copied.  I think that the second bug is
irrelevant for your case, since it worked before.

Try this patch in addition to the linux/ patches I sent before.

diff --git a/sys/kern/link_elf_obj.c b/sys/kern/link_elf_obj.c
index 43f85bd17c9..872cb79f38b 100644
--- a/sys/kern/link_elf_obj.c
+++ b/sys/kern/link_elf_obj.c
@@ -142,7 +142,7 @@ static int  link_elf_each_function_name(linker_file_t,
 static int link_elf_each_function_nameval(linker_file_t,
linker_function_nameval_callback_t,
void *);
-static int link_elf_reloc_local(linker_file_t);
+static int link_elf_reloc_local(linker_file_t, bool);
 static longlink_elf_symtab_get(linker_file_t, const Elf_Sym **);
 static longlink_elf_strtab_get(linker_file_t, caddr_t *);
 
@@ -441,7 +441,10 @@ link_elf_link_preload(linker_class_t cls, const char 
*filename,
}
 
/* Local intra-module relocations */
-   error = link_elf_reloc_local(lf);
+   error = link_elf_reloc_local(lf, false);
+   if (error != 0)
+   goto out;
+   error = link_elf_reloc_local(lf, true);
if (error != 0)
goto out;
 
@@ -969,7 +972,7 @@ link_elf_load_file(linker_class_t cls, const char *filename,
}
 
/* Local intra-module relocations */
-   error = link_elf_reloc_local(lf);
+   error = link_elf_reloc_local(lf, false);
if (error != 0)
goto out;
 
@@ -985,6 +988,11 @@ link_elf_load_file(linker_class_t cls, const char 
*filename,
if (error)
goto out;
 
+   /* Now ifuncs. */
+   error = link_elf_reloc_local(lf, true);
+   if (error != 0)
+   goto out;
+
/* Notify MD code that a module is being loaded. */
error = elf_cpu_load_file(lf);
if (error)
@@ -1374,7 +1382,10 @@ elf_obj_lookup(linker_file_t lf, Elf_Size symidx, int 
deps, Elf_Addr *res)
 
/* Quick answer if there is a definition included. */
if (sym->st_shndx != SHN_UNDEF) {
-   *res = sym->st_value;
+   res1 = (Elf_Addr)sym->st_value;
+   if (ELF_ST_TYPE(sym->st_info) == STT_GNU_IFUNC)
+   res1 = ((Elf_Addr (*)(void))res1)();
+   *res = res1;
return (0);
}
 
@@ -1470,7 +1481,7 @@ link_elf_fix_link_set(elf_file_t ef)
 }
 
 static int
-link_elf_reloc_local(linker_file_t lf)
+link_elf_reloc_local(linker_file_t lf, bool ifuncs)
 {
elf_file_t ef = (elf_file_t)lf;
const Elf_Rel *rellim;
@@ -1505,8 +1516,13 @@ link_elf_reloc_local(linker_file_t lf)
/* Only do local relocs */
if (ELF_ST_BIND(sym->st_info) != STB_LOCAL)
continue;
-   elf_reloc_local(lf, base, rel, ELF_RELOC_REL,
-   elf_obj_lookup);
+   if ((ELF_ST_TYPE(sym->st_info) == STT_GNU_IFUNC) ==
+   ifuncs)
+   elf_reloc_local(lf, base, rel, ELF_RELOC_REL,
+   elf_obj_lookup);
+   else if (ifuncs)
+   elf_reloc_ifunc(lf, base, rel, ELF_RELOC_REL,
+   elf_obj_lookup);
}
}
 
@@ -1531,8 +1547,13 @@ link_elf_reloc_local(linker_file_t lf)
/* Only do local relocs */
if (ELF_ST_BIND(sym->st_info) != STB_LOCAL)
continue;
-   elf_reloc_local(lf, base, rela, 

Re: panic after ifioctl/if_clone_destroy

2018-08-06 Thread Matthew Macy
The struct thread is typesafe. The problem is that the link is no longer
typesafe now that it’s not part of the thread. Thanks for pointing this
out. I’ll commit a fix later today.

-M



On Mon, Aug 6, 2018 at 02:39 Hans Petter Selasky  wrote:

> Hi Matthew,
>
> On 08/06/18 10:02, Hans Petter Selasky wrote:
> > - if ((tdwait = TAILQ_FIRST(>er_tdlist)) != NULL &&
> > - TD_IS_RUNNING(tdwait->et_td)) {
>
> At least the TD_IS_RUNNING() check is invalid. The "tdwait" structure is
> in the control of the other CPU and "tdwait->et_td" might be invalid at
> any time, so accessing any members here is not a good idea.
>
> It is pretty clear that the epoch was exited during the loop:
>
>  etd->et_td = (void*)0xDEADBEEF;
>
> fault virtual address   = 0xdeadc2ff
> fault code  = supervisor read data, page not present
>
>
> If you remove the TD_IS_RUNNING() check I'm not sure how useful this
> loop will be ...
>
> --HPS
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: em0 link fail

2018-08-06 Thread R. Tyler Croy
(replies inline)

On Tue, 31 Jul 2018, R. Tyler Croy wrote:

> On Wed, 25 Jul 2018, R. Tyler Croy wrote:
> 
> > On Sun, 15 Jul 2018, Michael Butler wrote:
> > 
> > > On 07/05/18 09:54, I wrote:
> > > > On 07/05/18 09:27, tech-lists wrote:
> > > >> On 03/07/2018 19:47, Michael Butler wrote:
> > > >>> That would've been ..
> > > >>>
> > > >>> Jun  1 09:56:15 toshi kernel: FreeBSD 12.0-CURRENT #35 r334484: Fri 
> > > >>> Jun
> > > >>> 1 08:25:58 EDT 2018
> > > >>>
> > > >>> I'm going to build one with SVN r334862 reverted to see if that works,
> > > >>
> > > >> Is it working now? Am asking because a system I'd like to take from
> > > >> 11-stable to 12 uses the em driver.
> > > > 
> > > > No :-( I haven't had the chance yet to revisit it,
> > > 
> > > As it turns out, SVN r336313 (committed today) solves the issue I was
> > > having with the hardware stalling,
> > 
> > I'll give r336313 a try as soon as possible and corroborate the fixes!
> 
> After a couple days with this new build, it looks like i can corroborate the
> fix referenced by Michael. :D



Regrettably I spoke too soon. I've had two failures thus far today
unfortunately :(

It appears to be correlated either with my link state changing rapidly due to
upstream fluctuations from my ISP, or a new DHCP lease being offered.

Some relevant snippets from syslog around the time of the link loss:

Aug  1 16:17:10 strawberry kernel: em1: link state changed to DOWN
Aug  1 16:17:20 strawberry kernel: em1: link state changed to UP
Aug  1 16:17:26 strawberry kernel: em1: link state changed to DOWN
Aug  1 16:17:28 strawberry kernel: em1: link state changed to UP
Aug  1 16:17:32 strawberry kernel: em1: link state changed to DOWN
Aug  1 16:17:34 strawberry kernel: em1: link state changed to UP
Aug  1 16:17:41 strawberry kernel: em1: link state changed to DOWN
Aug  1 16:17:43 strawberry kernel: em1: link state changed to UP
Aug  1 16:18:04 strawberry dhclient: New IP Address (em1): 173.228.82.91
Aug  1 16:18:04 strawberry dhclient: New Subnet Mask (em1): 255.255.255.0
Aug  1 16:18:04 strawberry dhclient: New Broadcast Address (em1):
173.228.82.255
Aug  1 16:18:04 strawberry dhclient: New Routers (em1): 173.228.82.1

Aug  5 22:32:32 strawberry syslogd: last message repeated 1 times
Aug  5 23:44:53 strawberry kernel: em1: Watchdog timeout Queue[0]-- 
resetting
Aug  5 23:44:53 strawberry kernel: Interface is RUNNING and ACTIVE
Aug  5 23:44:53 strawberry kernel: em1: TX Queue 0 --
Aug  5 23:44:53 strawberry kernel: em1: hw tdh = 282, hw tdt = 176
Aug  5 23:44:53 strawberry kernel: em1: Tx Queue Status = -2147483648
Aug  5 23:44:53 strawberry kernel: em1: TX descriptors avail = 106
Aug  5 23:44:53 strawberry kernel: em1: Tx Descriptors avail failure = 0
Aug  5 23:44:53 strawberry kernel: em1: RX Queue 0 --
Aug  5 23:44:53 strawberry kernel: em1: hw rdh = 176, hw rdt = 174
Aug  5 23:44:53 strawberry kernel: em1: RX discarded packets = 0
Aug  5 23:44:53 strawberry kernel: em1: RX Next to Check = 175
Aug  5 23:44:53 strawberry kernel: em1: RX Next to Refresh = 174
Aug  5 23:44:53 strawberry kernel: em1: link state changed to DOWN
Aug  5 23:44:55 strawberry kernel: em1: link state changed to UP
Aug  5 23:46:18 strawberry dhclient: New IP Address (em1): 173.228.82.91
Aug  5 23:46:18 strawberry dhclient: New Subnet Mask (em1): 255.255.255.0
Aug  5 23:46:18 strawberry dhclient: New Broadcast Address (em1):
173.228.82.255
Aug  5 23:46:18 strawberry dhclient: New Routers (em1): 173.228.82.1
Aug  5 23:46:19 strawberry syslogd: last message repeated 1 times
Aug  5 23:49:35 strawberry dhclient[12645]: send_packet: No route to host


At the tail end of the syslog I was trying to get a new lease but was then
unable to get a lease or restore functionality to the em1 device.


This is rather perplexing to me, but I'm not savvy enough to figure out how to
further be a useful debugger here :-/

Any suggestions would be appreciated!


Cheers
-R Tyler Croy



signature.asc
Description: PGP signature


Re: panic: mutex pmap not owned at ... efirt_machdep.c:255

2018-08-06 Thread Kyle Evans
On Sun, Aug 5, 2018 at 5:43 AM, Konstantin Belousov  wrote:
> On Sat, Aug 04, 2018 at 09:46:39PM -0500, Kyle Evans wrote:
>>
>> He now gets a little further, but ends up with the same panic due to
>> efirtc_probe trying to get time to verify the rtc's actually
>> implemented. What kind of approach must we take to ensure curcpu is
>> synced?
>
> It does not panic for me, when I load efirt.ko from the loader prompt.
> Anyway, try this

Right, I also don't get a panic on any of my machines from this.
Hopefully he'll have a chance to try this soon.

> diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c
> index 572b2197453..f84f56b98e2 100644
> --- a/sys/amd64/amd64/pmap.c
> +++ b/sys/amd64/amd64/pmap.c
> @@ -2655,7 +2655,7 @@ pmap_pinit0(pmap_t pmap)
> __pcpu[i].pc_ucr3 = PMAP_NO_CR3;
> }
> }
> -   PCPU_SET(curpmap, kernel_pmap);
> +   PCPU_SET(curpmap, pmap);
> pmap_activate(curthread);
> CPU_FILL(_pmap->pm_active);
>  }
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: OpenVPN produces garbage on TAP on -current

2018-08-06 Thread Lars Schotte
12.0-CURRENT FreeBSD 12.0-CURRENT #10 r337347: Sun Aug  5 14:28:37 CEST
2018 here.

On Mon, 6 Aug 2018 11:30:57 +0300
Gleb Popov  wrote:

> On Sun, Aug 5, 2018 at 11:51 PM, Lars Schotte  wrote:
> 
> > Here a bit of paste:
> > https://paste.fedoraproject.org/paste/Hn4M2JqZ~5xccLWOVD1xUw/raw
> > just to illustrate how it does not work.
> >
> > TAP device works good inside OS (FreeBSD current) however,
> > everything that comes over OpenVPN is just garbage.
> >  
> 
> I'm using CURRENT from June 10 and tap device works fine for me with
> OpenVPN 2.4.6_1
> 
> 
> > --
> >  Lars Schotte
> >  Mudroňova 13
> > 92101 Piešťany
> > ___
> > freebsd-po...@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-ports
> > To unsubscribe, send any mail to
> > "freebsd-ports-unsubscr...@freebsd.org" 

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: OpenVPN produces garbage on TAP on -current

2018-08-06 Thread Lars Schotte
Yes, I also have very recent 12-current without change on what SSL
library it uses.

However, the problem does not seem to be the SSL library, since it
connects and reports no problems at all. The issue here is only that it
does not transfer packages correctly.

And I am not ruling out other problems yet. It may also be a problem on
the client's side (Linux) with NetworkManager. However, that also uses
OpenVPN and I also tested it with FreeBSD <-> FreeBSD both OpenVPN and
the issue was the same. Something is wrong here. I am staying in touch
with this, but I am not testing 24/7 as I have also other things to do.

On Mon, 6 Aug 2018 16:31:43 +0200 (CEST)
Ronald Klop  wrote:

> I'm running a very recent 12-current too and latest openvpn from
> pkgs. No problems. I did not change anything in the defaults for the
> SSL-library it uses.
> 
> Ronald.
>  
> Van: Gleb Popov 
> Datum: maandag, 6 augustus 2018 10:30
> Aan: Lars Schotte 
> CC: FreeBSD Ports , FreeBSD current
>  Onderwerp: Re: OpenVPN produces garbage
> on TAP on -current
> > 
> > On Sun, Aug 5, 2018 at 11:51 PM, Lars Schotte 
> > wrote: 
> > > Here a bit of paste:
> > > https://paste.fedoraproject.org/paste/Hn4M2JqZ~5xccLWOVD1xUw/raw
> > > just to illustrate how it does not work.
> > >
> > > TAP device works good inside OS (FreeBSD current) however,
> > > everything that comes over OpenVPN is just garbage.
> > >  
> > 
> > I'm using CURRENT from June 10 and tap device works fine for me with
> > OpenVPN 2.4.6_1
> > 
> >   
> > > --
> > >  Lars Schotte
> > >  Mudro?ova 13
> > > 92101 Pie??any
> > > ___
> > > freebsd-po...@freebsd.org mailing list
> > > https://lists.freebsd.org/mailman/listinfo/freebsd-ports
> > > To unsubscribe, send any mail to
> > > "freebsd-ports-unsubscr...@freebsd.org" 
> > ___
> > freebsd-current@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to
> > "freebsd-current-unsubscr...@freebsd.org"
> > 
> > 
> >   

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic after ifioctl/if_clone_destroy

2018-08-06 Thread Roman Bogorodskiy
  Hans Petter Selasky wrote:

> Hi Roman,
> 
> Can you try the attached patch?
> 
> --HPS

Thanks for the patch, works fine so far.

I'll give it more testing in the next few days.

Roman Bogorodskiy


signature.asc
Description: PGP signature


Linux process causes kernel panic

2018-08-06 Thread Vladimir Kondratyev
I've got similar panic right after skype start

Disabling of SMAP via loader tunable workarounded the panic for me.

Applying of the patch make skype eating 100%CPU in unkillable state.

tail of ktrace dump

  1238 skype    CALL  linux_gettid
  1238 skype    RET   linux_gettid 101123/0x18b03
  1238 skype    CALL 
linux_sys_futex(0x3301edc,0x84,0x1,0x7fff,0x3301ec0,0x2)
  1238 skype    RET   linux_sys_futex 0
  1238 skype    CALL  linux_sys_futex(0x33b0fac,0x80,0x1,0,0x33b0f90,0x1)
  1238 skype    CALL  linux_sys_futex(0x3301edc,0x80,0x1,0,0x3301ec0,0x1)
  1238 skype    RET   linux_sys_futex -1 errno -11 Resource temporarily
unavailable
  1238 skype    CALL 
linux_sys_futex(0x3301ec0,0x81,0x1,0x3301ec0,0x33b02c8,0xc168)
  1238 skype    RET   linux_sys_futex 0
  1238 skype    CALL 
linux_sys_futex(0x33b0fac,0x85,0x1,0x1,0x33b0fa8,0x401)
-- here it stops --


ddb also shows that process is looping somewhere inside linux_sys_futex()

KDB: enter: manual escape to
debugger
 

[ thread pid 11 tid 100014
]   
  

Stopped at  kdb_enter+0x3b: movq   
$0,kdb_why  
 

db> bt
1238

  

Tracing pid 1238 tid 101049 td
0xf80157a64000  
  

cpustop_handler() at cpustop_handler+0x28/frame
0xfe9d6df0  
 

ipi_nmi_handler() at ipi_nmi_handler+0x44/frame
0xfe9d6e10  
 

trap() at trap+0x49/frame
0xfe9d6f20  
   

nmi_calltrap() at nmi_calltrap+0x8/frame
0xfe9d6f20  


--- trap 0x13, rip = 0x80709219, rsp = 0xfe00a8c906d0, rbp =
0xfe00a8c90750
--- 
 

witness_unlock() at witness_unlock+0x139/frame 0xfe00a8c90750
__mtx_unlock_flags() at __mtx_unlock_flags+0x5d/frame
0xfe00a8c90790  
   

futex_put() at futex_put+0x134/frame
0xfe00a8c907c0  


linux_sys_futex() at linux_sys_futex+0x609/frame
0xfe00a8c90880  


ia32_syscall() at ia32_syscall+0x282/frame
0xfe00a8c909b0  
  

int0x80_syscall_common() at int0x80_syscall_common+0x9c/frame 0x401



On 06.08.2018 15:03, Johannes Lundberg wrote:
> On Sat, Aug 4, 2018 at 3:22 PM Konstantin Belousov 
> wrote:
>
>> On Sat, Aug 04, 2018 at 01:12:17PM +0100, Johannes Lundberg wrote:
>>> No panic over night with that tunable so it seems you're on the right
>>> track.
>> Please try this, on top of r337316.
>>
> Been running boinc client now with 4 linux processes at 100% cpu load with
> this patch for a while. So far so good.
>
>
>> diff --git a/sys/amd64/linux/linux_machdep.c
>> b/sys/amd64/linux/linux_machdep.c
>> index 6c5b014853f..434ea0eac07 100644
>> --- a/sys/amd64/linux/linux_machdep.c
>> +++ b/sys/amd64/linux/linux_machdep.c
>> @@ -78,6 +78,9 @@ __FBSDID("$FreeBSD$");
>>  #include 
>>  #include 
>>
>> +#include 
>> +#include 
>> +
>>  #include 
>>  #include 
>>  #include 
>> @@ -88,8 +91,6 @@ __FBSDID("$FreeBSD$");
>>  #include 
>>  #include 
>>
>> -#include 
>> -
>>  int
>>  linux_execve(struct thread *td, struct linux_execve_args *args)
>>  {
>> @@ -276,3 +277,48 @@ linux_set_cloned_tls(struct thread *td, void *desc)
>>
>> return (0);
>>  }
>> +
>> +int futex_xchgl_nosmap(int oparg, uint32_t *uaddr, int *oldval);
>> +int futex_xchgl_smap(int oparg, uint32_t *uaddr, int *oldval);
>> +DEFINE_IFUNC(, int, futex_xchgl, (int, uint32_t *, int *), static)
>> +{
>> +
>> +   return ((cpu_stdext_feature & CPUID_STDEXT_SMAP) != 0 ?
>> +   futex_xchgl_smap : futex_xchgl_nosmap);
>> +}
>> +
>> +int 

Re: OpenVPN produces garbage on TAP on -current

2018-08-06 Thread Ronald Klop

I'm running a very recent 12-current too and latest openvpn from pkgs. No 
problems.
I did not change anything in the defaults for the SSL-library it uses.

Ronald.

Van: Gleb Popov 
Datum: maandag, 6 augustus 2018 10:30
Aan: Lars Schotte 
CC: FreeBSD Ports , FreeBSD current 

Onderwerp: Re: OpenVPN produces garbage on TAP on -current


On Sun, Aug 5, 2018 at 11:51 PM, Lars Schotte  wrote:

> Here a bit of paste:
> https://paste.fedoraproject.org/paste/Hn4M2JqZ~5xccLWOVD1xUw/raw
> just to illustrate how it does not work.
>
> TAP device works good inside OS (FreeBSD current) however, everything
> that comes over OpenVPN is just garbage.
>

I'm using CURRENT from June 10 and tap device works fine for me with
OpenVPN 2.4.6_1


> --
>  Lars Schotte
>  Mudro?ova 13
> 92101 Pie??any
> ___
> freebsd-po...@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-ports
> To unsubscribe, send any mail to "freebsd-ports-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"




___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Linux process causes kernel panic

2018-08-06 Thread Johannes Lundberg
On Sat, Aug 4, 2018 at 3:22 PM Konstantin Belousov 
wrote:

> On Sat, Aug 04, 2018 at 01:12:17PM +0100, Johannes Lundberg wrote:
> > No panic over night with that tunable so it seems you're on the right
> > track.
>
> Please try this, on top of r337316.
>

Been running boinc client now with 4 linux processes at 100% cpu load with
this patch for a while. So far so good.


> diff --git a/sys/amd64/linux/linux_machdep.c
> b/sys/amd64/linux/linux_machdep.c
> index 6c5b014853f..434ea0eac07 100644
> --- a/sys/amd64/linux/linux_machdep.c
> +++ b/sys/amd64/linux/linux_machdep.c
> @@ -78,6 +78,9 @@ __FBSDID("$FreeBSD$");
>  #include 
>  #include 
>
> +#include 
> +#include 
> +
>  #include 
>  #include 
>  #include 
> @@ -88,8 +91,6 @@ __FBSDID("$FreeBSD$");
>  #include 
>  #include 
>
> -#include 
> -
>  int
>  linux_execve(struct thread *td, struct linux_execve_args *args)
>  {
> @@ -276,3 +277,48 @@ linux_set_cloned_tls(struct thread *td, void *desc)
>
> return (0);
>  }
> +
> +int futex_xchgl_nosmap(int oparg, uint32_t *uaddr, int *oldval);
> +int futex_xchgl_smap(int oparg, uint32_t *uaddr, int *oldval);
> +DEFINE_IFUNC(, int, futex_xchgl, (int, uint32_t *, int *), static)
> +{
> +
> +   return ((cpu_stdext_feature & CPUID_STDEXT_SMAP) != 0 ?
> +   futex_xchgl_smap : futex_xchgl_nosmap);
> +}
> +
> +int futex_addl_nosmap(int oparg, uint32_t *uaddr, int *oldval);
> +int futex_addl_smap(int oparg, uint32_t *uaddr, int *oldval);
> +DEFINE_IFUNC(, int, futex_addl, (int, uint32_t *, int *), static)
> +{
> +
> +   return ((cpu_stdext_feature & CPUID_STDEXT_SMAP) != 0 ?
> +   futex_addl_smap : futex_addl_nosmap);
> +}
> +
> +int futex_orl_nosmap(int oparg, uint32_t *uaddr, int *oldval);
> +int futex_orl_smap(int oparg, uint32_t *uaddr, int *oldval);
> +DEFINE_IFUNC(, int, futex_orl, (int, uint32_t *, int *), static)
> +{
> +
> +   return ((cpu_stdext_feature & CPUID_STDEXT_SMAP) != 0 ?
> +   futex_orl_smap : futex_orl_nosmap);
> +}
> +
> +int futex_andl_nosmap(int oparg, uint32_t *uaddr, int *oldval);
> +int futex_andl_smap(int oparg, uint32_t *uaddr, int *oldval);
> +DEFINE_IFUNC(, int, futex_andl, (int, uint32_t *, int *), static)
> +{
> +
> +   return ((cpu_stdext_feature & CPUID_STDEXT_SMAP) != 0 ?
> +   futex_andl_smap : futex_andl_nosmap);
> +}
> +
> +int futex_xorl_nosmap(int oparg, uint32_t *uaddr, int *oldval);
> +int futex_xorl_smap(int oparg, uint32_t *uaddr, int *oldval);
> +DEFINE_IFUNC(, int, futex_xorl, (int, uint32_t *, int *), static)
> +{
> +
> +   return ((cpu_stdext_feature & CPUID_STDEXT_SMAP) != 0 ?
> +   futex_xorl_smap : futex_xorl_nosmap);
> +}
> diff --git a/sys/amd64/linux/linux_support.s
> b/sys/amd64/linux/linux_support.s
> index a9f02160be2..391f76414f2 100644
> --- a/sys/amd64/linux/linux_support.s
> +++ b/sys/amd64/linux/linux_support.s
> @@ -38,7 +38,7 @@ futex_fault:
> movl$-EFAULT,%eax
> ret
>
> -ENTRY(futex_xchgl)
> +ENTRY(futex_xchgl_nosmap)
> movqPCPU(CURPCB),%r8
> movq$futex_fault,PCB_ONFAULT(%r8)
> movq$VM_MAXUSER_ADDRESS-4,%rax
> @@ -49,25 +49,58 @@ ENTRY(futex_xchgl)
> xorl%eax,%eax
> movq%rax,PCB_ONFAULT(%r8)
> ret
> -END(futex_xchgl)
> +END(futex_xchgl_nosmap)
>
> -ENTRY(futex_addl)
> +ENTRY(futex_xchgl_smap)
> movqPCPU(CURPCB),%r8
> movq$futex_fault,PCB_ONFAULT(%r8)
> movq$VM_MAXUSER_ADDRESS-4,%rax
> cmpq%rax,%rsi
> ja  futex_fault
> +   stac
> +   xchgl   %edi,(%rsi)
> +   clac
> +   movl%edi,(%rdx)
> +   xorl%eax,%eax
> +   movq%rax,PCB_ONFAULT(%r8)
> +   ret
> +END(futex_xchgl_smap)
> +
> +ENTRY(futex_addl_nosmap)
> +   movqPCPU(CURPCB),%r8
> +   movq$futex_fault,PCB_ONFAULT(%r8)
> +   movq$VM_MAXUSER_ADDRESS-4,%rax
> +   cmpq%rax,%rsi
> +   ja  futex_fault
> +#ifdef SMP
> +   lock
> +#endif
> +   xaddl   %edi,(%rsi)
> +   movl%edi,(%rdx)
> +   xorl%eax,%eax
> +   movq%rax,PCB_ONFAULT(%r8)
> +   ret
> +END(futex_addl_nosmap)
> +
> +ENTRY(futex_addl_smap)
> +   movqPCPU(CURPCB),%r8
> +   movq$futex_fault,PCB_ONFAULT(%r8)
> +   movq$VM_MAXUSER_ADDRESS-4,%rax
> +   cmpq%rax,%rsi
> +   ja  futex_fault
> +   stac
>  #ifdef SMP
> lock
>  #endif
> xaddl   %edi,(%rsi)
> +   clac
> movl%edi,(%rdx)
> xorl%eax,%eax
> movq%rax,PCB_ONFAULT(%r8)
> ret
> -END(futex_addl)
> +END(futex_addl_smap)
>
> -ENTRY(futex_orl)
> +ENTRY(futex_orl_nosmap)
> movqPCPU(CURPCB),%r8
> movq$futex_fault,PCB_ONFAULT(%r8)
> movq$VM_MAXUSER_ADDRESS-4,%rax
> @@ -85,9 +118,31 @@ ENTRY(futex_orl)
> xorl%eax,%eax
> movq%rax,PCB_ONFAULT(%r8)
> ret
> -END(futex_orl)
> +END(futex_orl_nosmap)
>
> -ENTRY(futex_andl)

Re: panic after ifioctl/if_clone_destroy

2018-08-06 Thread Hans Petter Selasky

Hi Matthew,

On 08/06/18 10:02, Hans Petter Selasky wrote:

-   if ((tdwait = TAILQ_FIRST(>er_tdlist)) != NULL &&
-   TD_IS_RUNNING(tdwait->et_td)) {


At least the TD_IS_RUNNING() check is invalid. The "tdwait" structure is 
in the control of the other CPU and "tdwait->et_td" might be invalid at 
any time, so accessing any members here is not a good idea.


It is pretty clear that the epoch was exited during the loop:

etd->et_td = (void*)0xDEADBEEF;

fault virtual address   = 0xdeadc2ff
fault code  = supervisor read data, page not present


If you remove the TD_IS_RUNNING() check I'm not sure how useful this 
loop will be ...


--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: OpenVPN produces garbage on TAP on -current

2018-08-06 Thread Gleb Popov
On Sun, Aug 5, 2018 at 11:51 PM, Lars Schotte  wrote:

> Here a bit of paste:
> https://paste.fedoraproject.org/paste/Hn4M2JqZ~5xccLWOVD1xUw/raw
> just to illustrate how it does not work.
>
> TAP device works good inside OS (FreeBSD current) however, everything
> that comes over OpenVPN is just garbage.
>

I'm using CURRENT from June 10 and tap device works fine for me with
OpenVPN 2.4.6_1


> --
>  Lars Schotte
>  Mudroňova 13
> 92101 Piešťany
> ___
> freebsd-po...@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-ports
> To unsubscribe, send any mail to "freebsd-ports-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic after ifioctl/if_clone_destroy

2018-08-06 Thread Hans Petter Selasky

Hi Roman,

Can you try the attached patch?

--HPS
Index: sys/kern/subr_epoch.c
===
--- sys/kern/subr_epoch.c	(revision 336962)
+++ sys/kern/subr_epoch.c	(working copy)
@@ -232,33 +232,14 @@
 	struct epoch_thread *tdwait;
 	struct turnstile *ts;
 	struct lock_object *lock;
-	int spincount, gen;
 	int locksheld __unused;
 
 	record = __containerof(cr, struct epoch_record, er_record);
 	td = curthread;
 	locksheld = td->td_locks;
-	spincount = 0;
 	counter_u64_add(block_count, 1);
 	if (record->er_cpuid != curcpu) {
 		/*
-		 * If the head of the list is running, we can wait for it
-		 * to remove itself from the list and thus save us the
-		 * overhead of a migration
-		 */
-		if ((tdwait = TAILQ_FIRST(>er_tdlist)) != NULL &&
-		TD_IS_RUNNING(tdwait->et_td)) {
-			gen = record->er_gen;
-			thread_unlock(td);
-			do {
-cpu_spinwait();
-			} while (tdwait == TAILQ_FIRST(>er_tdlist) &&
-			gen == record->er_gen && TD_IS_RUNNING(tdwait->et_td) &&
-			spincount++ < MAX_ADAPTIVE_SPIN);
-			thread_lock(td);
-			return;
-		}
-		/*
 		 * Being on the same CPU as that of the record on which
 		 * we need to wait allows us access to the thread
 		 * list associated with that CPU. We can then examine the
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic after ifioctl/if_clone_destroy

2018-08-06 Thread Hans Petter Selasky

Hi,

I think the problem is the thread pointed to by tdwait exited. I would 
say it is not allowed to peek into the other records threads, because 
they may change under the hood and are not protected by the current context.




if (record->er_cpuid != curcpu) {


This optimisation is invalid or needs to be revisited:

/*
 * If the head of the list is running, we can wait for it
 * to remove itself from the list and thus save us the
 * overhead of a migration
 */
if ((tdwait = TAILQ_FIRST(>er_tdlist)) != NULL &&
TD_IS_RUNNING(tdwait->et_td)) {
gen = record->er_gen;
thread_unlock(td);
do {
cpu_spinwait();
} while (tdwait == TAILQ_FIRST(>er_tdlist) &&
gen == record->er_gen && TD_IS_RUNNING(tdwait->et_td) 
&&
spincount++ < MAX_ADAPTIVE_SPIN);
thread_lock(td);
return;
}



--HPS

On 08/05/18 22:01, Matthew Macy wrote:

If you could give me a self-contained reproducer that would expedite a fix.

Thanks.
-M

On Sun, Aug 5, 2018 at 08:36 Roman Bogorodskiy  wrote:


Running -CURRENT r336863 on amd64. Get the following panic right after
(or during) boot:

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 04
fault virtual address   = 0xdeadc2ff
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x80bd7858
stack pointer   = 0x28:0xfe008b445580
frame pointer   = 0x28:0xfe008b4455c0
code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 903 (libvirtd)

Traceback is:

(kgdb) #0  doadump (textdump=0) at pcpu.h:230
#1  0x8043dc7b in db_dump (dummy=,
 dummy2=, dummy3=,
 dummy4=) at /usr/src/sys/ddb/db_command.c:574
#2  0x8043da49 in db_command (cmd_table=)
 at /usr/src/sys/ddb/db_command.c:481
#3  0x8043d7c4 in db_command_loop ()
 at /usr/src/sys/ddb/db_command.c:534
#4  0x804409ef in db_trap (type=,
 code=) at /usr/src/sys/ddb/db_main.c:252
#5  0x80bdd513 in kdb_trap (type=12, code=0, tf=)
 at /usr/src/sys/kern/subr_kdb.c:693
#6  0x810769f1 in trap_fatal (frame=0xfe008b4454c0,
eva=3735929599)
 at /usr/src/sys/amd64/amd64/trap.c:884
#7  0x81076b12 in trap_pfault (frame=0xfe008b4454c0,
 usermode=) at pcpu.h:230
#8  0x8107611a in trap (frame=0xfe008b4454c0)
 at /usr/src/sys/amd64/amd64/trap.c:427
#9  0x810518ac in calltrap ()
 at /usr/src/sys/amd64/amd64/exception.S:230
#10 0x80bd7858 in epoch_block_handler_preempt (
 global=, cr=0xfe00760c3a00,
 arg=) at /usr/src/sys/kern/subr_epoch.c:256
#11 0x803994fd in ck_epoch_synchronize_wait (
 global=0xf800030c5680,
 cb=0x80bd77a0 , ct=0x0)
 at /usr/src/sys/contrib/ck/src/ck_epoch.c:407
#12 0x80bd7630 in epoch_wait_preempt (epoch=0xf800030c5680)
 at /usr/src/sys/kern/subr_epoch.c:389
#13 0x80c983bf in if_delgroup (ifp=0xf80003aab800,
 groupname=0xf80005ff5e00 "bridge") at /usr/src/sys/net/if.c:1514
#14 0x80c9f2b2 in if_clone_destroyif (ifc=0xf80005ff5e00,
 ifp=0xf80003aab800) at /usr/src/sys/net/if_clone.c:325
#15 0x80c9f0d5 in if_clone_destroy (name=0xfe008b4458d0
"virbr0")
 at /usr/src/sys/net/if_clone.c:288
#16 0x80c9a2c3 in ifioctl (so=0xf80007edca38, cmd=2149607801,
 data=, td=)
 at /usr/src/sys/net/if.c:3053
#17 0x80c04259 in kern_ioctl (td=0xf80007c1a580,
 fd=, com=,
 data=) at file.h:330
#18 0x80c03f2e in sys_ioctl (td=0xf80007c1a580,
 uap=0xf80007c1a940) at /usr/src/sys/kern/sys_generic.c:712
#19 0x81077401 in amd64_syscall (td=0xf80007c1a580, traced=0)
 at subr_syscall.c:135
#20 0x8105218d in fast_syscall_common ()
 at /usr/src/sys/amd64/amd64/exception.S:500
#21 0x0008028f4c0a in ?? ()


Previous frame inner to this frame (corrupt stack?)


Current language:  auto; currently minimal


(kgdb)

It looks like panic happens during network interfaces related
operations. Couple of dmesg lines before panic:

Aug  5 19:02:42 romashka rtsold[585]:  interface
bridge0 removed
Aug  5 19:02:42 romashka kernel: bridge0: Ethernet address:
02:af:41:48:c7:00
Aug  5 19:02:42 romashka kernel: bridge0: changing name to 'virbr-ab'
Aug  5 19:02:42 romashka kernel: tap0: Ethernet address: 00:bd:8d:11:f7:00
Aug  5 19:02:42 romashka kernel: tap0: link state changed to UP
Aug  5 19:02:42 romashka kernel: tap0: changing name to 'virbr-ab-nic'
Aug