Re: [lfs-support] Booting LFS with systemd

2018-07-13 Thread Hazel Russman
On Fri, 13 Jul 2018 08:26:01 +0100
Ken Moffat  wrote:

> On Fri, Jul 13, 2018 at 12:47:50AM -0400, Michael Shell wrote:
> > I should have realized that since Hazel was working with the full git tree,
> > that he could easily revert any commit within git. Duh!
> > 
> > H ... I think what I was thinking was that I didn't trust the results
> > of the bisection within git (because the identified problem code does not
> > even seem be active within Hazel's config, and Hazel said he was new to
> > the bisection process). So, I'm not so sure the commit that is believed
> > to be the offender really is the guilty party. I was thinking that Hazel
> > would take the 4.15 source tree (from the official release tarball) and
> > then manually backout the suspect commit - if that works/boots, then it
> > is virtually certain we found the problem area (as we didn't depend on
> > git).
> >   
Gentlemen, I have a confession to make. Over the past two days I have repeated 
the bisect (for safety) and it turns out that the original one was terminated 
prematurely. That's what happens when you have people blindly following 
instructions and not really knowing what they are doing!

The actual "guilty party" is the *next* commit -- which is why the one I 
reported before didn't seem relevant. Here is the final end:

# good: [872cbefd2d9c52bd0b1e2c7942c4369e98a5a5ae] x86/cpu/AMD: Add the Secure 
Memory Encryption CPU feature
git bisect good 872cbefd2d9c52bd0b1e2c7942c4369e98a5a5ae
# good: [7744ccdbc16f0ac4adae21b3678af93775b3a386] x86/mm: Add Secure Memory 
Encryption (SME) support
git bisect good 7744ccdbc16f0ac4adae21b3678af93775b3a386
# bad: [33c2b803edd13487518a2c7d5002d84d7e9c878f] x86/mm: Remove phys_to_virt() 
usage in ioremap()
git bisect bad 33c2b803edd13487518a2c7d5002d84d7e9c878f
# first bad commit: [33c2b803edd13487518a2c7d5002d84d7e9c878f] x86/mm:
Remove phys_to_virt() usage in ioremap()

When I do "bisect show" I get:
 "Currently there is a check if the address being mapped is in the ISA range 
(is_ISA_range()), and if it is, then phys_to_virt() is used to perform the 
mapping. When SME is active, the default is to add pagetable mappings with the 
encryption bit set unless specifically overridden. The resulting pagetable 
mapping from phys_to_virt() will result in a mapping that has the encryption 
bit set. With SME, the use of ioremap() is intended to generate pagetable 
mappings that do not have the encryption bit set through the use of the 
PAGE_KERNEL_IO protection value.

Rather than special case the SME scenario, remove the ISA range check and usage 
of phys_to_virt() and have ISA range mappings continue through the remaining 
ioremap() path."

I gather that some remapping that used to be done isn't done any more and 
that's what my machine doesn't like.

I suppose I now need to find the patch and revert it by hand and see what that 
does. I plan to do that today. Thank you for all your help so far.
-- 
-- 
http://lists.linuxfromscratch.org/listinfo/lfs-support
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page

Do not top post on this list.

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

http://en.wikipedia.org/wiki/Posting_style


Re: [lfs-support] Booting LFS with systemd

2018-07-13 Thread Frans de Boer

On 13-07-18 16:24, Michael Shell wrote:

On Fri, 13 Jul 2018 09:35:24 -0400
Michael Shell  wrote:


what exactly did gdb say about systemd's crash?



And FWIW, command output can be logged to a file as well as displayed
on the screen at the same time via the use of tee:

gdb /bin/program | tee gdb_log.txt

Actually, from

https://www.linuxquestions.org/questions/linux-software-2/bash-how-to-redirect-output-to-file-and-still-have-it-on-screen-412611/

it is even better also redirect stderr and use a subshell to avoid
order problems due to buffering:

(gdb /bin/program 2>&1) | tee gdb_log.txt

Then you can interact with gdb as needed and a copy of the
"conversation" will be in gdb_log.txt.


  Cheers,

  Mike

In order to use gdb, I need to compile it in. However, I now am stuck at 
glibc not compiling when following the LFS instruction is chapter six 
exactly.


So, I need that to be fixed first, then I need tlc, expect, deganu and 
gdb to be compiled in to even load it.


--- Frans.
--
http://lists.linuxfromscratch.org/listinfo/lfs-support
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page

Do not top post on this list.

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

http://en.wikipedia.org/wiki/Posting_style


Re: [lfs-support] Booting LFS with systemd

2018-07-13 Thread Michael Shell
On Fri, 13 Jul 2018 09:35:24 -0400
Michael Shell  wrote:

> what exactly did gdb say about systemd's crash?


And FWIW, command output can be logged to a file as well as displayed
on the screen at the same time via the use of tee:

gdb /bin/program | tee gdb_log.txt

Actually, from

https://www.linuxquestions.org/questions/linux-software-2/bash-how-to-redirect-output-to-file-and-still-have-it-on-screen-412611/

it is even better also redirect stderr and use a subshell to avoid
order problems due to buffering:

(gdb /bin/program 2>&1) | tee gdb_log.txt

Then you can interact with gdb as needed and a copy of the
"conversation" will be in gdb_log.txt.


 Cheers,

 Mike


-- 
http://lists.linuxfromscratch.org/listinfo/lfs-support
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page

Do not top post on this list.

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

http://en.wikipedia.org/wiki/Posting_style


Re: [lfs-support] Booting LFS with systemd

2018-07-13 Thread Michael Shell
On Fri, 13 Jul 2018 09:35:24 -0400
Michael Shell  wrote:

> In anycase, either by changing the m4/binutils build order, or
> adding the symlink, you can compile glibc successfully, right?


I read Bruce's old post too quickly. That symlink fix was for building
the newer binutils, not glibc. Actually, looking over those posts,
I'm still not sure what caused Bruce's glibc build problem.

Anyway, I think Frans' glibc build problem is due to not setting up
the chroot environment properly:

http://www.linuxfromscratch.org/lfs/view/development/chapter06/chroot.html

at the time glibc is built, what does 

echo $PATH

say? It should contain /tools/bin and so m4 will be found.


  Cheers,

  Mike
-- 
http://lists.linuxfromscratch.org/listinfo/lfs-support
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page

Do not top post on this list.

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

http://en.wikipedia.org/wiki/Posting_style


Re: [lfs-support] Booting LFS with systemd

2018-07-13 Thread Michael Shell
On Fri, 13 Jul 2018 14:28:12 +0200
Frans de Boer  wrote:

> In an effort to understand why systemd crashes and having a message
> that there is a segfault in glibc while booting, i tried to
> recompile all again. Now I can't even compile glibc.


  Frans,

We kind of leaped over some steps/info here. What exactly did gdb reveal
about glibc when systemd crashed?

As to why you can't recompile glibc, I found this:

http://lfs-dev.linuxfromscratch.narkive.com/EqIzQ6w0/glibc-2-27

where Bruce said, 
  "That was a fix for binutils. We moved the build order to have m4
   before binutils. That change should not be needed. So, we have
   to (now) build m4 before binutils"

And m4 is listed before binutils can be seen in the development tree:
http://www.linuxfromscratch.org/lfs/view/development/

In anycase, either by changing the m4/binutils build order, or
adding the symlink, you can compile glibc successfully, right?

Again, what exactly did gdb say about systemd's crash?


  Cheers,

  Mike

-- 
http://lists.linuxfromscratch.org/listinfo/lfs-support
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page

Do not top post on this list.

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

http://en.wikipedia.org/wiki/Posting_style


Re: [lfs-support] Booting LFS with systemd

2018-07-13 Thread Frans de Boer

On 06-07-18 08:23, Frans de Boer wrote:

On 07/06/2018 05:32 AM, Michael Shell wrote:

On Thu, 5 Jul 2018 21:48:16 +0200
Frans de Boer  wrote:


I had even rebuild everything with systemd-232, and that worked as
before. But after 232, things started to behave strange. Now way to
debug systemd, whatever I do


    Frans,

That's the whole point of being able to start the system with a shell
- so that systemd's startup, or failure thereof, can then be debugged
manually. What happened when you booted to shell and then tried to
start systemd manually?

init=/bin/bash
mount -o remount,rw /

Then, at the bash prompt, you want to try to start systemd manually.
You'll also want to first make sure you get a core file if/when it
crashes:

echo "core" > /proc/sys/kernel/core_pattern
ulimit -c unlimited

/usr/lib/systemd/systemd


With the above, does systemd crash and yield a core file?

Does

dmesg

show any relevant error messages?

If you get a core file, you can run gdb on systemd using the core
file:

gdb -c core /usr/lib/systemd/systemd

then what does the gdb backtrace reveal:

(gdb) bt


You can also try gdb on systemd without the core:

gdb /usr/lib/systemd/systemd
(gdb) run
(gdb) bt


If I had to bet at this point, my money would go on the theory that
your kernel is lacking support for something systemd (now) needs.
You can find a current list of systemd kernel config requirements
here:

https://cgit.freedesktop.org/systemd/systemd/tree/README

Note also, some kernel features must be *disabled*, e.g.,
CONFIG_SYSFS_DEPRECATED=n

Also, "systemd requires that the /run mount point exists.
    systemd also requires that /var/run is a symlink to
    /run "


    Cheers,

    Mike


Hi Mike,
I will follow your suggestions, of which few are new to me, and will 
come back with a report.


--- Frans

I get the following error:

...
bison --yacc --name-prefix=__gettext --output 
/sources-lfs/glibc-2.27/glibc-build/intl/plural.c plural.y

bison: m4 subprocess failed: No such file or directory
make[2]: *** [Makefile:46: 
/sources-lfs/glibc-2.27/glibc-build/intl/plural.c] Error 1

make[2]: Leaving directory '/sources-lfs/glibc-2.27/intl'
make[1]: *** [Makefile:215: intl/subdir_lib] Error 2
make[1]: Leaving directory '/sources-lfs/glibc-2.27'
make: *** [Makefile:9: all] Error 2

If I include 'ln -sfv /tools/bin/m4 /usr/bin' as suggested some time 
ago, I can compile glibc. In an effort to understand why systemd crashes 
and having a message that there is a segfault in glibc while booting, i 
tried to recompile all again. Now I can't even compile glibc.


Is this a result of some modification in the tool chain, or is de 
documentation not upto date?


--- Frans.
--
http://lists.linuxfromscratch.org/listinfo/lfs-support
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page

Do not top post on this list.

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

http://en.wikipedia.org/wiki/Posting_style


Re: [lfs-support] Booting LFS with systemd

2018-07-13 Thread Ken Moffat
On Fri, Jul 13, 2018 at 12:47:50AM -0400, Michael Shell wrote:
> On Thu, 12 Jul 2018 08:41:23 +0100
> Ken Moffat  wrote:
> 
> > And the generic command is probably 'git revert 7744ccdbc16f' but
> > since I'm not currently bisecting, I'm not sure what state that
> > would leave things in.
> 
> 
>   Ken,
> 
> I should have realized that since Hazel was working with the full git tree,
> that he could easily revert any commit within git. Duh!
> 
> H ... I think what I was thinking was that I didn't trust the results
> of the bisection within git (because the identified problem code does not
> even seem be active within Hazel's config, and Hazel said he was new to
> the bisection process). So, I'm not so sure the commit that is believed
> to be the offender really is the guilty party. I was thinking that Hazel
> would take the 4.15 source tree (from the official release tarball) and
> then manually backout the suspect commit - if that works/boots, then it
> is virtually certain we found the problem area (as we didn't depend on
> git).
> 

At the risk of teaching people to suck eggs ...

If someone has the git tree (either Greg's stable tree for a
particular minor version, or Linus's tree for .0) the suspected
commit can be viewed with

 git show 7744ccdbc16f

(in stable trees, backported commits will have a different hash).

Assuming that command does show the suspected commit,

 git show 7744ccdbc16f >/tmp/suspect1

then view /tmp/suspect1 to confirm it, and

 cat /tmp/suspect1 | patch -p1 -R --dry-run

(other invocations are available, but I like pipes)

to see if it will revert cleanly.  If it does, revert it, rebuild
the kernel, see if it fixes the problem.

> And, if we find the problem commit, the next step might be to manually
> change things in the source until we home in on the offending *line*,
> if possible. So, we'd be manually tweaking the source at some point
> anyway.
> 
> 
> With regard to reversing a patch (without any use of git), is it for
> certain that the -R option of patch can be used to reverse *any*
> patch file?
> 
> https://www.drupal.org/patch/reverse
> 

No.  For context diffs in the kernel, there have been several
instances where a hunk (of an update, but the principle is the same)
gets applied to similar code elsewhere in the file.  Those cases
were with git, but I'm sure patch will do the same.  But things like
that are uncommon.

Backing up the file before reversing (modern patch will often do
this, creating a .orig version - I think it depends on the amount of
fuzz) and then diffing the two and comparing to the patch, and
perhaps looking at more details in the before and after files, should
be able to catch this.

Also, if too much has changed since the patch was created then it
cannot be reversed.  But I'm sure you knew that.  In those cases
(e.g. a function got changed, or extra code added in the middle of
what is being changed) it needs to be fixed up manually.
Unfortunately, that is fairly common when trying to apply our own,
or distro, patches in BLFS or beyond-BLFS.

> If it is not universally possible to create a reverse patch using only
> the information in a patch file, then I'd say that that is an oversight
> in the design of diff.
> 
> 
> There is a lot more info on the topic of reversing patches here:
> 
> https://stackoverflow.com/questions/3902388/permanently-reversing-a-patch-file
> 
> 
> The interdiff utility and the patchutils package are new to me:
> 
> http://cyberelk.net/tim/software/patchutils/
> 
> ( simple standard install: ./configure --prefix=/usr
>configure looks for perl and xmlto )
> 
> Interdiff can create an "inverse" patch file via:
> 
> interdiff file.patch /dev/null > reversed.patch
> 
> The resulting reverse patch looks good to me, but when I tried a dry
> run:
> 
> patch --dry-run -p1 -i 
> ../tip-x86-mm-x86-mm-Add-Secure-Memory-Encryption-SME-support_reverse.patch
> 
> I got:
> 
> checking file arch/x86/Kconfig
> Reversed (or previously applied) patch detected!  Assume -R? [n]
> 
> I don't understand why it would think the patch had been already applied
> as the patch is supposed to *delete* code that is indeed in my Kconfig file.
> I think the problem might be because the specific kernel tree I tried it on
> has the context lines at 1436 rather than the 1415 specified by the patch.
> 

Dunno, but in beyond-BLFS, or trying to apply a BLFS patch to a
newer version, I often get that.  Sometimes I can see differences
which make me go "yes, changed code", other times it all looks
unaltered but still rejects.  In either case, manually apply the
.rej items with any necessary changes, then remove the .orig and
.rej, rediff, and test to see if it builds (and if it does, test to
see if it works).

In theory, just moving a block of code by a few lines should only
increse the fuzz.  But recent versions of patch seem to be more
sensitive.

> I've attached a copy of the interdiff created
> 
> tip-x86-mm-x86-mm-Add-Secure-Me