Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-06-01 Thread Matthew Garrett
On Sat, Jun 01, 2013 at 08:01:59PM +0900, Linus Torvalds wrote:
> On Fri, 31 May 2013, Matthew Garrett wrote:
> > 
> > I agree that a revert is probably the right thing to do here, but the 
> > original patch was there to permit a more accurate calculation of the 
> > amount of nvram in use, not to provide additional debug information. 
> > Reverting it is going to differently break a different set of systems
> 
> So "differently break" doesn't matter, if it's old breakage, and people 
> thus don't really expect it to work. We need to fix bugs without *new* 
> breakage, and quite frankly, I have been distressed by hearing the EFI 
> "specifications" mentioned so many times in this thread.

The old breakage has only ever been present in -rcs, not in released 
kernels. We worked around the worst of it with the patchset that's 
causing the problems. We can avoid that old breakage by reverting the 
patches that introduced it, but that gets us back to bricking some 
machines.

> Anyway, I'm traveling with absolutely horrendous internet access, so can 
> somebody please send a description of the revert with the relevant 
> information, because I literally have a hard time extracting it all from 
> this thread because my email access is so slow and flaky... Make it easy 
> for me to do the revert with a good explanation message, please,

I've finally got some information from Samsung that *should* let us fix 
this without anything else breaking in the process, but since we're 
dealing with firmware there's obviously a risk that that won't actually 
be true. I'll send that later today.

-- 
Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-06-01 Thread Linus Torvalds


On Fri, 31 May 2013, Matthew Garrett wrote:
> 
> I agree that a revert is probably the right thing to do here, but the 
> original patch was there to permit a more accurate calculation of the 
> amount of nvram in use, not to provide additional debug information. 
> Reverting it is going to differently break a different set of systems

So "differently break" doesn't matter, if it's old breakage, and people 
thus don't really expect it to work. We need to fix bugs without *new* 
breakage, and quite frankly, I have been distressed by hearing the EFI 
"specifications" mentioned so many times in this thread.

Firmware specs are pure and utter garbage. They are irrelevant. Firmware 
is buggy, and will always be buggy. The "spec" doesn't matter. We should 
use firmware for loading the kernel, and as little else as humanly 
possible.

I'm very disappointed in how the EFI code doesn't seem to understand that. 
There's tons of these stupid EFI variable crap that simply shouldn't 
matter. Quite frankly, we'd be better off ignoring as much of it by 
default as at all possible. Exactly because the more of an EFI interface 
we have, the more we open us up to th einevitable firmware bugs.

Anyway, I'm traveling with absolutely horrendous internet access, so can 
somebody please send a description of the revert with the relevant 
information, because I literally have a hard time extracting it all from 
this thread because my email access is so slow and flaky... Make it easy 
for me to do the revert with a good explanation message, please,

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-06-01 Thread Linus Torvalds


On Fri, 31 May 2013, Matthew Garrett wrote:
 
 I agree that a revert is probably the right thing to do here, but the 
 original patch was there to permit a more accurate calculation of the 
 amount of nvram in use, not to provide additional debug information. 
 Reverting it is going to differently break a different set of systems

So differently break doesn't matter, if it's old breakage, and people 
thus don't really expect it to work. We need to fix bugs without *new* 
breakage, and quite frankly, I have been distressed by hearing the EFI 
specifications mentioned so many times in this thread.

Firmware specs are pure and utter garbage. They are irrelevant. Firmware 
is buggy, and will always be buggy. The spec doesn't matter. We should 
use firmware for loading the kernel, and as little else as humanly 
possible.

I'm very disappointed in how the EFI code doesn't seem to understand that. 
There's tons of these stupid EFI variable crap that simply shouldn't 
matter. Quite frankly, we'd be better off ignoring as much of it by 
default as at all possible. Exactly because the more of an EFI interface 
we have, the more we open us up to th einevitable firmware bugs.

Anyway, I'm traveling with absolutely horrendous internet access, so can 
somebody please send a description of the revert with the relevant 
information, because I literally have a hard time extracting it all from 
this thread because my email access is so slow and flaky... Make it easy 
for me to do the revert with a good explanation message, please,

   Linus
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-06-01 Thread Matthew Garrett
On Sat, Jun 01, 2013 at 08:01:59PM +0900, Linus Torvalds wrote:
 On Fri, 31 May 2013, Matthew Garrett wrote:
  
  I agree that a revert is probably the right thing to do here, but the 
  original patch was there to permit a more accurate calculation of the 
  amount of nvram in use, not to provide additional debug information. 
  Reverting it is going to differently break a different set of systems
 
 So differently break doesn't matter, if it's old breakage, and people 
 thus don't really expect it to work. We need to fix bugs without *new* 
 breakage, and quite frankly, I have been distressed by hearing the EFI 
 specifications mentioned so many times in this thread.

The old breakage has only ever been present in -rcs, not in released 
kernels. We worked around the worst of it with the patchset that's 
causing the problems. We can avoid that old breakage by reverting the 
patches that introduced it, but that gets us back to bricking some 
machines.

 Anyway, I'm traveling with absolutely horrendous internet access, so can 
 somebody please send a description of the revert with the relevant 
 information, because I literally have a hard time extracting it all from 
 this thread because my email access is so slow and flaky... Make it easy 
 for me to do the revert with a good explanation message, please,

I've finally got some information from Samsung that *should* let us fix 
this without anything else breaking in the process, but since we're 
dealing with firmware there's obviously a risk that that won't actually 
be true. I'll send that later today.

-- 
Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Matthew Garrett
On Fri, May 31, 2013 at 11:20:59PM -0500, Russ Anderson wrote:

> And when QueryVariableInfo reports insufficient space, don't write
> and everything is fine.

And then installs fail, which is why we implemented this additional 
code.

> But when QueryVariableInfo reports insufficient space and you
> write anyway, the system can get bricked.  Is that the problem?

No.

> Some bios implementation may be "*including garbage*", but can
> you definitively say ALL do?  The spec explicitly says "the
> implementation of variable storage is not defined in this
> specification".  You can't assume any form of garbage collection.

No. We can't assume that. But nor can we rely upon the spec, because 
behaving according to the spec results in dead computers. So we're 
forced to write code that behaves according to reality rather than 
paper, which is unfortunate, but if *your* firmware behaved according to 
the specification then you wouldn't be seeing any problems with the 
current code, so I think there's a lesson for us all here.

> I don't see anything in here about the OS being free to use
> more nvram than QueryVariableInfo() reported as being available.
> What happened to following the spec?

That would be the same spec that tells you not to use physical addresses 
after SetVirtualAddressMap() has been called, right? Regardless, we 
don't use more space than QueryVariableInfo() reports as being 
available. We would expect that any attempt to do so would fail. But nor 
can we assume that we're free to use as much space as 
QueryVariableInfo() *does* report, because doing so results in dead 
computers. So instead we're forced to rely on workarounds that happen to 
break your broken firmware, so now we're going to have to find a 
different set of workarounds and hope that they don't break someone 
else's computer instead.

> > There's a limited number of situations where the firmware can write to 
> > nvram, but they're basically uninteresting. Practically speaking, it's 
> > always via the kernel once ExitBootServices() has been called.
> 
> How can you say "they're basically uninteresting"???
> Do you know what EVERY bios implementation does???

We have to assume that firmware doesn't behave in a pathological manner, 
because the alternative is to give up entirely and declare that Linux 
doesn't run on UEFI systems. I suspect that that would be problematic 
for your employers.

-- 
Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Russ Anderson
On Sat, Jun 01, 2013 at 01:03:11AM +0100, Matthew Garrett wrote:
> On Fri, May 31, 2013 at 05:57:31PM -0500, Russ Anderson wrote:
> > On Fri, May 31, 2013 at 05:28:16PM +0100, Matthew Garrett wrote:
> > >   If nvram becaomes full, some 
> > > systems crash during firmware initialisation. So we can't let nvram 
> > > become full. The obvious thing to do here is to look at the values from 
> > > QueryVariableInfo, but many systems won't perform any garbage collection 
> > > until they're almost out of space and so variables that have been 
> > > deleted still show up as used space.
> > 
> > OK.  I get nvram looks full due to lack of garbage collection
> > on some systems.  Does QueryVariableInfo (at runtime) tell you
> > it is full?  Is the problem that it says it is full when it
> > is not, or does not tell you it is full when it is?
> 
> QueryVariableInfo reports the amount used *including garbage*. This 
> makes it useless for determining how much space is available for the 
> firmware to use.

So when QueryVariableInfo reports there is space available, the
write works and everything is fine.

And when QueryVariableInfo reports insufficient space, don't write
and everything is fine.

But when QueryVariableInfo reports insufficient space and you
write anyway, the system can get bricked.  Is that the problem?



According to the UEFI spec for QueryVariableInfo()

  RemainingVariableStorageSize
  Returns the remaining size of the storage
  space available for EFI variables associated
  with the attributes specified.

It also says:

   The returned MaximumVariableStorageSize, RemainingVariableStorageSize,
   MaximumVariableSize information may change immediately after
   the call based on other runtime activities including asynchronous
   error events. Also, these values associated with different
   attributes are not additive in nature.

Note that "other runtime activities including asynchronous error
events."  That means runtime services may be using nvram without
the OS knowing about it.


Some bios implementation may be "*including garbage*", but can
you definitively say ALL do?  The spec explicitly says "the
implementation of variable storage is not defined in this
specification".  You can't assume any form of garbage collection.

   7.2 Variable Services

   Although the implementation of variable storage is not defined
   in this specification, variables must be persistent in most cases.
   This implies that the EFI implementation on a platform must arrange
   it so that variables passed in for storage are retained and
   available for use each time the system boots, at least until they
   are explicitly deleted or overwritten. Provision of this type of
   nonvolatile storage may be very limited on some platforms, so
   variables should be used sparingly in cases where other means of
   communicating information cannot be used.

I don't see anything in here about the OS being free to use
more nvram than QueryVariableInfo() reported as being available.
What happened to following the spec?


> > >   We can work around that by adding 
> > > up the size of the variables ourselves, but that only gives us the value 
> > > for runtime-visible variables. We also need to know how much space is 
> > > used by variables that are only visible during boot,
> > 
> > Is it valid to assume that only the kernel writes to nvram at
> > runtime?  Can bios log error information to nvram on an SMM
> > interrupt (for example)?
> 
> There's a limited number of situations where the firmware can write to 
> nvram, but they're basically uninteresting. Practically speaking, it's 
> always via the kernel once ExitBootServices() has been called.

How can you say "they're basically uninteresting"???
Do you know what EVERY bios implementation does???


> > >  hence calling 
> > > QueryVariableInfo before ExitBootServices.
> > 
> > Correct me if I am wrong, but that is called from EFI stubs,
> > which is only used by some bootloaders (sometimes grub2).
> > Does that mean EFI/grub and EFI/elilo will not hit the problem,
> > or will not have your fix?
> 
> Will not have the fix. Boot EFI systems via the EFI stub.

Thanks for that clarification.

> Boot EFI systems via the EFI stub.

Fortunately all our shipping systems are EFI/grub and EFI/elilo
right now, so they boot.

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Matthew Garrett
On Fri, May 31, 2013 at 05:57:31PM -0500, Russ Anderson wrote:
> On Fri, May 31, 2013 at 05:28:16PM +0100, Matthew Garrett wrote:
> >   If nvram becaomes full, some 
> > systems crash during firmware initialisation. So we can't let nvram 
> > become full. The obvious thing to do here is to look at the values from 
> > QueryVariableInfo, but many systems won't perform any garbage collection 
> > until they're almost out of space and so variables that have been 
> > deleted still show up as used space.
> 
> OK.  I get nvram looks full due to lack of garbage collection
> on some systems.  Does QueryVariableInfo (at runtime) tell you
> it is full?  Is the problem that it says it is full when it
> is not, or does not tell you it is full when it is?

QueryVariableInfo reports the amount used *including garbage*. This 
makes it useless for determining how much space is available for the 
firmware to use.

> >   We can work around that by adding 
> > up the size of the variables ourselves, but that only gives us the value 
> > for runtime-visible variables. We also need to know how much space is 
> > used by variables that are only visible during boot,
> 
> Is it valid to assume that only the kernel writes to nvram at
> runtime?  Can bios log error information to nvram on an SMM
> interrupt (for example)?

There's a limited number of situations where the firmware can write to 
nvram, but they're basically uninteresting. Practically speaking, it's 
always via the kernel once ExitBootServices() has been called.

> >  hence calling 
> > QueryVariableInfo before ExitBootServices.
> 
> Correct me if I am wrong, but that is called from EFI stubs,
> which is only used by some bootloaders (sometimes grub2).
> Does that mean EFI/grub and EFI/elilo will not hit the problem,
> or will not have your fix?

Will not have the fix. Boot EFI systems via the EFI stub.

-- 
Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Jiri Kosina
On Fri, 31 May 2013, Russ Anderson wrote:

> OK.  I get nvram looks full due to lack of garbage collection
> on some systems.  Does QueryVariableInfo (at runtime) tell you
> it is full?  Is the problem that it says it is full when it
> is not, or does not tell you it is full when it is?

We are trying to count the size occupied by existing UEFI variables. 
QueryVariableInfo() called from runtime environment doesn't give a full 
picture, as it doesn't provide information about boot-only variables.

Hence the patch.

> > up the size of the variables ourselves, but that only gives us the value 
> > for runtime-visible variables. We also need to know how much space is 
> > used by variables that are only visible during boot,
> 
> Is it valid to assume that only the kernel writes to nvram at
> runtime?  

Well, mostly yes. But if the kernel doesn't know what the initial state 
is, it can't count the size properly.

> > > Which means the previous patch(es) that caused the bricking should
> > > get pulled, too.
> > 
> > There are no patches that cause the bricking.
> 
> I thought that was the problem you were trying to avoid.

The problem that needs to be avoided is machines turning into bricks if 
EFI storage nvram is getting full.

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread H. Peter Anvin
On 05/31/2013 03:57 PM, Russ Anderson wrote:
> 
>>> Which means the previous patch(es) that caused the bricking should
>>> get pulled, too.
>>
>> There are no patches that cause the bricking.
> 
> I thought that was the problem you were trying to avoid.
> 

There are no *patches* that cause the bricking.  We have been trying to
figure out how to *avoid* the bricking.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Russ Anderson
On Fri, May 31, 2013 at 05:28:16PM +0100, Matthew Garrett wrote:
> On Fri, May 31, 2013 at 10:43:49AM -0500, Russ Anderson wrote:
> 
> > When did writing EFI variables to nvram become necessary to boot on 
> > UEFI? And if it is necessary, why is it that only linux boot loaders 
> > that use EFI stubs (generally grub2) need it?  The current kernel 
> > boots using EFI/grub and EFI/elilo.  It is just when EFI stubs are 
> > used that the boot fails.
> 
> I think you've misunderstood the problem.

The problem for me is the system not booting.  I was not
familiar with the problem you are trying to solve, but
have now looked at the links.

>   If nvram becaomes full, some 
> systems crash during firmware initialisation. So we can't let nvram 
> become full. The obvious thing to do here is to look at the values from 
> QueryVariableInfo, but many systems won't perform any garbage collection 
> until they're almost out of space and so variables that have been 
> deleted still show up as used space.

OK.  I get nvram looks full due to lack of garbage collection
on some systems.  Does QueryVariableInfo (at runtime) tell you
it is full?  Is the problem that it says it is full when it
is not, or does not tell you it is full when it is?

>   We can work around that by adding 
> up the size of the variables ourselves, but that only gives us the value 
> for runtime-visible variables. We also need to know how much space is 
> used by variables that are only visible during boot,

Is it valid to assume that only the kernel writes to nvram at
runtime?  Can bios log error information to nvram on an SMM
interrupt (for example)?

>  hence calling 
> QueryVariableInfo before ExitBootServices.

Correct me if I am wrong, but that is called from EFI stubs,
which is only used by some bootloaders (sometimes grub2).
Does that mean EFI/grub and EFI/elilo will not hit the problem,
or will not have your fix?


> > Which means the previous patch(es) that caused the bricking should
> > get pulled, too.
> 
> There are no patches that cause the bricking.

I thought that was the problem you were trying to avoid.

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread James Bottomley
On Fri, 2013-05-31 at 17:28 +0100, Matthew Garrett wrote:
> On Fri, May 31, 2013 at 10:43:49AM -0500, Russ Anderson wrote:
> 
> > When did writing EFI variables to nvram become necessary to boot on 
> > UEFI? And if it is necessary, why is it that only linux boot loaders 
> > that use EFI stubs (generally grub2) need it?  The current kernel 
> > boots using EFI/grub and EFI/elilo.  It is just when EFI stubs are 
> > used that the boot fails.
> 
> I think you've misunderstood the problem. If nvram becaomes full, some 
> systems crash during firmware initialisation. So we can't let nvram 
> become full. The obvious thing to do here is to look at the values from 
> QueryVariableInfo, but many systems won't perform any garbage collection 
> until they're almost out of space and so variables that have been 
> deleted still show up as used space. We can work around that by adding 
> up the size of the variables ourselves, but that only gives us the value 
> for runtime-visible variables. We also need to know how much space is 
> used by variables that are only visible during boot, hence calling 
> QueryVariableInfo before ExitBootServices.
> 
> > Which means the previous patch(es) that caused the bricking should
> > get pulled, too.
> 
> There are no patches that cause the bricking.

This is the description of the original problem:

http://www.h-online.com/open/news/item/Booting-Linux-using-UEFI-can-brick-Samsung-laptops-1793958.html

And the further investigation:

http://mjg59.dreamwidth.org/22855.html

If you read the latter, it shows you why we have to use
QueryVariableInfo to try to work out how much space we have available.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Matthew Garrett
On Fri, May 31, 2013 at 10:43:49AM -0500, Russ Anderson wrote:

> When did writing EFI variables to nvram become necessary to boot on 
> UEFI? And if it is necessary, why is it that only linux boot loaders 
> that use EFI stubs (generally grub2) need it?  The current kernel 
> boots using EFI/grub and EFI/elilo.  It is just when EFI stubs are 
> used that the boot fails.

I think you've misunderstood the problem. If nvram becaomes full, some 
systems crash during firmware initialisation. So we can't let nvram 
become full. The obvious thing to do here is to look at the values from 
QueryVariableInfo, but many systems won't perform any garbage collection 
until they're almost out of space and so variables that have been 
deleted still show up as used space. We can work around that by adding 
up the size of the variables ourselves, but that only gives us the value 
for runtime-visible variables. We also need to know how much space is 
used by variables that are only visible during boot, hence calling 
QueryVariableInfo before ExitBootServices.

> Which means the previous patch(es) that caused the bricking should
> get pulled, too.

There are no patches that cause the bricking.

-- 
Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Russ Anderson
On Fri, May 31, 2013 at 03:48:26PM +0100, Matthew Garrett wrote:
> On Fri, May 31, 2013 at 07:42:37AM -0700, James Bottomley wrote:
> > On Fri, 2013-05-31 at 15:34 +0100, Matthew Garrett wrote:
> > > I agree that a revert is probably the right thing to do here, but the 
> > > original patch was there to permit a more accurate calculation of the 
> > > amount of nvram in use, not to provide additional debug information. 
> > > Reverting it is going to differently break a different set of systems
> > 
> > The only ones that are broken are the Samsung ones.  Samsung claims to
> > have fixed their UEFI firmware, so we could refer any problems to them.
> 
> No, reverting this gets us back to the old state of refusing any writes 
> if more than 50% of the variable store *appears* to be used, regardless 
> of whether it's actually used. Which, unfortunately, makes it impossible 
> to install Linux on most UEFI machines.

When did writing EFI variables to nvram become necessary to boot
on UEFI?  And if it is necessary, why is it that only linux boot
loaders that use EFI stubs (generally grub2) need it?  The current
kernel boots using EFI/grub and EFI/elilo.  It is just when
EFI stubs are used that the boot fails.

I'm missing the background on why linux needs to write so many
EFI variables to nvram that it fills up nvram.  What is that
all about?

>  In any case, Samsung clearly 
> haven't fixed this problem on a pile of machines that have already 
> shipped.

Which means the previous patch(es) that caused the bricking should
get pulled, too.


-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Matthew Garrett
On Fri, May 31, 2013 at 07:42:37AM -0700, James Bottomley wrote:
> On Fri, 2013-05-31 at 15:34 +0100, Matthew Garrett wrote:
> > I agree that a revert is probably the right thing to do here, but the 
> > original patch was there to permit a more accurate calculation of the 
> > amount of nvram in use, not to provide additional debug information. 
> > Reverting it is going to differently break a different set of systems
> 
> The only ones that are broken are the Samsung ones.  Samsung claims to
> have fixed their UEFI firmware, so we could refer any problems to them.

No, reverting this gets us back to the old state of refusing any writes 
if more than 50% of the variable store *appears* to be used, regardless 
of whether it's actually used. Which, unfortunately, makes it impossible 
to install Linux on most UEFI machines. In any case, Samsung clearly 
haven't fixed this problem on a pile of machines that have already 
shipped.

> Could we hedge the QueryVariableInfo checks with a test for Samsung in
> the UEFI identity strings?

We could, but apparently some Lenovos also have a similar problem. We 
just don't have the information we need to implement a comprehensive 
blacklist, and if we get it wrong we're back to destroying people's 
hardware.

-- 
Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread H. Peter Anvin
On 05/31/2013 07:42 AM, James Bottomley wrote:
> 
> The only ones that are broken are the Samsung ones.  Samsung claims to
> have fixed their UEFI firmware, so we could refer any problems to them.
> 
> The signature of the Samsung failure, which this is guarding against is
> that the laptop gets bricked, so it really is a nasty choice of poisons
> we have to pick...
> 

Are we sure it is only Samsung?  I was under the impression there were
other systems with similar failures?

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread James Bottomley
On Fri, 2013-05-31 at 15:34 +0100, Matthew Garrett wrote:
> On Fri, May 31, 2013 at 02:43:56PM +0200, Ingo Molnar wrote:
> 
> > 4) The revert is easy, and the functionality the original patch provided
> >was a marginal increase in debug output to begin with...
> 
> I agree that a revert is probably the right thing to do here, but the 
> original patch was there to permit a more accurate calculation of the 
> amount of nvram in use, not to provide additional debug information. 
> Reverting it is going to differently break a different set of systems

The only ones that are broken are the Samsung ones.  Samsung claims to
have fixed their UEFI firmware, so we could refer any problems to them.

The signature of the Samsung failure, which this is guarding against is
that the laptop gets bricked, so it really is a nasty choice of poisons
we have to pick...

Could we hedge the QueryVariableInfo checks with a test for Samsung in
the UEFI identity strings?

James



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread H. Peter Anvin
On 05/29/2013 01:16 PM, Russ Anderson wrote:
> 
> To be more specific (now that I've dug through the code), grub2 (used
> by rhel7) uses EFI boot stubs.  grub and elilo apparently do not use
> EFI boot stubs, so they don't hit the problem (at least on my test
> systems).
> 

Grub2 *can* use the EFI boot stub... as far as I know it doesn't by
default, but the way it is used in RHEL does.

Either way, we need to fix this and fix it soon.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Matthew Garrett
On Fri, May 31, 2013 at 02:43:56PM +0200, Ingo Molnar wrote:

> 4) The revert is easy, and the functionality the original patch provided
>was a marginal increase in debug output to begin with...

I agree that a revert is probably the right thing to do here, but the 
original patch was there to permit a more accurate calculation of the 
amount of nvram in use, not to provide additional debug information. 
Reverting it is going to differently break a different set of systems

-- 
Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Ingo Molnar

* Borislav Petkov  wrote:

> On Fri, May 31, 2013 at 01:06:09PM +0200, Jiri Kosina wrote:
> >
> > (*) If one would be naive enough, he'd believe that the pressure 
> > should be put on the BIOS writers instead of OS to fix the bug :)
> 
> Oh, definitely pressure on BIOS dudes. If they're in violation of the 
> spec and they still can fix it in time, they better. I'm sick and tired 
> of having to deal with BIOS idiocy in kernel code.

I'm all for some BIOS quality bashing, but the reality is:

1) It's not just about SGI/UV systems but apparently about several 
   different types of x86 laptops produce the same boot crash pattern: 
   most of which come from manufacturers that simply don't care about 
   Linux all that much.

   So by not reverting we'd screw our users, not put any recognizable 
   pressure on any BIOS writers or manufacturers.

2) Obviously Windows does not crash, and that's what most laptops test.
   So our realistic 'spec target' is not some sort of pure 'EFI spec',
   but EFI implementations _tested under Windows_. Consider it an 
   'extended EFI compatibility spec'.

3) There's a better, more robust firmware environment approach being 
   worked on (by you?) that avoids such 1:1 physical mapping assumption 
   crashes. That's something worth doing anyway, so why not delay the 
   early QueryVariableInfo() call change to when that enviroment is 
   properly implemented?

4) The revert is easy, and the functionality the original patch provided
   was a marginal increase in debug output to begin with...

So to me the right approach seems to be:

 A: revert now for v3.10
 B: implement 1:1 mappings environment for firmware, for v3.11
 C: reintroduce the early QueryVariableInfo() call again, in v3.11

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Borislav Petkov
On Fri, May 31, 2013 at 01:06:09PM +0200, Jiri Kosina wrote:
> (*) If one would be naive enough, he'd believe that the pressure
> should be put on the BIOS writers instead of OS to fix the bug :)

Oh, definitely pressure on BIOS dudes. If they're in violation of the
spec and they still can fix it in time, they better. I'm sick and tired
of having to deal with BIOS idiocy in kernel code.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Josh Boyer
On Fri, May 31, 2013 at 7:40 AM, Ingo Molnar  wrote:
>
> * Jiri Kosina  wrote:
>
>> On Fri, 31 May 2013, Ingo Molnar wrote:
>>
>> > > Unfortunately that means that you can as well throw the patch away
>> > > completely.
>> > >
>> > > The sole point is to run the QueryVariableInfo() from the boot
>> > > environment, in order to obtain more accurate information. And it's
>> > > a valid thing to do, according to UEFI specification.
>> >
>> > So this change needs to be reverted or fixed.
>>
>> I don't think anyone is arguing against that.
>
> The statement I replied to above can be fairly read as arguing against it.
>
> Anyway, given that the fix (workaround) from Boris seems elaborate so late
> in the v3.10 cycle, does everyone else agree with a revert as well?

If you go with a revert, please make sure to tag it for stable to it
gets picked up with 3.9.y.

josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Ingo Molnar

* Jiri Kosina  wrote:

> On Fri, 31 May 2013, Ingo Molnar wrote:
> 
> > > Unfortunately that means that you can as well throw the patch away 
> > > completely.
> > >
> > > The sole point is to run the QueryVariableInfo() from the boot 
> > > environment, in order to obtain more accurate information. And it's 
> > > a valid thing to do, according to UEFI specification.
> >
> > So this change needs to be reverted or fixed.
> 
> I don't think anyone is arguing against that.

The statement I replied to above can be fairly read as arguing against it.

Anyway, given that the fix (workaround) from Boris seems elaborate so late 
in the v3.10 cycle, does everyone else agree with a revert as well?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Jiri Kosina
On Fri, 31 May 2013, Ingo Molnar wrote:

> So this change needs to be reverted or fixed.

I don't think anyone is arguing against that.

My remark was purely to describe the current status quo and help to 
understand what exactly is happening, i.e.:

- QueryVariableInfo() should be a valid thing to do from inside boot  
  environment, according to the spec
- now we see that at least SGI bios (an probably other incarnations) 
  think otherwise
- if we are not able to fix / work around the bug in BIOS (*), we have to 
  make a choice between two evils -- either increase likelyhood of 
  bricking certain machines due to filling the EFI storage space, or 
  break booting on broken BIOSen
- the theory is that Borislav's 1:1 mapping patches will work this around; 
  one of the supporting arguments being that it's probably what Windows is 
  doing. I believe Borislav is in the process of testing this. But the 
  patches are not ready for mainline yet.

(*) If one would be naive enough, he'd believe that the pressure should be 
put on the BIOS writers instead of OS to fix the bug :)

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Ingo Molnar

* Jiri Kosina  wrote:

> On Thu, 30 May 2013, Russ Anderson wrote:
> 
> > > > > > Yes, but this call is clearly happening way before 
> > > > > > ExitBootServices() -- 
> > > > > > see the surrounding code, see for example this in efi_main():
> > > > > > 
> > > > > > [ ... snip ... ]
> > > > > > setup_efi_vars(boot_params);
> > > > > > 
> > > > > > setup_efi_pci(boot_params);
> > > > > > 
> > > > > > status = efi_call_phys3(sys_table->boottime->allocate_pool,
> > > > > > EFI_LOADER_DATA, sizeof(*gdt),
> > > > > > (void **));
> > > > > > if (status != EFI_SUCCESS) {
> > > > > > efi_printk("Failed to alloc mem for gdt structure\n");
> > > > > > goto fail;
> > > > > > }
> > > > > > [ ... snip ... ]
> > > > > 
> > > > > Yes.  Note the failing call is sys_table->runtime while all the
> > > > > other calls are sys_table->boottime and seem to work.  Not sure
> > > > > why the sys_table->runtime call has a problem but it may be
> > > > > a clue.  Could something in the runtime path not be set up???
> > > > 
> > > > That was my original idea early today as well. My understanding of the 
> > > > UEFI spec is admittedly limited, but afaics calling runtime method from 
> > > > boot environment should be a valid thing to do ... ?
> > > 
> > > QueryVariableInfo() is a runtime services, all runtime services should
> > > available bother on boot time and runtime:
> > > 
> > > UEFI spec 2.3.1 P.109:
> > >   Runtime Services
> > >   Functions that are available before and after any call to  
> > >   ExitBootServices(). These functions are described in Section 7.
> > 
> > That's a great idea.  This patch moves the QueryVariableInfo()
> > call from bootime to runtime, in efi_late_init().  The attached
> > patch is consistent with the UEFI spec and avoids the problem.
> 
> Unfortunately that means that you can as well throw the patch away 
> completely.
> 
> The sole point is to run the QueryVariableInfo() from the boot 
> environment, in order to obtain more accurate information. And it's a 
> valid thing to do, according to UEFI specification.

Argh!

The 'UEFI specification', if contradicted by reality, has _NO VALUE_ 
whatsoever. Full stop.

Any specification is only valuable as long as it's followed by everyone: 
but once that is not the case, it does not matter much what the written 
wishes on the paper express.

What matters is that a system that booted fine with v3.8 does not boot 
anymore due to a change we did to the EFI code. And the only valid 
response to that is to make it work. Full stop!

And this is not only about SGI/UV systems: in this thread there were 
multiple systems mentioned that _DO NOT BOOT_ due to this change.

So this change needs to be reverted or fixed.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Ingo Molnar

* Russ Anderson  wrote:

> On Thu, May 30, 2013 at 10:32:09PM +, Matthew Garrett wrote:
> > On Thu, 2013-05-30 at 17:28 -0500, Russ Anderson wrote:
> > > On Thu, May 30, 2013 at 10:21:53PM +, Matthew Garrett wrote:
> > > > On Thu, 2013-05-30 at 17:17 -0500, Russ Anderson wrote:
> > > > 
> > > > > That's a great idea.  This patch moves the QueryVariableInfo()
> > > > > call from bootime to runtime, in efi_late_init().  The attached
> > > > > patch is consistent with the UEFI spec and avoids the problem.
> > > > 
> > > > No, that defeats the entire point of the original patch.
> > > 
> > > How so?  It is still calling QueryVariableInfo()
> > > before the data is used.
> > 
> > We want to know how much space is used by variables that aren't visible
> > at runtime.
> 
> We want to boot.  We could boot up through 3.9-rc7.
> 
> Knowing how much space is used by variables that aren't
> visible at runtime it moot if you can't boot.

Exactly - fixing a boot regression is _WAY_ more important than pretty 
much any other concern.

and the boot breakage is not limited to UV systems - the thread mentioned 
a couple of other systems as well.

So it's an absolute no-brainer that this change should be reverted or 
fixed via your patch.

Once a safer mechanism is implemented to call QueryVariableInfo() earlier 
(Boris's patches?) the change can be reintroduced.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Ingo Molnar

* Russ Anderson r...@sgi.com wrote:

 On Thu, May 30, 2013 at 10:32:09PM +, Matthew Garrett wrote:
  On Thu, 2013-05-30 at 17:28 -0500, Russ Anderson wrote:
   On Thu, May 30, 2013 at 10:21:53PM +, Matthew Garrett wrote:
On Thu, 2013-05-30 at 17:17 -0500, Russ Anderson wrote:

 That's a great idea.  This patch moves the QueryVariableInfo()
 call from bootime to runtime, in efi_late_init().  The attached
 patch is consistent with the UEFI spec and avoids the problem.

No, that defeats the entire point of the original patch.
   
   How so?  It is still calling QueryVariableInfo()
   before the data is used.
  
  We want to know how much space is used by variables that aren't visible
  at runtime.
 
 We want to boot.  We could boot up through 3.9-rc7.
 
 Knowing how much space is used by variables that aren't
 visible at runtime it moot if you can't boot.

Exactly - fixing a boot regression is _WAY_ more important than pretty 
much any other concern.

and the boot breakage is not limited to UV systems - the thread mentioned 
a couple of other systems as well.

So it's an absolute no-brainer that this change should be reverted or 
fixed via your patch.

Once a safer mechanism is implemented to call QueryVariableInfo() earlier 
(Boris's patches?) the change can be reintroduced.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Ingo Molnar

* Jiri Kosina jkos...@suse.cz wrote:

 On Thu, 30 May 2013, Russ Anderson wrote:
 
  Yes, but this call is clearly happening way before 
  ExitBootServices() -- 
  see the surrounding code, see for example this in efi_main():
  
  [ ... snip ... ]
  setup_efi_vars(boot_params);
  
  setup_efi_pci(boot_params);
  
  status = efi_call_phys3(sys_table-boottime-allocate_pool,
  EFI_LOADER_DATA, sizeof(*gdt),
  (void **)gdt);
  if (status != EFI_SUCCESS) {
  efi_printk(Failed to alloc mem for gdt structure\n);
  goto fail;
  }
  [ ... snip ... ]
 
 Yes.  Note the failing call is sys_table-runtime while all the
 other calls are sys_table-boottime and seem to work.  Not sure
 why the sys_table-runtime call has a problem but it may be
 a clue.  Could something in the runtime path not be set up???

That was my original idea early today as well. My understanding of the 
UEFI spec is admittedly limited, but afaics calling runtime method from 
boot environment should be a valid thing to do ... ?
   
   QueryVariableInfo() is a runtime services, all runtime services should
   available bother on boot time and runtime:
   
   UEFI spec 2.3.1 P.109:
 Runtime Services
 Functions that are available before and after any call to  
 ExitBootServices(). These functions are described in Section 7.
  
  That's a great idea.  This patch moves the QueryVariableInfo()
  call from bootime to runtime, in efi_late_init().  The attached
  patch is consistent with the UEFI spec and avoids the problem.
 
 Unfortunately that means that you can as well throw the patch away 
 completely.
 
 The sole point is to run the QueryVariableInfo() from the boot 
 environment, in order to obtain more accurate information. And it's a 
 valid thing to do, according to UEFI specification.

Argh!

The 'UEFI specification', if contradicted by reality, has _NO VALUE_ 
whatsoever. Full stop.

Any specification is only valuable as long as it's followed by everyone: 
but once that is not the case, it does not matter much what the written 
wishes on the paper express.

What matters is that a system that booted fine with v3.8 does not boot 
anymore due to a change we did to the EFI code. And the only valid 
response to that is to make it work. Full stop!

And this is not only about SGI/UV systems: in this thread there were 
multiple systems mentioned that _DO NOT BOOT_ due to this change.

So this change needs to be reverted or fixed.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Jiri Kosina
On Fri, 31 May 2013, Ingo Molnar wrote:

 So this change needs to be reverted or fixed.

I don't think anyone is arguing against that.

My remark was purely to describe the current status quo and help to 
understand what exactly is happening, i.e.:

- QueryVariableInfo() should be a valid thing to do from inside boot  
  environment, according to the spec
- now we see that at least SGI bios (an probably other incarnations) 
  think otherwise
- if we are not able to fix / work around the bug in BIOS (*), we have to 
  make a choice between two evils -- either increase likelyhood of 
  bricking certain machines due to filling the EFI storage space, or 
  break booting on broken BIOSen
- the theory is that Borislav's 1:1 mapping patches will work this around; 
  one of the supporting arguments being that it's probably what Windows is 
  doing. I believe Borislav is in the process of testing this. But the 
  patches are not ready for mainline yet.

(*) If one would be naive enough, he'd believe that the pressure should be 
put on the BIOS writers instead of OS to fix the bug :)

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Ingo Molnar

* Jiri Kosina jkos...@suse.cz wrote:

 On Fri, 31 May 2013, Ingo Molnar wrote:
 
   Unfortunately that means that you can as well throw the patch away 
   completely.
  
   The sole point is to run the QueryVariableInfo() from the boot 
   environment, in order to obtain more accurate information. And it's 
   a valid thing to do, according to UEFI specification.
 
  So this change needs to be reverted or fixed.
 
 I don't think anyone is arguing against that.

The statement I replied to above can be fairly read as arguing against it.

Anyway, given that the fix (workaround) from Boris seems elaborate so late 
in the v3.10 cycle, does everyone else agree with a revert as well?

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Josh Boyer
On Fri, May 31, 2013 at 7:40 AM, Ingo Molnar mi...@kernel.org wrote:

 * Jiri Kosina jkos...@suse.cz wrote:

 On Fri, 31 May 2013, Ingo Molnar wrote:

   Unfortunately that means that you can as well throw the patch away
   completely.
  
   The sole point is to run the QueryVariableInfo() from the boot
   environment, in order to obtain more accurate information. And it's
   a valid thing to do, according to UEFI specification.
 
  So this change needs to be reverted or fixed.

 I don't think anyone is arguing against that.

 The statement I replied to above can be fairly read as arguing against it.

 Anyway, given that the fix (workaround) from Boris seems elaborate so late
 in the v3.10 cycle, does everyone else agree with a revert as well?

If you go with a revert, please make sure to tag it for stable to it
gets picked up with 3.9.y.

josh
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Borislav Petkov
On Fri, May 31, 2013 at 01:06:09PM +0200, Jiri Kosina wrote:
 (*) If one would be naive enough, he'd believe that the pressure
 should be put on the BIOS writers instead of OS to fix the bug :)

Oh, definitely pressure on BIOS dudes. If they're in violation of the
spec and they still can fix it in time, they better. I'm sick and tired
of having to deal with BIOS idiocy in kernel code.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Ingo Molnar

* Borislav Petkov b...@alien8.de wrote:

 On Fri, May 31, 2013 at 01:06:09PM +0200, Jiri Kosina wrote:
 
  (*) If one would be naive enough, he'd believe that the pressure 
  should be put on the BIOS writers instead of OS to fix the bug :)
 
 Oh, definitely pressure on BIOS dudes. If they're in violation of the 
 spec and they still can fix it in time, they better. I'm sick and tired 
 of having to deal with BIOS idiocy in kernel code.

I'm all for some BIOS quality bashing, but the reality is:

1) It's not just about SGI/UV systems but apparently about several 
   different types of x86 laptops produce the same boot crash pattern: 
   most of which come from manufacturers that simply don't care about 
   Linux all that much.

   So by not reverting we'd screw our users, not put any recognizable 
   pressure on any BIOS writers or manufacturers.

2) Obviously Windows does not crash, and that's what most laptops test.
   So our realistic 'spec target' is not some sort of pure 'EFI spec',
   but EFI implementations _tested under Windows_. Consider it an 
   'extended EFI compatibility spec'.

3) There's a better, more robust firmware environment approach being 
   worked on (by you?) that avoids such 1:1 physical mapping assumption 
   crashes. That's something worth doing anyway, so why not delay the 
   early QueryVariableInfo() call change to when that enviroment is 
   properly implemented?

4) The revert is easy, and the functionality the original patch provided
   was a marginal increase in debug output to begin with...

So to me the right approach seems to be:

 A: revert now for v3.10
 B: implement 1:1 mappings environment for firmware, for v3.11
 C: reintroduce the early QueryVariableInfo() call again, in v3.11

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread H. Peter Anvin
On 05/29/2013 01:16 PM, Russ Anderson wrote:
 
 To be more specific (now that I've dug through the code), grub2 (used
 by rhel7) uses EFI boot stubs.  grub and elilo apparently do not use
 EFI boot stubs, so they don't hit the problem (at least on my test
 systems).
 

Grub2 *can* use the EFI boot stub... as far as I know it doesn't by
default, but the way it is used in RHEL does.

Either way, we need to fix this and fix it soon.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread James Bottomley
On Fri, 2013-05-31 at 15:34 +0100, Matthew Garrett wrote:
 On Fri, May 31, 2013 at 02:43:56PM +0200, Ingo Molnar wrote:
 
  4) The revert is easy, and the functionality the original patch provided
 was a marginal increase in debug output to begin with...
 
 I agree that a revert is probably the right thing to do here, but the 
 original patch was there to permit a more accurate calculation of the 
 amount of nvram in use, not to provide additional debug information. 
 Reverting it is going to differently break a different set of systems

The only ones that are broken are the Samsung ones.  Samsung claims to
have fixed their UEFI firmware, so we could refer any problems to them.

The signature of the Samsung failure, which this is guarding against is
that the laptop gets bricked, so it really is a nasty choice of poisons
we have to pick...

Could we hedge the QueryVariableInfo checks with a test for Samsung in
the UEFI identity strings?

James



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Matthew Garrett
On Fri, May 31, 2013 at 02:43:56PM +0200, Ingo Molnar wrote:

 4) The revert is easy, and the functionality the original patch provided
was a marginal increase in debug output to begin with...

I agree that a revert is probably the right thing to do here, but the 
original patch was there to permit a more accurate calculation of the 
amount of nvram in use, not to provide additional debug information. 
Reverting it is going to differently break a different set of systems

-- 
Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread H. Peter Anvin
On 05/31/2013 07:42 AM, James Bottomley wrote:
 
 The only ones that are broken are the Samsung ones.  Samsung claims to
 have fixed their UEFI firmware, so we could refer any problems to them.
 
 The signature of the Samsung failure, which this is guarding against is
 that the laptop gets bricked, so it really is a nasty choice of poisons
 we have to pick...
 

Are we sure it is only Samsung?  I was under the impression there were
other systems with similar failures?

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Matthew Garrett
On Fri, May 31, 2013 at 07:42:37AM -0700, James Bottomley wrote:
 On Fri, 2013-05-31 at 15:34 +0100, Matthew Garrett wrote:
  I agree that a revert is probably the right thing to do here, but the 
  original patch was there to permit a more accurate calculation of the 
  amount of nvram in use, not to provide additional debug information. 
  Reverting it is going to differently break a different set of systems
 
 The only ones that are broken are the Samsung ones.  Samsung claims to
 have fixed their UEFI firmware, so we could refer any problems to them.

No, reverting this gets us back to the old state of refusing any writes 
if more than 50% of the variable store *appears* to be used, regardless 
of whether it's actually used. Which, unfortunately, makes it impossible 
to install Linux on most UEFI machines. In any case, Samsung clearly 
haven't fixed this problem on a pile of machines that have already 
shipped.

 Could we hedge the QueryVariableInfo checks with a test for Samsung in
 the UEFI identity strings?

We could, but apparently some Lenovos also have a similar problem. We 
just don't have the information we need to implement a comprehensive 
blacklist, and if we get it wrong we're back to destroying people's 
hardware.

-- 
Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Russ Anderson
On Fri, May 31, 2013 at 03:48:26PM +0100, Matthew Garrett wrote:
 On Fri, May 31, 2013 at 07:42:37AM -0700, James Bottomley wrote:
  On Fri, 2013-05-31 at 15:34 +0100, Matthew Garrett wrote:
   I agree that a revert is probably the right thing to do here, but the 
   original patch was there to permit a more accurate calculation of the 
   amount of nvram in use, not to provide additional debug information. 
   Reverting it is going to differently break a different set of systems
  
  The only ones that are broken are the Samsung ones.  Samsung claims to
  have fixed their UEFI firmware, so we could refer any problems to them.
 
 No, reverting this gets us back to the old state of refusing any writes 
 if more than 50% of the variable store *appears* to be used, regardless 
 of whether it's actually used. Which, unfortunately, makes it impossible 
 to install Linux on most UEFI machines.

When did writing EFI variables to nvram become necessary to boot
on UEFI?  And if it is necessary, why is it that only linux boot
loaders that use EFI stubs (generally grub2) need it?  The current
kernel boots using EFI/grub and EFI/elilo.  It is just when
EFI stubs are used that the boot fails.

I'm missing the background on why linux needs to write so many
EFI variables to nvram that it fills up nvram.  What is that
all about?

  In any case, Samsung clearly 
 haven't fixed this problem on a pile of machines that have already 
 shipped.

Which means the previous patch(es) that caused the bricking should
get pulled, too.


-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Matthew Garrett
On Fri, May 31, 2013 at 10:43:49AM -0500, Russ Anderson wrote:

 When did writing EFI variables to nvram become necessary to boot on 
 UEFI? And if it is necessary, why is it that only linux boot loaders 
 that use EFI stubs (generally grub2) need it?  The current kernel 
 boots using EFI/grub and EFI/elilo.  It is just when EFI stubs are 
 used that the boot fails.

I think you've misunderstood the problem. If nvram becaomes full, some 
systems crash during firmware initialisation. So we can't let nvram 
become full. The obvious thing to do here is to look at the values from 
QueryVariableInfo, but many systems won't perform any garbage collection 
until they're almost out of space and so variables that have been 
deleted still show up as used space. We can work around that by adding 
up the size of the variables ourselves, but that only gives us the value 
for runtime-visible variables. We also need to know how much space is 
used by variables that are only visible during boot, hence calling 
QueryVariableInfo before ExitBootServices.

 Which means the previous patch(es) that caused the bricking should
 get pulled, too.

There are no patches that cause the bricking.

-- 
Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread James Bottomley
On Fri, 2013-05-31 at 17:28 +0100, Matthew Garrett wrote:
 On Fri, May 31, 2013 at 10:43:49AM -0500, Russ Anderson wrote:
 
  When did writing EFI variables to nvram become necessary to boot on 
  UEFI? And if it is necessary, why is it that only linux boot loaders 
  that use EFI stubs (generally grub2) need it?  The current kernel 
  boots using EFI/grub and EFI/elilo.  It is just when EFI stubs are 
  used that the boot fails.
 
 I think you've misunderstood the problem. If nvram becaomes full, some 
 systems crash during firmware initialisation. So we can't let nvram 
 become full. The obvious thing to do here is to look at the values from 
 QueryVariableInfo, but many systems won't perform any garbage collection 
 until they're almost out of space and so variables that have been 
 deleted still show up as used space. We can work around that by adding 
 up the size of the variables ourselves, but that only gives us the value 
 for runtime-visible variables. We also need to know how much space is 
 used by variables that are only visible during boot, hence calling 
 QueryVariableInfo before ExitBootServices.
 
  Which means the previous patch(es) that caused the bricking should
  get pulled, too.
 
 There are no patches that cause the bricking.

This is the description of the original problem:

http://www.h-online.com/open/news/item/Booting-Linux-using-UEFI-can-brick-Samsung-laptops-1793958.html

And the further investigation:

http://mjg59.dreamwidth.org/22855.html

If you read the latter, it shows you why we have to use
QueryVariableInfo to try to work out how much space we have available.

James


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Russ Anderson
On Fri, May 31, 2013 at 05:28:16PM +0100, Matthew Garrett wrote:
 On Fri, May 31, 2013 at 10:43:49AM -0500, Russ Anderson wrote:
 
  When did writing EFI variables to nvram become necessary to boot on 
  UEFI? And if it is necessary, why is it that only linux boot loaders 
  that use EFI stubs (generally grub2) need it?  The current kernel 
  boots using EFI/grub and EFI/elilo.  It is just when EFI stubs are 
  used that the boot fails.
 
 I think you've misunderstood the problem.

The problem for me is the system not booting.  I was not
familiar with the problem you are trying to solve, but
have now looked at the links.

   If nvram becaomes full, some 
 systems crash during firmware initialisation. So we can't let nvram 
 become full. The obvious thing to do here is to look at the values from 
 QueryVariableInfo, but many systems won't perform any garbage collection 
 until they're almost out of space and so variables that have been 
 deleted still show up as used space.

OK.  I get nvram looks full due to lack of garbage collection
on some systems.  Does QueryVariableInfo (at runtime) tell you
it is full?  Is the problem that it says it is full when it
is not, or does not tell you it is full when it is?

   We can work around that by adding 
 up the size of the variables ourselves, but that only gives us the value 
 for runtime-visible variables. We also need to know how much space is 
 used by variables that are only visible during boot,

Is it valid to assume that only the kernel writes to nvram at
runtime?  Can bios log error information to nvram on an SMM
interrupt (for example)?

  hence calling 
 QueryVariableInfo before ExitBootServices.

Correct me if I am wrong, but that is called from EFI stubs,
which is only used by some bootloaders (sometimes grub2).
Does that mean EFI/grub and EFI/elilo will not hit the problem,
or will not have your fix?


  Which means the previous patch(es) that caused the bricking should
  get pulled, too.
 
 There are no patches that cause the bricking.

I thought that was the problem you were trying to avoid.

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread H. Peter Anvin
On 05/31/2013 03:57 PM, Russ Anderson wrote:
 
 Which means the previous patch(es) that caused the bricking should
 get pulled, too.

 There are no patches that cause the bricking.
 
 I thought that was the problem you were trying to avoid.
 

There are no *patches* that cause the bricking.  We have been trying to
figure out how to *avoid* the bricking.

-hpa

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Jiri Kosina
On Fri, 31 May 2013, Russ Anderson wrote:

 OK.  I get nvram looks full due to lack of garbage collection
 on some systems.  Does QueryVariableInfo (at runtime) tell you
 it is full?  Is the problem that it says it is full when it
 is not, or does not tell you it is full when it is?

We are trying to count the size occupied by existing UEFI variables. 
QueryVariableInfo() called from runtime environment doesn't give a full 
picture, as it doesn't provide information about boot-only variables.

Hence the patch.

  up the size of the variables ourselves, but that only gives us the value 
  for runtime-visible variables. We also need to know how much space is 
  used by variables that are only visible during boot,
 
 Is it valid to assume that only the kernel writes to nvram at
 runtime?  

Well, mostly yes. But if the kernel doesn't know what the initial state 
is, it can't count the size properly.

   Which means the previous patch(es) that caused the bricking should
   get pulled, too.
  
  There are no patches that cause the bricking.
 
 I thought that was the problem you were trying to avoid.

The problem that needs to be avoided is machines turning into bricks if 
EFI storage nvram is getting full.

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Matthew Garrett
On Fri, May 31, 2013 at 05:57:31PM -0500, Russ Anderson wrote:
 On Fri, May 31, 2013 at 05:28:16PM +0100, Matthew Garrett wrote:
If nvram becaomes full, some 
  systems crash during firmware initialisation. So we can't let nvram 
  become full. The obvious thing to do here is to look at the values from 
  QueryVariableInfo, but many systems won't perform any garbage collection 
  until they're almost out of space and so variables that have been 
  deleted still show up as used space.
 
 OK.  I get nvram looks full due to lack of garbage collection
 on some systems.  Does QueryVariableInfo (at runtime) tell you
 it is full?  Is the problem that it says it is full when it
 is not, or does not tell you it is full when it is?

QueryVariableInfo reports the amount used *including garbage*. This 
makes it useless for determining how much space is available for the 
firmware to use.

We can work around that by adding 
  up the size of the variables ourselves, but that only gives us the value 
  for runtime-visible variables. We also need to know how much space is 
  used by variables that are only visible during boot,
 
 Is it valid to assume that only the kernel writes to nvram at
 runtime?  Can bios log error information to nvram on an SMM
 interrupt (for example)?

There's a limited number of situations where the firmware can write to 
nvram, but they're basically uninteresting. Practically speaking, it's 
always via the kernel once ExitBootServices() has been called.

   hence calling 
  QueryVariableInfo before ExitBootServices.
 
 Correct me if I am wrong, but that is called from EFI stubs,
 which is only used by some bootloaders (sometimes grub2).
 Does that mean EFI/grub and EFI/elilo will not hit the problem,
 or will not have your fix?

Will not have the fix. Boot EFI systems via the EFI stub.

-- 
Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Russ Anderson
On Sat, Jun 01, 2013 at 01:03:11AM +0100, Matthew Garrett wrote:
 On Fri, May 31, 2013 at 05:57:31PM -0500, Russ Anderson wrote:
  On Fri, May 31, 2013 at 05:28:16PM +0100, Matthew Garrett wrote:
 If nvram becaomes full, some 
   systems crash during firmware initialisation. So we can't let nvram 
   become full. The obvious thing to do here is to look at the values from 
   QueryVariableInfo, but many systems won't perform any garbage collection 
   until they're almost out of space and so variables that have been 
   deleted still show up as used space.
  
  OK.  I get nvram looks full due to lack of garbage collection
  on some systems.  Does QueryVariableInfo (at runtime) tell you
  it is full?  Is the problem that it says it is full when it
  is not, or does not tell you it is full when it is?
 
 QueryVariableInfo reports the amount used *including garbage*. This 
 makes it useless for determining how much space is available for the 
 firmware to use.

So when QueryVariableInfo reports there is space available, the
write works and everything is fine.

And when QueryVariableInfo reports insufficient space, don't write
and everything is fine.

But when QueryVariableInfo reports insufficient space and you
write anyway, the system can get bricked.  Is that the problem?



According to the UEFI spec for QueryVariableInfo()

  RemainingVariableStorageSize
  Returns the remaining size of the storage
  space available for EFI variables associated
  with the attributes specified.

It also says:

   The returned MaximumVariableStorageSize, RemainingVariableStorageSize,
   MaximumVariableSize information may change immediately after
   the call based on other runtime activities including asynchronous
   error events. Also, these values associated with different
   attributes are not additive in nature.

Note that other runtime activities including asynchronous error
events.  That means runtime services may be using nvram without
the OS knowing about it.


Some bios implementation may be *including garbage*, but can
you definitively say ALL do?  The spec explicitly says the
implementation of variable storage is not defined in this
specification.  You can't assume any form of garbage collection.

   7.2 Variable Services

   Although the implementation of variable storage is not defined
   in this specification, variables must be persistent in most cases.
   This implies that the EFI implementation on a platform must arrange
   it so that variables passed in for storage are retained and
   available for use each time the system boots, at least until they
   are explicitly deleted or overwritten. Provision of this type of
   nonvolatile storage may be very limited on some platforms, so
   variables should be used sparingly in cases where other means of
   communicating information cannot be used.

I don't see anything in here about the OS being free to use
more nvram than QueryVariableInfo() reported as being available.
What happened to following the spec?


 We can work around that by adding 
   up the size of the variables ourselves, but that only gives us the value 
   for runtime-visible variables. We also need to know how much space is 
   used by variables that are only visible during boot,
  
  Is it valid to assume that only the kernel writes to nvram at
  runtime?  Can bios log error information to nvram on an SMM
  interrupt (for example)?
 
 There's a limited number of situations where the firmware can write to 
 nvram, but they're basically uninteresting. Practically speaking, it's 
 always via the kernel once ExitBootServices() has been called.

How can you say they're basically uninteresting???
Do you know what EVERY bios implementation does???


hence calling 
   QueryVariableInfo before ExitBootServices.
  
  Correct me if I am wrong, but that is called from EFI stubs,
  which is only used by some bootloaders (sometimes grub2).
  Does that mean EFI/grub and EFI/elilo will not hit the problem,
  or will not have your fix?
 
 Will not have the fix. Boot EFI systems via the EFI stub.

Thanks for that clarification.

 Boot EFI systems via the EFI stub.

Fortunately all our shipping systems are EFI/grub and EFI/elilo
right now, so they boot.

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-31 Thread Matthew Garrett
On Fri, May 31, 2013 at 11:20:59PM -0500, Russ Anderson wrote:

 And when QueryVariableInfo reports insufficient space, don't write
 and everything is fine.

And then installs fail, which is why we implemented this additional 
code.

 But when QueryVariableInfo reports insufficient space and you
 write anyway, the system can get bricked.  Is that the problem?

No.

 Some bios implementation may be *including garbage*, but can
 you definitively say ALL do?  The spec explicitly says the
 implementation of variable storage is not defined in this
 specification.  You can't assume any form of garbage collection.

No. We can't assume that. But nor can we rely upon the spec, because 
behaving according to the spec results in dead computers. So we're 
forced to write code that behaves according to reality rather than 
paper, which is unfortunate, but if *your* firmware behaved according to 
the specification then you wouldn't be seeing any problems with the 
current code, so I think there's a lesson for us all here.

 I don't see anything in here about the OS being free to use
 more nvram than QueryVariableInfo() reported as being available.
 What happened to following the spec?

That would be the same spec that tells you not to use physical addresses 
after SetVirtualAddressMap() has been called, right? Regardless, we 
don't use more space than QueryVariableInfo() reports as being 
available. We would expect that any attempt to do so would fail. But nor 
can we assume that we're free to use as much space as 
QueryVariableInfo() *does* report, because doing so results in dead 
computers. So instead we're forced to rely on workarounds that happen to 
break your broken firmware, so now we're going to have to find a 
different set of workarounds and hope that they don't break someone 
else's computer instead.

  There's a limited number of situations where the firmware can write to 
  nvram, but they're basically uninteresting. Practically speaking, it's 
  always via the kernel once ExitBootServices() has been called.
 
 How can you say they're basically uninteresting???
 Do you know what EVERY bios implementation does???

We have to assume that firmware doesn't behave in a pathological manner, 
because the alternative is to give up entirely and declare that Linux 
doesn't run on UEFI systems. I suspect that that would be problematic 
for your employers.

-- 
Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-30 Thread joeyli
於 四,2013-05-30 於 21:17 -0500,Russ Anderson 提到:
> On Fri, May 31, 2013 at 12:30:43AM +0200, Jiri Kosina wrote:
> > On Thu, 30 May 2013, Russ Anderson wrote:
> > 
> > > > > That's a great idea.  This patch moves the QueryVariableInfo()
> > > > > call from bootime to runtime, in efi_late_init().  The attached
> > > > > patch is consistent with the UEFI spec and avoids the problem.
> > > > 
> > > > No, that defeats the entire point of the original patch.
> > > 
> > > How so?  It is still calling QueryVariableInfo()
> > > before the data is used.
> > 
> > You lose information provided by QueryVariableInfo() about boot-only 
> > variables once the transition boot -> runtime has happened.
> 
> Is that information really more important than the ability
> to boot?
> 
> Correct me if I'm wrong, but linux was able to boot without
> the boottime QueryVariableInfo() call up until 3.9-rc7,
> and it still does on systems that do not use EFI stubs (ie
> grub and elilo).  It is only when linux uses EFI stubs (ie
> grub2) that linux makes the boottime QueryVariableInfo()
> call.  So why is that call, or whatever is dependent on it,
> more important than booting?
> 
> 
> 
> Thanks,

It related to BIOS's garbage collection behavior of UEFI variable
storage.

The used space of non volatile boottime variables is useful to us for
calculate the active_size, please reference 31ff2f2 patch:

https://lkml.org/lkml/2013/4/15/476

An earlier 68d9298 patch to avoid some machines bricked when more than
50% of the system flash is in use, because the garbage collection will
not trigger on those machines.

We need find out the size of system flash space indeed usage for avoid
this problem. So, cc5a080c5 patch call QueryVariableInfo() to grab the
usage information in boot time.

Calling QueryVariableInfo() at boot time should not causes side effect
until your issue show up. Before this issue happen, avoid bricking some
machines is also important.


Thanks a lot!
Joey Lee

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-30 Thread Russ Anderson
On Thu, May 30, 2013 at 10:32:09PM +, Matthew Garrett wrote:
> On Thu, 2013-05-30 at 17:28 -0500, Russ Anderson wrote:
> > On Thu, May 30, 2013 at 10:21:53PM +, Matthew Garrett wrote:
> > > On Thu, 2013-05-30 at 17:17 -0500, Russ Anderson wrote:
> > > 
> > > > That's a great idea.  This patch moves the QueryVariableInfo()
> > > > call from bootime to runtime, in efi_late_init().  The attached
> > > > patch is consistent with the UEFI spec and avoids the problem.
> > > 
> > > No, that defeats the entire point of the original patch.
> > 
> > How so?  It is still calling QueryVariableInfo()
> > before the data is used.
> 
> We want to know how much space is used by variables that aren't visible
> at runtime.

We want to boot.  We could boot up through 3.9-rc7.

Knowing how much space is used by variables that aren't
visible at runtime it moot if you can't boot.


And again, maybe this is a bios bug - we have bios people
looking into it - and maybe that call _should_ work, but
the fact is the kernel booted without that change[1] and does
not boot with it.  


[1] commit cc5a080c5d40c36089bb08a8a16fa3fc7047fe0f

Thanks,
-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-30 Thread Russ Anderson
On Fri, May 31, 2013 at 12:30:43AM +0200, Jiri Kosina wrote:
> On Thu, 30 May 2013, Russ Anderson wrote:
> 
> > > > That's a great idea.  This patch moves the QueryVariableInfo()
> > > > call from bootime to runtime, in efi_late_init().  The attached
> > > > patch is consistent with the UEFI spec and avoids the problem.
> > > 
> > > No, that defeats the entire point of the original patch.
> > 
> > How so?  It is still calling QueryVariableInfo()
> > before the data is used.
> 
> You lose information provided by QueryVariableInfo() about boot-only 
> variables once the transition boot -> runtime has happened.

Is that information really more important than the ability
to boot?

Correct me if I'm wrong, but linux was able to boot without
the boottime QueryVariableInfo() call up until 3.9-rc7,
and it still does on systems that do not use EFI stubs (ie
grub and elilo).  It is only when linux uses EFI stubs (ie
grub2) that linux makes the boottime QueryVariableInfo()
call.  So why is that call, or whatever is dependent on it,
more important than booting?



Thanks,
-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-30 Thread Matthew Garrett
On Thu, 2013-05-30 at 17:28 -0500, Russ Anderson wrote:
> On Thu, May 30, 2013 at 10:21:53PM +, Matthew Garrett wrote:
> > On Thu, 2013-05-30 at 17:17 -0500, Russ Anderson wrote:
> > 
> > > That's a great idea.  This patch moves the QueryVariableInfo()
> > > call from bootime to runtime, in efi_late_init().  The attached
> > > patch is consistent with the UEFI spec and avoids the problem.
> > 
> > No, that defeats the entire point of the original patch.
> 
> How so?  It is still calling QueryVariableInfo()
> before the data is used.

We want to know how much space is used by variables that aren't visible
at runtime.

-- 
Matthew Garrett | mj...@srcf.ucam.org


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-30 Thread Jiri Kosina
On Thu, 30 May 2013, Russ Anderson wrote:

> > > That's a great idea.  This patch moves the QueryVariableInfo()
> > > call from bootime to runtime, in efi_late_init().  The attached
> > > patch is consistent with the UEFI spec and avoids the problem.
> > 
> > No, that defeats the entire point of the original patch.
> 
> How so?  It is still calling QueryVariableInfo()
> before the data is used.

You lose information provided by QueryVariableInfo() about boot-only 
variables once the transition boot -> runtime has happened.

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-30 Thread Russ Anderson
On Thu, May 30, 2013 at 10:21:53PM +, Matthew Garrett wrote:
> On Thu, 2013-05-30 at 17:17 -0500, Russ Anderson wrote:
> 
> > That's a great idea.  This patch moves the QueryVariableInfo()
> > call from bootime to runtime, in efi_late_init().  The attached
> > patch is consistent with the UEFI spec and avoids the problem.
> 
> No, that defeats the entire point of the original patch.

How so?  It is still calling QueryVariableInfo()
before the data is used.

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-30 Thread Jiri Kosina
On Thu, 30 May 2013, Russ Anderson wrote:

> > > > > Yes, but this call is clearly happening way before ExitBootServices() 
> > > > > -- 
> > > > > see the surrounding code, see for example this in efi_main():
> > > > > 
> > > > > [ ... snip ... ]
> > > > >   setup_efi_vars(boot_params);
> > > > > 
> > > > >   setup_efi_pci(boot_params);
> > > > > 
> > > > >   status = efi_call_phys3(sys_table->boottime->allocate_pool,
> > > > >   EFI_LOADER_DATA, sizeof(*gdt),
> > > > >   (void **));
> > > > >   if (status != EFI_SUCCESS) {
> > > > >   efi_printk("Failed to alloc mem for gdt structure\n");
> > > > >   goto fail;
> > > > >   }
> > > > > [ ... snip ... ]
> > > > 
> > > > Yes.  Note the failing call is sys_table->runtime while all the
> > > > other calls are sys_table->boottime and seem to work.  Not sure
> > > > why the sys_table->runtime call has a problem but it may be
> > > > a clue.  Could something in the runtime path not be set up???
> > > 
> > > That was my original idea early today as well. My understanding of the 
> > > UEFI spec is admittedly limited, but afaics calling runtime method from 
> > > boot environment should be a valid thing to do ... ?
> > 
> > QueryVariableInfo() is a runtime services, all runtime services should
> > available bother on boot time and runtime:
> > 
> > UEFI spec 2.3.1 P.109:
> >   Runtime Services
> >   Functions that are available before and after any call to  
> >   ExitBootServices(). These functions are described in Section 7.
> 
> That's a great idea.  This patch moves the QueryVariableInfo()
> call from bootime to runtime, in efi_late_init().  The attached
> patch is consistent with the UEFI spec and avoids the problem.

Unfortunately that means that you can as well throw the patch away 
completely.

The sole point is to run the QueryVariableInfo() from the boot 
environment, in order to obtain more accurate information.
And it's a valid thing to do, according to UEFI specification.

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-30 Thread Matthew Garrett
On Thu, 2013-05-30 at 17:17 -0500, Russ Anderson wrote:

> That's a great idea.  This patch moves the QueryVariableInfo()
> call from bootime to runtime, in efi_late_init().  The attached
> patch is consistent with the UEFI spec and avoids the problem.

No, that defeats the entire point of the original patch.

-- 
Matthew Garrett | mj...@srcf.ucam.org


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-30 Thread Russ Anderson
On Thu, May 30, 2013 at 10:16:12AM +0800, joeyli wrote:
> 於 四,2013-05-30 於 00:53 +0200,Jiri Kosina 提到:
> > On Wed, 29 May 2013, Russ Anderson wrote:
> > 
> > > > Yes, but this call is clearly happening way before ExitBootServices() 
> > > > -- 
> > > > see the surrounding code, see for example this in efi_main():
> > > > 
> > > > [ ... snip ... ]
> > > > setup_efi_vars(boot_params);
> > > > 
> > > > setup_efi_pci(boot_params);
> > > > 
> > > > status = efi_call_phys3(sys_table->boottime->allocate_pool,
> > > > EFI_LOADER_DATA, sizeof(*gdt),
> > > > (void **));
> > > > if (status != EFI_SUCCESS) {
> > > > efi_printk("Failed to alloc mem for gdt structure\n");
> > > > goto fail;
> > > > }
> > > > [ ... snip ... ]
> > > 
> > > Yes.  Note the failing call is sys_table->runtime while all the
> > > other calls are sys_table->boottime and seem to work.  Not sure
> > > why the sys_table->runtime call has a problem but it may be
> > > a clue.  Could something in the runtime path not be set up???
> > 
> > That was my original idea early today as well. My understanding of the 
> > UEFI spec is admittedly limited, but afaics calling runtime method from 
> > boot environment should be a valid thing to do ... ?
> 
> QueryVariableInfo() is a runtime services, all runtime services should
> available bother on boot time and runtime:
> 
> UEFI spec 2.3.1 P.109:
>   Runtime Services
>   Functions that are available before and after any call to  
>   ExitBootServices(). These functions are described in Section 7.

That's a great idea.  This patch moves the QueryVariableInfo()
call from bootime to runtime, in efi_late_init().  The attached
patch is consistent with the UEFI spec and avoids the problem.

Thanks,
-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
Move query_variable_info call to runtime to avoid bios issues.

Signed-off-by: Russ Anderson 

---
 arch/x86/boot/compressed/eboot.c |   49 ---
 arch/x86/platform/efi/efi.c  |   35 ---
 2 files changed, 16 insertions(+), 68 deletions(-)

Index: linux/arch/x86/boot/compressed/eboot.c
===
--- linux.orig/arch/x86/boot/compressed/eboot.c	2013-05-30 11:02:19.034914824 -0500
+++ linux/arch/x86/boot/compressed/eboot.c	2013-05-30 16:53:50.512636568 -0500
@@ -251,53 +251,6 @@ static void find_bits(unsigned long mask
 	*size = len;
 }
 
-static efi_status_t setup_efi_vars(struct boot_params *params)
-{
-	struct setup_data *data;
-	struct efi_var_bootdata *efidata;
-	u64 store_size, remaining_size, var_size;
-	efi_status_t status;
-
-	if (sys_table->runtime->hdr.revision < EFI_2_00_SYSTEM_TABLE_REVISION)
-		return EFI_UNSUPPORTED;
-
-	data = (struct setup_data *)(unsigned long)params->hdr.setup_data;
-
-	while (data && data->next)
-		data = (struct setup_data *)(unsigned long)data->next;
-
-	status = efi_call_phys4((void *)sys_table->runtime->query_variable_info,
-EFI_VARIABLE_NON_VOLATILE |
-EFI_VARIABLE_BOOTSERVICE_ACCESS |
-EFI_VARIABLE_RUNTIME_ACCESS, _size,
-_size, _size);
-
-	if (status != EFI_SUCCESS) { // RJA
-		efi_printk("RJA: setup_efi_vars FAILED\n");
-		return status;
-	}
-
-	status = efi_call_phys3(sys_table->boottime->allocate_pool,
-EFI_LOADER_DATA, sizeof(*efidata), );
-
-	if (status != EFI_SUCCESS)
-		return status;
-
-	efidata->data.type = SETUP_EFI_VARS;
-	efidata->data.len = sizeof(struct efi_var_bootdata) -
-		sizeof(struct setup_data);
-	efidata->data.next = 0;
-	efidata->store_size = store_size;
-	efidata->remaining_size = remaining_size;
-	efidata->max_var_size = var_size;
-
-	if (data)
-		data->next = (unsigned long)efidata;
-	else
-		params->hdr.setup_data = (unsigned long)efidata;
-
-}
-
 static efi_status_t setup_efi_pci(struct boot_params *params)
 {
 	efi_pci_io_protocol *pci;
@@ -1204,8 +1157,6 @@ struct boot_params *efi_main(void *handl
 
 	setup_graphics(boot_params);
 
-	setup_efi_vars(boot_params);
-
 	setup_efi_pci(boot_params);
 
 	status = efi_call_phys3(sys_table->boottime->allocate_pool,
Index: linux/arch/x86/platform/efi/efi.c
===
--- linux.orig/arch/x86/platform/efi/efi.c	2013-05-30 11:02:19.034914824 -0500
+++ linux/arch/x86/platform/efi/efi.c	2013-05-30 17:05:38.140039879 -0500
@@ -786,9 +786,6 @@ void __init efi_init(void)
 	char vendor[100] = "unknown";
 	int i = 0;
 	void *tmp;
-	struct setup_data *data;
-	struct efi_var_bootdata *efi_var_data;
-	u64 pa_data;
 
 #ifdef CONFIG_X86_32
 	if (boot_params.efi_info.efi_systab_hi ||
@@ -806,22 +803,6 @@ void __init efi_init(void)
 	if (efi_systab_init(efi_phys.systab))
 		return;
 
-	pa_data = boot_params.hdr.setup_data;
-	while (pa_data) {
-		data = early_ioremap(pa_data, sizeof(*efi_var_data));
-	

Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-30 Thread Russ Anderson
On Thu, May 30, 2013 at 10:16:12AM +0800, joeyli wrote:
 於 四,2013-05-30 於 00:53 +0200,Jiri Kosina 提到:
  On Wed, 29 May 2013, Russ Anderson wrote:
  
Yes, but this call is clearly happening way before ExitBootServices() 
-- 
see the surrounding code, see for example this in efi_main():

[ ... snip ... ]
setup_efi_vars(boot_params);

setup_efi_pci(boot_params);

status = efi_call_phys3(sys_table-boottime-allocate_pool,
EFI_LOADER_DATA, sizeof(*gdt),
(void **)gdt);
if (status != EFI_SUCCESS) {
efi_printk(Failed to alloc mem for gdt structure\n);
goto fail;
}
[ ... snip ... ]
   
   Yes.  Note the failing call is sys_table-runtime while all the
   other calls are sys_table-boottime and seem to work.  Not sure
   why the sys_table-runtime call has a problem but it may be
   a clue.  Could something in the runtime path not be set up???
  
  That was my original idea early today as well. My understanding of the 
  UEFI spec is admittedly limited, but afaics calling runtime method from 
  boot environment should be a valid thing to do ... ?
 
 QueryVariableInfo() is a runtime services, all runtime services should
 available bother on boot time and runtime:
 
 UEFI spec 2.3.1 P.109:
   Runtime Services
   Functions that are available before and after any call to  
   ExitBootServices(). These functions are described in Section 7.

That's a great idea.  This patch moves the QueryVariableInfo()
call from bootime to runtime, in efi_late_init().  The attached
patch is consistent with the UEFI spec and avoids the problem.

Thanks,
-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
Move query_variable_info call to runtime to avoid bios issues.

Signed-off-by: Russ Anderson r...@sgi.com

---
 arch/x86/boot/compressed/eboot.c |   49 ---
 arch/x86/platform/efi/efi.c  |   35 ---
 2 files changed, 16 insertions(+), 68 deletions(-)

Index: linux/arch/x86/boot/compressed/eboot.c
===
--- linux.orig/arch/x86/boot/compressed/eboot.c	2013-05-30 11:02:19.034914824 -0500
+++ linux/arch/x86/boot/compressed/eboot.c	2013-05-30 16:53:50.512636568 -0500
@@ -251,53 +251,6 @@ static void find_bits(unsigned long mask
 	*size = len;
 }
 
-static efi_status_t setup_efi_vars(struct boot_params *params)
-{
-	struct setup_data *data;
-	struct efi_var_bootdata *efidata;
-	u64 store_size, remaining_size, var_size;
-	efi_status_t status;
-
-	if (sys_table-runtime-hdr.revision  EFI_2_00_SYSTEM_TABLE_REVISION)
-		return EFI_UNSUPPORTED;
-
-	data = (struct setup_data *)(unsigned long)params-hdr.setup_data;
-
-	while (data  data-next)
-		data = (struct setup_data *)(unsigned long)data-next;
-
-	status = efi_call_phys4((void *)sys_table-runtime-query_variable_info,
-EFI_VARIABLE_NON_VOLATILE |
-EFI_VARIABLE_BOOTSERVICE_ACCESS |
-EFI_VARIABLE_RUNTIME_ACCESS, store_size,
-remaining_size, var_size);
-
-	if (status != EFI_SUCCESS) { // RJA
-		efi_printk(RJA: setup_efi_vars FAILED\n);
-		return status;
-	}
-
-	status = efi_call_phys3(sys_table-boottime-allocate_pool,
-EFI_LOADER_DATA, sizeof(*efidata), efidata);
-
-	if (status != EFI_SUCCESS)
-		return status;
-
-	efidata-data.type = SETUP_EFI_VARS;
-	efidata-data.len = sizeof(struct efi_var_bootdata) -
-		sizeof(struct setup_data);
-	efidata-data.next = 0;
-	efidata-store_size = store_size;
-	efidata-remaining_size = remaining_size;
-	efidata-max_var_size = var_size;
-
-	if (data)
-		data-next = (unsigned long)efidata;
-	else
-		params-hdr.setup_data = (unsigned long)efidata;
-
-}
-
 static efi_status_t setup_efi_pci(struct boot_params *params)
 {
 	efi_pci_io_protocol *pci;
@@ -1204,8 +1157,6 @@ struct boot_params *efi_main(void *handl
 
 	setup_graphics(boot_params);
 
-	setup_efi_vars(boot_params);
-
 	setup_efi_pci(boot_params);
 
 	status = efi_call_phys3(sys_table-boottime-allocate_pool,
Index: linux/arch/x86/platform/efi/efi.c
===
--- linux.orig/arch/x86/platform/efi/efi.c	2013-05-30 11:02:19.034914824 -0500
+++ linux/arch/x86/platform/efi/efi.c	2013-05-30 17:05:38.140039879 -0500
@@ -786,9 +786,6 @@ void __init efi_init(void)
 	char vendor[100] = unknown;
 	int i = 0;
 	void *tmp;
-	struct setup_data *data;
-	struct efi_var_bootdata *efi_var_data;
-	u64 pa_data;
 
 #ifdef CONFIG_X86_32
 	if (boot_params.efi_info.efi_systab_hi ||
@@ -806,22 +803,6 @@ void __init efi_init(void)
 	if (efi_systab_init(efi_phys.systab))
 		return;
 
-	pa_data = boot_params.hdr.setup_data;
-	while (pa_data) {
-		data = early_ioremap(pa_data, sizeof(*efi_var_data));
-		if (data-type == SETUP_EFI_VARS) {
-			efi_var_data = (struct efi_var_bootdata *)data;
-
-			

Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-30 Thread Matthew Garrett
On Thu, 2013-05-30 at 17:17 -0500, Russ Anderson wrote:

 That's a great idea.  This patch moves the QueryVariableInfo()
 call from bootime to runtime, in efi_late_init().  The attached
 patch is consistent with the UEFI spec and avoids the problem.

No, that defeats the entire point of the original patch.

-- 
Matthew Garrett | mj...@srcf.ucam.org


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-30 Thread Jiri Kosina
On Thu, 30 May 2013, Russ Anderson wrote:

 Yes, but this call is clearly happening way before ExitBootServices() 
 -- 
 see the surrounding code, see for example this in efi_main():
 
 [ ... snip ... ]
   setup_efi_vars(boot_params);
 
   setup_efi_pci(boot_params);
 
   status = efi_call_phys3(sys_table-boottime-allocate_pool,
   EFI_LOADER_DATA, sizeof(*gdt),
   (void **)gdt);
   if (status != EFI_SUCCESS) {
   efi_printk(Failed to alloc mem for gdt structure\n);
   goto fail;
   }
 [ ... snip ... ]

Yes.  Note the failing call is sys_table-runtime while all the
other calls are sys_table-boottime and seem to work.  Not sure
why the sys_table-runtime call has a problem but it may be
a clue.  Could something in the runtime path not be set up???
   
   That was my original idea early today as well. My understanding of the 
   UEFI spec is admittedly limited, but afaics calling runtime method from 
   boot environment should be a valid thing to do ... ?
  
  QueryVariableInfo() is a runtime services, all runtime services should
  available bother on boot time and runtime:
  
  UEFI spec 2.3.1 P.109:
Runtime Services
Functions that are available before and after any call to  
ExitBootServices(). These functions are described in Section 7.
 
 That's a great idea.  This patch moves the QueryVariableInfo()
 call from bootime to runtime, in efi_late_init().  The attached
 patch is consistent with the UEFI spec and avoids the problem.

Unfortunately that means that you can as well throw the patch away 
completely.

The sole point is to run the QueryVariableInfo() from the boot 
environment, in order to obtain more accurate information.
And it's a valid thing to do, according to UEFI specification.

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-30 Thread Russ Anderson
On Thu, May 30, 2013 at 10:21:53PM +, Matthew Garrett wrote:
 On Thu, 2013-05-30 at 17:17 -0500, Russ Anderson wrote:
 
  That's a great idea.  This patch moves the QueryVariableInfo()
  call from bootime to runtime, in efi_late_init().  The attached
  patch is consistent with the UEFI spec and avoids the problem.
 
 No, that defeats the entire point of the original patch.

How so?  It is still calling QueryVariableInfo()
before the data is used.

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-30 Thread Jiri Kosina
On Thu, 30 May 2013, Russ Anderson wrote:

   That's a great idea.  This patch moves the QueryVariableInfo()
   call from bootime to runtime, in efi_late_init().  The attached
   patch is consistent with the UEFI spec and avoids the problem.
  
  No, that defeats the entire point of the original patch.
 
 How so?  It is still calling QueryVariableInfo()
 before the data is used.

You lose information provided by QueryVariableInfo() about boot-only 
variables once the transition boot - runtime has happened.

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-30 Thread Matthew Garrett
On Thu, 2013-05-30 at 17:28 -0500, Russ Anderson wrote:
 On Thu, May 30, 2013 at 10:21:53PM +, Matthew Garrett wrote:
  On Thu, 2013-05-30 at 17:17 -0500, Russ Anderson wrote:
  
   That's a great idea.  This patch moves the QueryVariableInfo()
   call from bootime to runtime, in efi_late_init().  The attached
   patch is consistent with the UEFI spec and avoids the problem.
  
  No, that defeats the entire point of the original patch.
 
 How so?  It is still calling QueryVariableInfo()
 before the data is used.

We want to know how much space is used by variables that aren't visible
at runtime.

-- 
Matthew Garrett | mj...@srcf.ucam.org


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-30 Thread Russ Anderson
On Fri, May 31, 2013 at 12:30:43AM +0200, Jiri Kosina wrote:
 On Thu, 30 May 2013, Russ Anderson wrote:
 
That's a great idea.  This patch moves the QueryVariableInfo()
call from bootime to runtime, in efi_late_init().  The attached
patch is consistent with the UEFI spec and avoids the problem.
   
   No, that defeats the entire point of the original patch.
  
  How so?  It is still calling QueryVariableInfo()
  before the data is used.
 
 You lose information provided by QueryVariableInfo() about boot-only 
 variables once the transition boot - runtime has happened.

Is that information really more important than the ability
to boot?

Correct me if I'm wrong, but linux was able to boot without
the boottime QueryVariableInfo() call up until 3.9-rc7,
and it still does on systems that do not use EFI stubs (ie
grub and elilo).  It is only when linux uses EFI stubs (ie
grub2) that linux makes the boottime QueryVariableInfo()
call.  So why is that call, or whatever is dependent on it,
more important than booting?



Thanks,
-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-30 Thread Russ Anderson
On Thu, May 30, 2013 at 10:32:09PM +, Matthew Garrett wrote:
 On Thu, 2013-05-30 at 17:28 -0500, Russ Anderson wrote:
  On Thu, May 30, 2013 at 10:21:53PM +, Matthew Garrett wrote:
   On Thu, 2013-05-30 at 17:17 -0500, Russ Anderson wrote:
   
That's a great idea.  This patch moves the QueryVariableInfo()
call from bootime to runtime, in efi_late_init().  The attached
patch is consistent with the UEFI spec and avoids the problem.
   
   No, that defeats the entire point of the original patch.
  
  How so?  It is still calling QueryVariableInfo()
  before the data is used.
 
 We want to know how much space is used by variables that aren't visible
 at runtime.

We want to boot.  We could boot up through 3.9-rc7.

Knowing how much space is used by variables that aren't
visible at runtime it moot if you can't boot.


And again, maybe this is a bios bug - we have bios people
looking into it - and maybe that call _should_ work, but
the fact is the kernel booted without that change[1] and does
not boot with it.  


[1] commit cc5a080c5d40c36089bb08a8a16fa3fc7047fe0f

Thanks,
-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-30 Thread joeyli
於 四,2013-05-30 於 21:17 -0500,Russ Anderson 提到:
 On Fri, May 31, 2013 at 12:30:43AM +0200, Jiri Kosina wrote:
  On Thu, 30 May 2013, Russ Anderson wrote:
  
 That's a great idea.  This patch moves the QueryVariableInfo()
 call from bootime to runtime, in efi_late_init().  The attached
 patch is consistent with the UEFI spec and avoids the problem.

No, that defeats the entire point of the original patch.
   
   How so?  It is still calling QueryVariableInfo()
   before the data is used.
  
  You lose information provided by QueryVariableInfo() about boot-only 
  variables once the transition boot - runtime has happened.
 
 Is that information really more important than the ability
 to boot?
 
 Correct me if I'm wrong, but linux was able to boot without
 the boottime QueryVariableInfo() call up until 3.9-rc7,
 and it still does on systems that do not use EFI stubs (ie
 grub and elilo).  It is only when linux uses EFI stubs (ie
 grub2) that linux makes the boottime QueryVariableInfo()
 call.  So why is that call, or whatever is dependent on it,
 more important than booting?
 
 
 
 Thanks,

It related to BIOS's garbage collection behavior of UEFI variable
storage.

The used space of non volatile boottime variables is useful to us for
calculate the active_size, please reference 31ff2f2 patch:

https://lkml.org/lkml/2013/4/15/476

An earlier 68d9298 patch to avoid some machines bricked when more than
50% of the system flash is in use, because the garbage collection will
not trigger on those machines.

We need find out the size of system flash space indeed usage for avoid
this problem. So, cc5a080c5 patch call QueryVariableInfo() to grab the
usage information in boot time.

Calling QueryVariableInfo() at boot time should not causes side effect
until your issue show up. Before this issue happen, avoid bricking some
machines is also important.


Thanks a lot!
Joey Lee

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-29 Thread joeyli
於 三,2013-05-29 於 17:46 -0500,Russ Anderson 提到:
> On Thu, May 30, 2013 at 12:22:13AM +0200, Jiri Kosina wrote:
> > On Wed, 29 May 2013, Russ Anderson wrote:
> > 
> > > > What appears to be happening is that your the EFI runtime services code
> > > > is calling into the EFI boot services code, which is definitely a bug in
> > > > your firmware because we're at runtime, but we've seen other machines
> > > > that do similar things so we usually handle it just fine. However, what
> > > > makes your case different, and the reason you see the above splat, is
> > > > that it's using the physical address of the EFI boot services region,
> > > > not the virtual one we setup with SetVirtualAddressMap(). Which is a
> > > > second firmware bug. Again, we have seen other machines that access
> > > > physical addresses after SetVirtualAddressMap(), but until now we
> > > > haven't had any non-optional code that triggered them.
> > > > 
> > > > The only reason I can see that the offending commit would introduce this
> > > > problem is because it calls QueryVariableInfo() at boot time. I notice
> > > > that your machine is an SGI UV one, is there any chance you could get a
> > > > firmware fix for this? If possible, it would be also good to confirm
> > > > that it's this chunk of code in setup_efi_vars(),
> > > > 
> > > > status = efi_call_phys4(sys_table->runtime->query_variable_info,
> > > > EFI_VARIABLE_NON_VOLATILE |
> > > > EFI_VARIABLE_BOOTSERVICE_ACCESS |
> > > > EFI_VARIABLE_RUNTIME_ACCESS, 
> > > > _size,
> > > > _size, _size);
> > > 
> > > This does trigger the problem.  Note that the definition of
> > > QueryVariableInfo() in the UEFI spec says:
> > > 
> > >   The returned MaximumVariableStorageSize, RemainingVariableStorageSize,
> > >   MaximumVariableSize information may change immediately after the call
> > >   based on other runtime activities including asynchronous error events.
> > >   Also, these values associated with different attributes are not
> > >   additive in nature.
> > > 
> > > Note the values may be accurate at the point in time when returned,
> > > but may not be after that.
> > > 
> > >   After the system has transitioned into runtime (after
> > >   ExitBootServices() is called), an implementation may not be able to
> > >   accurately return information about the Boot Services variable store.
> > >   In such cases, EFI_INVALID_PARAMETER should be returned.
> > > 
> > > It is not clear to me exactly when ExitBootServices() is called.
> > > Our bios is returning a failing indication on the call.
> > 
> > Yes, but this call is clearly happening way before ExitBootServices() -- 
> > see the surrounding code, see for example this in efi_main():
> > 
> > [ ... snip ... ]
> > setup_efi_vars(boot_params);
> > 
> > setup_efi_pci(boot_params);
> > 
> > status = efi_call_phys3(sys_table->boottime->allocate_pool,
> > EFI_LOADER_DATA, sizeof(*gdt),
> > (void **));
> > if (status != EFI_SUCCESS) {
> > efi_printk("Failed to alloc mem for gdt structure\n");
> > goto fail;
> > }
> > [ ... snip ... ]
> 
> Yes.  Note the failing call is sys_table->runtime while all the
> other calls are sys_table->boottime and seem to work.  Not sure
> why the sys_table->runtime call has a problem but it may be
> a clue.  Could something in the runtime path not be set up???
> 

Per UEFI spec Section 6, all runtime services should a available both on
boot time and runtime. And, we query the EFI_VARIABLE_BOOTSERVICE_ACCESS
space information before ExitBootServices(), that means we call it in
boot time, so, QueryVariableInfo() should return information to us.

Back to the kernel oops, the oops happened in runtime environment when
efivar_init running. As Matt's point out as following:

於 五,2013-05-24 於 08:43 +0100,Matt Fleming 提到: 
> On Thu, 23 May, at 03:32:34PM, Russ Anderson wrote:
> >efi: mem127: type=4, attr=0xf, 
> > range=[0x6bb22000-0x7ca9c000) (271MB)
> 
> EFI_BOOT_SERVICES_CODE

Per UEFI 2.3.1 Section 6.2, type 4 is EfiBootServicesCode. This area available 
for
OS using after ExitBootServices():

UEFI 2.3.1 P.133
 EfiBootServicesData Memory available for general use.

So, any runtime services should not access this area, it's the first thing need 
to check
the BIOS code for why.

> >efi: mem133: type=5, attr=0x800f, 
> > range=[0x7daff000-0x7dbff000) (1MB)
> 
> EFI_RUNTIME_SERVICES_CODE
> 
> >EFI Variables Facility v0.08 2004-May-17
> >BUG: unable to handle kernel paging request at 7ca95b10
> >IP: [] 0x88007dbf213f
[...]
What appears to be happening is that your the EFI runtime services code
> is calling into the EFI boot services code, which is definitely a bug in
> your firmware because we're at runtime, but 

Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-29 Thread joeyli
於 四,2013-05-30 於 00:53 +0200,Jiri Kosina 提到:
> On Wed, 29 May 2013, Russ Anderson wrote:
> 
> > > Yes, but this call is clearly happening way before ExitBootServices() -- 
> > > see the surrounding code, see for example this in efi_main():
> > > 
> > > [ ... snip ... ]
> > >   setup_efi_vars(boot_params);
> > > 
> > >   setup_efi_pci(boot_params);
> > > 
> > >   status = efi_call_phys3(sys_table->boottime->allocate_pool,
> > >   EFI_LOADER_DATA, sizeof(*gdt),
> > >   (void **));
> > >   if (status != EFI_SUCCESS) {
> > >   efi_printk("Failed to alloc mem for gdt structure\n");
> > >   goto fail;
> > >   }
> > > [ ... snip ... ]
> > 
> > Yes.  Note the failing call is sys_table->runtime while all the
> > other calls are sys_table->boottime and seem to work.  Not sure
> > why the sys_table->runtime call has a problem but it may be
> > a clue.  Could something in the runtime path not be set up???
> 
> That was my original idea early today as well. My understanding of the 
> UEFI spec is admittedly limited, but afaics calling runtime method from 
> boot environment should be a valid thing to do ... ?

QueryVariableInfo() is a runtime services, all runtime services should
available bother on boot time and runtime:

UEFI spec 2.3.1 P.109:
  Runtime Services
  Functions that are available before and after any call to  
  ExitBootServices(). These functions are described in Section 7.

> 
> > > We are calling QueryVariableInfo() in setup_efi_vars(), and later on 
> > > AllocatePool is being called (through boot table).
> > 
> > On my system the QueryVariableInfo() call fails, so AllocatePool()
> > is not called in setup_efi_vars().
> 
> But it's being called later on coming back to efi_main(). That was just a 
> poor man's demonstration attempt why this code is running before 
> ExitBootServices() has been called.
> 

Yes, I agreed your point, the space information of
EFI_VARIABLE_BOOTSERVICE_ACCESS should still return by
QueryVariableInfo() because we call it before ExitBootServices():

arch/x86/boot/compressed/eboot.c
efi_main(void *handle, efi_system_table_t *_table, struct boot_params 
*boot_params)
..
sys_table = _table;
/* Check if we were booted by the EFI firmware */
if (sys_table->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE)
goto fail;
boot_params->secure_boot = get_secure_boot(sys_table)   
/* check does BIOS in secure boot mode */
setup_graphics(boot_params);
setup_efi_vars(boot_params);
/* Pass boot services variable info to runtime code, call QueryVariableInfo() */
...
status = exit_boot(boot_params, handle);
/* call ExitBootServices() */
...


Thanks a lot!
Joey Lee

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-29 Thread Jiri Kosina
On Wed, 29 May 2013, Russ Anderson wrote:

> > Yes, but this call is clearly happening way before ExitBootServices() -- 
> > see the surrounding code, see for example this in efi_main():
> > 
> > [ ... snip ... ]
> > setup_efi_vars(boot_params);
> > 
> > setup_efi_pci(boot_params);
> > 
> > status = efi_call_phys3(sys_table->boottime->allocate_pool,
> > EFI_LOADER_DATA, sizeof(*gdt),
> > (void **));
> > if (status != EFI_SUCCESS) {
> > efi_printk("Failed to alloc mem for gdt structure\n");
> > goto fail;
> > }
> > [ ... snip ... ]
> 
> Yes.  Note the failing call is sys_table->runtime while all the
> other calls are sys_table->boottime and seem to work.  Not sure
> why the sys_table->runtime call has a problem but it may be
> a clue.  Could something in the runtime path not be set up???

That was my original idea early today as well. My understanding of the 
UEFI spec is admittedly limited, but afaics calling runtime method from 
boot environment should be a valid thing to do ... ?

> > We are calling QueryVariableInfo() in setup_efi_vars(), and later on 
> > AllocatePool is being called (through boot table).
> 
> On my system the QueryVariableInfo() call fails, so AllocatePool()
> is not called in setup_efi_vars().

But it's being called later on coming back to efi_main(). That was just a 
poor man's demonstration attempt why this code is running before 
ExitBootServices() has been called.

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-29 Thread Russ Anderson
On Thu, May 30, 2013 at 12:22:13AM +0200, Jiri Kosina wrote:
> On Wed, 29 May 2013, Russ Anderson wrote:
> 
> > > What appears to be happening is that your the EFI runtime services code
> > > is calling into the EFI boot services code, which is definitely a bug in
> > > your firmware because we're at runtime, but we've seen other machines
> > > that do similar things so we usually handle it just fine. However, what
> > > makes your case different, and the reason you see the above splat, is
> > > that it's using the physical address of the EFI boot services region,
> > > not the virtual one we setup with SetVirtualAddressMap(). Which is a
> > > second firmware bug. Again, we have seen other machines that access
> > > physical addresses after SetVirtualAddressMap(), but until now we
> > > haven't had any non-optional code that triggered them.
> > > 
> > > The only reason I can see that the offending commit would introduce this
> > > problem is because it calls QueryVariableInfo() at boot time. I notice
> > > that your machine is an SGI UV one, is there any chance you could get a
> > > firmware fix for this? If possible, it would be also good to confirm
> > > that it's this chunk of code in setup_efi_vars(),
> > > 
> > >   status = efi_call_phys4(sys_table->runtime->query_variable_info,
> > >   EFI_VARIABLE_NON_VOLATILE |
> > >   EFI_VARIABLE_BOOTSERVICE_ACCESS |
> > >   EFI_VARIABLE_RUNTIME_ACCESS, _size,
> > >   _size, _size);
> > 
> > This does trigger the problem.  Note that the definition of
> > QueryVariableInfo() in the UEFI spec says:
> > 
> >   The returned MaximumVariableStorageSize, RemainingVariableStorageSize,
> >   MaximumVariableSize information may change immediately after the call
> >   based on other runtime activities including asynchronous error events.
> >   Also, these values associated with different attributes are not
> >   additive in nature.
> > 
> > Note the values may be accurate at the point in time when returned,
> > but may not be after that.
> > 
> >   After the system has transitioned into runtime (after
> >   ExitBootServices() is called), an implementation may not be able to
> >   accurately return information about the Boot Services variable store.
> >   In such cases, EFI_INVALID_PARAMETER should be returned.
> > 
> > It is not clear to me exactly when ExitBootServices() is called.
> > Our bios is returning a failing indication on the call.
> 
> Yes, but this call is clearly happening way before ExitBootServices() -- 
> see the surrounding code, see for example this in efi_main():
> 
> [ ... snip ... ]
>   setup_efi_vars(boot_params);
> 
>   setup_efi_pci(boot_params);
> 
>   status = efi_call_phys3(sys_table->boottime->allocate_pool,
>   EFI_LOADER_DATA, sizeof(*gdt),
>   (void **));
>   if (status != EFI_SUCCESS) {
>   efi_printk("Failed to alloc mem for gdt structure\n");
>   goto fail;
>   }
> [ ... snip ... ]

Yes.  Note the failing call is sys_table->runtime while all the
other calls are sys_table->boottime and seem to work.  Not sure
why the sys_table->runtime call has a problem but it may be
a clue.  Could something in the runtime path not be set up???

> We are calling QueryVariableInfo() in setup_efi_vars(), and later on 
> AllocatePool is being called (through boot table).

On my system the QueryVariableInfo() call fails, so AllocatePool()
is not called in setup_efi_vars().

>That'd inherently fail 
> if we were calling it after ExitBootServices() has happened, but I believe 
> that call is succeeding on your affected system as well.
> 
> -- 
> Jiri Kosina
> SUSE Labs

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-29 Thread Jiri Kosina
On Wed, 29 May 2013, Russ Anderson wrote:

> > What appears to be happening is that your the EFI runtime services code
> > is calling into the EFI boot services code, which is definitely a bug in
> > your firmware because we're at runtime, but we've seen other machines
> > that do similar things so we usually handle it just fine. However, what
> > makes your case different, and the reason you see the above splat, is
> > that it's using the physical address of the EFI boot services region,
> > not the virtual one we setup with SetVirtualAddressMap(). Which is a
> > second firmware bug. Again, we have seen other machines that access
> > physical addresses after SetVirtualAddressMap(), but until now we
> > haven't had any non-optional code that triggered them.
> > 
> > The only reason I can see that the offending commit would introduce this
> > problem is because it calls QueryVariableInfo() at boot time. I notice
> > that your machine is an SGI UV one, is there any chance you could get a
> > firmware fix for this? If possible, it would be also good to confirm
> > that it's this chunk of code in setup_efi_vars(),
> > 
> > status = efi_call_phys4(sys_table->runtime->query_variable_info,
> > EFI_VARIABLE_NON_VOLATILE |
> > EFI_VARIABLE_BOOTSERVICE_ACCESS |
> > EFI_VARIABLE_RUNTIME_ACCESS, _size,
> > _size, _size);
> 
> This does trigger the problem.  Note that the definition of
> QueryVariableInfo() in the UEFI spec says:
> 
>   The returned MaximumVariableStorageSize, RemainingVariableStorageSize,
>   MaximumVariableSize information may change immediately after the call
>   based on other runtime activities including asynchronous error events.
>   Also, these values associated with different attributes are not
>   additive in nature.
> 
> Note the values may be accurate at the point in time when returned,
> but may not be after that.
> 
>   After the system has transitioned into runtime (after
>   ExitBootServices() is called), an implementation may not be able to
>   accurately return information about the Boot Services variable store.
>   In such cases, EFI_INVALID_PARAMETER should be returned.
> 
> It is not clear to me exactly when ExitBootServices() is called.
> Our bios is returning a failing indication on the call.

Yes, but this call is clearly happening way before ExitBootServices() -- 
see the surrounding code, see for example this in efi_main():

[ ... snip ... ]
setup_efi_vars(boot_params);

setup_efi_pci(boot_params);

status = efi_call_phys3(sys_table->boottime->allocate_pool,
EFI_LOADER_DATA, sizeof(*gdt),
(void **));
if (status != EFI_SUCCESS) {
efi_printk("Failed to alloc mem for gdt structure\n");
goto fail;
}
[ ... snip ... ]

We are calling QueryVariableInfo() in setup_efi_vars(), and later on 
AllocatePool is being called (through boot table). That'd inherently fail 
if we were calling it after ExitBootServices() has happened, but I believe 
that call is succeeding on your affected system as well.

-- 
Jiri Kosina
SUSE Labs

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-29 Thread Russ Anderson
On Fri, May 24, 2013 at 08:43:31AM +0100, Matt Fleming wrote:
> On Thu, 23 May, at 03:32:34PM, Russ Anderson wrote:
> >efi: mem127: type=4, attr=0xf, 
> > range=[0x6bb22000-0x7ca9c000) (271MB)
> 
> EFI_BOOT_SERVICES_CODE
> 
> >efi: mem133: type=5, attr=0x800f, 
> > range=[0x7daff000-0x7dbff000) (1MB)
> 
> EFI_RUNTIME_SERVICES_CODE
> 
> >EFI Variables Facility v0.08 2004-May-17
> >BUG: unable to handle kernel paging request at 7ca95b10
> >IP: [] 0x88007dbf213f
> 
> > [] ?  efi_call3+0x43/0x80
> > [] ?  virt_efi_get_next_variable+0x47/0x1c0
> > [] ?  create_efivars_bin_attributes+0x150/0x150
> > [] ?  efivar_init+0xd5/0x390
> > [] ?  efivar_update_sysfs_entries+0x90/0x90
> > [] ?  kobject_uevent+0xb/0x10
> > [] ?  kset_register+0x5b/0x70
> > [] ?  create_efivars_bin_attributes+0x150/0x150
> > [] ?  efivars_sysfs_init+0x87/0xf0
> > [] ?  do_one_initcall+0x15a/0x1b0
> > [] ?  do_basic_setup+0xad/0xce
> > [] ?  kernel_init_freeable+0x291/0x291
> > [] ?  sched_init_smp+0x15b/0x162
> > [] ?  kernel_init_freeable+0x20d/0x291
> > [] ?  rest_init+0x80/0x80
> > [] ?  kernel_init+0xe/0x180
> > [] ?  ret_from_fork+0x7c/0xb0
> > [] ?  rest_init+0x80/0x80
> 
> Here's the real call stack leading up to the crash.
> 
> What appears to be happening is that your the EFI runtime services code
> is calling into the EFI boot services code, which is definitely a bug in
> your firmware because we're at runtime, but we've seen other machines
> that do similar things so we usually handle it just fine. However, what
> makes your case different, and the reason you see the above splat, is
> that it's using the physical address of the EFI boot services region,
> not the virtual one we setup with SetVirtualAddressMap(). Which is a
> second firmware bug. Again, we have seen other machines that access
> physical addresses after SetVirtualAddressMap(), but until now we
> haven't had any non-optional code that triggered them.
> 
> The only reason I can see that the offending commit would introduce this
> problem is because it calls QueryVariableInfo() at boot time. I notice
> that your machine is an SGI UV one, is there any chance you could get a
> firmware fix for this? If possible, it would be also good to confirm
> that it's this chunk of code in setup_efi_vars(),
> 
>   status = efi_call_phys4(sys_table->runtime->query_variable_info,
>   EFI_VARIABLE_NON_VOLATILE |
>   EFI_VARIABLE_BOOTSERVICE_ACCESS |
>   EFI_VARIABLE_RUNTIME_ACCESS, _size,
>   _size, _size);

This does trigger the problem.  Note that the definition of
QueryVariableInfo() in the UEFI spec says:

  The returned MaximumVariableStorageSize, RemainingVariableStorageSize,
  MaximumVariableSize information may change immediately after the call
  based on other runtime activities including asynchronous error events.
  Also, these values associated with different attributes are not
  additive in nature.

Note the values may be accurate at the point in time when returned,
but may not be after that.

  After the system has transitioned into runtime (after
  ExitBootServices() is called), an implementation may not be able to
  accurately return information about the Boot Services variable store.
  In such cases, EFI_INVALID_PARAMETER should be returned.

It is not clear to me exactly when ExitBootServices() is called.
Our bios is returning a failing indication on the call.


The description from commit cc5a080c5d40c36089bb08a8a16fa3fc7047fe0f is:

  efi: Pass boot services variable info to runtime code

  EFI variables can be flagged as being accessible only within boot
  services. This makes it awkward for us to figure out how much space
  they use at runtime. In theory we could figure this out by simply
  comparing the results from QueryVariableInfo() to the space used by
  all of our variables, but that fails if the platform doesn't garbage
  collect on every boot. Thankfully, calling QueryVariableInfo() while
  still inside boot services gives a more reliable answer. This patch
  passes that information from the EFI boot stub up to the efi platform
  code. 

"more reliable" is a relative answer.  Saving those values for later
use seems inconsistent with the UEFI spec.

> that later makes GetNextVariable() jump to the physical address of the
> EFI Boot Services region. Because if not, we need to do some more
> digging.
> 
> Borislav, how are your 1:1 mapping patches coming along? In theory, once
> those are merged we can gracefully workaround these kinds of issues.
> 
> -- 
> Matt Fleming, Intel Open Source Technology Center

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a 

Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-29 Thread Russ Anderson
On Fri, May 24, 2013 at 08:45:44AM +0100, Matt Fleming wrote:
> On Thu, 23 May, at 05:23:21PM, Russ Anderson wrote:
> > Interesting data point.  The failure is on a rhel7/grub2 root.
> > The identical kernel on a rhel6/grub root boots.  So maybe
> > grub2 brings out the failure?  I suspect Fedora19/grub2 on
> > EFI should hit the problem (for someone looking to reproduce
> > it).
> > 
> > In both cases the kernel boot line options are the same.
> 
> I'll bet that rhel7 is using the EFI handover protocol which uses the
> internal mechanisms of the EFI boot stub.

To be more specific (now that I've dug through the code), grub2 (used
by rhel7) uses EFI boot stubs.  grub and elilo apparently do not use
EFI boot stubs, so they don't hit the problem (at least on my test
systems).



-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-29 Thread Russ Anderson
On Fri, May 24, 2013 at 08:45:44AM +0100, Matt Fleming wrote:
 On Thu, 23 May, at 05:23:21PM, Russ Anderson wrote:
  Interesting data point.  The failure is on a rhel7/grub2 root.
  The identical kernel on a rhel6/grub root boots.  So maybe
  grub2 brings out the failure?  I suspect Fedora19/grub2 on
  EFI should hit the problem (for someone looking to reproduce
  it).
  
  In both cases the kernel boot line options are the same.
 
 I'll bet that rhel7 is using the EFI handover protocol which uses the
 internal mechanisms of the EFI boot stub.

To be more specific (now that I've dug through the code), grub2 (used
by rhel7) uses EFI boot stubs.  grub and elilo apparently do not use
EFI boot stubs, so they don't hit the problem (at least on my test
systems).



-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-29 Thread Russ Anderson
On Fri, May 24, 2013 at 08:43:31AM +0100, Matt Fleming wrote:
 On Thu, 23 May, at 03:32:34PM, Russ Anderson wrote:
 efi: mem127: type=4, attr=0xf, 
  range=[0x6bb22000-0x7ca9c000) (271MB)
 
 EFI_BOOT_SERVICES_CODE
 
 efi: mem133: type=5, attr=0x800f, 
  range=[0x7daff000-0x7dbff000) (1MB)
 
 EFI_RUNTIME_SERVICES_CODE
 
 EFI Variables Facility v0.08 2004-May-17
 BUG: unable to handle kernel paging request at 7ca95b10
 IP: [88007dbf2140] 0x88007dbf213f
 
  [810499b3] ?  efi_call3+0x43/0x80
  [810492a7] ?  virt_efi_get_next_variable+0x47/0x1c0
  [814c8cc0] ?  create_efivars_bin_attributes+0x150/0x150
  [814c7b55] ?  efivar_init+0xd5/0x390
  [814c8ae0] ?  efivar_update_sysfs_entries+0x90/0x90
  [812f906b] ?  kobject_uevent+0xb/0x10
  [812f812b] ?  kset_register+0x5b/0x70
  [814c8cc0] ?  create_efivars_bin_attributes+0x150/0x150
  [814c8d47] ?  efivars_sysfs_init+0x87/0xf0
  [8100032a] ?  do_one_initcall+0x15a/0x1b0
  [81a17831] ?  do_basic_setup+0xad/0xce
  [81a17ae3] ?  kernel_init_freeable+0x291/0x291
  [81a3708a] ?  sched_init_smp+0x15b/0x162
  [81a17a5f] ?  kernel_init_freeable+0x20d/0x291
  [81601eb0] ?  rest_init+0x80/0x80
  [81601ebe] ?  kernel_init+0xe/0x180
  [8162179c] ?  ret_from_fork+0x7c/0xb0
  [81601eb0] ?  rest_init+0x80/0x80
 
 Here's the real call stack leading up to the crash.
 
 What appears to be happening is that your the EFI runtime services code
 is calling into the EFI boot services code, which is definitely a bug in
 your firmware because we're at runtime, but we've seen other machines
 that do similar things so we usually handle it just fine. However, what
 makes your case different, and the reason you see the above splat, is
 that it's using the physical address of the EFI boot services region,
 not the virtual one we setup with SetVirtualAddressMap(). Which is a
 second firmware bug. Again, we have seen other machines that access
 physical addresses after SetVirtualAddressMap(), but until now we
 haven't had any non-optional code that triggered them.
 
 The only reason I can see that the offending commit would introduce this
 problem is because it calls QueryVariableInfo() at boot time. I notice
 that your machine is an SGI UV one, is there any chance you could get a
 firmware fix for this? If possible, it would be also good to confirm
 that it's this chunk of code in setup_efi_vars(),
 
   status = efi_call_phys4(sys_table-runtime-query_variable_info,
   EFI_VARIABLE_NON_VOLATILE |
   EFI_VARIABLE_BOOTSERVICE_ACCESS |
   EFI_VARIABLE_RUNTIME_ACCESS, store_size,
   remaining_size, var_size);

This does trigger the problem.  Note that the definition of
QueryVariableInfo() in the UEFI spec says:

  The returned MaximumVariableStorageSize, RemainingVariableStorageSize,
  MaximumVariableSize information may change immediately after the call
  based on other runtime activities including asynchronous error events.
  Also, these values associated with different attributes are not
  additive in nature.

Note the values may be accurate at the point in time when returned,
but may not be after that.

  After the system has transitioned into runtime (after
  ExitBootServices() is called), an implementation may not be able to
  accurately return information about the Boot Services variable store.
  In such cases, EFI_INVALID_PARAMETER should be returned.

It is not clear to me exactly when ExitBootServices() is called.
Our bios is returning a failing indication on the call.


The description from commit cc5a080c5d40c36089bb08a8a16fa3fc7047fe0f is:

  efi: Pass boot services variable info to runtime code

  EFI variables can be flagged as being accessible only within boot
  services. This makes it awkward for us to figure out how much space
  they use at runtime. In theory we could figure this out by simply
  comparing the results from QueryVariableInfo() to the space used by
  all of our variables, but that fails if the platform doesn't garbage
  collect on every boot. Thankfully, calling QueryVariableInfo() while
  still inside boot services gives a more reliable answer. This patch
  passes that information from the EFI boot stub up to the efi platform
  code. 

more reliable is a relative answer.  Saving those values for later
use seems inconsistent with the UEFI spec.

 that later makes GetNextVariable() jump to the physical address of the
 EFI Boot Services region. Because if not, we need to do some more
 digging.
 
 Borislav, how are your 1:1 mapping patches coming along? In theory, once
 those are merged we can gracefully workaround these kinds of issues.
 
 -- 
 Matt Fleming, Intel 

Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-29 Thread Jiri Kosina
On Wed, 29 May 2013, Russ Anderson wrote:

  What appears to be happening is that your the EFI runtime services code
  is calling into the EFI boot services code, which is definitely a bug in
  your firmware because we're at runtime, but we've seen other machines
  that do similar things so we usually handle it just fine. However, what
  makes your case different, and the reason you see the above splat, is
  that it's using the physical address of the EFI boot services region,
  not the virtual one we setup with SetVirtualAddressMap(). Which is a
  second firmware bug. Again, we have seen other machines that access
  physical addresses after SetVirtualAddressMap(), but until now we
  haven't had any non-optional code that triggered them.
  
  The only reason I can see that the offending commit would introduce this
  problem is because it calls QueryVariableInfo() at boot time. I notice
  that your machine is an SGI UV one, is there any chance you could get a
  firmware fix for this? If possible, it would be also good to confirm
  that it's this chunk of code in setup_efi_vars(),
  
  status = efi_call_phys4(sys_table-runtime-query_variable_info,
  EFI_VARIABLE_NON_VOLATILE |
  EFI_VARIABLE_BOOTSERVICE_ACCESS |
  EFI_VARIABLE_RUNTIME_ACCESS, store_size,
  remaining_size, var_size);
 
 This does trigger the problem.  Note that the definition of
 QueryVariableInfo() in the UEFI spec says:
 
   The returned MaximumVariableStorageSize, RemainingVariableStorageSize,
   MaximumVariableSize information may change immediately after the call
   based on other runtime activities including asynchronous error events.
   Also, these values associated with different attributes are not
   additive in nature.
 
 Note the values may be accurate at the point in time when returned,
 but may not be after that.
 
   After the system has transitioned into runtime (after
   ExitBootServices() is called), an implementation may not be able to
   accurately return information about the Boot Services variable store.
   In such cases, EFI_INVALID_PARAMETER should be returned.
 
 It is not clear to me exactly when ExitBootServices() is called.
 Our bios is returning a failing indication on the call.

Yes, but this call is clearly happening way before ExitBootServices() -- 
see the surrounding code, see for example this in efi_main():

[ ... snip ... ]
setup_efi_vars(boot_params);

setup_efi_pci(boot_params);

status = efi_call_phys3(sys_table-boottime-allocate_pool,
EFI_LOADER_DATA, sizeof(*gdt),
(void **)gdt);
if (status != EFI_SUCCESS) {
efi_printk(Failed to alloc mem for gdt structure\n);
goto fail;
}
[ ... snip ... ]

We are calling QueryVariableInfo() in setup_efi_vars(), and later on 
AllocatePool is being called (through boot table). That'd inherently fail 
if we were calling it after ExitBootServices() has happened, but I believe 
that call is succeeding on your affected system as well.

-- 
Jiri Kosina
SUSE Labs

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-29 Thread Russ Anderson
On Thu, May 30, 2013 at 12:22:13AM +0200, Jiri Kosina wrote:
 On Wed, 29 May 2013, Russ Anderson wrote:
 
   What appears to be happening is that your the EFI runtime services code
   is calling into the EFI boot services code, which is definitely a bug in
   your firmware because we're at runtime, but we've seen other machines
   that do similar things so we usually handle it just fine. However, what
   makes your case different, and the reason you see the above splat, is
   that it's using the physical address of the EFI boot services region,
   not the virtual one we setup with SetVirtualAddressMap(). Which is a
   second firmware bug. Again, we have seen other machines that access
   physical addresses after SetVirtualAddressMap(), but until now we
   haven't had any non-optional code that triggered them.
   
   The only reason I can see that the offending commit would introduce this
   problem is because it calls QueryVariableInfo() at boot time. I notice
   that your machine is an SGI UV one, is there any chance you could get a
   firmware fix for this? If possible, it would be also good to confirm
   that it's this chunk of code in setup_efi_vars(),
   
 status = efi_call_phys4(sys_table-runtime-query_variable_info,
 EFI_VARIABLE_NON_VOLATILE |
 EFI_VARIABLE_BOOTSERVICE_ACCESS |
 EFI_VARIABLE_RUNTIME_ACCESS, store_size,
 remaining_size, var_size);
  
  This does trigger the problem.  Note that the definition of
  QueryVariableInfo() in the UEFI spec says:
  
The returned MaximumVariableStorageSize, RemainingVariableStorageSize,
MaximumVariableSize information may change immediately after the call
based on other runtime activities including asynchronous error events.
Also, these values associated with different attributes are not
additive in nature.
  
  Note the values may be accurate at the point in time when returned,
  but may not be after that.
  
After the system has transitioned into runtime (after
ExitBootServices() is called), an implementation may not be able to
accurately return information about the Boot Services variable store.
In such cases, EFI_INVALID_PARAMETER should be returned.
  
  It is not clear to me exactly when ExitBootServices() is called.
  Our bios is returning a failing indication on the call.
 
 Yes, but this call is clearly happening way before ExitBootServices() -- 
 see the surrounding code, see for example this in efi_main():
 
 [ ... snip ... ]
   setup_efi_vars(boot_params);
 
   setup_efi_pci(boot_params);
 
   status = efi_call_phys3(sys_table-boottime-allocate_pool,
   EFI_LOADER_DATA, sizeof(*gdt),
   (void **)gdt);
   if (status != EFI_SUCCESS) {
   efi_printk(Failed to alloc mem for gdt structure\n);
   goto fail;
   }
 [ ... snip ... ]

Yes.  Note the failing call is sys_table-runtime while all the
other calls are sys_table-boottime and seem to work.  Not sure
why the sys_table-runtime call has a problem but it may be
a clue.  Could something in the runtime path not be set up???

 We are calling QueryVariableInfo() in setup_efi_vars(), and later on 
 AllocatePool is being called (through boot table).

On my system the QueryVariableInfo() call fails, so AllocatePool()
is not called in setup_efi_vars().

That'd inherently fail 
 if we were calling it after ExitBootServices() has happened, but I believe 
 that call is succeeding on your affected system as well.
 
 -- 
 Jiri Kosina
 SUSE Labs

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-29 Thread Jiri Kosina
On Wed, 29 May 2013, Russ Anderson wrote:

  Yes, but this call is clearly happening way before ExitBootServices() -- 
  see the surrounding code, see for example this in efi_main():
  
  [ ... snip ... ]
  setup_efi_vars(boot_params);
  
  setup_efi_pci(boot_params);
  
  status = efi_call_phys3(sys_table-boottime-allocate_pool,
  EFI_LOADER_DATA, sizeof(*gdt),
  (void **)gdt);
  if (status != EFI_SUCCESS) {
  efi_printk(Failed to alloc mem for gdt structure\n);
  goto fail;
  }
  [ ... snip ... ]
 
 Yes.  Note the failing call is sys_table-runtime while all the
 other calls are sys_table-boottime and seem to work.  Not sure
 why the sys_table-runtime call has a problem but it may be
 a clue.  Could something in the runtime path not be set up???

That was my original idea early today as well. My understanding of the 
UEFI spec is admittedly limited, but afaics calling runtime method from 
boot environment should be a valid thing to do ... ?

  We are calling QueryVariableInfo() in setup_efi_vars(), and later on 
  AllocatePool is being called (through boot table).
 
 On my system the QueryVariableInfo() call fails, so AllocatePool()
 is not called in setup_efi_vars().

But it's being called later on coming back to efi_main(). That was just a 
poor man's demonstration attempt why this code is running before 
ExitBootServices() has been called.

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-29 Thread joeyli
於 四,2013-05-30 於 00:53 +0200,Jiri Kosina 提到:
 On Wed, 29 May 2013, Russ Anderson wrote:
 
   Yes, but this call is clearly happening way before ExitBootServices() -- 
   see the surrounding code, see for example this in efi_main():
   
   [ ... snip ... ]
 setup_efi_vars(boot_params);
   
 setup_efi_pci(boot_params);
   
 status = efi_call_phys3(sys_table-boottime-allocate_pool,
 EFI_LOADER_DATA, sizeof(*gdt),
 (void **)gdt);
 if (status != EFI_SUCCESS) {
 efi_printk(Failed to alloc mem for gdt structure\n);
 goto fail;
 }
   [ ... snip ... ]
  
  Yes.  Note the failing call is sys_table-runtime while all the
  other calls are sys_table-boottime and seem to work.  Not sure
  why the sys_table-runtime call has a problem but it may be
  a clue.  Could something in the runtime path not be set up???
 
 That was my original idea early today as well. My understanding of the 
 UEFI spec is admittedly limited, but afaics calling runtime method from 
 boot environment should be a valid thing to do ... ?

QueryVariableInfo() is a runtime services, all runtime services should
available bother on boot time and runtime:

UEFI spec 2.3.1 P.109:
  Runtime Services
  Functions that are available before and after any call to  
  ExitBootServices(). These functions are described in Section 7.

 
   We are calling QueryVariableInfo() in setup_efi_vars(), and later on 
   AllocatePool is being called (through boot table).
  
  On my system the QueryVariableInfo() call fails, so AllocatePool()
  is not called in setup_efi_vars().
 
 But it's being called later on coming back to efi_main(). That was just a 
 poor man's demonstration attempt why this code is running before 
 ExitBootServices() has been called.
 

Yes, I agreed your point, the space information of
EFI_VARIABLE_BOOTSERVICE_ACCESS should still return by
QueryVariableInfo() because we call it before ExitBootServices():

arch/x86/boot/compressed/eboot.c
efi_main(void *handle, efi_system_table_t *_table, struct boot_params 
*boot_params)
..
sys_table = _table;
/* Check if we were booted by the EFI firmware */
if (sys_table-hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE)
goto fail;
boot_params-secure_boot = get_secure_boot(sys_table)   
/* check does BIOS in secure boot mode */
setup_graphics(boot_params);
setup_efi_vars(boot_params);
/* Pass boot services variable info to runtime code, call QueryVariableInfo() */
...
status = exit_boot(boot_params, handle);
/* call ExitBootServices() */
...


Thanks a lot!
Joey Lee

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-29 Thread joeyli
於 三,2013-05-29 於 17:46 -0500,Russ Anderson 提到:
 On Thu, May 30, 2013 at 12:22:13AM +0200, Jiri Kosina wrote:
  On Wed, 29 May 2013, Russ Anderson wrote:
  
What appears to be happening is that your the EFI runtime services code
is calling into the EFI boot services code, which is definitely a bug in
your firmware because we're at runtime, but we've seen other machines
that do similar things so we usually handle it just fine. However, what
makes your case different, and the reason you see the above splat, is
that it's using the physical address of the EFI boot services region,
not the virtual one we setup with SetVirtualAddressMap(). Which is a
second firmware bug. Again, we have seen other machines that access
physical addresses after SetVirtualAddressMap(), but until now we
haven't had any non-optional code that triggered them.

The only reason I can see that the offending commit would introduce this
problem is because it calls QueryVariableInfo() at boot time. I notice
that your machine is an SGI UV one, is there any chance you could get a
firmware fix for this? If possible, it would be also good to confirm
that it's this chunk of code in setup_efi_vars(),

status = efi_call_phys4(sys_table-runtime-query_variable_info,
EFI_VARIABLE_NON_VOLATILE |
EFI_VARIABLE_BOOTSERVICE_ACCESS |
EFI_VARIABLE_RUNTIME_ACCESS, 
store_size,
remaining_size, var_size);
   
   This does trigger the problem.  Note that the definition of
   QueryVariableInfo() in the UEFI spec says:
   
 The returned MaximumVariableStorageSize, RemainingVariableStorageSize,
 MaximumVariableSize information may change immediately after the call
 based on other runtime activities including asynchronous error events.
 Also, these values associated with different attributes are not
 additive in nature.
   
   Note the values may be accurate at the point in time when returned,
   but may not be after that.
   
 After the system has transitioned into runtime (after
 ExitBootServices() is called), an implementation may not be able to
 accurately return information about the Boot Services variable store.
 In such cases, EFI_INVALID_PARAMETER should be returned.
   
   It is not clear to me exactly when ExitBootServices() is called.
   Our bios is returning a failing indication on the call.
  
  Yes, but this call is clearly happening way before ExitBootServices() -- 
  see the surrounding code, see for example this in efi_main():
  
  [ ... snip ... ]
  setup_efi_vars(boot_params);
  
  setup_efi_pci(boot_params);
  
  status = efi_call_phys3(sys_table-boottime-allocate_pool,
  EFI_LOADER_DATA, sizeof(*gdt),
  (void **)gdt);
  if (status != EFI_SUCCESS) {
  efi_printk(Failed to alloc mem for gdt structure\n);
  goto fail;
  }
  [ ... snip ... ]
 
 Yes.  Note the failing call is sys_table-runtime while all the
 other calls are sys_table-boottime and seem to work.  Not sure
 why the sys_table-runtime call has a problem but it may be
 a clue.  Could something in the runtime path not be set up???
 

Per UEFI spec Section 6, all runtime services should a available both on
boot time and runtime. And, we query the EFI_VARIABLE_BOOTSERVICE_ACCESS
space information before ExitBootServices(), that means we call it in
boot time, so, QueryVariableInfo() should return information to us.

Back to the kernel oops, the oops happened in runtime environment when
efivar_init running. As Matt's point out as following:

於 五,2013-05-24 於 08:43 +0100,Matt Fleming 提到: 
 On Thu, 23 May, at 03:32:34PM, Russ Anderson wrote:
 efi: mem127: type=4, attr=0xf, 
  range=[0x6bb22000-0x7ca9c000) (271MB)
 
 EFI_BOOT_SERVICES_CODE

Per UEFI 2.3.1 Section 6.2, type 4 is EfiBootServicesCode. This area available 
for
OS using after ExitBootServices():

UEFI 2.3.1 P.133
 EfiBootServicesData Memory available for general use.

So, any runtime services should not access this area, it's the first thing need 
to check
the BIOS code for why.

 efi: mem133: type=5, attr=0x800f, 
  range=[0x7daff000-0x7dbff000) (1MB)
 
 EFI_RUNTIME_SERVICES_CODE
 
 EFI Variables Facility v0.08 2004-May-17
 BUG: unable to handle kernel paging request at 7ca95b10
 IP: [88007dbf2140] 0x88007dbf213f
[...]
What appears to be happening is that your the EFI runtime services code
 is calling into the EFI boot services code, which is definitely a bug in
 your firmware because we're at runtime, but we've seen other machines
 that do similar things so we usually handle it just fine. However, what
 makes your case different, and the reason you see the above splat, is
 that it's using the 

Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-28 Thread Matt Fleming
On Fri, 24 May, at 03:05:39PM, Russ Anderson wrote:
> One of the BIOS guys will build a debug bios next week to help see
> what is going on in the query_variable_info() call.

That would be incredibly useful, thanks.

-- 
Matt Fleming, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-28 Thread Matt Fleming
On Fri, 24 May, at 03:49:38PM, Russ Anderson wrote:
> Why does the kernel still try to create /sys/firmware/efivars/
> entries in the original failure even though efi_call_phys4()
> failed?  Or does it always try to create those entries
> and GetNextVariable() blows up in the original failure
> but not in my experiment?

CONFIG_EFI_VARS will try to create /sys/firmware/efivars even without
the additional info from the early EFI boot variable code. The early
variable code is only a heuristic that is supposed to improve the
anti-bricking algorithm in the EFI variable code.

-- 
Matt Fleming, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-28 Thread Ingo Molnar

* Matt Fleming  wrote:

> What appears to be happening is that your the EFI runtime services code 
> is calling into the EFI boot services code, which is definitely a bug in 
> your firmware because we're at runtime, but we've seen other machines 
> that do similar things so we usually handle it just fine. However, what 
> makes your case different, and the reason you see the above splat, is 
> that it's using the physical address of the EFI boot services region, 
> not the virtual one we setup with SetVirtualAddressMap(). Which is a 
> second firmware bug. Again, we have seen other machines that access 
> physical addresses after SetVirtualAddressMap(), but until now we 
> haven't had any non-optional code that triggered them.
> 
> The only reason I can see that the offending commit would introduce this 
> problem is because it calls QueryVariableInfo() at boot time. I notice 
> that your machine is an SGI UV one, is there any chance you could get a 
> firmware fix for this? If possible, it would be also good to confirm 
> that it's this chunk of code in setup_efi_vars(),
> 
>   status = efi_call_phys4(sys_table->runtime->query_variable_info,
>   EFI_VARIABLE_NON_VOLATILE |
>   EFI_VARIABLE_BOOTSERVICE_ACCESS |
>   EFI_VARIABLE_RUNTIME_ACCESS, _size,
>   _size, _size);
> 
> that later makes GetNextVariable() jump to the physical address of the 
> EFI Boot Services region. Because if not, we need to do some more 
> digging.
> 
> Borislav, how are your 1:1 mapping patches coming along? In theory, once 
> those are merged we can gracefully workaround these kinds of issues.

Handling these gracefully without crashing boxes or expecting firmware to 
be sane (which is wishful thinking) would be _SO_ preferred ...

I suspect 1:1 mapped is what Windows does - and we simply need to provide 
a Windows-EFI compatible environment (which is reality), not just an 
EFI-spec environment (which is a fiction).

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-28 Thread Ingo Molnar

* Matt Fleming m...@console-pimps.org wrote:

 What appears to be happening is that your the EFI runtime services code 
 is calling into the EFI boot services code, which is definitely a bug in 
 your firmware because we're at runtime, but we've seen other machines 
 that do similar things so we usually handle it just fine. However, what 
 makes your case different, and the reason you see the above splat, is 
 that it's using the physical address of the EFI boot services region, 
 not the virtual one we setup with SetVirtualAddressMap(). Which is a 
 second firmware bug. Again, we have seen other machines that access 
 physical addresses after SetVirtualAddressMap(), but until now we 
 haven't had any non-optional code that triggered them.
 
 The only reason I can see that the offending commit would introduce this 
 problem is because it calls QueryVariableInfo() at boot time. I notice 
 that your machine is an SGI UV one, is there any chance you could get a 
 firmware fix for this? If possible, it would be also good to confirm 
 that it's this chunk of code in setup_efi_vars(),
 
   status = efi_call_phys4(sys_table-runtime-query_variable_info,
   EFI_VARIABLE_NON_VOLATILE |
   EFI_VARIABLE_BOOTSERVICE_ACCESS |
   EFI_VARIABLE_RUNTIME_ACCESS, store_size,
   remaining_size, var_size);
 
 that later makes GetNextVariable() jump to the physical address of the 
 EFI Boot Services region. Because if not, we need to do some more 
 digging.
 
 Borislav, how are your 1:1 mapping patches coming along? In theory, once 
 those are merged we can gracefully workaround these kinds of issues.

Handling these gracefully without crashing boxes or expecting firmware to 
be sane (which is wishful thinking) would be _SO_ preferred ...

I suspect 1:1 mapped is what Windows does - and we simply need to provide 
a Windows-EFI compatible environment (which is reality), not just an 
EFI-spec environment (which is a fiction).

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-28 Thread Matt Fleming
On Fri, 24 May, at 03:49:38PM, Russ Anderson wrote:
 Why does the kernel still try to create /sys/firmware/efivars/
 entries in the original failure even though efi_call_phys4()
 failed?  Or does it always try to create those entries
 and GetNextVariable() blows up in the original failure
 but not in my experiment?

CONFIG_EFI_VARS will try to create /sys/firmware/efivars even without
the additional info from the early EFI boot variable code. The early
variable code is only a heuristic that is supposed to improve the
anti-bricking algorithm in the EFI variable code.

-- 
Matt Fleming, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-28 Thread Matt Fleming
On Fri, 24 May, at 03:05:39PM, Russ Anderson wrote:
 One of the BIOS guys will build a debug bios next week to help see
 what is going on in the query_variable_info() call.

That would be incredibly useful, thanks.

-- 
Matt Fleming, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-27 Thread Russ Anderson
On Mon, May 27, 2013 at 12:27:12PM +0800, joeyli wrote:
> Hi Dave, 
> 
> 於 五,2013-05-24 於 17:05 -0400,Dave Jones 提到:
> > On Fri, May 24, 2013 at 12:02:15PM -0500, Russ Anderson wrote:
> >  > On Fri, May 24, 2013 at 11:11:11AM -0500, Robin Holt wrote:
> >  > > Russ,
> >  > > 
> >  > > Can we open a bug for the BIOS folks and see if we can get this 
> > addressed?
> >  > 
> >  > I already talked with them.  It is not in an area that we
> >  > normally change, so if there is a bug may be in the Intel
> >  > reference code.  More investigation is needed to track down
> >  > the actual problem, and that could take help from Intel.
> >  > 
> >  > Regardless of that, it is a kernel patch that triggers the
> >  > problem.  This isn't the first time a kernel change does
> >  > the "right thing" but trips across questionable bios/EFI/bootloader
> >  > implementation.  That still makes it a kernel bug.
> >  > 
> >  > I'm still digging to better understand the root problem.
> >  
> > When we rebased the Fedora 18 kernel to 3.9 we had a bunch of reports
> > from people who can no longer boot with what looks like similar symptoms.
> > 
> > https://bugzilla.redhat.com/show_bug.cgi?id=964335
> > 
> > Dave
> > 
> 
> The oops on the above bug is similar to my problem on Acer machine,
> could you please ask the reporter try the eccaf52f patch in urgent
> branch on Matt's efi git tree:
> 
> https://git.kernel.org/cgit/linux/kernel/git/mfleming/efi.git/commit/?h=urgent=eccaf52fee8305d5207ff110950a82c100e459bc

I tried that patch, but it does not fix the problem.

Thanks for the suggestion.


-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-27 Thread Russ Anderson
On Mon, May 27, 2013 at 12:27:12PM +0800, joeyli wrote:
 Hi Dave, 
 
 於 五,2013-05-24 於 17:05 -0400,Dave Jones 提到:
  On Fri, May 24, 2013 at 12:02:15PM -0500, Russ Anderson wrote:
On Fri, May 24, 2013 at 11:11:11AM -0500, Robin Holt wrote:
 Russ,
 
 Can we open a bug for the BIOS folks and see if we can get this 
  addressed?

I already talked with them.  It is not in an area that we
normally change, so if there is a bug may be in the Intel
reference code.  More investigation is needed to track down
the actual problem, and that could take help from Intel.

Regardless of that, it is a kernel patch that triggers the
problem.  This isn't the first time a kernel change does
the right thing but trips across questionable bios/EFI/bootloader
implementation.  That still makes it a kernel bug.

I'm still digging to better understand the root problem.
   
  When we rebased the Fedora 18 kernel to 3.9 we had a bunch of reports
  from people who can no longer boot with what looks like similar symptoms.
  
  https://bugzilla.redhat.com/show_bug.cgi?id=964335
  
  Dave
  
 
 The oops on the above bug is similar to my problem on Acer machine,
 could you please ask the reporter try the eccaf52f patch in urgent
 branch on Matt's efi git tree:
 
 https://git.kernel.org/cgit/linux/kernel/git/mfleming/efi.git/commit/?h=urgentid=eccaf52fee8305d5207ff110950a82c100e459bc

I tried that patch, but it does not fix the problem.

Thanks for the suggestion.


-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-26 Thread joeyli
於 一,2013-05-27 於 12:27 +0800,joeyli 提到:
> Hi Dave, 
> 
> 於 五,2013-05-24 於 17:05 -0400,Dave Jones 提到:
> > On Fri, May 24, 2013 at 12:02:15PM -0500, Russ Anderson wrote:
> >  > On Fri, May 24, 2013 at 11:11:11AM -0500, Robin Holt wrote:
> >  > > Russ,
> >  > > 
> >  > > Can we open a bug for the BIOS folks and see if we can get this 
> > addressed?
> >  > 
> >  > I already talked with them.  It is not in an area that we
> >  > normally change, so if there is a bug may be in the Intel
> >  > reference code.  More investigation is needed to track down
> >  > the actual problem, and that could take help from Intel.
> >  > 
> >  > Regardless of that, it is a kernel patch that triggers the
> >  > problem.  This isn't the first time a kernel change does
> >  > the "right thing" but trips across questionable bios/EFI/bootloader
> >  > implementation.  That still makes it a kernel bug.
> >  > 
> >  > I'm still digging to better understand the root problem.
> >  
> > When we rebased the Fedora 18 kernel to 3.9 we had a bunch of reports
> > from people who can no longer boot with what looks like similar symptoms.
> > 
> > https://bugzilla.redhat.com/show_bug.cgi?id=964335
> > 
> > Dave
> > 
> 
> The oops on the above bug is similar to my problem on Acer machine,
> could you please ask the reporter try the eccaf52f patch in urgent
> branch on Matt's efi git tree:
> 
> https://git.kernel.org/cgit/linux/kernel/git/mfleming/efi.git/commit/?h=urgent=eccaf52fee8305d5207ff110950a82c100e459bc
> 
> 
> Thanks!
> Joey Lee

Looks there have a couple of different machines on this bug. The oops
similar to my machine is reported by josephhenryblack.


Thanks
Joey Lee


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-26 Thread joeyli
Hi Dave, 

於 五,2013-05-24 於 17:05 -0400,Dave Jones 提到:
> On Fri, May 24, 2013 at 12:02:15PM -0500, Russ Anderson wrote:
>  > On Fri, May 24, 2013 at 11:11:11AM -0500, Robin Holt wrote:
>  > > Russ,
>  > > 
>  > > Can we open a bug for the BIOS folks and see if we can get this 
> addressed?
>  > 
>  > I already talked with them.  It is not in an area that we
>  > normally change, so if there is a bug may be in the Intel
>  > reference code.  More investigation is needed to track down
>  > the actual problem, and that could take help from Intel.
>  > 
>  > Regardless of that, it is a kernel patch that triggers the
>  > problem.  This isn't the first time a kernel change does
>  > the "right thing" but trips across questionable bios/EFI/bootloader
>  > implementation.  That still makes it a kernel bug.
>  > 
>  > I'm still digging to better understand the root problem.
>  
> When we rebased the Fedora 18 kernel to 3.9 we had a bunch of reports
> from people who can no longer boot with what looks like similar symptoms.
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=964335
> 
>   Dave
> 

The oops on the above bug is similar to my problem on Acer machine,
could you please ask the reporter try the eccaf52f patch in urgent
branch on Matt's efi git tree:

https://git.kernel.org/cgit/linux/kernel/git/mfleming/efi.git/commit/?h=urgent=eccaf52fee8305d5207ff110950a82c100e459bc


Thanks!
Joey Lee

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-26 Thread joeyli
Hi Dave, 

於 五,2013-05-24 於 17:05 -0400,Dave Jones 提到:
 On Fri, May 24, 2013 at 12:02:15PM -0500, Russ Anderson wrote:
   On Fri, May 24, 2013 at 11:11:11AM -0500, Robin Holt wrote:
Russ,

Can we open a bug for the BIOS folks and see if we can get this 
 addressed?
   
   I already talked with them.  It is not in an area that we
   normally change, so if there is a bug may be in the Intel
   reference code.  More investigation is needed to track down
   the actual problem, and that could take help from Intel.
   
   Regardless of that, it is a kernel patch that triggers the
   problem.  This isn't the first time a kernel change does
   the right thing but trips across questionable bios/EFI/bootloader
   implementation.  That still makes it a kernel bug.
   
   I'm still digging to better understand the root problem.
  
 When we rebased the Fedora 18 kernel to 3.9 we had a bunch of reports
 from people who can no longer boot with what looks like similar symptoms.
 
 https://bugzilla.redhat.com/show_bug.cgi?id=964335
 
   Dave
 

The oops on the above bug is similar to my problem on Acer machine,
could you please ask the reporter try the eccaf52f patch in urgent
branch on Matt's efi git tree:

https://git.kernel.org/cgit/linux/kernel/git/mfleming/efi.git/commit/?h=urgentid=eccaf52fee8305d5207ff110950a82c100e459bc


Thanks!
Joey Lee

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-26 Thread joeyli
於 一,2013-05-27 於 12:27 +0800,joeyli 提到:
 Hi Dave, 
 
 於 五,2013-05-24 於 17:05 -0400,Dave Jones 提到:
  On Fri, May 24, 2013 at 12:02:15PM -0500, Russ Anderson wrote:
On Fri, May 24, 2013 at 11:11:11AM -0500, Robin Holt wrote:
 Russ,
 
 Can we open a bug for the BIOS folks and see if we can get this 
  addressed?

I already talked with them.  It is not in an area that we
normally change, so if there is a bug may be in the Intel
reference code.  More investigation is needed to track down
the actual problem, and that could take help from Intel.

Regardless of that, it is a kernel patch that triggers the
problem.  This isn't the first time a kernel change does
the right thing but trips across questionable bios/EFI/bootloader
implementation.  That still makes it a kernel bug.

I'm still digging to better understand the root problem.
   
  When we rebased the Fedora 18 kernel to 3.9 we had a bunch of reports
  from people who can no longer boot with what looks like similar symptoms.
  
  https://bugzilla.redhat.com/show_bug.cgi?id=964335
  
  Dave
  
 
 The oops on the above bug is similar to my problem on Acer machine,
 could you please ask the reporter try the eccaf52f patch in urgent
 branch on Matt's efi git tree:
 
 https://git.kernel.org/cgit/linux/kernel/git/mfleming/efi.git/commit/?h=urgentid=eccaf52fee8305d5207ff110950a82c100e459bc
 
 
 Thanks!
 Joey Lee

Looks there have a couple of different machines on this bug. The oops
similar to my machine is reported by josephhenryblack.


Thanks
Joey Lee


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-24 Thread Dave Jones
On Fri, May 24, 2013 at 12:02:15PM -0500, Russ Anderson wrote:
 > On Fri, May 24, 2013 at 11:11:11AM -0500, Robin Holt wrote:
 > > Russ,
 > > 
 > > Can we open a bug for the BIOS folks and see if we can get this addressed?
 > 
 > I already talked with them.  It is not in an area that we
 > normally change, so if there is a bug may be in the Intel
 > reference code.  More investigation is needed to track down
 > the actual problem, and that could take help from Intel.
 > 
 > Regardless of that, it is a kernel patch that triggers the
 > problem.  This isn't the first time a kernel change does
 > the "right thing" but trips across questionable bios/EFI/bootloader
 > implementation.  That still makes it a kernel bug.
 > 
 > I'm still digging to better understand the root problem.
 
When we rebased the Fedora 18 kernel to 3.9 we had a bunch of reports
from people who can no longer boot with what looks like similar symptoms.

https://bugzilla.redhat.com/show_bug.cgi?id=964335

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-24 Thread Russ Anderson
On Fri, May 24, 2013 at 08:11:01PM +, Matthew Garrett wrote:
> On Fri, 2013-05-24 at 15:05 -0500, Russ Anderson wrote:
> 
> > One other data point is if the query_variable_info call is hacked to
> > remove one of the EFI flags (ie comment out EFI_VARIABLE_BOOTSERVICE_ACCESS)
> > the efi_call_phys4() call fails with EFI_INVALID_PARAMETER and
> > the system boots.  Of course it does not create /sys/firmware/efivars
> > entries and complains "[Firmware Bug]: efi: Inconsistent initial sizes".
> > But at least it boots.
> 
> EFI_VARIABLE_RUNTIME_ACCESS is only legal if
> EFI_VARIABLE_BOOTSERVICE_ACCESS is set, so it's correct to throw
> EFI_INVALID_PARAMETER there.

Yes.  The point of the experiment was to see if it returned a
valid failure status (which it did) and see if the boot still
failed (which it didn't).  So something about going deeper
into that call seems to trigger the failure.

Why does the kernel still try to create /sys/firmware/efivars/
entries in the original failure even though efi_call_phys4()
failed?  Or does it always try to create those entries
and GetNextVariable() blows up in the original failure
but not in my experiment?

Thanks,
-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-24 Thread Matthew Garrett
On Fri, 2013-05-24 at 15:05 -0500, Russ Anderson wrote:

> One other data point is if the query_variable_info call is hacked to
> remove one of the EFI flags (ie comment out EFI_VARIABLE_BOOTSERVICE_ACCESS)
> the efi_call_phys4() call fails with EFI_INVALID_PARAMETER and
> the system boots.  Of course it does not create /sys/firmware/efivars
> entries and complains "[Firmware Bug]: efi: Inconsistent initial sizes".
> But at least it boots.

EFI_VARIABLE_RUNTIME_ACCESS is only legal if
EFI_VARIABLE_BOOTSERVICE_ACCESS is set, so it's correct to throw
EFI_INVALID_PARAMETER there.

-- 
Matthew Garrett | mj...@srcf.ucam.org
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-24 Thread Russ Anderson
On Fri, May 24, 2013 at 08:43:31AM +0100, Matt Fleming wrote:
> On Thu, 23 May, at 03:32:34PM, Russ Anderson wrote:
> >efi: mem127: type=4, attr=0xf, 
> > range=[0x6bb22000-0x7ca9c000) (271MB)
> 
> EFI_BOOT_SERVICES_CODE
> 
> >efi: mem133: type=5, attr=0x800f, 
> > range=[0x7daff000-0x7dbff000) (1MB)
> 
> EFI_RUNTIME_SERVICES_CODE
> 
> >EFI Variables Facility v0.08 2004-May-17
> >BUG: unable to handle kernel paging request at 7ca95b10
> >IP: [] 0x88007dbf213f
> 
> > [] ?  efi_call3+0x43/0x80
> > [] ?  virt_efi_get_next_variable+0x47/0x1c0
> > [] ?  create_efivars_bin_attributes+0x150/0x150
> > [] ?  efivar_init+0xd5/0x390
> > [] ?  efivar_update_sysfs_entries+0x90/0x90
> > [] ?  kobject_uevent+0xb/0x10
> > [] ?  kset_register+0x5b/0x70
> > [] ?  create_efivars_bin_attributes+0x150/0x150
> > [] ?  efivars_sysfs_init+0x87/0xf0
> > [] ?  do_one_initcall+0x15a/0x1b0
> > [] ?  do_basic_setup+0xad/0xce
> > [] ?  kernel_init_freeable+0x291/0x291
> > [] ?  sched_init_smp+0x15b/0x162
> > [] ?  kernel_init_freeable+0x20d/0x291
> > [] ?  rest_init+0x80/0x80
> > [] ?  kernel_init+0xe/0x180
> > [] ?  ret_from_fork+0x7c/0xb0
> > [] ?  rest_init+0x80/0x80
> 
> Here's the real call stack leading up to the crash.
> 
> What appears to be happening is that your the EFI runtime services code
> is calling into the EFI boot services code, which is definitely a bug in
> your firmware because we're at runtime, but we've seen other machines
> that do similar things so we usually handle it just fine. However, what
> makes your case different, and the reason you see the above splat, is
> that it's using the physical address of the EFI boot services region,
> not the virtual one we setup with SetVirtualAddressMap(). Which is a
> second firmware bug. Again, we have seen other machines that access
> physical addresses after SetVirtualAddressMap(), but until now we
> haven't had any non-optional code that triggered them.
> 
> The only reason I can see that the offending commit would introduce this
> problem is because it calls QueryVariableInfo() at boot time. I notice
> that your machine is an SGI UV one, is there any chance you could get a
> firmware fix for this? If possible, it would be also good to confirm
> that it's this chunk of code in setup_efi_vars(),
> 
>   status = efi_call_phys4(sys_table->runtime->query_variable_info,
>   EFI_VARIABLE_NON_VOLATILE |
>   EFI_VARIABLE_BOOTSERVICE_ACCESS |
>   EFI_VARIABLE_RUNTIME_ACCESS, _size,
>   _size, _size);

This call is failing, but not returning a valid EFI_* return status.
setup_efi_vars() returns at that point.  Maybe it is not set up
to do GetNextVariable() later on???  Why call GetNextVariable() if the
earlier call failed?

> that later makes GetNextVariable() jump to the physical address of the
> EFI Boot Services region. Because if not, we need to do some more
> digging.

One other data point is if the query_variable_info call is hacked to
remove one of the EFI flags (ie comment out EFI_VARIABLE_BOOTSERVICE_ACCESS)
the efi_call_phys4() call fails with EFI_INVALID_PARAMETER and
the system boots.  Of course it does not create /sys/firmware/efivars
entries and complains "[Firmware Bug]: efi: Inconsistent initial sizes".
But at least it boots.

One of the BIOS guys will build a debug bios next week to help see
what is going on in the query_variable_info() call.

> Borislav, how are your 1:1 mapping patches coming along? In theory, once
> those are merged we can gracefully workaround these kinds of issues.
> 
> -- 
> Matt Fleming, Intel Open Source Technology Center

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-24 Thread Russ Anderson
On Fri, May 24, 2013 at 11:11:11AM -0500, Robin Holt wrote:
> Russ,
> 
> Can we open a bug for the BIOS folks and see if we can get this addressed?

I already talked with them.  It is not in an area that we
normally change, so if there is a bug may be in the Intel
reference code.  More investigation is needed to track down
the actual problem, and that could take help from Intel.

Regardless of that, it is a kernel patch that triggers the
problem.  This isn't the first time a kernel change does
the "right thing" but trips across questionable bios/EFI/bootloader
implementation.  That still makes it a kernel bug.

I'm still digging to better understand the root problem.


> Robin
> 
> On Fri, May 24, 2013 at 08:43:31AM +0100, Matt Fleming wrote:
> > On Thu, 23 May, at 03:32:34PM, Russ Anderson wrote:
> > >efi: mem127: type=4, attr=0xf, 
> > > range=[0x6bb22000-0x7ca9c000) (271MB)
> > 
> > EFI_BOOT_SERVICES_CODE
> > 
> > >efi: mem133: type=5, attr=0x800f, 
> > > range=[0x7daff000-0x7dbff000) (1MB)
> > 
> > EFI_RUNTIME_SERVICES_CODE
> > 
> > >EFI Variables Facility v0.08 2004-May-17
> > >BUG: unable to handle kernel paging request at 7ca95b10
> > >IP: [] 0x88007dbf213f
> > 
> > This...
> > 
> > >Call Trace:
> > > [] ?  __alloc_pages_nodemask+0x154/0x2f0
> > > [] ?  alloc_page_interleave+0x9d/0xa0
> > > [] ?  put_dec+0x72/0x90
> > > [] ?  ida_get_new_above+0xb3/0x220
> > > [] ?  sub_alloc+0x74/0x1d0
> > > [] ?  sub_alloc+0x74/0x1d0
> > > [] ?  ida_get_new_above+0xb3/0x220
> > > [] ?  create_efivars_bin_attributes+0x150/0x150
> > 
> > is junk on the stack.
> > 
> > > [] ?  efi_call3+0x43/0x80
> > > [] ?  virt_efi_get_next_variable+0x47/0x1c0
> > > [] ?  create_efivars_bin_attributes+0x150/0x150
> > > [] ?  efivar_init+0xd5/0x390
> > > [] ?  efivar_update_sysfs_entries+0x90/0x90
> > > [] ?  kobject_uevent+0xb/0x10
> > > [] ?  kset_register+0x5b/0x70
> > > [] ?  create_efivars_bin_attributes+0x150/0x150
> > > [] ?  efivars_sysfs_init+0x87/0xf0
> > > [] ?  do_one_initcall+0x15a/0x1b0
> > > [] ?  do_basic_setup+0xad/0xce
> > > [] ?  kernel_init_freeable+0x291/0x291
> > > [] ?  sched_init_smp+0x15b/0x162
> > > [] ?  kernel_init_freeable+0x20d/0x291
> > > [] ?  rest_init+0x80/0x80
> > > [] ?  kernel_init+0xe/0x180
> > > [] ?  ret_from_fork+0x7c/0xb0
> > > [] ?  rest_init+0x80/0x80
> > 
> > Here's the real call stack leading up to the crash.
> > 
> > What appears to be happening is that your the EFI runtime services code
> > is calling into the EFI boot services code, which is definitely a bug in
> > your firmware because we're at runtime, but we've seen other machines
> > that do similar things so we usually handle it just fine. However, what
> > makes your case different, and the reason you see the above splat, is
> > that it's using the physical address of the EFI boot services region,
> > not the virtual one we setup with SetVirtualAddressMap(). Which is a
> > second firmware bug. Again, we have seen other machines that access
> > physical addresses after SetVirtualAddressMap(), but until now we
> > haven't had any non-optional code that triggered them.
> > 
> > The only reason I can see that the offending commit would introduce this
> > problem is because it calls QueryVariableInfo() at boot time. I notice
> > that your machine is an SGI UV one, is there any chance you could get a
> > firmware fix for this? If possible, it would be also good to confirm
> > that it's this chunk of code in setup_efi_vars(),
> > 
> > status = efi_call_phys4(sys_table->runtime->query_variable_info,
> > EFI_VARIABLE_NON_VOLATILE |
> > EFI_VARIABLE_BOOTSERVICE_ACCESS |
> > EFI_VARIABLE_RUNTIME_ACCESS, _size,
> > _size, _size);
> > 
> > that later makes GetNextVariable() jump to the physical address of the
> > EFI Boot Services region. Because if not, we need to do some more
> > digging.
> > 
> > Borislav, how are your 1:1 mapping patches coming along? In theory, once
> > those are merged we can gracefully workaround these kinds of issues.
> > 
> > -- 
> > Matt Fleming, Intel Open Source Technology Center
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  r...@sgi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-24 Thread Robin Holt
Russ,

Can we open a bug for the BIOS folks and see if we can get this addressed?

Robin

On Fri, May 24, 2013 at 08:43:31AM +0100, Matt Fleming wrote:
> On Thu, 23 May, at 03:32:34PM, Russ Anderson wrote:
> >efi: mem127: type=4, attr=0xf, 
> > range=[0x6bb22000-0x7ca9c000) (271MB)
> 
> EFI_BOOT_SERVICES_CODE
> 
> >efi: mem133: type=5, attr=0x800f, 
> > range=[0x7daff000-0x7dbff000) (1MB)
> 
> EFI_RUNTIME_SERVICES_CODE
> 
> >EFI Variables Facility v0.08 2004-May-17
> >BUG: unable to handle kernel paging request at 7ca95b10
> >IP: [] 0x88007dbf213f
> 
> This...
> 
> >Call Trace:
> > [] ?  __alloc_pages_nodemask+0x154/0x2f0
> > [] ?  alloc_page_interleave+0x9d/0xa0
> > [] ?  put_dec+0x72/0x90
> > [] ?  ida_get_new_above+0xb3/0x220
> > [] ?  sub_alloc+0x74/0x1d0
> > [] ?  sub_alloc+0x74/0x1d0
> > [] ?  ida_get_new_above+0xb3/0x220
> > [] ?  create_efivars_bin_attributes+0x150/0x150
> 
> is junk on the stack.
> 
> > [] ?  efi_call3+0x43/0x80
> > [] ?  virt_efi_get_next_variable+0x47/0x1c0
> > [] ?  create_efivars_bin_attributes+0x150/0x150
> > [] ?  efivar_init+0xd5/0x390
> > [] ?  efivar_update_sysfs_entries+0x90/0x90
> > [] ?  kobject_uevent+0xb/0x10
> > [] ?  kset_register+0x5b/0x70
> > [] ?  create_efivars_bin_attributes+0x150/0x150
> > [] ?  efivars_sysfs_init+0x87/0xf0
> > [] ?  do_one_initcall+0x15a/0x1b0
> > [] ?  do_basic_setup+0xad/0xce
> > [] ?  kernel_init_freeable+0x291/0x291
> > [] ?  sched_init_smp+0x15b/0x162
> > [] ?  kernel_init_freeable+0x20d/0x291
> > [] ?  rest_init+0x80/0x80
> > [] ?  kernel_init+0xe/0x180
> > [] ?  ret_from_fork+0x7c/0xb0
> > [] ?  rest_init+0x80/0x80
> 
> Here's the real call stack leading up to the crash.
> 
> What appears to be happening is that your the EFI runtime services code
> is calling into the EFI boot services code, which is definitely a bug in
> your firmware because we're at runtime, but we've seen other machines
> that do similar things so we usually handle it just fine. However, what
> makes your case different, and the reason you see the above splat, is
> that it's using the physical address of the EFI boot services region,
> not the virtual one we setup with SetVirtualAddressMap(). Which is a
> second firmware bug. Again, we have seen other machines that access
> physical addresses after SetVirtualAddressMap(), but until now we
> haven't had any non-optional code that triggered them.
> 
> The only reason I can see that the offending commit would introduce this
> problem is because it calls QueryVariableInfo() at boot time. I notice
> that your machine is an SGI UV one, is there any chance you could get a
> firmware fix for this? If possible, it would be also good to confirm
> that it's this chunk of code in setup_efi_vars(),
> 
>   status = efi_call_phys4(sys_table->runtime->query_variable_info,
>   EFI_VARIABLE_NON_VOLATILE |
>   EFI_VARIABLE_BOOTSERVICE_ACCESS |
>   EFI_VARIABLE_RUNTIME_ACCESS, _size,
>   _size, _size);
> 
> that later makes GetNextVariable() jump to the physical address of the
> EFI Boot Services region. Because if not, we need to do some more
> digging.
> 
> Borislav, how are your 1:1 mapping patches coming along? In theory, once
> those are merged we can gracefully workaround these kinds of issues.
> 
> -- 
> Matt Fleming, Intel Open Source Technology Center
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-24 Thread Matt Fleming
On Fri, 24 May, at 01:09:11PM, Borislav Petkov wrote:
> What do you mean, map boot time functions 1:1 too?

Yep, that's what I meant.

-- 
Matt Fleming, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-24 Thread Borislav Petkov
On Fri, May 24, 2013 at 08:43:31AM +0100, Matt Fleming wrote:
> What appears to be happening is that your the EFI runtime services
> code is calling into the EFI boot services code, which is definitely
> a bug in your firmware because we're at runtime, but we've seen
> other machines that do similar things so we usually handle it just
> fine. However, what makes your case different, and the reason you
> see the above splat, is that it's using the physical address of
> the EFI boot services region, not the virtual one we setup with
> SetVirtualAddressMap(). Which is a second firmware bug.

I'm speechless. Let's have someone else do the ranting this time:

http://www.happyassassin.net/2013/05/03/a-day-in-the-life-of-a-firmware-engineer/

> Again, we have seen other machines that access
> physical addresses after SetVirtualAddressMap(), but until now we
> haven't had any non-optional code that triggered them.
> 
> The only reason I can see that the offending commit would introduce this
> problem is because it calls QueryVariableInfo() at boot time. I notice
> that your machine is an SGI UV one, is there any chance you could get a
> firmware fix for this? If possible, it would be also good to confirm
> that it's this chunk of code in setup_efi_vars(),
> 
>   status = efi_call_phys4(sys_table->runtime->query_variable_info,
>   EFI_VARIABLE_NON_VOLATILE |
>   EFI_VARIABLE_BOOTSERVICE_ACCESS |
>   EFI_VARIABLE_RUNTIME_ACCESS, _size,
>   _size, _size);
> 
> that later makes GetNextVariable() jump to the physical address of the
> EFI Boot Services region. Because if not, we need to do some more
> digging.
> 
> Borislav, how are your 1:1 mapping patches coming along? In theory, once
> those are merged we can gracefully workaround these kinds of issues.

What do you mean, map boot time functions 1:1 too?

In any case, I think I have an idea about the bug I was discussing with
hpa recently but I need to do more experimenting. I have the next week
off, though, so don't hold your breath just yet :).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-24 Thread Matt Fleming
On Thu, 23 May, at 05:23:21PM, Russ Anderson wrote:
> Interesting data point.  The failure is on a rhel7/grub2 root.
> The identical kernel on a rhel6/grub root boots.  So maybe
> grub2 brings out the failure?  I suspect Fedora19/grub2 on
> EFI should hit the problem (for someone looking to reproduce
> it).
> 
> In both cases the kernel boot line options are the same.

I'll bet that rhel7 is using the EFI handover protocol which uses the
internal mechanisms of the EFI boot stub.

I don't know whether anyone will be able to reproduce it, it looks like
it's a bug that's specific to your firmware.

-- 
Matt Fleming, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [regression, bisected] x86: efi: Pass boot services variable info to runtime code

2013-05-24 Thread Matt Fleming
On Thu, 23 May, at 03:32:34PM, Russ Anderson wrote:
>efi: mem127: type=4, attr=0xf, 
> range=[0x6bb22000-0x7ca9c000) (271MB)

EFI_BOOT_SERVICES_CODE

>efi: mem133: type=5, attr=0x800f, 
> range=[0x7daff000-0x7dbff000) (1MB)

EFI_RUNTIME_SERVICES_CODE

>EFI Variables Facility v0.08 2004-May-17
>BUG: unable to handle kernel paging request at 7ca95b10
>IP: [] 0x88007dbf213f

This...

>Call Trace:
> [] ?  __alloc_pages_nodemask+0x154/0x2f0
> [] ?  alloc_page_interleave+0x9d/0xa0
> [] ?  put_dec+0x72/0x90
> [] ?  ida_get_new_above+0xb3/0x220
> [] ?  sub_alloc+0x74/0x1d0
> [] ?  sub_alloc+0x74/0x1d0
> [] ?  ida_get_new_above+0xb3/0x220
> [] ?  create_efivars_bin_attributes+0x150/0x150

is junk on the stack.

> [] ?  efi_call3+0x43/0x80
> [] ?  virt_efi_get_next_variable+0x47/0x1c0
> [] ?  create_efivars_bin_attributes+0x150/0x150
> [] ?  efivar_init+0xd5/0x390
> [] ?  efivar_update_sysfs_entries+0x90/0x90
> [] ?  kobject_uevent+0xb/0x10
> [] ?  kset_register+0x5b/0x70
> [] ?  create_efivars_bin_attributes+0x150/0x150
> [] ?  efivars_sysfs_init+0x87/0xf0
> [] ?  do_one_initcall+0x15a/0x1b0
> [] ?  do_basic_setup+0xad/0xce
> [] ?  kernel_init_freeable+0x291/0x291
> [] ?  sched_init_smp+0x15b/0x162
> [] ?  kernel_init_freeable+0x20d/0x291
> [] ?  rest_init+0x80/0x80
> [] ?  kernel_init+0xe/0x180
> [] ?  ret_from_fork+0x7c/0xb0
> [] ?  rest_init+0x80/0x80

Here's the real call stack leading up to the crash.

What appears to be happening is that your the EFI runtime services code
is calling into the EFI boot services code, which is definitely a bug in
your firmware because we're at runtime, but we've seen other machines
that do similar things so we usually handle it just fine. However, what
makes your case different, and the reason you see the above splat, is
that it's using the physical address of the EFI boot services region,
not the virtual one we setup with SetVirtualAddressMap(). Which is a
second firmware bug. Again, we have seen other machines that access
physical addresses after SetVirtualAddressMap(), but until now we
haven't had any non-optional code that triggered them.

The only reason I can see that the offending commit would introduce this
problem is because it calls QueryVariableInfo() at boot time. I notice
that your machine is an SGI UV one, is there any chance you could get a
firmware fix for this? If possible, it would be also good to confirm
that it's this chunk of code in setup_efi_vars(),

status = efi_call_phys4(sys_table->runtime->query_variable_info,
EFI_VARIABLE_NON_VOLATILE |
EFI_VARIABLE_BOOTSERVICE_ACCESS |
EFI_VARIABLE_RUNTIME_ACCESS, _size,
_size, _size);

that later makes GetNextVariable() jump to the physical address of the
EFI Boot Services region. Because if not, we need to do some more
digging.

Borislav, how are your 1:1 mapping patches coming along? In theory, once
those are merged we can gracefully workaround these kinds of issues.

-- 
Matt Fleming, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   >