grub mishandles corrupt/missing primary GPT

2013-10-23 Thread Chris Murphy
https://bugzilla.redhat.com/show_bug.cgi?id=1022743

Gist is, starting with a disk with valid PMBR, primary GPT, and backup GPT, if 
I zero LBA 2, I can no longer boot from the disk. I get a grub rescue prompt.

Instead, if I merely corrupt a portion of the first partitiontypeguid to mimic 
corruption, I can still boot, whereas this primary GPT fails checksums with 
both gdisk and parted. 

This tells me that GRUB isn't checking for the validity of the primary GPT. And 
GRUB doesn't ever use the backup GPT.

Expected behavior is GRUB should check if the MBR is a PMBR (1st and only entry 
is type 0xEE) and if not then consider the disk MBR. If it is PMBR, check 
validity of the primary GPT header+table, if valid use it. If invalid, check 
validity of backup GPT header+table, if valid use it. If invalid, fail.

Chris Murphy___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


Re: grub mishandles corrupt/missing primary GPT

2013-10-23 Thread Vladimir 'φ-coder/phcoder' Serbinenko
On 24.10.2013 03:38, Chris Murphy wrote:
> https://bugzilla.redhat.com/show_bug.cgi?id=1022743
> 
> Gist is, starting with a disk with valid PMBR, primary GPT, and backup
> GPT, if I zero LBA 2, I can no longer boot from the disk. I get a grub
> rescue prompt.
> 
> Instead, if I merely corrupt a portion of the first partitiontypeguid to
> mimic corruption, I can still boot, whereas this primary GPT fails
> checksums with both gdisk and parted. 
> 
> This tells me that GRUB isn't checking for the validity of the primary
> GPT. And GRUB doesn't ever use the backup GPT.
> 
> Expected behavior is GRUB should check if the MBR is a PMBR (1st and
> only entry is type 0xEE)
There are so called "hybrid" disks which we have to treat as GPT
> and if not then consider the disk MBR. If it is
> PMBR, check validity of the primary GPT header+table, if valid use it.
> If invalid, check validity of backup GPT header+table, if valid use it.
> If invalid, fail.
partmap module is size-critical and CRC32 verification is pretty big.
There are 3 problems with backup header:
1) Backup header would be preserved even when primary is deliberately
reformatted and if we use it then we'll use it even on disks where we
should use newly-created MBR
2) The disk size isn't always known (loopback over network device,
ieee1275 disks and CD-ROMs, possibly others)
3) There are some weird scenarios with USB enclosures "forgetting" last
disk sectors which leads to partition having two different back-headers.
Consider following scenario:
One formats with enclosure, then puts disk natively and moves backup
headers to real end of disk and later modifies partition table. Then
puts disk in enclosure again and then backup has older table.

Do you have ways to handle this?
Why primary would be corrupted in first place?
> 
> Chris Murphy
> 
> 
> ___
> Grub-devel mailing list
> Grub-devel@gnu.org
> https://lists.gnu.org/mailman/listinfo/grub-devel
> 




signature.asc
Description: OpenPGP digital signature
___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


Re: grub mishandles corrupt/missing primary GPT

2013-10-23 Thread Chris Murphy
Thanks for the response:


On Oct 23, 2013, at 7:49 PM, Vladimir 'φ-coder/phcoder' Serbinenko 
 wrote:

> On 24.10.2013 03:38, Chris Murphy wrote:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1022743
>> 
>> Gist is, starting with a disk with valid PMBR, primary GPT, and backup
>> GPT, if I zero LBA 2, I can no longer boot from the disk. I get a grub
>> rescue prompt.
>> 
>> Instead, if I merely corrupt a portion of the first partitiontypeguid to
>> mimic corruption, I can still boot, whereas this primary GPT fails
>> checksums with both gdisk and parted. 
>> 
>> This tells me that GRUB isn't checking for the validity of the primary
>> GPT. And GRUB doesn't ever use the backup GPT.
>> 
>> Expected behavior is GRUB should check if the MBR is a PMBR (1st and
>> only entry is type 0xEE)
> There are so called "hybrid" disks which we have to treat as GPT

While technically a violation of the UEFI spec, I think this can be worked 
around by considering the disk GPT if the first entry in the MBR is type 0xEE. 
I don't know of a hybrid MBR implementation where an entry other than the first 
is 0xEE. 

But if there is no 0xEE entry at all, this is identical to a formerly GPT disk 
repartitioned as MBR by a utility that doesn't know anything about GPT, and 
thus doesn't erase the stale GPT data - and therefore must be treated as MBR.




>> and if not then consider the disk MBR. If it is
>> PMBR, check validity of the primary GPT header+table, if valid use it.
>> If invalid, check validity of backup GPT header+table, if valid use it.
>> If invalid, fail.
> partmap module is size-critical and CRC32 verification is pretty big.

So perhaps this test is difficult because it's GPT on BIOS, with a limited 
space BIOS boot partition. However, I think on UEFI computers this should still 
work with one valid GPT, rather than not boot at all. There's a lot more space 
for this there.

> There are 3 problems with backup header:
> 1) Backup header would be preserved even when primary is deliberately
> reformatted and if we use it then we'll use it even on disks where we
> should use newly-created MBR

Both primary and backup GPTs are preserved in this case since the primary is in 
LBA 1 and 2, and only LBA 0 is overwritten with the new MBR.

UEFI spec says if the MBR signature of 0xaa55 is intact, and there isn't an 
0xEE entry, and the partition entries are rational (physically on disk and 
don't overlap), then the two GPTs are considered stale and the disk is MBR.

> 2) The disk size isn't always known (loopback over network device,
> ieee1275 disks and CD-ROMs, possibly others)

The primary header contains the location of the backup GPT. If the header is 
sufficiently corrupt, and the backup GPT can't be located, then that's the same 
as an invalid backup GPT, and in that case fail.

My point is we shouldn't fail when there is a valid locatable backup GPT. The 
whole point of having a second GPT is obviated with the current behavior.


> 3) There are some weird scenarios with USB enclosures "forgetting" last
> disk sectors which leads to partition having two different back-headers.
> Consider following scenario:
> One formats with enclosure, then puts disk natively and moves backup
> headers to real end of disk and later modifies partition table. Then
> puts disk in enclosure again and then backup has older table.

I don't think we can work around this kind of hardware vendor sabotage. If it 
looks like a valid GPT, but is actually stale, if it's used and contains 
incorrect information, then boot fails. Better to try than not try at all.


> 
> Do you have ways to handle this?
> Why primary would be corrupted in first place?

It's certainly uncommon. A Google search: corrupt "primary gpt" only turns up 
1900 results. But it is possible.

And this isn't the only mishandling I'm finding, so it's not like GRUB is 
unique. In fact just now by changing only a single byte in the primary GPT 
table (I changed the E to an F in the BIOS boot partition type UUID), the 
kernel suddenly has no idea what disklabel the disk is, and fails to mount 
rootfs. So I need to track that down too, but it seems like it knows the 
primary GPT table is corrupt, but then fails to use the backup GPT for some 
reason.

An argument against GRUB doing all of this work: maybe the bootloader should be 
able to blindly trust the primary GPT table with no validity checks? And 
instead rely on (presently non-existent) checks by the underlying OS to fixi 
this problem? Something like an fsck_gpt, seeing as nothing else is in a good 
position to both check and fix such GPTs other than a partition tool.

The UEFI spec says "Software should ask a user for confirmation before 
restoring the primary GPT" and yet it also requires the unspecified software 
fix the primary GPT if corrupt. The spec actually uses the word "must". So per 
usual, the spec has rather lofty demands.


Chris Murphy
___
Grub-devel mailing list
Grub

Re: grub mishandles corrupt/missing primary GPT

2013-10-24 Thread Lennart Sorensen
On Wed, Oct 23, 2013 at 09:07:21PM -0600, Chris Murphy wrote:
> While technically a violation of the UEFI spec, I think this can be worked 
> around by considering the disk GPT if the first entry in the MBR is type 
> 0xEE. I don't know of a hybrid MBR implementation where an entry other than 
> the first is 0xEE. 

Well everyone other than Microsoft seems to understand how useful support
for hybrid setups can be and hence support them.

> But if there is no 0xEE entry at all, this is identical to a formerly GPT 
> disk repartitioned as MBR by a utility that doesn't know anything about GPT, 
> and thus doesn't erase the stale GPT data - and therefore must be treated as 
> MBR.

That is true.  That does not mean there must ONLY be a 0xEE entry.

> So perhaps this test is difficult because it's GPT on BIOS, with a limited 
> space BIOS boot partition. However, I think on UEFI computers this should 
> still work with one valid GPT, rather than not boot at all. There's a lot 
> more space for this there.

Certainly if using the BIOS boot partition, there really isn't much of
a space excuse anymore, unless you run into limitations on how much ram
you can use in early boot.

> Both primary and backup GPTs are preserved in this case since the primary is 
> in LBA 1 and 2, and only LBA 0 is overwritten with the new MBR.
> 
> UEFI spec says if the MBR signature of 0xaa55 is intact, and there isn't an 
> 0xEE entry, and the partition entries are rational (physically on disk and 
> don't overlap), then the two GPTs are considered stale and the disk is MBR.
> 
> The primary header contains the location of the backup GPT. If the header is 
> sufficiently corrupt, and the backup GPT can't be located, then that's the 
> same as an invalid backup GPT, and in that case fail.
> 
> My point is we shouldn't fail when there is a valid locatable backup GPT. The 
> whole point of having a second GPT is obviated with the current behavior.

Sometimes backups are designed in and never used.  I don't recall ever
seeing any indication Microsoft ever used the second copy of the FAT
for anything other than filesystem repair tools.

> I don't think we can work around this kind of hardware vendor sabotage. If it 
> looks like a valid GPT, but is actually stale, if it's used and contains 
> incorrect information, then boot fails. Better to try than not try at all.
> 
> It's certainly uncommon. A Google search: corrupt "primary gpt" only turns up 
> 1900 results. But it is possible.
> 
> And this isn't the only mishandling I'm finding, so it's not like GRUB is 
> unique. In fact just now by changing only a single byte in the primary GPT 
> table (I changed the E to an F in the BIOS boot partition type UUID), the 
> kernel suddenly has no idea what disklabel the disk is, and fails to mount 
> rootfs. So I need to track that down too, but it seems like it knows the 
> primary GPT table is corrupt, but then fails to use the backup GPT for some 
> reason.
> 
> An argument against GRUB doing all of this work: maybe the bootloader should 
> be able to blindly trust the primary GPT table with no validity checks? And 
> instead rely on (presently non-existent) checks by the underlying OS to fixi 
> this problem? Something like an fsck_gpt, seeing as nothing else is in a good 
> position to both check and fix such GPTs other than a partition tool.

Perhaps.  Certainly simpler.

I do wonder how Windows handles booting with a corrupt primary GPT.
Would you happen to know? (A quick google search didn't find an answer
to the question unfortunately).

> The UEFI spec says "Software should ask a user for confirmation before 
> restoring the primary GPT" and yet it also requires the unspecified software 
> fix the primary GPT if corrupt. The spec actually uses the word "must". So 
> per usual, the spec has rather lofty demands.

So it must fix it after asking the user for confirmation?

-- 
Len Sorensen

___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


Re: grub mishandles corrupt/missing primary GPT

2013-10-24 Thread Chris Murphy

On Oct 24, 2013, at 7:39 AM, "Lennart Sorensen"  
wrote:

> On Wed, Oct 23, 2013 at 09:07:21PM -0600, Chris Murphy wrote:
>> While technically a violation of the UEFI spec, I think this can be worked 
>> around by considering the disk GPT if the first entry in the MBR is type 
>> 0xEE. I don't know of a hybrid MBR implementation where an entry other than 
>> the first is 0xEE. 
> 
> Well everyone other than Microsoft seems to understand how useful support
> for hybrid setups can be and hence support them.

Support is a very strong word. They're basically a craptastic workaround for 
prior unfortunate choices.

Apple uses them, it hardly supports them. Their tools routinely nuke hybrid 
MBRs in favor of PMBRs rendering the secondary OS unbootable; if the MBR and 
GPT aren't sync'd, they will bone the correct MBR with wrong GPT information, 
rendering the secondary OS unbootable and data inaccessible. And it does this 
silently.

I think it's OK to tiptoe around hybrid MBRs, and do something sensible, if 
possible. Supporting them is out of scope because there's no standard or agreed 
upon way to interpret them.



> 
>> But if there is no 0xEE entry at all, this is identical to a formerly GPT 
>> disk repartitioned as MBR by a utility that doesn't know anything about GPT, 
>> and thus doesn't erase the stale GPT data - and therefore must be treated as 
>> MBR.
> 
> That is true.  That does not mean there must ONLY be a 0xEE entry.

Well, there must be only an 0xEE entry to treat the disk as a pure GPT disk.

Once there's 0xEE and 1-3 additional entries, it's a hybrid logic, very few 
combinations of which are sane. When the MBR and GPT don't agree with each 
other, which on Macs is actually somewhat common once you've used Bootcamp 
Assistant, because users think it's OK to resize OS X volumes in OS X Disk 
Utility, and then use free space to either create an additional OS X partition, 
or grow an existing Windows partition from within Windows. Oops, now the MBR 
and GPT don't agree with each other, so which one is correct? Well, it's 
ambiguous.

With a few exceptions, there's actually no way to know what's correct, which is 
why hybrid MBRs are ultimately shit. But again, I'm fine dodging piles of crap 
rather than cleaning up other people's messes.

> 
> I do wonder how Windows handles booting with a corrupt primary GPT.
> Would you happen to know? (A quick google search didn't find an answer
> to the question unfortunately).

I haven't tested it because I don't have a UEFI machine here, only Apple EFI. 
So I'm stuck with CSM-BIOS mode booting of Windows, which means it will only 
use MBR. I haven't figured out UEFI within qemu/kvm, and if that can boot 
Windows in UEFI mode.



> 
>> The UEFI spec says "Software should ask a user for confirmation before 
>> restoring the primary GPT" and yet it also requires the unspecified software 
>> fix the primary GPT if corrupt. The spec actually uses the word "must". So 
>> per usual, the spec has rather lofty demands.
> 
> So it must fix it after asking the user for confirmation?

Yes it's just being silly. But the take away is that (partitioning) tools are 
behaving wrongly if they understand GPT, yet ignore or can't fix problems with 
either GPT. The spec only says software, it doesn't specify what software, so 
I'm assuming partitioning tools. Obviously the kernel is software, but it's not 
in a position to ask the user anything.

Chris Murphy
___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


Re: grub mishandles corrupt/missing primary GPT

2013-11-02 Thread Vladimir 'φ-coder/phcoder' Serbinenko
On 24.10.2013 20:17, Chris Murphy wrote:
> Yes it's just being silly. But the take away is that (partitioning) tools are 
> behaving wrongly if they understand GPT, yet ignore or can't fix problems 
> with either GPT. The spec only says software, it doesn't specify what 
> software, so I'm assuming partitioning tools. Obviously the kernel is 
> software, but it's not in a position to ask the user anything.
GRUB logic is that it should be able to read corrupted as far as it's
not too corrupted and let kernel/partitioning tool to do the permanent fix.



signature.asc
Description: OpenPGP digital signature
___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


Re: grub mishandles corrupt/missing primary GPT

2013-12-09 Thread Phillip Susi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 10/23/2013 9:49 PM, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
> partmap module is size-critical and CRC32 verification is pretty
> big. There are 3 problems with backup header:

The grub core no longer fits in 63 sectors in all but the most trivial
configurations as it is, and a 2048 sector embed area has been
standard now for several years, so I don't think size is a problem.


-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSpeGhAAoJEI5FoCIzSKrw8XYH/09Aou9FwkH4i2bhVqYeKNeb
ge0VYz3JNSxVpEVz3cmw0STNyz4/5vF+lJ59Renjbo7vj8BhVcYpMF2FfuUtdM2f
8vgqAMWnCRud7dJgO13G1CopNfAg/rjduc2zFmxMDYdFtyGEGaFYUhrIXSjetzj2
g2Lryoah6BPIdvQA/kANSNvixTj/b2+uxUpnKSbqR2b+5c8zcdXkhUJGJwR9ZEmh
4K10uMA4QlR+Y2QNqxwSPzWo44NY5xmupjOVnNFeV/ROC/OAXQXoOa8lrapDLWta
vTSH6eddfoBdMqT5hdfQYnSgn61/sca1DR4IB9LdAVW+tPq4znDB6paFRfx+38A=
=YXGu
-END PGP SIGNATURE-

___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


Re: grub mishandles corrupt/missing primary GPT

2013-12-09 Thread Vladimir 'φ-coder/phcoder' Serbinenko
On 09.12.2013 16:28, Phillip Susi wrote:
> On 10/23/2013 9:49 PM, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
>> partmap module is size-critical and CRC32 verification is pretty
>> big. There are 3 problems with backup header:
> 
> The grub core no longer fits in 63 sectors in all but the most trivial
> configurations as it is,
Not true. I've checked: all configs not involving compressed fs or
diskfilter fit in 31K.
> and a 2048 sector embed area has been
> standard now for several years, so I don't think size is a problem.
> 
We're speaking abut GPT, nothing to do with MBR embed area.

My problem with that is that it increases complexity a lot in currently
simple code.
And also I had experience with backup header out of place due to disk
reconfiguration and primary header corrupted but still well enough to
have valid partitions. I could boot this system by using "gpt" linux
option. With proposed changes this system would become unbootable.



signature.asc
Description: OpenPGP digital signature
___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


Re: grub mishandles corrupt/missing primary GPT

2013-12-09 Thread Chris Murphy

On Dec 9, 2013, at 8:54 AM, Vladimir 'φ-coder/phcoder' Serbinenko 
 wrote:

> On 09.12.2013 16:28, Phillip Susi wrote:
>> On 10/23/2013 9:49 PM, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
>>> partmap module is size-critical and CRC32 verification is pretty
>>> big. There are 3 problems with backup header:
>> 
>> The grub core no longer fits in 63 sectors in all but the most trivial
>> configurations as it is,
> Not true. I've checked: all configs not involving compressed fs or
> diskfilter fit in 31K.
>> and a 2048 sector embed area has been
>> standard now for several years, so I don't think size is a problem.
>> 
> We're speaking abut GPT, nothing to do with MBR embed area.
> 
> My problem with that is that it increases complexity a lot in currently
> simple code.
> And also I had experience with backup header out of place due to disk
> reconfiguration and primary header corrupted but still well enough to
> have valid partitions. I could boot this system by using "gpt" linux
> option. With proposed changes this system would become unbootable.

Technically if the alternate is invalid by being in the wrong location (either 
end of disk or where the primary header says it should be located), and the 
header is also invalid because the header is corrupt, then the disk has an 
invalid GPT. So long as GRUB knows a valid MBR without an 0xEE entry means any 
found GPT is stale (or rather, simply doesn't go looking for the GPT), it seems 
possibly reasonable for GRUB to blindly use the primary partition table. If it 
fails, it fails, even if it's unfortunate there's no fallback to a valid 
alternate GPT.

Maybe someone could argue it's a security problem for an invalid GPT being used 
despite being invalid?

Also, I have some evidence that newer Apple EFI firmware are repairing these 
cases. I have one older computer that I can consistently corrupt, and it 
remains corrupted through boot, even to the degree the (linux) kernel face 
plants by default if the primary header or table is corrupt, unless the gpt 
kernel parameter is used. Yet a newer computer boots without the kernel 
complaining, and upon startup completion the GPT is fixed. Identically 
performed installations were performed in those cases.

So maybe it can be argued the firmware has a role to play in fixing up GPT? Or 
maybe this is a hideously bad idea for firmware, which as we know is slightly 
less than massively bug ridden, to have such write privileges to the disk.


Chris Murphy
___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


Re: grub mishandles corrupt/missing primary GPT

2013-12-09 Thread Vladimir 'φ-coder/phcoder' Serbinenko
On 10.12.2013 01:11, Chris Murphy wrote:
> 
> On Dec 9, 2013, at 8:54 AM, Vladimir 'φ-coder/phcoder' Serbinenko 
>  wrote:
> 
>> On 09.12.2013 16:28, Phillip Susi wrote:
>>> On 10/23/2013 9:49 PM, Vladimir 'φ-coder/phcoder' Serbinenko wrote:
 partmap module is size-critical and CRC32 verification is pretty
 big. There are 3 problems with backup header:
>>>
>>> The grub core no longer fits in 63 sectors in all but the most trivial
>>> configurations as it is,
>> Not true. I've checked: all configs not involving compressed fs or
>> diskfilter fit in 31K.
>>> and a 2048 sector embed area has been
>>> standard now for several years, so I don't think size is a problem.
>>>
>> We're speaking abut GPT, nothing to do with MBR embed area.
>>
>> My problem with that is that it increases complexity a lot in currently
>> simple code.
>> And also I had experience with backup header out of place due to disk
>> reconfiguration and primary header corrupted but still well enough to
>> have valid partitions. I could boot this system by using "gpt" linux
>> option. With proposed changes this system would become unbootable.
> 
> Technically if the alternate is invalid by being in the wrong location 
> (either end of disk or where the primary header says it should be located), 
> and the header is also invalid because the header is corrupt, then the disk 
> has an invalid GPT. So long as GRUB knows a valid MBR without an 0xEE entry 
> means any found GPT is stale (or rather, simply doesn't go looking for the 
> GPT), it seems possibly reasonable for GRUB to blindly use the primary 
> partition table. If it fails, it fails, even if it's unfortunate there's no 
> fallback to a valid alternate GPT.
It's already the case.
Probably the real remaining points are:
- Should we use backup headers under some conditions?
- Should msdos partitions be visible? Always? When it's not a PMBR? Or
when GPT is corrupt?
> 
> Maybe someone could argue it's a security problem for an invalid GPT being 
> used despite being invalid?
> 
CRC32 isn't a MAC. Anyone who can modify GPT can fix CRC32 as well.
> Also, I have some evidence that newer Apple EFI firmware are repairing these 
> cases. I have one older computer that I can consistently corrupt, and it 
> remains corrupted through boot, even to the degree the (linux) kernel face 
> plants by default if the primary header or table is corrupt, unless the gpt 
> kernel parameter is used. Yet a newer computer boots without the kernel 
> complaining, and upon startup completion the GPT is fixed. Identically 
> performed installations were performed in those cases.
> 
> So maybe it can be argued the firmware has a role to play in fixing up GPT? 
> Or maybe this is a hideously bad idea for firmware, which as we know is 
> slightly less than massively bug ridden, to have such write privileges to the 
> disk.
> 
Firmware writing to disk without being explicitly asked for it is a
bugware or spyware.
> 
> Chris Murphy
> ___
> Grub-devel mailing list
> Grub-devel@gnu.org
> https://lists.gnu.org/mailman/listinfo/grub-devel
> 




signature.asc
Description: OpenPGP digital signature
___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


Re: grub mishandles corrupt/missing primary GPT

2013-12-09 Thread Chris Murphy

On Dec 9, 2013, at 5:55 PM, Vladimir 'φ-coder/phcoder' Serbinenko 
 wrote:

> On 10.12.2013 01:11, Chris Murphy wrote:
>> 
>> Technically if the alternate is invalid by being in the wrong location 
>> (either end of disk or where the primary header says it should be located), 
>> and the header is also invalid because the header is corrupt, then the disk 
>> has an invalid GPT. So long as GRUB knows a valid MBR without an 0xEE entry 
>> means any found GPT is stale (or rather, simply doesn't go looking for the 
>> GPT), it seems possibly reasonable for GRUB to blindly use the primary 
>> partition table. If it fails, it fails, even if it's unfortunate there's no 
>> fallback to a valid alternate GPT.
> It's already the case.
> Probably the real remaining points are:
> - Should we use backup headers under some conditions?

It would be nice. But if not by validating at least the table checksum, how? I 
don't know how big the CRC32 code is in comparison to code needed to evaluate 
the table some with some heuristic approach. Also it seems like a bit flip of 
the most important partition data, the needed partition's start sector value 
(is the end value needed?) is a really rare case. The more likely scenario is 
some software alters the GPT and has a bad write or crash at that moment, in 
which case the cause of boot failure isn't a complete mystery.

> - Should msdos partitions be visible? Always? When it's not a PMBR? Or
> when GPT is corrupt?

I suggest parsing LBA 0 first for a conventional MBR, if it is, don't even 
parse LBA1 looking for a GPT. If the MBR is either hybrid or PMBR, then parse 
the GPT. I don't think it's a good idea to get into a case where GRUB looks at 
both MBR and GPT and has to figure out which partitions to honor or ignore if 
they aren't in sync. Even in the constrained Apple OS X Boot Camp 
implementation there has been a lot of data loss due to missteps in 
interpreting hybrid MBRs.


>> So maybe it can be argued the firmware has a role to play in fixing up GPT? 
>> Or maybe this is a hideously bad idea for firmware, which as we know is 
>> slightly less than massively bug ridden, to have such write privileges to 
>> the disk.
>> 
> Firmware writing to disk without being explicitly asked for it is a
> bugware or spyware.


Yes I definitely find this really interesting behavior. If the firmware does 
have the ability to write, I wonder if an arbitrary EFI application could have 
write permission? If so, it seems like a potentially huge attack vector. I 
don't see what else could be repairing the GPT: computer firmware, SSD 
firmware, GRUB, linux kernel. I think GRUB and linux are out, otherwise one of 
them would have fixed the GPT on other hardware that used an identical 
installation source.


Chris Murphy
___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


Re: grub mishandles corrupt/missing primary GPT

2013-12-09 Thread Vladimir 'φ-coder/phcoder' Serbinenko
On 10.12.2013 02:56, Chris Murphy wrote:
> 
> On Dec 9, 2013, at 5:55 PM, Vladimir 'φ-coder/phcoder' Serbinenko 
>  wrote:
> 
>> On 10.12.2013 01:11, Chris Murphy wrote:
>>>
>>> Technically if the alternate is invalid by being in the wrong location 
>>> (either end of disk or where the primary header says it should be located), 
>>> and the header is also invalid because the header is corrupt, then the disk 
>>> has an invalid GPT. So long as GRUB knows a valid MBR without an 0xEE entry 
>>> means any found GPT is stale (or rather, simply doesn't go looking for the 
>>> GPT), it seems possibly reasonable for GRUB to blindly use the primary 
>>> partition table. If it fails, it fails, even if it's unfortunate there's no 
>>> fallback to a valid alternate GPT.
>> It's already the case.
>> Probably the real remaining points are:
>> - Should we use backup headers under some conditions?
> 
> It would be nice. But if not by validating at least the table checksum, how? 
> I don't know how big the CRC32 code is in comparison to code needed to 
> evaluate the table some with some heuristic approach. Also it seems like a 
> bit flip of the most important partition data, the needed partition's start 
> sector value (is the end value needed?) is a really rare case. The more 
> likely scenario is some software alters the GPT and has a bad write or crash 
> at that moment, in which case the cause of boot failure isn't a complete 
> mystery.
> 
We need end value as well.
Here the interesting part is that the data you need is about 1% of
checksummed area, so in most cases checksum check gets more in the way
than it helps.
>> - Should msdos partitions be visible? Always? When it's not a PMBR? Or
>> when GPT is corrupt?
> 
> I suggest parsing LBA 0 first for a conventional MBR, if it is, don't even 
> parse LBA1 looking for a GPT. If the MBR is either hybrid or PMBR, then parse 
> the GPT. I don't think it's a good idea to get into a case where GRUB looks 
> at both MBR and GPT and has to figure out which partitions to honor or ignore 
> if they aren't in sync. Even in the constrained Apple OS X Boot Camp 
> implementation there has been a lot of data loss due to missteps in 
> interpreting hybrid MBRs.
> 
GRUB has handling of multiple partmap scenarios but it won't handle all
the cases of desync correctly. E.g. partitions with same start but
different end would be recognized as same UUID with most filesystems but
the files may be unreadable in case of premature partition end.
> 
>>> So maybe it can be argued the firmware has a role to play in fixing up GPT? 
>>> Or maybe this is a hideously bad idea for firmware, which as we know is 
>>> slightly less than massively bug ridden, to have such write privileges to 
>>> the disk.
>>>
>> Firmware writing to disk without being explicitly asked for it is a
>> bugware or spyware.
> 
> 
> Yes I definitely find this really interesting behavior. If the firmware does 
> have the ability to write, I wonder if an arbitrary EFI application could 
> have write permission? If so, it seems like a potentially huge attack vector. 
> I don't see what else could be repairing the GPT: computer firmware, SSD 
> firmware, GRUB, linux kernel. I think GRUB and linux are out, otherwise one 
> of them would have fixed the GPT on other hardware that used an identical 
> installation source.
> 
Firmware has full write capability.
BIOS, EFI, IEEE1275, ARC(S) all have disk write functions usable by
bootloader
U-Boot has only read functions.
Remaining GRUB platforms have no disk functions and GRUB uses own drivers.

> 
> Chris Murphy
> ___
> Grub-devel mailing list
> Grub-devel@gnu.org
> https://lists.gnu.org/mailman/listinfo/grub-devel
> 




signature.asc
Description: OpenPGP digital signature
___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel


Re: grub mishandles corrupt/missing primary GPT

2013-12-10 Thread Phillip Susi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/9/2013 10:54 AM, Vladimir '?-coder/phcoder' Serbinenko wrote:
> Not true. I've checked: all configs not involving compressed fs or 
> diskfilter fit in 31K.

As I said, "trivial" configurations ;)

ext2 with no raid or lvm fits... btrfs or any combination of raid or
lvm doesn't.

> We're speaking abut GPT, nothing to do with MBR embed area.

You seemed to be concerned that increasing the size to deal with GPT
properly would be bad for MBR setups.  MBR setups already have plenty
of spare room in the vast majority of cases.

> My problem with that is that it increases complexity a lot in
> currently simple code. And also I had experience with backup header
> out of place due to disk reconfiguration and primary header
> corrupted but still well enough to have valid partitions. I could
> boot this system by using "gpt" linux option. With proposed changes
> this system would become unbootable.

One very damaged configuration becomes unbootable, many other less
damaged configurations become bootable.  Good trade in my book.


-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSp22jAAoJEI5FoCIzSKrwxcEH/1Ban2YrF5XKC0qmywYnUjDc
Bk29a/1KQTPEgX8L8gm9k6cmdIWis+bPCn2HLxNo738/9OmAlUK23Tt5mXgAfy3j
6H+wZPl/NunNrYiWrVjql+sBgKyC69k6tGUwEXUeldyQRBfMWagJtbJGlZC7jmcq
zPwjME+hys+JDXSIhSDLWT6+EpNpwha8e147vlDKJ9CFA83l8WVR1kB6RuIloUly
iAPHavx33unqPc2vLghsajIj7MhGTzTKy0jDs1g8u1wZW3A2oJKMWAuz/FiCu1fL
K1wHeR0Mi6QeEKeQkbaNotAgW6CXlWO6zLzhdF7SuQRBsTxLAp6/ymrthGUQECA=
=tJLy
-END PGP SIGNATURE-

___
Grub-devel mailing list
Grub-devel@gnu.org
https://lists.gnu.org/mailman/listinfo/grub-devel