Bug#478238: grub-probe: fails to find drive for /dev/sda10

2008-05-11 Thread Török Edwin
[sending to grub-devel@ as requested]

Robert Millan wrote:
 On Sun, May 04, 2008 at 05:01:32PM +0300, Török Edwin wrote:
   
Device Boot  Start End  Blocks   Id  System
 /dev/sda1   *   11275102414067  HPFS/NTFS
 /dev/sda212762248 7815622+  a6  OpenBSD
 /dev/sda32249528924426832+   f  W95 Ext'd (LBA)
 /dev/sda460807296 9775552+  bf  Solaris
 /dev/sda522492371  987966   82  Linux swap / 
 Solaris
 /dev/sda623723587 9767488+  83  Linux
 /dev/sda735883600  104391   83  Linux
 /dev/sda83601486310145016   8e  Linux LVM
 /dev/sda948645228 2931831   a6  OpenBSD
 /dev/sda10   52295289  489951   83  Linux
 
 [...]
 grub ls (hd0,10)
 error: unknown device
 grub ls (hd0,11)
 error: unknown device
 grub
 

 I tried reproducing your setup, but I can't hit the same bug.  This starts to
 look really nasty.  Just spotted this:

   /build/buildd/grub2-1.96+20080426/partmap/pc.c:141: partition 0: flag 0x80, 
 type 0x7, start 0x3f, len 0x1388afc
   [...]
   /build/buildd/grub2-1.96+20080426/partmap/pc.c:141: partition 0: flag 0x0, 
 type 0x82, start 0x2270f07, len 0x1e267c

 for which I can't find any explanation other than memory corruption.  Also,
 due to a missing fflush() call the output is somewhat scrambled, which makes
 it harder to track (I fixed this already in upstream).

 Could you:

   - Apply the attached patch  run grub-probe again (this time output
 will be a bit more readable)
   

There was no patch attached, however I did a 'cvs diff -u -D2008-04-30',
and applied that patch.
I found what the problem is, and it also explains why you couldn't
reproduce the problem.

/dev/sda9 is not a valid OpenBSD partition, and in partmap/pc.c:176 the
iteration fails with an error: invalid disk label magic 0x%x.
If I replace that return with a continue, it works.

The problem is that grub2 stops looking for more partitions as soon as
it encountered the invalid partition,
grub 0.97 was working perfectly and I never noticed the partition has
the wrong type!

Also if I change the partition type to 83 (as it should be) an unpatched
grub-probe can find that /boot is on /dev/sda10:
# grub-probe -t device /boot
/dev/sda10

I think grub2 should handle errors more gracefully, eventually mark the
partition as invalid, and keep going.
grub-probe was looking for /dev/sda10, and it shouldn't be affected by
/dev/sda9 being corrupted/invalid.
Think of it this way: if a partition gets corrupted, that shouldn't
prevent from booting, assuming the boot and root partitions are still ok.

Compare what grub-emu says when sda9 has wrong type:

grub ls (hd0,10)
error: unknown device

And this is what it says when sda9 has the correct type:
grub ls (hd0,10)
  Partition hd0,10: Filesystem type ext2, Label debian_BOOT



   - Send it to [EMAIL PROTECTED]
   
Done
   ?

 Maybe someone there has an idea, but if it's memory corruption and we can't
 reproduce it, tracing the problem remotely isn't going to work very well.
   

It wasn't memory corruption, however I have run valgrind and it has
shown some leaks, plus call to stat() with NULL parameter.
The attached patch fixes some valgrind warnings. Some leaks still
remain, I attached the new valgrind logs.

P.S.: grub2 seems to work now, I am able to boot with it with the
text-mode menu. The default graphics mode doesn't work I will open a
separate bug about that.

Best regards,
--Edwin

diff -ur grub2-1.96+20080429/kern/disk.c ../grub2-1.96+20080429/kern/disk.c
--- grub2-1.96+20080429/kern/disk.c	2008-02-08 14:22:51.0 +0200
+++ ../grub2-1.96+20080429/kern/disk.c	2008-05-11 13:58:02.270673755 +0300
@@ -317,7 +317,10 @@
   /* Reset the timer.  */
   grub_last_time = grub_get_rtc ();
 
-  grub_free (disk-partition);
+  if(disk-partition) {
+	  grub_free (disk-partition-data);
+	  grub_free (disk-partition);
+  }
   grub_free ((void *) disk-name);
   grub_free (disk);
 }
diff -ur grub2-1.96+20080429/util/grub-probe.c ../grub2-1.96+20080429/util/grub-probe.c
--- grub2-1.96+20080429/util/grub-probe.c	2008-05-11 13:59:14.934811935 +0300
+++ ../grub2-1.96+20080429/util/grub-probe.c	2008-05-11 13:46:21.729236855 +0300
@@ -190,9 +190,10 @@
   struct stat st;
   grub_fs_t fs;
 
-  stat (path, st);
+  if(path)
+	  stat (path, st);
 
-  if (st.st_mode == S_IFREG)
+  if (path  st.st_mode == S_IFREG)
 	{
 	  /* Regular file.  Verify that we can read it properly.  */
 
==25071== Memcheck, a memory error detector.
==25071== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==25071== Using LibVEX rev 1804, a library for dynamic binary translation.
==25071== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==25071== Using valgrind-3.3.0-Debian, a dynamic binary instrumentation 
framework.

Bug#478238: grub-probe: fails to find drive for /dev/sda10

2008-05-06 Thread Robert Millan
On Sun, May 04, 2008 at 05:01:32PM +0300, Török Edwin wrote:
 
 Device Boot  Start End  Blocks   Id  System
  /dev/sda1   *   11275102414067  HPFS/NTFS
  /dev/sda212762248 7815622+  a6  OpenBSD
  /dev/sda32249528924426832+   f  W95 Ext'd (LBA)
  /dev/sda460807296 9775552+  bf  Solaris
  /dev/sda522492371  987966   82  Linux swap / 
  Solaris
  /dev/sda623723587 9767488+  83  Linux
  /dev/sda735883600  104391   83  Linux
  /dev/sda83601486310145016   8e  Linux LVM
  /dev/sda948645228 2931831   a6  OpenBSD
  /dev/sda10   52295289  489951   83  Linux
 [...]
 grub ls (hd0,10)
 error: unknown device
 grub ls (hd0,11)
 error: unknown device
 grub

I tried reproducing your setup, but I can't hit the same bug.  This starts to
look really nasty.  Just spotted this:

  /build/buildd/grub2-1.96+20080426/partmap/pc.c:141: partition 0: flag 0x80, 
type 0x7, start 0x3f, len 0x1388afc
  [...]
  /build/buildd/grub2-1.96+20080426/partmap/pc.c:141: partition 0: flag 0x0, 
type 0x82, start 0x2270f07, len 0x1e267c

for which I can't find any explanation other than memory corruption.  Also,
due to a missing fflush() call the output is somewhat scrambled, which makes
it harder to track (I fixed this already in upstream).

Could you:

  - Apply the attached patch  run grub-probe again (this time output
will be a bit more readable)

  - Send it to [EMAIL PROTECTED]

  ?

Maybe someone there has an idea, but if it's memory corruption and we can't
reproduce it, tracing the problem remotely isn't going to work very well.

Thank you

-- 
Robert Millan

GPLv2 I know my rights; I want my phone call!
DRM What use is a phone call… if you are unable to speak?
(as seen on /.)



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#478238: grub-probe: fails to find drive for /dev/sda10

2008-05-04 Thread Török Edwin
Robert Millan wrote:
 On Mon, Apr 28, 2008 at 11:54:16AM +0300, Török Edwin wrote:
   
 Package: grub-common
 Version: 1.96+20080426-1
 Severity: important

 --- Please enter the report below this line. ---

 # grub-probe -t device /boot
 /dev/sda10

 # grub-probe -t drive /boot
 grub-probe: error: Cannot find a GRUB drive for /dev/sda10.  Check your 
 device.map.

 I attached the output of
 'grub-probe -d /dev/sda10 -vv'

 =-=
 Output from fdisk -l:

 Disk /dev/sda: 60.0 GB, 60011642880 bytes
 255 heads, 63 sectors/track, 7296 cylinders
 Units = cylinders of 16065 * 512 = 8225280 bytes
 Disk identifier: 0x8f800100

Device Boot  Start End  Blocks   Id  System
 /dev/sda1   *   11275102414067  HPFS/NTFS
 /dev/sda212762248 7815622+  a6  OpenBSD
 /dev/sda32249528924426832+   f  W95 Ext'd (LBA)
 /dev/sda460807296 9775552+  bf  Solaris
 /dev/sda522492371  987966   82  Linux swap / Solaris
 /dev/sda623723587 9767488+  83  Linux
 /dev/sda735883600  104391   83  Linux
 /dev/sda83601486310145016   8e  Linux LVM
 /dev/sda948645228 2931831   a6  OpenBSD
 /dev/sda10   52295289  489951   83  Linux
 

 Could you run grub-emu and type the following commands?

   ls, ls (hd0,1), ls (hd0,2), ., ls (hd0,10)

 and send the output?  Thank you

Output from grub2:

GNU GRUB  version 1.96

 [ Minimal BASH-like line editing is supported. For the first word, TAB
   lists possible command completions. Anywhere else TAB lists possible
   device/file completions. ]

grub ls
(VolGroup00-LogVol01) (VolGroup00-LogVol00) (host) (hd0) (hd0,1) (hd0,2)
(hd0,2,a) (hd0,2,b) (hd0,2,d) (hd0,2,e) (hd0,2,g) (hd0,2,h) (hd0,2,l)
(hd0,2,m) (hd0
,2,n) (hd0,2,o) (hd0,2,p) (hd0,4) (hd0,5) (hd0,6) (hd0,7) (hd0,8) (hd0,9)
error: invalid disk label magic 0x4542414c
grub ls (hd0,1)
Partition hd0,1: Filesystem type ntfs
grub ls (hd0,2)
Partition hd0,2: Filesystem type ufs
grub ls (hd0,3)
error: unknown device
grub ls (hd0,4)
Partition hd0,4: Unknown filesystem
grub ls (hd0,5)
Partition hd0,5: Unknown filesystem
grub ls (hd0,6)
Partition hd0,6: Filesystem type xfs, Label debian_ROOT
grub ls (hd0,7)
Partition hd0,7: Filesystem type ext2, Label /boot
grub ls (hd0,8)
Partition hd0,8: Unknown filesystem
grub ls (hd0,9)
Partition hd0,9: Unknown filesystem
grub ls (hd0,10)
error: unknown device
grub ls (hd0,11)
error: unknown device
grub


Output from grub-0.97:
grub root (hd0,1)
 Filesystem type unknown, partition type 0xa6

grub root (hd0,2)
 Filesystem type unknown, partition type 0xf

grub root (hd0,3)
 Filesystem type unknown, partition type 0xbf

grub root (hd0,4)
 Filesystem type unknown, partition type 0x82

grub root (hd0,5)
 Filesystem type is xfs, partition type 0x83

grub root (hd0,6)
 Filesystem type is ext2fs, partition type 0x83

grub root (hd0,7)
 Filesystem type unknown, partition type 0x8e

grub root (hd0,8)

Error 24: Attempt to access block outside partition

grub root (hd0,9)
 Filesystem type is ext2fs, partition type 0x83

grub root (hd0,10)

Error 22: No such partition

P.S. Sorry for the late reply, the mail from the bug tracker went into
the spam folder :(

Best regards,
--Edwin




--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#478238: grub-probe: fails to find drive for /dev/sda10

2008-04-29 Thread Robert Millan
On Mon, Apr 28, 2008 at 11:54:16AM +0300, Török Edwin wrote:
 Package: grub-common
 Version: 1.96+20080426-1
 Severity: important
 
 --- Please enter the report below this line. ---
 
 # grub-probe -t device /boot
 /dev/sda10
 
 # grub-probe -t drive /boot
 grub-probe: error: Cannot find a GRUB drive for /dev/sda10.  Check your 
 device.map.
 
 I attached the output of
 'grub-probe -d /dev/sda10 -vv'
 
 =-=
 Output from fdisk -l:
 
 Disk /dev/sda: 60.0 GB, 60011642880 bytes
 255 heads, 63 sectors/track, 7296 cylinders
 Units = cylinders of 16065 * 512 = 8225280 bytes
 Disk identifier: 0x8f800100
 
Device Boot  Start End  Blocks   Id  System
 /dev/sda1   *   11275102414067  HPFS/NTFS
 /dev/sda212762248 7815622+  a6  OpenBSD
 /dev/sda32249528924426832+   f  W95 Ext'd (LBA)
 /dev/sda460807296 9775552+  bf  Solaris
 /dev/sda522492371  987966   82  Linux swap / Solaris
 /dev/sda623723587 9767488+  83  Linux
 /dev/sda735883600  104391   83  Linux
 /dev/sda83601486310145016   8e  Linux LVM
 /dev/sda948645228 2931831   a6  OpenBSD
 /dev/sda10   52295289  489951   83  Linux

Could you run grub-emu and type the following commands?

  ls, ls (hd0,1), ls (hd0,2), ., ls (hd0,10)

and send the output?  Thank you

-- 
Robert Millan

The technological evasion of the license is as unacceptable as the
 legal evasion of the license [...].  That's the provision in section
 1 regarding keys. [...]  We say one thing: when you sell somebody a
 home... give him the keys  -- Eben Moglen on GPLv3



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]