I apologize in advance for the length of this post.  Since however I do not
know what information is necessary to determine why this installation
failed I am including everything which I have the least suspicion may be
contributing to the failure.

To begin I describe the essentials of my recently purchased custom designed
desktop.  It consists of a Gigabyte GA-Z87N mainboard, the Z87N being one
of Intel's latest Haswell chipsets.  Integrated in that processor is
Intel's HD graphics 4600, the mainboard having the required graphics
outputs.  Two 2 terabyte hard drives are designated as a RAID1.

I started out with the intention of  installing only the basic system,
thereby shortening  reinstalment processes if for any reason I needed to
start the installer over again -- such as to change the partitioning.  At
this stage for example I tried to install UEFI.  Since I was unsuccessful I
reverted to the standard BIOS.  My experience trying to install UEFI will
be the subject of another thread.

Since Dec 31 when I started this process I have lost count of the number of
false starts.  My guess is that since then I ended up initiating the
installer about fifteen times.

When I thought everything was ready to install the desktop environment (DE)
and other things I tried to do so manually by installing xserver-xorg and
xfce meta packages.  This method was unsuccessful. Examination of log files
indicated that much was missing, including whatever package contains
startx.  (Running startx in a virtual terminal returned the message
"command not found".)

I then decided to allow the installer to install the DE and other selected
tasks; to do so however required initiating the installer once again -- in
fact twice.  First I set up in the partition section of the installer with
xfs file type in most of my partitions with lilo as the boot loader, but
the installer would not install lilo.  (Does xfs still require lilo instead
of grub?)

The second time I used the ext4 file type with grub as the boot loader.
The boot process aborted with a  "failed" message.  To see what would
happen  I typed Control-D to resume the boot.  In due course the xfce login
window appeared.  After entering my user name and password the monitor went
blank.  At this point the computer was totally unresponsive to any input.
I had to reboot.

This boot ran to the point where the line containing Control-D appeared.
This time I entered the root password in order to examine the output of the
dmesg command, various log files and the information which scrolls by on
the monitor while the OS is loading -- and perhaps glean some idea of why
the DE would not load.

The first problem encountered was that something was detecting and enabling
the Logitech USB optical mouse many times, even to the point of
interrupting other commands and their outputs.  I solved that problem -- at
least temporarily -- by disconnecting it.  In a virtual terminal it is
anyway not needed.

The first part of the dmesg output which scares me reads as follows.
-------------------------------------------------------------------------
[    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.2.0-4-amd64
root=/dev/mapper/TG1-root ro single
[    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
[    0.000000] xsave/xrstor: enabled xstate_bv 0x7, cntxt size 0x340
[    0.000000] Checking aperture...
[    0.000000] No AGP bridge found
[    0.000000] Calgary: detecting Calgary via BIOS EBDA area
[    0.000000] Calgary: Unable to locate Rio Grande table in EBDA - bailing!
[    0.000000] ------------[ cut here ]------------
[    0.000000] WARNING: at
/build/linux-rrsxby/linux-3.2.51/drivers/iommu/dmar.c:492
warn_invalid_dmar+0x77/0x85()
[    0.000000] Hardware name: Z87N-WIFI
[    0.000000] Your BIOS is broken; DMAR reported at address 0!
[    0.000000] BIOS vendor: American Megatrends Inc.; Ver: F4; Product
Version: To be filled by O.E.M.
[    0.000000] Modules linked in:
[    0.000000] Pid: 0, comm: swapper Not tainted 3.2.0-4-amd64 #1 Debian
3.2.51-1
[    0.000000] Call Trace:
[    0.000000]  [<ffffffff81046cbd>] ? warn_slowpath_common+0x78/0x8c
[    0.000000]  [<ffffffff81046d1f>] ? warn_slowpath_fmt_taint+0x3d/0x42
[    0.000000]  [<ffffffff81348632>] ? __pte+0x7/0x8
[    0.000000]  [<ffffffff816bde4c>] ? __early_set_fixmap+0x89/0x8d
[    0.000000]  [<ffffffff812757e5>] ? warn_invalid_dmar+0x77/0x85
[    0.000000]  [<ffffffff816df6d4>] ? check_zero_address+0xad/0xdc
[    0.000000]  [<ffffffff81358b26>] ? bad_to_user+0x620/0x620
[    0.000000]  [<ffffffff816df714>] ? detect_intel_iommu+0x11/0xaf
[    0.000000]  [<ffffffff816b1e38>] ? pci_iommu_alloc+0x3f/0x67
[    0.000000]  [<ffffffff816bdc2b>] ? mem_init+0x14/0xe5
[    0.000000]  [<ffffffff816ab94e>] ? start_kernel+0x1d0/0x3c3
[    0.000000]  [<ffffffff816ab140>] ? early_idt_handlers+0x140/0x140
[    0.000000]  [<ffffffff816ab3c4>] ? x86_64_start_kernel+0x104/0x111
[    0.000000] ---[ end trace 01021c3814caad1d ]---
----------------------------------------------------------------------------
The first thing that scares me is the line: " Your BIOS is broken; DMAR
reported at address 0!"  After some online research I discovered that this
phenomenon also occurs in other mainboard brands besides Gigabyte.  Three
possibilities of removing it were mentioned in posts online: disabling
VT-d,  turning off iommu or updating the BIOS.

The closest thing to VT-d I could find in the BIOS was something called
"Intel Virtualization Technology".  I am not sure what that means or what
it does; disabling it however made no difference; so I reënabled it.  I did
not try turning off iommu or updating the BIOS.

So I suppose the real questions at this point are the following.  What
purpose does this file serve?   Is the invalidity of the DMAR referred to
in the "WARNING" line above sufficient to cause the DE not to load?

The other part of the dmesg output which concerns me follows.
---------------------------------------------------------------------------
1.240960]  sdb: sdb1 sdb2
[    1.241103] sd 1:0:0:0: [sdb] Attached SCSI disk
[    1.260609]  sda: sda1 sda2
[    1.260755] sd 0:0:0:0: [sda] Attached SCSI disk
[    1.593645] md: md0 stopped.
[    1.594503] md: bind<sdb1>
[    1.594659] md: bind<sda1>
[    1.595242] md: raid1 personality registered for level 1
[    1.595394] bio: create slab <bio-1> at 1
[    1.595484] md/raid1:md0: active with 2 out of 2 mirrors
[    1.595541] md0: detected capacity change from 0 to 248315904
[    1.596423]  md0: unknown partition table
[    1.683228] Refined TSC clocksource calibration: 3392.144 MHz.
[    1.683278] Switching to clocksource tsc
[    1.797451] md: md1 stopped.
[    1.797959] md: bind<sdb2>
[    1.798118] md: bind<sda2>
[    1.798620] md/raid1:md1: not clean -- starting background reconstruction
[    1.798673] md/raid1:md1: active with 2 out of 2 mirrors
[    1.798731] md1: detected capacity change from 0 to 1499865088000
[    1.806447]  md1: unknown partition table
[    1.999928] device-mapper: uevent: version 1.0.3
[    2.000006] device-mapper: ioctl: 4.22.0-ioctl (2011-10-19) initialised:
dm-de...@redhat.com
[    2.195467] EXT4-fs (dm-0): INFO: recovery required on readonly
filesystem
[    2.195518] EXT4-fs (dm-0): write access will be enabled during recovery
[    2.263170] md: resync of RAID array md1
[    2.263216] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[    2.263264] md: using maximum available idle IO bandwidth (but not more
than 200000 KB/sec) for resync.
[    2.263330] md: using 128k window, over a total of 1464712000k.
[    2.277910] EXT4-fs (dm-0): recovery complete
[    2.319337] EXT4-fs (dm-0): mounted filesystem with ordered data mode.
Opts: (null)
--------------------------------------------------------------------------
The lines above which i do not understand are 1.596423 and 1.806447, both
of which say that the system is not aware of partition tables for md0 and
md1.  Both are part of  a RAID1; md0 contains only the /the /boot
partition, which happens to be empty because the boot loader is in the MBR;
and md1 is the only physical volume in LVM volume group TH1.  All the other
partitions are logical volumes in that volume group.

The following quote neither comes from the output of dmesg nor is part of
syslog.  Instead it appears at the end of the information which scrolls by
on the screen as part of the boot process.
--------------------------------------------------------------------------
[ ok ] setting up LVM Volume Groups ... done.
[ .... ] Starting remaining crypto disks .... [info] TG1-swap_crypt
(starting) ... TG1 -swap_crypt (started) ... TG1-swap_crypt (running) ...
[info] TG1-tmp_crypt (starting) ...
[  ok  mp_crypt (started ) ... done.  {sic}
[ ok ] Activating lvm and md swap ... done.
[....]  Checking file systems ... fsck from util-linux 2.20.1
fsck.ext4: Unable to resolve 'UUID=a5fdb692-2b34-4e18-8fd5-c24dde957071'
fsck.ext4: No such file or directory while trying to open
/dev/mapper/TH1-ken
Possibly non-existent device?
fsck.ext4: No such file or directory while trying to open
/dev/mapper/TH1-martin
Possibly non-existent device?
fsck.ext2: No such file or directory while trying to open
/dev/mapper/TH1-tmp_crypt
Possibly non-existent device?
fsck.ext4: No such file or directory while trying to open
/dev/mapper/TH1-var
Possibly non-existent device?
fsck died with exit status 8
failed (code 8).  {code 8 means "an operational error"  -- my comment.}
[....]  File system check failed.  A log is being saved in
/var/log/fsck.checkfs if
[FAIL] the location is writable.  Please repair the file system manually.
... failed!
[....] A maintenance shell will now be started.  CONTROL-D will terminate
this [warning] shell and resume system boot. ... (warning).
Give root password for maintenance
(or type Control-D to continue):
----------------------------------------------------------------------------
I am reasonably certain that this failure is the main -- possibly the only
-- reason for failure of the boot process to complete and install the DE.
I am also at a loss as to how to fix it.  The /etc/fstab file shows those
four partitions -- with file type ext4 -- are mounted in accordance with
the partitions created during installation.  The output of command blkid
also shows correctly the same information.  In maintenance mode I was able
to access all the "failed" mount points and write files to them.

I would appreciate any advice from anybody out there as to how to make this
computer operational.  Once again I apologize for the length of this post.

Regards, Ken Heard.

Reply via email to