Re: Trouble with a Bladecenter

2008-02-06 Diskussionsfäden Michael Tautschnig
> Hello, 
> 
> * Philipp Grau <[EMAIL PROTECTED]> [29.01.08 08:36]:
> > Up until now, we did not make any progress. But I'll do some more testing at
> > the end of the week and will report if there is success.
> 
> Do give a short summary: There were some problems with the network setup.
> 
> The blades did the PXE boot just fine - there is a DHCP Relay in our setup -
> fetched the kernel and initrd via tftp and booted. Than the second round of
> dhcp request comming from the ip=dhcp paramater in the PXE configuration did
> not reach the dhcp server. They were dropped somehow by the relay. We moved
> the dhcp server in the right network and now it works.
>

Hmm, that sound's very much like
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=432977, but I couldn't
reproduce it anymore after some time. In fact, if you find the time to try the
patch attached to the bugreport that would be pretty cool.

Best,
Michael



pgpY0SpivINnh.pgp
Description: PGP signature


Re: Trouble with a Bladecenter

2008-02-06 Diskussionsfäden Philipp Grau
Hello, 

* Philipp Grau <[EMAIL PROTECTED]> [29.01.08 08:36]:
> Up until now, we did not make any progress. But I'll do some more testing at
> the end of the week and will report if there is success.

Do give a short summary: There were some problems with the network setup.

The blades did the PXE boot just fine - there is a DHCP Relay in our setup -
fetched the kernel and initrd via tftp and booted. Than the second round of
dhcp request comming from the ip=dhcp paramater in the PXE configuration did
not reach the dhcp server. They were dropped somehow by the relay. We moved
the dhcp server in the right network and now it works.

Philipp


Re: Trouble with a Bladecenter

2008-01-30 Diskussionsfäden Roel van der Made
Hi Max,

On Wed, 2008-01-30 at 05:23 +0100, Maximilian Wilhelm wrote:
> Am Monday, den 28 January hub Roel van der Made folgendes in die Tasten:
> 
> Hi!
> 
> > I'm having exact the same problem with our Dell 1955 chassis servers and
> > until know did not find what is preventing the nfs-mount to succeed.
> 
> I never had problems with getting FAI running on our 1955 Dell blades.
> 
> Did you try to put the same Vlan on both NICs/switchports?
> Maybe something swaps network cards after booting (udev comes to mind
> to do thing one don't expect, need or want...)

There we go, although I was convinced I tested this scenario it is
indeed swapping nic's during (kernel-)boot-time and thus the mac
addresses and the nics dont correspond anymore and thus the installation
hangs at the moment of NIC initialization

I just saw this page : http://linux.dell.com/debian_9g.shtml which
mentions the issue and also the solution in a more recent kernel.

I will build a new kernel now (2.6.24) and see if this indeed solves the
problem permanently.


> (Allthough I don't know if udev is used in never FAI versions, it did
>  harm us as the eth0 and eth1 were swapped after first reboot ...)

Yep..

> Ciao
> Max

thanks!

Roel.


Re: Trouble with a Bladecenter

2008-01-30 Diskussionsfäden Roel van der Made

On Wed, 2008-01-30 at 10:15 +0100, Thomas Lange wrote:
> > On Wed, 30 Jan 2008 05:23:34 +0100, Maximilian Wilhelm <[EMAIL 
> > PROTECTED]> said:
> 
> > I never had problems with getting FAI running on our 1955 Dell blades.
> Which FAI version are you using? Which kernel version?

As mentioned in the thread earlier on:

I'm using FAI 3.2.4 with an etch nfsroot and standard 2.6.18-5-486
kernel. I had a similar running test-environment in VMWare but am quite
puzzled why this isn't working..

tnx

Roel.


Re: Trouble with a Bladecenter

2008-01-30 Diskussionsfäden Thomas Lange
> On Wed, 30 Jan 2008 05:23:34 +0100, Maximilian Wilhelm <[EMAIL 
> PROTECTED]> said:

> I never had problems with getting FAI running on our 1955 Dell blades.
Which FAI version are you using? Which kernel version?

-- 
regards Thomas


Re: Trouble with a Bladecenter

2008-01-29 Diskussionsfäden Maximilian Wilhelm
Am Monday, den 28 January hub Roel van der Made folgendes in die Tasten:

Hi!

> I'm having exact the same problem with our Dell 1955 chassis servers and
> until know did not find what is preventing the nfs-mount to succeed.

I never had problems with getting FAI running on our 1955 Dell blades.

Did you try to put the same Vlan on both NICs/switchports?
Maybe something swaps network cards after booting (udev comes to mind
to do thing one don't expect, need or want...)

(Allthough I don't know if udev is used in never FAI versions, it did
 harm us as the eth0 and eth1 were swapped after first reboot ...)

Ciao
Max
-- 
Follow the white penguin.


Re: Trouble with a Bladecenter

2008-01-28 Diskussionsfäden Philipp Grau
Hello, 

* Roel van der Made <[EMAIL PROTECTED]> [28.01.08 15:34]:
> Did Phillip find anything after these suggestions from Tim and Thomas?

Up until now, we did not make any progress. But I'll do some more testing at
the end of the week and will report if there is success.

Philipp



Re: Trouble with a Bladecenter

2008-01-28 Diskussionsfäden Roel van der Made
Hi,

I'm having exact the same problem with our Dell 1955 chassis servers and
until know did not find what is preventing the nfs-mount to succeed.

On Fri, 2008-01-25 at 13:11 +0100, Thomas Lange wrote:
> > On Fri, 25 Jan 2008 11:17:38 +0100, Philipp Grau <[EMAIL PROTECTED]> 
> > said:
> 
> > The BL460c blades have two network interfaces, But even disabling one of
> > them int server's BIOS
> Can you disable the other one in the BIOS?

Tried this one also, no success.

> > - Is there a way to get more verbose output from the kernel what he is
> >   trying to mount?
> add debug to the kernel append line.
> > - Are there other ways to debug the problem?
> Check the mount request as Tim suggests.

I see no mount requests when tcpdumping on the nfs-server.

> ip=::eth1:dhcp should also work, IMO you do not need to supply all
> network parameters. But check the number of colons.

This also does not give any success.

I'm using FAI 3.2.4 with an etch nfsroot and standard 2.6.18-5-486
kernel.

Did Phillip find anything after these suggestions from Tim and Thomas?

thanks

Roel.





Re: Trouble with a Bladecenter

2008-01-25 Diskussionsfäden Thomas Lange
> On Fri, 25 Jan 2008 11:17:38 +0100, Philipp Grau <[EMAIL PROTECTED]> said:

> The BL460c blades have two network interfaces, But even disabling one of
> them int server's BIOS
Can you disable the other one in the BIOS?

> - Is there a way to get more verbose output from the kernel what he is
>   trying to mount?
add debug to the kernel append line.
  
> - Are there other ways to debug the problem?
Check the mount request as Tim suggests.

ip=::eth1:dhcp should also work, IMO you do not need to supply all
network parameters. But check the number of colons.
-- 
regards Thomas


Re: Trouble with a Bladecenter

2008-01-25 Diskussionsfäden Tim Cutts


On 25 Jan 2008, at 10:17 am, Philipp Grau wrote:


Hello,

this post is more a braindump to get my head clear than a real  
question, but

if you have any suggestions you are more than welcome.

Ok, this is the starting point: HP Bladecenter BL c-Class with  
BL460c G1
blades. So far so good. We have a new FAI setup with version 3.2.4.  
The FAI

server is a seperate maschine.


We have a very similar setup, although we're using FAI 3.1.8, not 3.2.4



PXE booting work fine, but when the blade should mount the nfsroot  
it simply

hangs. I tried several stock Debian kernels.


Hmm - I don't think we had any such problems as that.  We have about  
250 BL460c blades successfully FAI-installed.



- Is there a way to get more verbose output from the kernel what he is
 trying to mount?


I take it the messages you're reporting are what you're seeing on the  
console via the OA?




- Are there other ways to debug the problem?



tcpdump on the NFS server, to see if you see the mount request at all?

Tim



--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE. 


Re: Trouble with a Bladecenter

2008-01-25 Diskussionsfäden Michael Tautschnig
[...]
> The BL460c blades have two network interfaces, But even disabling one of
> them int server's BIOS - so that there is only eth0 - doesn't change
> anything. This is because the kernel detects the interfaces just the other
> way around than the bios lists them.
>
[...]

Lovely - we don't actually have blades here, but firewalls with 5 network
interfaces, their order changes all the time, but a bit of trying&guessing helps
if you change the append line to the following to your pxelinux.cfg/:

append 
ip=192.168.248.5:192.168.248.29:192.168.248.1:255.255.255.192:frog.model:eth1:dhcp
 ...

That is, you set a fixed IP address and hostname for interface eth1, where eth1
is subject to change all the time. We don't reinstall our firewalls that often,
so the bit of overhead of trying eth0,eth1,eth2,eth3,eth4 is acceptable.

HTH,
Michael



pgpMteyqLqshr.pgp
Description: PGP signature