date:20110309

Re: [CentOS] Next3 (eg ext3 snapshots support) on OpenNode / CentOS 5 / RHEL 5 howto and rpms

2011-03-09 Thread Lucian

On Thu, Mar 10, 2011 at 12:15 AM, Andres Toomsalu  wrote:
> We have created an next3 and patched e2fsprogs OpenNode / CentOS 5 / RHEL 5 
> rpms - installable from opennode-test yum repo. Provided next3 kernel module 
> is currently built against  RHEL5 OpenVZ kernel used in OpenNode - so 
> installing this next3 rpm package on your CentOS 5 or RHEL 5 host will 
> install also OpenVZ patched RHEL5 kernel and newer e2fsprogs package.
>
> Installation and usage instructions are provided by this howto document: 
> http://opennode.activesys.org/documentation/howtos/next3-snapshots/
> There is also simple next3 snapshotting automation script available for use 
> with cron - usage and download link provided in howto document referenced 
> above.

Impressive.
I guess I picked the wrong time to upgrade to ext4 :-)
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] CentOS 5.5 does not recognise SAS drives with LSI 1068E Controller

2011-03-09 Thread William Warren

On 3/9/2011 12:10 PM, Peter Peltonen wrote:
> Hi,
>
> On Wed, Mar 9, 2011 at 6:57 PM, Les Mikesell  wrote:
>> Some controllers want to map arrays to volumes and present the volumes
>> to the OS instead of drives, so you have to go through the motions of
>> assigning the resources to volumes and initializing them even if you
>> only want one disk in the array or volume.
> I am pretty sure this was done already as that was what I had been
> told, and I remember seeing on the screen during the bootup messages
> about the drives being initialized and RAID5 working ok. But its been
> a while since I've been working with hardware issues so I will double
> check this tomorrow and show you the config.
>
> So is it so that the LSI 1068E Controller *should* be supported by
> megaraid_sas driver and the net install should use it without any
> driver disk needed?
>
> Regards,
> Peter
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
go into the configuration of the card itself and make sure the raid 
array is not only configured./.but initialized and bootable.  ONce it is 
seutp correctly it should get seen correctly.  Megaraid is the 
"technical" name for jsut about all of it's controller chips..:)
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] CentOS 5.5 does not recognise SAS drives with LSI 1068E Controller

2011-03-09 Thread Drew

> Based on that info I assume the board having a "8x SAS Ports via LSI
> 1068E Controller". We received the server with 3 drives + 1 spare as
> hw RAID-5 preinstalled. During bootup I see that the drives are
> initialised and everything seems ok.
>
> The issue I am facing is that when trying to install CentOS no hard
> drives are recognised.

One other thing to check, which is rare but I've seen before with your
symptoms, is a controller that's not listed in the driver's PCI ID
list. The chip onboard is a 1068e but if Supermicro used a nonstandard
PCI ID, the driver wouldn't recognize it.


-- 
Drew

"Nothing in life is to be feared. It is only to be understood."
--Marie Curie
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] sata drives and controlers

2011-03-09 Thread David Brian Chait


> thats some old stuff.   :-/

Monumental understatementwhy again do you want to put it back into 
production Michel?
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] [Newbie] Reclaiming /boot space

2011-03-09 Thread Nico Kadel-Garcia

On Wed, Mar 9, 2011 at 7:48 PM, Dr. Ed Morbius  wrote:
> on 15:49 Wed 09 Mar, Keith Keller (kkel...@wombat.san-francisco.ca.us) wrote:
>> On Wed, Mar 09, 2011 at 01:44:18PM -0800, Dr. Ed Morbius wrote:
>> > on 09:24 Wed 09 Mar, Simon Matter (simon.mat...@invoca.ch) wrote:
>> > >
>> > > Yes, only that reinstall doesn't exist in EL4 :)
>> >
>> > It doesn't?
>> >
>> >     rpm -Uvh 
>>
>> Creating the package list is what yum does automatically; using rpm
>> directly means creating a list of URLs or downloading rpms
>> individually.
>
> See my other recent post to this thread for how that's done.
>
> Essentially:  use RPM to generate a list of all packages.  List all
> files in those packages.  Filter for files on /boot.  Identify the
> packages with files on /boot.   Reinstall those packages.
>
> It's a shell one-liner.

The packages, as of the last night's cron jobs, are typically in
"/var/log/rpmpkgs", generated by the "/etc/cron.daily/rpm" script.
That script is not mandatory and does not seem to be in RHEL 6 nor
will it therefore be in CentOS 6, but lord, it's useful in this kind
of situation. It's also much faster to parse, than manually issuing
"yum" commands.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] sata drives and controlers

2011-03-09 Thread John R Pierce

On 03/09/11 6:03 PM, Michel Donais wrote:
>> Is your server PCI 32bit, PCI 64bit, or PCI-X (64bit, 100-133Mhz), or is
>> it PCI-Express and if so does it have x4 or faster slots?
> The mother board is an MSI KT# MS6380E with an AMD2100XP cpu
> FSB is 166 mhz; chipset is 333mhz

thats some old stuff.   :-/

32bit 33Mhz PCI slots, with an AGP slot for the video card.   no server 
grade cards will work at all, and a simple PCI SATA card will be 
bottlenecked if more than 1 drive is actively transferring at once

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Which file system to use for a USB backup

2011-03-09 Thread Dale Dellutri

On Wed, Mar 9, 2011 at 5:51 PM, Todd Cary  wrote:
> Les -
>
> A lot of the data needs to be moved in time to servers in other
> organizations (e.g. Rotary) or the data may be used as a
> repository for someone with just a notebook computer who would
> plug the HD into the computer.  This is not my main data backup;
> I use rsync for that.
> http://www.toddcary.com/rotary/ is one example of data that needs
> to be shared.
>
> Can rsync take ext4 data and copy it to a fat32 drive?

Yes, but you have to give up permissions and the modify time on a FAT32 is only
accurate to 2 seconds.  To rsync from an ext3/4 directory to a
plugged-in USB drive
use something like:

  rsync -av --no-p --modify-window=1 / /media///

and you might need --delete.

More info at man rsync.

Another possibility: always use tar, and put something like a Windows version of
7zip executable on the USB drive as well as the data.  That way, Windows users
can get the files out the tar archive.

-- 
Dale Dellutri
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] sata drives and controlers

2011-03-09 Thread Michel Donais

> Is your server PCI 32bit, PCI 64bit, or PCI-X (64bit, 100-133Mhz), or is 
> it PCI-Express and if so does it have x4 or faster slots?

The mother board is an MSI KT# MS6380E with an AMD2100XP cpu
FSB is 166 mhz; chipset is 333mhz

> As someone else said, a SATA card likely will NOT be bootable, unless it 
> has a boot eeprom on it, and these cost more.

On good point I have to care, 
Many thanks


---
Michel Donais
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] keepalived+LVS

2011-03-09 Thread bedo

thanks Jure Pečar,
*but ,i just know how to complete lvs/tun only use keepalived.*
*my experiment need  a result .
*
2011/3/10 Jure Pečar 

> On Wed, 9 Mar 2011 12:44:38 +0800
> bedo  wrote:
>
> > oh,god , anybody help me!
>
> Do yourself a favour and use HAproxy or something more higher level. I had
> nothing but bad expirience with lvs for the past few years.
>
> --
>
> Jure Pečar
> http://jure.pecar.org
> http://f5j.eu
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
>
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] [Newbie] Reclaiming /boot space

2011-03-09 Thread Dr. Ed Morbius

on 15:49 Wed 09 Mar, Keith Keller (kkel...@wombat.san-francisco.ca.us) wrote:
> On Wed, Mar 09, 2011 at 01:44:18PM -0800, Dr. Ed Morbius wrote:
> > on 09:24 Wed 09 Mar, Simon Matter (simon.mat...@invoca.ch) wrote:
> > > 
> > > Yes, only that reinstall doesn't exist in EL4 :)
> > 
> > It doesn't?
> > 
> > rpm -Uvh 
> 
> Creating the package list is what yum does automatically; using rpm
> directly means creating a list of URLs or downloading rpms
> individually.  

See my other recent post to this thread for how that's done.

Essentially:  use RPM to generate a list of all packages.  List all
files in those packages.  Filter for files on /boot.  Identify the
packages with files on /boot.   Reinstall those packages.

It's a shell one-liner.

-- 
Dr. Ed Morbius, Chief Scientist /|
  Robot Wrangler / Staff Psychologist| When you seek unlimited power
Krell Power Systems Unlimited|  Go to Krell!


signature.asc
Description: Digital signature
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

[CentOS] Next3 (eg ext3 snapshots support) on OpenNode / CentOS 5 / RHEL 5 howto and rpms

2011-03-09 Thread Andres Toomsalu

We have created an next3 and patched e2fsprogs OpenNode / CentOS 5 / RHEL 5 
rpms - installable from opennode-test yum repo. Provided next3 kernel module is 
currently built against  RHEL5 OpenVZ kernel used in OpenNode - so installing 
this next3 rpm package on your CentOS 5 or RHEL 5 host will install also OpenVZ 
patched RHEL5 kernel and newer e2fsprogs package. 

Installation and usage instructions are provided by this howto document: 
http://opennode.activesys.org/documentation/howtos/next3-snapshots/ 
There is also simple next3 snapshotting automation script available for use 
with cron - usage and download link provided in howto document referenced above.


Regards,
-- 
--
Andres Toomsalu, and...@active.ee



___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Michael Eager

Dr. Ed Morbius wrote:

> 
> You're NOT obliged to repeat information you've already posted (e.g.:
> home-brew system), but it's helpful to front-load data rather than have
> us tease it out of you.

No intention to have anyone tease information out of me.

The subject line says that the system is CentOS 5.5.  The other
info has been forthcoming, as much as I have been able to provide.
Sorry it wasn't all at the same time -- I didn't think that saying
the server was not a Dell or HP box was important.

>>> With what you've posted to date, it's not.
>> I could waste my time posting logs for you to tell me that they don't
>> point to any problem.  I'd rather skip that step.
> 
> Krell forfend you should post relevant and useful information which
> might be useful in actually diagnosing your problem (or pointing to
> likely candidates and/or further tests).

The logs are uninformative.  No messages for hours before the crash.

Thanks for the help.

-- 
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread John R Pierce

On 03/09/11 4:06 PM, Michael Eager wrote:
> I'll compare the values from lm_sensors with the bios
> temps to see if they are in line.

I find lm_sensors tends to be pretty useless on server grade hardware, 
as opposed to desktop.   server hardware tends to have an IPMI 
management processor, which is accessed over the network (after you 
configure it) and can be centrally managed, this includes temp+fan+power 
monitoring as well as remote power and console.

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread compdoc

>1280C is about the melting point of iron.  Wow!

The degree symbol was converted to text after pasting into the email and
became an '0'

It actually shows 128C in lm_sensors.

Great little program, tho.





___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Which file system to use for a USB backup

2011-03-09 Thread Les Mikesell

On 3/9/2011 5:51 PM, Todd Cary wrote:
>
> A lot of the data needs to be moved in time to servers in other
> organizations (e.g. Rotary) or the data may be used as a
> repository for someone with just a notebook computer who would
> plug the HD into the computer.  This is not my main data backup;
> I use rsync for that.
> http://www.toddcary.com/rotary/ is one example of data that needs
> to be shared.
>
> Can rsync take ext4 data and copy it to a fat32 drive?

Sure, it will copy the files, but you'll lose the attributes (owner, 
ctime, etc.) that fat32 doesn't store.  If you need that, you could 
write tar archives up to the 4gb size limit.  But, unless you need to 
work with non-networked computers, I'd just rsync to some common network 
location that also exported via samba.

-- 
   Les Mikesell
lesmikes...@gmail.com

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Michael Eager

compdoc wrote:
> Err, that should read 128C
> 
> -Original Message-
> From: centos-boun...@centos.org [mailto:centos-boun...@centos.org] On Behalf
> Of compdoc
> Sent: Wednesday, March 09, 2011 4:50 PM
> To: 'CentOS mailing list'
> Subject: Re: [CentOS] Server hangs on CentOS 5.5
> 
> +36C and +39C are likely your cpu and motherboard temps. You have to look at
> the temps in the cmos and match them.
> 
> The +87C is likely just a miss-reading by lm_sensors. Anything running that
> hot won't be stable.
> 
> I use AMD as well, and lm_sensors tells me something is 1280C.

I'll compare the values from lm_sensors with the bios
temps to see if they are in line.

1280C is about the melting point of iron.  Wow!


-- 
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Michael Eager

Rudi Ahlers wrote:

> As far as I can see you were giving a bucked load of advice, which you
> haven't even bothered to follow yet. You're the only one who could
> actually do anything about the problem.

I have followed quite a bit of the advice, which I have
appreciated and noted.  I've set up the monitor so that it
will not be blanked on a crash, installed monitoring software,
and checked a number of conditions which people have suggested.

No, I have not responded to the philosophical discussions
about vender management, nor to the suggestions to RMA
something to somebody for unknown reasons.  No, I'm not
going to replace RAM or capacitors here and there on the off
chance that something might be bad.  (But I will look for
capacitors which show signs of bulging or leaking.)

> No amount of suggestions made on this list will fix the problem for
> you. You need to actually take apart the server and see what's going
> on.

I wasn't interested in anyone fixing the server for me.
I did ask for suggestions on how improve the diagnostics
for the problem, which several people have responded to.
Again, I appreciate their suggestions greatly.

As I've said, I have a list of things to check when the
server is next taken down.

-- 
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Which file system to use for a USB backup

2011-03-09 Thread John R Pierce

On 03/09/11 3:51 PM, Todd Cary wrote:
> Can rsync take ext4 data and copy it to a fat32 drive?

rsync copies files.


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread John R Pierce

On 03/09/11 2:31 PM, Michael Eager wrote:
> I'll repeat:  this is a house-made system.  There's no vendor to RMA to.
> It seems obvious to me:  RMA is not a diagnostic tool.
>

you built it, you get to fix it. sometimes the initial savings in 
capital can come back and bite you in time wasted.

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread compdoc

Err, that should read 128C

-Original Message-
From: centos-boun...@centos.org [mailto:centos-boun...@centos.org] On Behalf
Of compdoc
Sent: Wednesday, March 09, 2011 4:50 PM
To: 'CentOS mailing list'
Subject: Re: [CentOS] Server hangs on CentOS 5.5

+36C and +39C are likely your cpu and motherboard temps. You have to look at
the temps in the cmos and match them.

The +87C is likely just a miss-reading by lm_sensors. Anything running that
hot won't be stable.

I use AMD as well, and lm_sensors tells me something is 1280C.

heh




___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Which file system to use for a USB backup

2011-03-09 Thread Todd Cary

Les -

A lot of the data needs to be moved in time to servers in other 
organizations (e.g. Rotary) or the data may be used as a 
repository for someone with just a notebook computer who would 
plug the HD into the computer.  This is not my main data backup; 
I use rsync for that.
http://www.toddcary.com/rotary/ is one example of data that needs 
to be shared.

Can rsync take ext4 data and copy it to a fat32 drive?

Todd

On 3/9/2011 3:16 PM, Les Mikesell wrote:
> On 3/9/2011 4:56 PM, Todd Cary wrote:
>> I have some photographs on my Centos 4 server that I want to copy
>> to a USB drive.  However, I want to be able to access the files
>> from Windows or Mac OS's.  Where should I look for instructions
>> on how to mount and format the USB drive and is FAT32 the only
>> option?
> After plugging it in, use 'dmesg' to see the device name that was just
> added and mount it wherever you want.  Maybe you should think about
> switching to Centos 5 (or 6 soon...) which should automount on the desktop.
>
> Fat32 is the only thing that will 'just work' across the different OS's
> and it is OK unless you are handling files>4GB.  But don't you have a
> network for that sort of thing?
>

-- 
Ariste Software
Petaluma, CA 94952

http://www.aristesoftware.com

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread compdoc

+36C and +39C are likely your cpu and motherboard temps. You have to look at
the temps in the cmos and match them.

The +87C is likely just a miss-reading by lm_sensors. Anything running that
hot won't be stable.

I use AMD as well, and lm_sensors tells me something is 128°C.

heh


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] [Newbie] Reclaiming /boot space

2011-03-09 Thread Keith Keller

On Wed, Mar 09, 2011 at 01:44:18PM -0800, Dr. Ed Morbius wrote:
> on 09:24 Wed 09 Mar, Simon Matter (simon.mat...@invoca.ch) wrote:
> > 
> > Yes, only that reinstall doesn't exist in EL4 :)
> 
> It doesn't?
> 
> rpm -Uvh 

Creating the package list is what yum does automatically; using rpm
directly means creating a list of URLs or downloading rpms
individually.  It's the same end result, but yum is a lot easier
if you know only the package names (and not specific URLs and/or
versions of packages).  (And it may not be the same end result if you
don't get the versions correct.)

I also read this snippet from man yum:

 reinstall
Will reinstall the identically versioned package as is currently
installed.  This does not work for "installonly" packages,  like
Kernels.

So maybe yum reinstall wouldn't fully fix the OP's problem after all?

--keith

-- 
kkel...@wombat.san-francisco.ca.us

pgp05vTcXg97q.pgp
Description: PGP signature
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Don Krause

On Mar 9, 2011, at 3:26 PM, compdoc wrote:

>> When we removed the heatsinks, the
>> cpus came up with them, even though
>> the socket lever was down in the lock position.
> 
> I've seen that in HP desktops too - the thermal paste became a hardened glue
> and the cpu gets pulled right out .
> 
> Another reason to leave the heat sink on.
> 

Umm, actually, that was a great reason to take the heatsink off. The machines 
wouldn't boot in that condition, reseating the cpus fixed them all.
Yes, we could have shipped them back, (they were brand new, broken out of the 
box) but didn't have the time to deal with that.

--
Don Krause   

smime.p7s
Description: S/MIME cryptographic signature
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Rudi Ahlers

On Thu, Mar 10, 2011 at 1:17 AM, Michael Eager  wrote:
> Rudi Ahlers wrote:
>> On Thu, Mar 10, 2011 at 12:31 AM, Michael Eager  wrote:
>>> Dr. Ed Morbius wrote:
>>>
 If the issue is repeated but rare system failures on one of a set of
 similarly configured hosts, I'd RMA the box and get a replacement.  End
 of story.
>>> I'll repeat:  this is a house-made system.  There's no vendor to RMA to.
>>
>>
>>
>> I don't know where you are, but in our country we can RMA anything and
>> everything. Apart from CPU's. So, even a cheap desktop mobo could be
>> RMA'd, as long as I can prove to the suppliers it's faulty, and it's
>> within the warrenty period
>
> I responded to Dr. Morbius' suggestion that I "RMA the box".
> There is vendor to RMA the box to.
>
> If I knew that it was a motherboard problem, I could RMA it.
> Or disk, or PSU, or network card, or whatever.  But, as I've mentioned,
> there's no indication what causes the system to hang.  There is no
> way at this point to prove that it is a defective motherboard.
>
>
> --
> Michael Eager    ea...@eagercon.com
> 1960 Park Blvd., Palo Alto, CA 94306  650-325-8077
> ___
`

As far as I can see you were giving a bucked load of advice, which you
haven't even bothered to follow yet. You're the only one who could
actually do anything about the problem.

No amount of suggestions made on this list will fix the problem for
you. You need to actually take apart the server and see what's going
on.


-- 
Kind Regards
Rudi Ahlers
SoftDux

Website: http://www.SoftDux.com
Technical Blog: http://Blog.SoftDux.com
Office: 087 805 9573
Cell: 082 554 7532
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Michael Eager

compdoc wrote:
>> According to the man page, it apparently needs a kernel driver
>> named OpenIMPI, which it claims is installed in standard
>> distributions.  I don't find it on my system.
> 
> 
> lm_sensors is another, and I think installs ready to use from the repos.

sensors says that the three temp sensors read +36C, +39C, and +87C.
These appear to be AMD K10 temp sensors, although I might be
misreading sensors-detect.  Low/highs are (+127/+127, +127/+90,
+127/+127) respectively.  (I'm not sure if these are alarm set
points or something else.)

One fan is listed as 0 rpm.   Something to look into.

-- 
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread compdoc

>When we removed the heatsinks, the
>cpus came up with them, even though
>the socket lever was down in the lock position.

I've seen that in HP desktops too - the thermal paste became a hardened glue
and the cpu gets pulled right out .

Another reason to leave the heat sink on.




___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Dr. Ed Morbius

on 14:31 Wed 09 Mar, Michael Eager (ea...@eagerm.com) wrote:
> Dr. Ed Morbius wrote:
> 
> >If the issue is repeated but rare system failures on one of a set of
> >similarly configured hosts, I'd RMA the box and get a replacement.  End
> >of story.
> 
> I'll repeat:  this is a house-made system.  There's no vendor to RMA to.
> It seems obvious to me:  RMA is not a diagnostic tool.

You fab your own silicon?

I saw your reference to a homebrew machine after I'd posted.  You'd
neglected to provide this information initially.

Knowing some basic stuff like:  CPU architecture, memory allocation,
disk subsystem, kernel modules, etc.,

> >If you'd post
> >details of the host, more logging information, netconsole panic logs,
> >etc., it might be possible to narrow down possible causes.
> 
> The problem is that there are NO DIAGNOSTICS generated when the
> system hangs.  There's no panic and nothing in the logs which
> indicates any problem.  This is what I indicated from the get go.

uname -a
/proc/cpuinfo
/proc/meminfo
lspci
lsmod
/proc/mounts
/proc/scsi/scsi
/proc/partitions
dmidecode

... would be useful for starters.

If you've built your own kernel, your config options (if you're running
stock, we can get that from the package itself).

As would wiring up netconsole as I initially suggested.

If I can clarify:  YOU are the person with the problem.  WE are the
people you're turning to for assistance.  YOU are getting pissy.  YOU
should be focusing on providing relevant information, or noting that
it's not available.

You're NOT obliged to repeat information you've already posted (e.g.:
home-brew system), but it's helpful to front-load data rather than have
us tease it out of you.

> >With what you've posted to date, it's not.
> 
> I could waste my time posting logs for you to tell me that they don't
> point to any problem.  I'd rather skip that step.

Krell forfend you should post relevant and useful information which
might be useful in actually diagnosing your problem (or pointing to
likely candidates and/or further tests).

-- 
Dr. Ed Morbius, Chief Scientist /|
  Robot Wrangler / Staff Psychologist| When you seek unlimited power
Krell Power Systems Unlimited|  Go to Krell!
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Don Krause

On Mar 9, 2011, at 3:06 PM, compdoc wrote:

>> compdoc wrote:
>>> I'll re-seat the CPU, heatsink, and fan on the next downtime.
>>> 
>>> Is the CPU overheating? Pointless to reseat the cpu or even remove the
>>> heatsink, if not.
> 
>> No evidence to suggest that it is.
> 
> 
> As much as I love telling anecdotes, I have none to tell you concerning cpu
> reseating. I've never seen it fix a problem.

Funny, we actually had a whole stack of HP 4600s that needed the cpus 
reinstalled in order to function.

When we removed the heatsinks, the cpus came up with them, even though the 
socket lever was down in the lock position. 

We had to "twist" the CPU off the bottom of the heatsink, reinstall it in the 
socket, reinstall the heatsink, and the machines were fine.

--
Don Krause   

smime.p7s
Description: S/MIME cryptographic signature
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Michael Eager

Rudi Ahlers wrote:
> On Thu, Mar 10, 2011 at 12:31 AM, Michael Eager  wrote:
>> Dr. Ed Morbius wrote:
>>
>>> If the issue is repeated but rare system failures on one of a set of
>>> similarly configured hosts, I'd RMA the box and get a replacement.  End
>>> of story.
>> I'll repeat:  this is a house-made system.  There's no vendor to RMA to.
> 
> 
> 
> I don't know where you are, but in our country we can RMA anything and
> everything. Apart from CPU's. So, even a cheap desktop mobo could be
> RMA'd, as long as I can prove to the suppliers it's faulty, and it's
> within the warrenty period

I responded to Dr. Morbius' suggestion that I "RMA the box".
There is vendor to RMA the box to.

If I knew that it was a motherboard problem, I could RMA it.
Or disk, or PSU, or network card, or whatever.  But, as I've mentioned,
there's no indication what causes the system to hang.  There is no
way at this point to prove that it is a defective motherboard.

-- 
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Which file system to use for a USB backup

2011-03-09 Thread Les Mikesell

On 3/9/2011 4:56 PM, Todd Cary wrote:
> I have some photographs on my Centos 4 server that I want to copy
> to a USB drive.  However, I want to be able to access the files
> from Windows or Mac OS's.  Where should I look for instructions
> on how to mount and format the USB drive and is FAT32 the only
> option?

After plugging it in, use 'dmesg' to see the device name that was just 
added and mount it wherever you want.  Maybe you should think about 
switching to Centos 5 (or 6 soon...) which should automount on the desktop.

Fat32 is the only thing that will 'just work' across the different OS's 
and it is OK unless you are handling files >4GB.  But don't you have a 
network for that sort of thing?

-- 
   Les Mikesell
lesmikes...@gmail.com
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread compdoc

>According to the man page, it apparently needs a kernel driver
>named OpenIMPI, which it claims is installed in standard
>distributions.  I don't find it on my system.


lm_sensors is another, and I think installs ready to use from the repos.

Failing that, you should reboot and look in the motherboard's bios/cmos. It
should display all that good stuff: fan speeds, voltage levels, temps.




___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread compdoc

> compdoc wrote:
>> I'll re-seat the CPU, heatsink, and fan on the next downtime.
> >
> >Is the CPU overheating? Pointless to reseat the cpu or even remove the
>> heatsink, if not.

>No evidence to suggest that it is.

As much as I love telling anecdotes, I have none to tell you concerning cpu
reseating. I've never seen it fix a problem.

Maybe that was something they needed to do back in 1998, but cpu and ram
sockets are a reliable technology these days.

Removing and then reinserting is likely to do more damage than it will fix.

I think you're on the right track - use diagnostic tools and see what you
can find. The more poking around you do the better.

I do agree about bad caps - even one with a bulging top can cause
crashing/rebooting. They need to be checked both on the motherboard and
inside the PSU.

However, if the motherboard is 2 years old or less, capacitor problems on
the motherboard will become less likely the newer it is. They've been making
some excellent low cost boards with solid caps for a while.

The older boards with that problem are still around but most have died by
now. Cheaper PSUs have a cap problem even these days, though.

Oh, and both the motherboard and PSU circuit board should be examined for
burned components. We have some hellacious lighting strikes here in Denver,
and stuff blows up.

Hey, I did manage an anecdote after all!

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

[CentOS] Which file system to use for a USB backup

2011-03-09 Thread Todd Cary

I have some photographs on my Centos 4 server that I want to copy 
to a USB drive.  However, I want to be able to access the files 
from Windows or Mac OS's.  Where should I look for instructions 
on how to mount and format the USB drive and is FAT32 the only 
option?

Many thanks

-- 
Ariste Software
Petaluma, CA 94952

http://www.aristesoftware.com

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Rudi Ahlers

On Thu, Mar 10, 2011 at 12:31 AM, Michael Eager  wrote:
> Dr. Ed Morbius wrote:
>
>> If the issue is repeated but rare system failures on one of a set of
>> similarly configured hosts, I'd RMA the box and get a replacement.  End
>> of story.
>
> I'll repeat:  this is a house-made system.  There's no vendor to RMA to.



I don't know where you are, but in our country we can RMA anything and
everything. Apart from CPU's. So, even a cheap desktop mobo could be
RMA'd, as long as I can prove to the suppliers it's faulty, and it's
within the warrenty period

-- 
Kind Regards
Rudi Ahlers
SoftDux

Website: http://www.SoftDux.com
Technical Blog: http://Blog.SoftDux.com
Office: 087 805 9573
Cell: 082 554 7532
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Michael Eager

m.r...@5-cent.us wrote:
> Michael Eager wrote:
>> compdoc wrote:
 I'll re-seat the CPU, heatsink, and fan on the next downtime.
>>> Is the CPU overheating? Pointless to reseat the cpu or even remove the
>>> heatsink, if not.
>> No evidence to suggest that it is.
> 
> Have you used ipmitool to see what the temperatures are?

No, I'm not familiar with ipmitool.   I just installed it and
the man page will take some time to read.  It looks like it
does everything and then more.

According to the man page, it apparently needs a kernel driver
named OpenIMPI, which it claims is installed in standard
distributions.  I don't find it on my system.   Running
"impitool sdr type Temperature" results in an error message
saying that it could not open /dev/imp0, etc.

-- 
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Michael Eager

Dr. Ed Morbius wrote:

> If the issue is repeated but rare system failures on one of a set of
> similarly configured hosts, I'd RMA the box and get a replacement.  End
> of story.

I'll repeat:  this is a house-made system.  There's no vendor to RMA to.
It seems obvious to me:  RMA is not a diagnostic tool.

> If you'd post
> details of the host, more logging information, netconsole panic logs,
> etc., it might be possible to narrow down possible causes.

The problem is that there are NO DIAGNOSTICS generated when the
system hangs.  There's no panic and nothing in the logs which
indicates any problem.  This is what I indicated from the get go.

> With what you've posted to date, it's not.

I could waste my time posting logs for you to tell me that they don't
point to any problem.  I'd rather skip that step.

-- 
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Creating the symbolic links in the /boot and /boot/grub/

2011-03-09 Thread Todd Cary

Unfortunately, I live out with the cows, so I am using DSL to 
download the latest - it will take awhile.  It has been awhile 
since I downloaded the four disks, however I assume disk 1 
contains all that I need to do a "rescue".

Once I get that down, I will use torrent to get all four disks.

Hey, guys, many thanks.  Any of you live in the SF Bay Area?  
Love to treat you to a beer.

Todd

On 3/9/2011 1:03 PM, Simon Matter wrote:
>> And here are the contents of grub.conf:
>>
>> # grub.conf generated by anaconda
>> #
>> # Note that you do not have to rerun grub after making changes to
>> this file
>> # NOTICE:  You have a /boot partition.  This means that
>> #  all kernel and initrd paths are relative to /boot/, eg.
>> #  root (hd0,0)
>> #  kernel /vmlinuz-version ro root=/dev/VolGroup00/LogVol00
>> #  initrd /initrd-version.img
>> #boot=/dev/hdc
>> default=0
>> timeout=5
>> splashimage=(hd0,0)/grub/splash.xpm.gz
>> hiddenmenu
>> title CentOS (2.6.9-100.EL)
>>   root (hd0,0)
>>   kernel /vmlinuz-2.6.9-100.EL ro
>> root=/dev/VolGroup00/LogVol00 rhgb quiet
>>   initrd /initrd-2.6.9-100.EL.img
> OK, the file listing of /boot from your last mail and now grub.conf, they
> look quite good. grub.conf has been updated by the kernel update, and also
> a new initrd-2.6.9-100.EL.img has been created, so that doesn't look bad.
>
> The only thing I'm not really sure is if grub is installed correctly now.
> Maybe you have to run grub-install again to be sure but I'm just not so
> sure about grubs internals. Maybe someone can tell you more about this.
>
> As someone else mentioned, it's a very good idea to have a current CentOS
> 4.8 disk at hand so you could boot into rescue mode with 'linux rescue' at
> the boot prompt if somethings goes wrong.
>
> Simon
>
>> Todd
>>
>> On 3/9/2011 12:23 AM, Simon Matter wrote:
 I inadvertently missed using the list...here are my recent messages.
>>> As Nico suggested, download the kernel but also grub and redhat-logos,
>>> like so
>>> wget
>>> http://mirrors.kernel.org/centos/4.9/updates/i386/RPMS/kernel-2.6.9-100.EL.i686.rpm
>>> wget
>>> http://mirrors.kernel.org/centos/4.9/os/i386/CentOS/RPMS/redhat-logos-1.1.26-1.centos4.4.noarch.rpm
>>> wget
>>> http://mirrors.kernel.org/centos/4.9/os/i386/CentOS/RPMS/grub-0.95-3.8.i386.rpm
>>>
>>> Then do a
>>>
>>> rpm -Uvh --replacepkgs --replacefiles kernel-2.6.9-100.EL.i686.rpm
>>> redhat-logos-1.1.26-1.centos4.4.noarch.rpm grub-0.95-3.8.i386.rpm
>>>
>>> And the show us the contents of 'ls -laR /boot' and 'cat /etc/grub.conf'
>>>
>>> Simon
>>>
 On 3/8/2011 8:39 PM, Nico Kadel-Garcia wrote:
> On Tue, Mar 8, 2011 at 11:31 PM, Todd Cary
> wrote:
>> reinstall is not an option for yum.  I ran "yum install kernel" and
>> it
>> completed without errors however there are no links created.
> Oh, dear. Can you grab the RPM and do "rpm -U -replacepkgs
> [kernel-whatver].rpm"? You should be able to use "yum remove" on the
> old kernel packages, consistent with freeing up the space, and now
> install your new kernel with yum.
>
>> Would this be the correct ln command for vmlinuz-2.6.9-89.35.1
>>
>> # /boot/vmlinuz-2.6.9-89.35.1 /boot/vmlinuz
>>
>> Todd
>>
>> On 3/8/2011 7:04 PM, Nico Kadel-Garcia wrote:
>>> On Tue, Mar 8, 2011 at 9:58 PM, Todd Cary
>>> wrote:
 I started a new thread since the original one is getting rather
 long.

 I have retrieved the files I deleted in /boot and /boot/grub,
 however I need to make links for

 /boot/System.map  (System.map ->  System.map-2.6.9-89.35.1)
 /boot/vmlinuz  (vmlinuz ->  vmlinuz-2.6.9-89.35.1)
 /boot/grub/menu.lst (menu.lst ->  ./grub.conf)
>>> Instead, re-install your kernel. "yum reinstall kernel". This should
>>> regenerate your symlinks correctly, except possibly the grub.conf.
>>>
 If it was not so important to get it correct, I would appreciate
 the syntax for the command.  Usually I would figure it out.

 Since I have restored the files (I will double check to make sure
 they are all there), do I need to run grub-install?
>>> i think yes. The old location of the boot loader is listed in
>>> /boot/grub/grub.conf, and should be used as the argument to that
>>> command. grub is much smarter than LILO used to be, but I think the
>>> bootstrap procedure relies on knowing details of where the fiddly
>>> bits
>>> of grub live on the relevant ex2 compatible filesytem.
>>>
 My apologies for bothering everyone with such a dumb error on my
 part.

 Todd

 --
 Ariste Software
 Petaluma, CA 94952

 http://www.aristesoftware.com

 ___
 CentOS mailing list
 CentOS@centos.org

Re: [CentOS] [Newbie] Reclaiming /boot space

2011-03-09 Thread Dr. Ed Morbius

on 15:39 Tue 08 Mar, Todd Cary (t...@aristesoftware.com) wrote:
> Simon -
> 
> Did I screw up?  I deleted what was in /boot!

Yes, as others have noted.

Lessons:

1: Don't go randomly/arbitrarily deleting system files (unless you're
curious to see what happens when randomly deleting systems files).

2: Understand how Linux functions.  E.g.: the boot process, and the
significance of the /boot directory/filesystem.

3: Use your package management system.  If you /are/ going to delete
arbitrary system files, doing it through your package manager is going
to a) give you some idea when you're about to do something really stupid
(generally other Really Important Stuff depends on them) and b) at the
very least does the damage in an orderly manner.

4: Have a boot disk.  Know how to use it.

5: Know how to restore GRUB and an emergency boot kernel.

Circling back to #3:  your package management system can also dig you
out of this hole.

You should be able to identify and replace all files in an arbitrary
tree, for example, /boot, using an RPM bash one-liner:

$ rpm -qa   # lists all packages installed
$ rpm -ql  # lists all files in a package
$ command | grep -q   # success/fail on match / no match
$ command1 && command2  # runs command2 if command1 exits true
# rpm -Uvh --replacepkgs  # (re)installs packages
$ $( command )  # executes output of 'command'

Putting that together:

rpm -Uvh --replacepkgs $( 
for pkg in $( rpm -qa ); 
do 
rpm -ql $pkg | grep -q ^/boot && && echo $pkg
done
)

Incidentally, the list of packages works out to:

filesystem-2.4.0-3.el5
grub-0.97-13.5
kernel-2.6.18-194.17.1.el5
redhat-logos-4.9.99-11.el5.centos

-- 
Dr. Ed Morbius, Chief Scientist /|
  Robot Wrangler / Staff Psychologist| When you seek unlimited power
Krell Power Systems Unlimited|  Go to Krell!
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] [Newbie] Reclaiming /boot space

2011-03-09 Thread Dr. Ed Morbius

on 09:24 Wed 09 Mar, Simon Matter (simon.mat...@invoca.ch) wrote:
> > On Wed, Mar 9, 2011 at 10:12 AM, Simon Matter 
> > wrote:



> > Wouldn't it have been easier to reinstall the kernel & grub, i.e.:
> >
> > yum reinstall kernel grub
> >
> > Surely if yum reinstalls it, it would re-create the permissions &
> > symlinks as well?
> 
> Yes, only that reinstall doesn't exist in EL4 :)

It doesn't?

rpm -Uvh 

The -U (upgrade) should occur regardless of the current install state,
though mucking with '--force' may be necessary.  '--replacepkgs' should
cover it.

-- 
Dr. Ed Morbius, Chief Scientist /|
  Robot Wrangler / Staff Psychologist| When you seek unlimited power
Krell Power Systems Unlimited|  Go to Krell!
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread m . roth

Michael Eager wrote:
> compdoc wrote:
>>> I'll re-seat the CPU, heatsink, and fan on the next downtime.
>>
>> Is the CPU overheating? Pointless to reseat the cpu or even remove the
>> heatsink, if not.
>
> No evidence to suggest that it is.

Have you used ipmitool to see what the temperatures are?

 mark

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Michael Eager

m.r...@5-cent.us wrote:
> Michael Eager wrote:
> 
>> I'll have to stop the server to find out what the installed bios version
>> is and see whether there is an update.  Most bios updates appear to only
>> change supported CPUs.  Something else for the next downtime.
> 
> Nope: dmidecode, or lshw, is your friend.

Thanks.  Looks like there might be a newer bios available,
although the vendor identifies it as 'beta'.

-- 
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] keepalived+LVS

2011-03-09 Thread Jure Pečar

On Wed, 9 Mar 2011 12:44:38 +0800
bedo  wrote:

> oh,god , anybody help me!

Do yourself a favour and use HAproxy or something more higher level. I had
nothing but bad expirience with lvs for the past few years.

-- 

Jure Pečar
http://jure.pecar.org
http://f5j.eu
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Creating the symbolic links in the /boot and /boot/grub/

2011-03-09 Thread Simon Matter

> And here are the contents of grub.conf:
>
> # grub.conf generated by anaconda
> #
> # Note that you do not have to rerun grub after making changes to
> this file
> # NOTICE:  You have a /boot partition.  This means that
> #  all kernel and initrd paths are relative to /boot/, eg.
> #  root (hd0,0)
> #  kernel /vmlinuz-version ro root=/dev/VolGroup00/LogVol00
> #  initrd /initrd-version.img
> #boot=/dev/hdc
> default=0
> timeout=5
> splashimage=(hd0,0)/grub/splash.xpm.gz
> hiddenmenu
> title CentOS (2.6.9-100.EL)
>  root (hd0,0)
>  kernel /vmlinuz-2.6.9-100.EL ro
> root=/dev/VolGroup00/LogVol00 rhgb quiet
>  initrd /initrd-2.6.9-100.EL.img

OK, the file listing of /boot from your last mail and now grub.conf, they
look quite good. grub.conf has been updated by the kernel update, and also
a new initrd-2.6.9-100.EL.img has been created, so that doesn't look bad.

The only thing I'm not really sure is if grub is installed correctly now.
Maybe you have to run grub-install again to be sure but I'm just not so
sure about grubs internals. Maybe someone can tell you more about this.

As someone else mentioned, it's a very good idea to have a current CentOS
4.8 disk at hand so you could boot into rescue mode with 'linux rescue' at
the boot prompt if somethings goes wrong.

Simon

>
> Todd
>
> On 3/9/2011 12:23 AM, Simon Matter wrote:
>>> I inadvertently missed using the list...here are my recent messages.
>> As Nico suggested, download the kernel but also grub and redhat-logos,
>> like so
>> wget
>> http://mirrors.kernel.org/centos/4.9/updates/i386/RPMS/kernel-2.6.9-100.EL.i686.rpm
>> wget
>> http://mirrors.kernel.org/centos/4.9/os/i386/CentOS/RPMS/redhat-logos-1.1.26-1.centos4.4.noarch.rpm
>> wget
>> http://mirrors.kernel.org/centos/4.9/os/i386/CentOS/RPMS/grub-0.95-3.8.i386.rpm
>>
>> Then do a
>>
>> rpm -Uvh --replacepkgs --replacefiles kernel-2.6.9-100.EL.i686.rpm
>> redhat-logos-1.1.26-1.centos4.4.noarch.rpm grub-0.95-3.8.i386.rpm
>>
>> And the show us the contents of 'ls -laR /boot' and 'cat /etc/grub.conf'
>>
>> Simon
>>
>>>
>>> On 3/8/2011 8:39 PM, Nico Kadel-Garcia wrote:
 On Tue, Mar 8, 2011 at 11:31 PM, Todd Cary
 wrote:
> reinstall is not an option for yum.  I ran "yum install kernel" and
> it
> completed without errors however there are no links created.
 Oh, dear. Can you grab the RPM and do "rpm -U -replacepkgs
 [kernel-whatver].rpm"? You should be able to use "yum remove" on the
 old kernel packages, consistent with freeing up the space, and now
 install your new kernel with yum.

> Would this be the correct ln command for vmlinuz-2.6.9-89.35.1
>
> # /boot/vmlinuz-2.6.9-89.35.1 /boot/vmlinuz
>
> Todd
>
> On 3/8/2011 7:04 PM, Nico Kadel-Garcia wrote:
>> On Tue, Mar 8, 2011 at 9:58 PM, Todd Cary
>> wrote:
>>> I started a new thread since the original one is getting rather
>>> long.
>>>
>>> I have retrieved the files I deleted in /boot and /boot/grub,
>>> however I need to make links for
>>>
>>> /boot/System.map  (System.map -> System.map-2.6.9-89.35.1)
>>> /boot/vmlinuz  (vmlinuz -> vmlinuz-2.6.9-89.35.1)
>>> /boot/grub/menu.lst (menu.lst -> ./grub.conf)
>> Instead, re-install your kernel. "yum reinstall kernel". This should
>> regenerate your symlinks correctly, except possibly the grub.conf.
>>
>>> If it was not so important to get it correct, I would appreciate
>>> the syntax for the command.  Usually I would figure it out.
>>>
>>> Since I have restored the files (I will double check to make sure
>>> they are all there), do I need to run grub-install?
>> i think yes. The old location of the boot loader is listed in
>> /boot/grub/grub.conf, and should be used as the argument to that
>> command. grub is much smarter than LILO used to be, but I think the
>> bootstrap procedure relies on knowing details of where the fiddly
>> bits
>> of grub live on the relevant ex2 compatible filesytem.
>>
>>> My apologies for bothering everyone with such a dumb error on my
>>> part.
>>>
>>> Todd
>>>
>>> --
>>> Ariste Software
>>> Petaluma, CA 94952
>>>
>>> http://www.aristesoftware.com
>>>
>>> ___
>>> CentOS mailing list
>>> CentOS@centos.org
>>> http://lists.centos.org/mailman/listinfo/centos
>>>
> --
> Ariste Software
> Petaluma, CA 94952
>
> http://www.aristesoftware.com
>
>
>>> --
>>> Ariste Software
>>> Petaluma, CA 94952
>>>
>>> http://www.aristesoftware.com
>>>
>>> ___
>>> CentOS mailing list
>>> CentOS@centos.org
>>> http://lists.centos.org/mailman/listinfo/centos
>>>
>>
>> ___
>> CentOS mailing list
>> CentOS@centos.org
>> http://lists.centos.org/mailman/listinfo/ce

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Michael Eager

compdoc wrote:
>> I'll re-seat the CPU, heatsink, and fan on the next downtime.
> 
> Is the CPU overheating? Pointless to reseat the cpu or even remove the
> heatsink, if not.

No evidence to suggest that it is.

-- 
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread John R Pierce

On 03/09/11 10:29 AM, Michael Eager wrote:
> I'll re-seat the CPU, heatsink, and fan on the next downtime.

do have on hand the suppplies to clean off the old heatsink goo (I use 
alcohol pads for this), and some fresh heatsink goop

check all fans when its powered off that they spin easily.  I've seen 
fans that were still spinning but felt a little stiff, and failed not 
long thereafter.  and of course, clean out most of the dust that tends 
to collect everywhere.



___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread compdoc

> I'll re-seat the CPU, heatsink, and fan on the next downtime.

Is the CPU overheating? Pointless to reseat the cpu or even remove the
heatsink, if not.


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Dr. Ed Morbius

on 10:29 Wed 09 Mar, Michael Eager (ea...@eagerm.com) wrote:
> Les Mikesell wrote:
> 
> > Note that overheating can be localized or a bad heat sink mounting or 
> > fan on a CPU.
> 
> I'll re-seat the CPU, heatsink, and fan on the next downtime.

Very strongly advised.  It's a simple and very cheap approach.  I'd
check /all/ cables (power, disk) as well.

Visually scan for bad caps while you're doing this.  The pandemic of the
mid 2000s seems to have abated, but they can still ruin your whole day.

> Heat related problems usually present as a system which fails
> and will not reboot immediately, but will after they sit for a
> while to cool down.  This system doesn't do that.

Maybe, maybe not.

> I'll install sensord to log CPU temps in case this is a problem.

Good call.

> > There's not really a good way to approach intermittent failures.  It
> > may only break when you aren't looking.  Major component swaps or
> > taking it offline for extended diagnostics hoping to catch a glimpse
> > of the cause when it fails is about all you can do.

I disagree with this statement:  you start with the bleeding obvious and
easy to do (the cheap diagnostics), same as any garage mechanic or
doctor.  You instrument and increase log scrutiny.  You make damned sure
you're logging remotely as one of the first things a hosed system does
is stop writing to disk.

> Yes, most memory diagnostics are not very effective.
> 
> I'll have to stop the server to find out what the installed bios version
> is and see whether there is an update.  Most bios updates appear to only
> change supported CPUs.  Something else for the next downtime.

You haven't stated who's built this system, but many LOM / OMC systems
will provide basic information such as this.  dmidecode and lshw are
also very helpful here.

-- 
Dr. Ed Morbius, Chief Scientist /|
  Robot Wrangler / Staff Psychologist| When you seek unlimited power
Krell Power Systems Unlimited|  Go to Krell!
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Les Mikesell

On 3/9/2011 12:47 PM, Dr. Ed Morbius wrote:
>
> That represents an accounting failure, as opex is now subsidizing capex.
> Troubleshooting of known bad equipment should be an opex chargeback
> against capex or some capital reserve.
>
> This requires clueful beancounters.  Recent economic/business/finance
> history suggests a significant shortage of same.  Cue supply/demand and
> incentives off-topic digression.

Statistical stuff doesn't play out well in one-off situations.  If you 
have a large number of boxes you'll know about the right amount of spare 
parts and on-hand spares you need.  But individual units are about like 
light bulbs in breaking at random and if the only one you have breaks 
today it won't matter that their average life is in years.

-- 
   Les Mikesell
lesmikes...@gmail.com

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Dr. Ed Morbius

on 11:52 Wed 09 Mar, Les Mikesell (lesmikes...@gmail.com) wrote:
> On 3/9/2011 11:32 AM, Michael Eager wrote:

> Memory diagnostics may take days to catch a problem.  Did you check for 
> a newer bios for your MB?  I mentioned before that it seemed strange, 
> but I've seen that fix mysterious problems even after the machines had 
> previously been reliable for a long time (and even more oddly, all the 
> machines in the lot weren't affected).

BIOS issues would tend to present similar issues on numerous systems,
especially if they're similarly configured.

Mind:  we've encountered a DSTATE bug with recent Dell PowerEdge systems
(r610, r410, r310), which has resulted in several BIOS revisions, the
latest of which simply disables the option entirely.  It's one of the
first things Dell techs mention when you call them these days (much to
our amusement).

If it's a single system (and assuming there are others similarly
configured), I'm leaning toward hardware or build-quality issues:  bad
RAM, other componentry, poor cable seating, etc.

-- 
Dr. Ed Morbius, Chief Scientist /|
  Robot Wrangler / Staff Psychologist| When you seek unlimited power
Krell Power Systems Unlimited|  Go to Krell!
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

[CentOS] scanning under CentOS

2011-03-09 Thread Boris Epstein

Hello listmates,

I've got an HP Officejet 7110 All-in-One Printer which I am trying to
get to scan for us. Details on the printer/scanner/FAX here:

http://h10025.www1.hp.com/ewfrf/wc/product?cc=us&lc=en&dlc=en&product=91472

Now xsane seems to handle individual pages OK but multiplage scans
through the ADF - that seems to be a complete mess. Hence the
questions:

1) Is there a good scanner interface package that would allow me to
just load the paper into the ADF, say "go" and watch the scanner go
through the pages I've put in and produce an output file when it's
done?

2) Has anyone worked with the scanner I am trying to use? What was
your experience like when you did?

Thanks.

Boris.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Creating the symbolic links in the /boot and /boot/grub/

2011-03-09 Thread Les Mikesell

On 3/9/2011 12:35 PM, Todd Cary wrote:
> Simon -
>
> I performed the tasks as outlined and here are the contents of /boot:

It might be a good time to review what you can do with an install disk 
after booting with "linux rescue" at the prompt.  If you've made a 
mistake in the setup you can get back in to fix it that way - including 
reinstalling grub and doing a kernel update.

-- 
   Les Mikesell
lesmikes...@gmail.com
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread m . roth

Michael Eager wrote:

> I'll have to stop the server to find out what the installed bios version
> is and see whether there is an update.  Most bios updates appear to only
> change supported CPUs.  Something else for the next downtime.

Nope: dmidecode, or lshw, is your friend.

 mark

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Dr. Ed Morbius

on 10:37 Wed 09 Mar, Lamar Owen (lo...@pari.edu) wrote:
> On Wednesday, March 09, 2011 10:16:34 am Brunner, Brian T. wrote:
> > This would be far cheaper than the time spent troubleshooting the
> > running (sometimes hanging) system.
> 
> Let me interject here, that from a budgeting standpoint 'cheaper' has
> to be interpreted in the context of which budget the costs are coming
> out of.  New hardware is capex, and thus would come out of the capital
> budget, and admin time is opex, and thus would come out of the
> operating budget.  There may be sufficient funds in the operating
> budget to pay an admin $x,000 but the funds in the capital budget may
> be insufficient to buy a server costing $y,000, where y=x.  

That represents an accounting failure, as opex is now subsidizing capex.
Troubleshooting of known bad equipment should be an opex chargeback
against capex or some capital reserve.

This requires clueful beancounters.  Recent economic/business/finance
history suggests a significant shortage of same.  Cue supply/demand and
incentives off-topic digression.

The answer is still to communicate the issue upstream.  Estimating
replacement costs and likelihood will help in the relevant business /
organizational decision.

-- 
Dr. Ed Morbius, Chief Scientist /|
  Robot Wrangler / Staff Psychologist| When you seek unlimited power
Krell Power Systems Unlimited|  Go to Krell!
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Creating the symbolic links in the /boot and /boot/grub/

2011-03-09 Thread compdoc

How goes the repair? Got it all worked out?



___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Creating the symbolic links in the /boot and /boot/grub/

2011-03-09 Thread Todd Cary

And here are the contents of grub.conf:

# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to 
this file
# NOTICE:  You have a /boot partition.  This means that
#  all kernel and initrd paths are relative to /boot/, eg.
#  root (hd0,0)
#  kernel /vmlinuz-version ro root=/dev/VolGroup00/LogVol00
#  initrd /initrd-version.img
#boot=/dev/hdc
default=0
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title CentOS (2.6.9-100.EL)
 root (hd0,0)
 kernel /vmlinuz-2.6.9-100.EL ro 
root=/dev/VolGroup00/LogVol00 rhgb quiet
 initrd /initrd-2.6.9-100.EL.img

Todd

On 3/9/2011 12:23 AM, Simon Matter wrote:
>> I inadvertently missed using the list...here are my recent messages.
> As Nico suggested, download the kernel but also grub and redhat-logos,
> like so
> wget
> http://mirrors.kernel.org/centos/4.9/updates/i386/RPMS/kernel-2.6.9-100.EL.i686.rpm
> wget
> http://mirrors.kernel.org/centos/4.9/os/i386/CentOS/RPMS/redhat-logos-1.1.26-1.centos4.4.noarch.rpm
> wget
> http://mirrors.kernel.org/centos/4.9/os/i386/CentOS/RPMS/grub-0.95-3.8.i386.rpm
>
> Then do a
>
> rpm -Uvh --replacepkgs --replacefiles kernel-2.6.9-100.EL.i686.rpm
> redhat-logos-1.1.26-1.centos4.4.noarch.rpm grub-0.95-3.8.i386.rpm
>
> And the show us the contents of 'ls -laR /boot' and 'cat /etc/grub.conf'
>
> Simon
>
>>
>> On 3/8/2011 8:39 PM, Nico Kadel-Garcia wrote:
>>> On Tue, Mar 8, 2011 at 11:31 PM, Todd Cary
>>> wrote:
 reinstall is not an option for yum.  I ran "yum install kernel" and it
 completed without errors however there are no links created.
>>> Oh, dear. Can you grab the RPM and do "rpm -U -replacepkgs
>>> [kernel-whatver].rpm"? You should be able to use "yum remove" on the
>>> old kernel packages, consistent with freeing up the space, and now
>>> install your new kernel with yum.
>>>
 Would this be the correct ln command for vmlinuz-2.6.9-89.35.1

 # /boot/vmlinuz-2.6.9-89.35.1 /boot/vmlinuz

 Todd

 On 3/8/2011 7:04 PM, Nico Kadel-Garcia wrote:
> On Tue, Mar 8, 2011 at 9:58 PM, Todd Cary
> wrote:
>> I started a new thread since the original one is getting rather long.
>>
>> I have retrieved the files I deleted in /boot and /boot/grub,
>> however I need to make links for
>>
>> /boot/System.map  (System.map -> System.map-2.6.9-89.35.1)
>> /boot/vmlinuz  (vmlinuz -> vmlinuz-2.6.9-89.35.1)
>> /boot/grub/menu.lst (menu.lst -> ./grub.conf)
> Instead, re-install your kernel. "yum reinstall kernel". This should
> regenerate your symlinks correctly, except possibly the grub.conf.
>
>> If it was not so important to get it correct, I would appreciate
>> the syntax for the command.  Usually I would figure it out.
>>
>> Since I have restored the files (I will double check to make sure
>> they are all there), do I need to run grub-install?
> i think yes. The old location of the boot loader is listed in
> /boot/grub/grub.conf, and should be used as the argument to that
> command. grub is much smarter than LILO used to be, but I think the
> bootstrap procedure relies on knowing details of where the fiddly bits
> of grub live on the relevant ex2 compatible filesytem.
>
>> My apologies for bothering everyone with such a dumb error on my
>> part.
>>
>> Todd
>>
>> --
>> Ariste Software
>> Petaluma, CA 94952
>>
>> http://www.aristesoftware.com
>>
>> ___
>> CentOS mailing list
>> CentOS@centos.org
>> http://lists.centos.org/mailman/listinfo/centos
>>
 --
 Ariste Software
 Petaluma, CA 94952

 http://www.aristesoftware.com


>> --
>> Ariste Software
>> Petaluma, CA 94952
>>
>> http://www.aristesoftware.com
>>
>> ___
>> CentOS mailing list
>> CentOS@centos.org
>> http://lists.centos.org/mailman/listinfo/centos
>>
>
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
>
>

-- 
Ariste Software
Petaluma, CA 94952

http://www.aristesoftware.com

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Dr. Ed Morbius

on 07:06 Wed 09 Mar, Michael Eager (ea...@eagerm.com) wrote:
> Dr. Ed Morbius wrote:
> >on 09:24 Tue 08 Mar, Michael Eager (ea...@eagerm.com) wrote:
> >>Hi --
> >>
> >>I'm running a server which is usually stable, but every
> >>once in a while it hangs.  The server is used as a file
> >>store using NFS and to run VMware machines.
> >>
> >>I don't see anything in /var/log/messages or elsewhere
> >>to indicate any problem or offer any clue why the system
> >>was hung.
> >>
> >>Any suggestions where I might look for a clue?
> >
> >I'd very strongly recommend you configure netconsole.  Though not entire
> >clear from the name, it's actually an in-kernel network logging module,
> >which is very useful for kicking out kernel panics which otherwise
> >aren't logged to disk and can't be seen on a (nonresponsive) monitor.
> 
> I'll take a look at netconsole.
> 
> >Alternately, a serial console which actually retains all output sent to
> >it (some remote access systems support this, some don't) may help.
> >
> >Barring that, I'd start looking at individual HW components, starting
> >with RAM.
> 
> The problem with randomly replacing various components, other than the
> downtime and nuisance, is that there's no way to know that the change
> actually fixed any problem.  When the base rate is one unknown system
> hang every few weeks, how many wees should I wait without a failure to
> conclude that the replaced component was the cause?  A failure which
> happens infrequently isn't really amenable to a random diagnostic
> approach.

This is where vendor management/relations starts coming into the
picture.

Your architecture should also support single-point failures.

If the issue is repeated but rare system failures on one of a set of
similarly configured hosts, I'd RMA the box and get a replacement.  End
of story.

If that's not the case, well, then, I suppose YOUR problem is to figure
out when you've resolved the issue.  I've outlined the steps I'd take.
If this means weeks of uncertainty, then I'd communicate this fact, in
no uncertain terms, to my manager, along with the financial implications
of downtime.

If downtime is more expensive than system replacement costs, the
decision is pretty obvious, even if painful.

Note that most system problems /are/ single-source.  If you'd post
details of the host, more logging information, netconsole panic logs,
etc., it might be possible to narrow down possible causes.

With what you've posted to date, it's not.

-- 
Dr. Ed Morbius, Chief Scientist /|
  Robot Wrangler / Staff Psychologist| When you seek unlimited power
Krell Power Systems Unlimited|  Go to Krell!
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Dr. Ed Morbius

on 10:05 Wed 09 Mar, Lamar Owen (lo...@pari.edu) wrote:
> On Tuesday, March 08, 2011 04:44:54 pm Dr. Ed Morbius wrote:
> > I'd very strongly recommend you configure netconsole. 
> 
> Ok, now this is useful indeed.  Thanks for the information, even
> though I'm not the OP  While I suspected the facility might be
> there, I hadn't really dug for it, but if this will catch things after
> filesystems go r/o (ext3 journal things, ya know) it could be worth
> its weight in gold for catching kernel errors from VMware guests
> (serial console not really an option with the hosts I have, 

Yep, it is.

Netconsole made me fall in love with Linux all over again.

> although I'm sure some enterprising soul has figured out how to
> redirect the VM guest serial port to something else). 

-- 
Dr. Ed Morbius, Chief Scientist /|
  Robot Wrangler / Staff Psychologist| When you seek unlimited power
Krell Power Systems Unlimited|  Go to Krell!
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Creating the symbolic links in the /boot and /boot/grub/

2011-03-09 Thread Todd Cary

Simon -

I performed the tasks as outlined and here are the contents of /boot:

/boot/:
total 25121
drwxr-xr-x   5 root root 9216 Mar  9 10:31 .
drwxr-xr-x  24 root root 4096 Jan 24 08:34 ..
-rw-r--r--   1 root root51676 Feb 17 22:41 config-2.6.9-100.EL
drwxr-xr-x   2 root root 1024 Mar  9 10:31 grub
-rw-r--r--   1 root root   444812 May  5  2007 grub-0.95-3.8.i386.rpm
-rw-r--r--   1 root root  1343054 Mar  9 10:29 
initrd-2.6.9-100.EL.img
-rw-r--r--   1 root root 13409764 Feb 18 06:25 
kernel-2.6.9-100.EL.i686.rpm
drwx--   2 root root12288 Jan 12  2007 lost+found
-rw-r--r--   1 root root 9371 Aug 12  2006 message
-rw-r--r--   1 root root 9371 Aug 12  2006 message.ja
-rw-r--r--   1 root root  7919724 Aug 13  2006 
redhat-logos-1.1.26-1.centos4.4.noarch.rpm
-rw-r--r--   1 root root67797 Feb 17 22:41 
symvers-2.6.9-100.EL.gz
-rw-r--r--   1 root root   770652 Feb 17 22:41 
System.map-2.6.9-100.EL
drwx--   2 root root 9216 Mar  9 10:03 .Trash-root
-rw-r--r--   1 root root  1538264 Feb 17 22:41 vmlinuz-2.6.9-100.EL

/boot/grub:
total 345
drwxr-xr-x  2 root root   1024 Mar  9 10:31 .
drwxr-xr-x  5 root root   9216 Mar  9 10:31 ..
-rw-r--r--  1 root root 82 Mar  8 17:39 device.map
-rw-r--r--  1 root root   7956 Mar  8 17:40 e2fs_stage1_5
-rw-r--r--  1 root root   7684 Mar  8 17:40 fat_stage1_5
-rw-r--r--  1 root root   6996 Mar  8 17:40 ffs_stage1_5
-rw---  1 root root599 Mar  9 10:31 grub.conf
-rw-r--r--  1 root root   7028 Mar  8 17:40 iso9660_stage1_5
-rw-r--r--  1 root root   8448 Mar  8 17:40 jfs_stage1_5
-rw---  1 root root   4240 Mar  8 20:24 menu.lst
-rw-r--r--  1 root root   7188 Mar  8 17:40 minix_stage1_5
-rw-r--r--  1 root root   9396 Mar  8 17:40 reiserfs_stage1_5
-rw-r--r--  1 root root   3605 Aug 12  2006 splash.xpm.gz
-rw-r--r--  1 root root512 Mar  8 17:45 stage1
-rw-r--r--  1 root root 103688 Mar  8 17:45 stage2
-rw-r--r--  1 root root  67701 Mar  8 18:30 
symvers-2.6.9-89.35.1.EL.gz
-rw-r--r--  1 root root  68477 Mar  8 18:30 
symvers-2.6.9-89.35.1.ELsmp.gz
-rw-r--r--  1 root root   7272 Mar  8 17:40 ufs2_stage1_5
-rw-r--r--  1 root root   6612 Mar  8 17:40 vstafs_stage1_5
-rw-r--r--  1 root root   9308 Mar  8 17:40 xfs_stage1_5

/boot/lost+found:
total 22
drwx--  2 root root 12288 Jan 12  2007 .
drwxr-xr-x  5 root root  9216 Mar  9 10:31 ..

/boot/.Trash-root:
total 19
drwx--  2 root root 9216 Mar  9 10:03 .
drwxr-xr-x  5 root root 9216 Mar  9 10:31 ..



On 3/9/2011 12:23 AM, Simon Matter wrote:
>> I inadvertently missed using the list...here are my recent messages.
> As Nico suggested, download the kernel but also grub and redhat-logos,
> like so
> wget
> http://mirrors.kernel.org/centos/4.9/updates/i386/RPMS/kernel-2.6.9-100.EL.i686.rpm
> wget
> http://mirrors.kernel.org/centos/4.9/os/i386/CentOS/RPMS/redhat-logos-1.1.26-1.centos4.4.noarch.rpm
> wget
> http://mirrors.kernel.org/centos/4.9/os/i386/CentOS/RPMS/grub-0.95-3.8.i386.rpm
>
> Then do a
>
> rpm -Uvh --replacepkgs --replacefiles kernel-2.6.9-100.EL.i686.rpm
> redhat-logos-1.1.26-1.centos4.4.noarch.rpm grub-0.95-3.8.i386.rpm
>
> And the show us the contents of 'ls -laR /boot' and 'cat /etc/grub.conf'
>
> Simon
>
>>
>> On 3/8/2011 8:39 PM, Nico Kadel-Garcia wrote:
>>> On Tue, Mar 8, 2011 at 11:31 PM, Todd Cary
>>> wrote:
 reinstall is not an option for yum.  I ran "yum install kernel" and it
 completed without errors however there are no links created.
>>> Oh, dear. Can you grab the RPM and do "rpm -U -replacepkgs
>>> [kernel-whatver].rpm"? You should be able to use "yum remove" on the
>>> old kernel packages, consistent with freeing up the space, and now
>>> install your new kernel with yum.
>>>
 Would this be the correct ln command for vmlinuz-2.6.9-89.35.1

 # /boot/vmlinuz-2.6.9-89.35.1 /boot/vmlinuz

 Todd

 On 3/8/2011 7:04 PM, Nico Kadel-Garcia wrote:
> On Tue, Mar 8, 2011 at 9:58 PM, Todd Cary
> wrote:
>> I started a new thread since the original one is getting rather long.
>>
>> I have retrieved the files I deleted in /boot and /boot/grub,
>> however I need to make links for
>>
>> /boot/System.map  (System.map -> System.map-2.6.9-89.35.1)
>> /boot/vmlinuz  (vmlinuz -> vmlinuz-2.6.9-89.35.1)
>> /boot/grub/menu.lst (menu.lst -> ./grub.conf)
> Instead, re-install your kernel. "yum reinstall kernel". This should
> regenerate your symlinks correctly, except possibly the grub.conf.
>
>> If it was not so important to get it correct, I would appreciate
>> the syntax for the command.  Usually I would figure it out.
>>
>> Since I have restored the files (I will double check to make sure
>> they are all there), do I need to run grub-install?
> i think yes. The old location of the boot loader is listed in
> /boot/grub/grub.conf, and should be used as the argument to that
> command. grub is much smarter than LILO used to be, but I think th

Re: [CentOS] Security updates for CentOS-5

2011-03-09 Thread Riccardo Veraldi

On 3/9/11 5:45 PM, Mark Foster wrote:
> Hello, I was wondering why there haven't seemed to be any security
> updates for centos-5 since Jan 6. Per
> https://rhn.redhat.com/errata/rhel-server-errata.html there are a ton of
> outstanding issues.
> Thanks.
My solution at least for the kernel, was to get the src.rpm from RedHat

ftp://ftp.redhat.com/pub/redhat/linux/enterprise/5Server/en/os/SRPMS/kernel-2.6.18-238.5.1.el5.src.rpm

and build the kernel myself.


CentOS staff is working now hard full time for 5.6 release, so since 
January there has not been any update.


Riccardo


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread compdoc

> During the next server downtime, I'll re-seat RAM


If the ram is passing memtest86+, I think reseating only serves to introduce
dust and dirt into an area where a tight connection was previously keeping
it out.

Gently press them down to make sure they're seated, sure. But pulling them
out only allows dirt to fall into the cavity, and increases chances of
damage from insertion or static electricity, etc.

No to mention causing wear on the memory socket itself...





___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Michael Eager

Les Mikesell wrote:

> Note that overheating can be localized or a bad heat sink mounting or 
> fan on a CPU.

I'll re-seat the CPU, heatsink, and fan on the next downtime.

Heat related problems usually present as a system which fails
and will not reboot immediately, but will after they sit for a
while to cool down.  This system doesn't do that.

I'll install sensord to log CPU temps in case this is a problem.

> There's not really a good way to approach intermittent failures.  It may 
> only break when you aren't looking.  Major component swaps or taking it 
> offline for extended diagnostics hoping to catch a glimpse of the cause 
> when it fails is about all you can do.
> 
>> During the next server downtime, I'll re-seat RAM and
>> cables, check for excess dust, and do normal maintenance
>> as folks have suggested.  I might also run a memory diag.
>> I'll also look at the several excellent and appreciated
>> suggestions (some of which I've already installed) on how
>> to get a better picture on the state of the server when/if
>> there is a future failure.
> 
> Memory diagnostics may take days to catch a problem.  Did you check for 
> a newer bios for your MB?  I mentioned before that it seemed strange, 
> but I've seen that fix mysterious problems even after the machines had 
> previously been reliable for a long time (and even more oddly, all the 
> machines in the lot weren't affected).

Yes, most memory diagnostics are not very effective.

I'll have to stop the server to find out what the installed bios version
is and see whether there is an update.  Most bios updates appear to only
change supported CPUs.  Something else for the next downtime.

-- 
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread m . roth

Michael Eager wrote:
> m.r...@5-cent.us wrote:
>> Michael Eager wrote:
>>> John Hodrien wrote:
 On Wed, 9 Mar 2011, Michael Eager wrote:
>> 
>> Here's one more, off-the-wall thought: do the setterm --powersave off,
>> and find some way to make it work, so that you can see what's on the
screen
>> when it dies.
>
> Yes, I did this.  Switched to console screen.  The correct command
> is "setterm -powersave off -blank off", otherwise the screen gets
> blanked.  Turned the monitor off.  I hope it shows something
> useful on the next fault.

Best of luck. And thanks, I may try that.
>
>> What may be very important here is I recently had a problem
>> with a honkin' big server crashing... and it turned out that a user was
>> running a parallel processing job that kicked off three? four? dozen
>> threads, and towards the end of the job, every single thread wanted
>> 10G... on a system with 256G RAM (which size still boggles my mind). The
>> OOM-Killer didn't even have a chance to do its thing Yes, he's
>> limited what his job requests, and the system hasn't crashed since.
>
> Strange.  OOM-Killer should get priority.  That's what it's for.
> Although it usually seems to kill the innocent bystanders before
> it gets around to killing the offenders.

Yeah, but apparently too many of them hit too quickly - that's all I can
think.

  mark

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Michael Eager

m.r...@5-cent.us wrote:
> Michael Eager wrote:
>> John Hodrien wrote:
>>> On Wed, 9 Mar 2011, Michael Eager wrote:
>>>
 The problem with randomly replacing various components, other than
 the downtime and nuisance, is that there's no way to know that the
 change actually fixed any problem.  When the base rate is one
 unknown system hang every few weeks, how many wees should I wait
 without a failure to conclude that the replaced component was the
 cause?  A failure which happens infrequently isn't really amenable
 to a random diagnostic approach.
>>> So you pitch the whole thing over to being a test rig, and buy all new
>>> hardware?
>> I'll repeat from my original post:
>>
>> I don't see anything in /var/log/messages or elsewhere
>> to indicate any problem or offer any clue why the system
>> was hung.
>>
>> Any suggestions where I might look for a clue?
>>
>> I'm looking for diagnostics to focus on the cause of the crash.
>> My thanks for the several suggestions in this area.
>>
>> I'm not particularly interested in a listing of the myriad of
>> hypothetical causes absent observable evidence and some of
>> which are contradicted by evidence (such as overheating).
> 
> Here's one more, off-the-wall thought: do the setterm --powersave off, and
> find some way to make it work, so that you can see what's on the screen
> when it dies. 

Yes, I did this.  Switched to console screen.  The correct command
is "setterm -powersave off -blank off", otherwise the screen gets
blanked.  Turned the monitor off.  I hope it shows something
useful on the next fault.

> What may be very important here is I recently had a problem
> with a honkin' big server crashing... and it turned out that a user was
> running a parallel processing job that kicked off three? four? dozen
> threads, and towards the end of the job, every single thread wanted 10G...
> on a system with 256G RAM (which size still boggles my mind). The
> OOM-Killer didn't even have a chance to do its thing Yes, he's limited
> what his job requests, and the system hasn't crashed since.

Strange.  OOM-Killer should get priority.  That's what it's for.
Although it usually seems to kill the innocent bystanders before
it gets around to killing the offenders.

-- 
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] kernel vulnerabilities

2011-03-09 Thread Riccardo Veraldi

Ok
Thank you very much

On 09/mar/2011, at 17:48, Peter Kjellström  wrote:

> On Wednesday, March 09, 2011 05:06:21 pm Riccardo Veraldi wrote:
>> excuse me, could you be more helpful ?
>> Actually I am not able to get any security update from CentOS 5.5 repo.
>> Is there something I must change in the repo files ?
> 
> The kernel you're expecting is not an update for 5.5 but a part of 5.6. 5.6 
> (along with 4.9 and 6.0) is currently being built and tested by the CentOS 
> team. The short and frustrated first answer you got is due to an excessive 
> flood of "is it done yet? what's going on?"-type threads over the last few 
> weeks (consult the archives...).
> 
> /Peter
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Les Mikesell

On 3/9/2011 11:32 AM, Michael Eager wrote:
>
> I'm not particularly interested in a listing of the myriad of
> hypothetical causes absent observable evidence and some of
> which are contradicted by evidence (such as overheating).

Note that overheating can be localized or a bad heat sink mounting or 
fan on a CPU.

> I've encountered my share of bad power supplies, bad RAM,
> poorly seated cards, etc.  I've replaced failing capacitors
> in monitors (never on a motherboard).  I've replaced video
> cards, hard drives, bad cables.  And so forth.  Each of these
> had characteristics which pointed to the problem: kernel oops,
> POST failures, flickering screens, etc.  The problem I have is
> that there is a lack of diagnostic information to focus on the
> cause of the server failure.

Anything that happens quickly isn't going to show up in a log.

> I don't mean to appear unappreciative, but suggestions which
> amount to spending many hours making a series of unfocused
> modifications to the server, hoping that one of these random
> alterations fixes an infrequent problem, doesn't strike me as
> useful.  At the other extreme, the suggestions that I not look
> for the cause of the system failure and instead replace the
> server with one or three servers also doesn't seem to be a
> useful diagnostic approach either.

There's not really a good way to approach intermittent failures.  It may 
only break when you aren't looking.  Major component swaps or taking it 
offline for extended diagnostics hoping to catch a glimpse of the cause 
when it fails is about all you can do.

> During the next server downtime, I'll re-seat RAM and
> cables, check for excess dust, and do normal maintenance
> as folks have suggested.  I might also run a memory diag.
> I'll also look at the several excellent and appreciated
> suggestions (some of which I've already installed) on how
> to get a better picture on the state of the server when/if
> there is a future failure.

Memory diagnostics may take days to catch a problem.  Did you check for 
a newer bios for your MB?  I mentioned before that it seemed strange, 
but I've seen that fix mysterious problems even after the machines had 
previously been reliable for a long time (and even more oddly, all the 
machines in the lot weren't affected).

-- 
   Les Mikesell
 lesmikes...@gmail.com


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread m . roth

Michael Eager wrote:
> John Hodrien wrote:
>> On Wed, 9 Mar 2011, Michael Eager wrote:
>>
>>> The problem with randomly replacing various components, other than
>>> the downtime and nuisance, is that there's no way to know that the
>>> change actually fixed any problem.  When the base rate is one
>>> unknown system hang every few weeks, how many wees should I wait
>>> without a failure to conclude that the replaced component was the
>>> cause?  A failure which happens infrequently isn't really amenable
>>> to a random diagnostic approach.
>>
>> So you pitch the whole thing over to being a test rig, and buy all new
>> hardware?
>
> I'll repeat from my original post:
>
> I don't see anything in /var/log/messages or elsewhere
> to indicate any problem or offer any clue why the system
> was hung.
>
> Any suggestions where I might look for a clue?
>
> I'm looking for diagnostics to focus on the cause of the crash.
> My thanks for the several suggestions in this area.
>
> I'm not particularly interested in a listing of the myriad of
> hypothetical causes absent observable evidence and some of
> which are contradicted by evidence (such as overheating).

Here's one more, off-the-wall thought: do the setterm --powersave off, and
find some way to make it work, so that you can see what's on the screen
when it dies. What may be very important here is I recently had a problem
with a honkin' big server crashing... and it turned out that a user was
running a parallel processing job that kicked off three? four? dozen
threads, and towards the end of the job, every single thread wanted 10G...
on a system with 256G RAM (which size still boggles my mind). The
OOM-Killer didn't even have a chance to do its thing Yes, he's limited
what his job requests, and the system hasn't crashed since.

mark

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] CentOS 5.5 does not recognise SAS drives with LSI 1068E Controller

2011-03-09 Thread Tom H

On Wed, Mar 9, 2011 at 11:51 AM,   wrote:
> Peter Peltonen wrote:
>> On Wed, Mar 9, 2011 at 6:33 PM,   wrote:
>>> Peter Peltonen wrote:

 Based on that info I assume the board having a "8x SAS Ports via LSI
 1068E Controller". We received the server with 3 drives + 1 spare as
 hw RAID-5 preinstalled. During bootup I see that the drives are
 initialised and everything seems ok.

 The issue I am facing is that when trying to install CentOS no hard
 drives are recognised.
>>>
>>> I recently had a problem like that with a Dell box. The trick is that
>>> with a hardware controller, it supercedes software RAID. What you need
> to do
>>> is go into the firmware controller configuration on boot, before you
> get to
>>> grub, and make sure everything's visible and correct. The controller can
>>> see the drives, but not present them to the o/s if you don't.
>>
>> Hmm, I am not sure if I understand you correctly: are you saying that
>> in the firmware configuration there might be an option that makes the
>> disks invisible for the OS? This sounds a bit strange and I wonder
>> what such config could be...
>>
>> Or are you suggesting that I should put the controller in "JBOD mode"
>> and then use software RAiD instead of hardware RAID? I would not like
>> to go with this option as I think the performance would suffer this
>> way?
>
> Nope. They may have said they "pre-installed the RAID, but you really need
> to go into the setup (, or -f, or whatever), and see what it
> presents ->logically<- (key buzzword). If it hasn't been initialized, or
> put into logical configuration, then it simply will not present the
> logical drives to the o/s, and AFAIK, it will *not* present the physical
> drives at all.

I think that it's ctrl-r and that you have to set up "virtual disks"
using the "physical disks".
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Michael Eager

John Hodrien wrote:
> On Wed, 9 Mar 2011, Michael Eager wrote:
> 
>> The problem with randomly replacing various components, other than
>> the downtime and nuisance, is that there's no way to know that the
>> change actually fixed any problem.  When the base rate is one
>> unknown system hang every few weeks, how many wees should I wait
>> without a failure to conclude that the replaced component was the
>> cause?  A failure which happens infrequently isn't really amenable
>> to a random diagnostic approach.
> 
> So you pitch the whole thing over to being a test rig, and buy all new
> hardware?

I'll repeat from my original post:

I don't see anything in /var/log/messages or elsewhere
to indicate any problem or offer any clue why the system
was hung.

Any suggestions where I might look for a clue?

I'm looking for diagnostics to focus on the cause of the crash.
My thanks for the several suggestions in this area.

I'm not particularly interested in a listing of the myriad of
hypothetical causes absent observable evidence and some of
which are contradicted by evidence (such as overheating).

I've encountered my share of bad power supplies, bad RAM,
poorly seated cards, etc.  I've replaced failing capacitors
in monitors (never on a motherboard).  I've replaced video
cards, hard drives, bad cables.  And so forth.  Each of these
had characteristics which pointed to the problem: kernel oops,
POST failures, flickering screens, etc.  The problem I have is
that there is a lack of diagnostic information to focus on the
cause of the server failure.

I don't mean to appear unappreciative, but suggestions which
amount to spending many hours making a series of unfocused
modifications to the server, hoping that one of these random
alterations fixes an infrequent problem, doesn't strike me as
useful.  At the other extreme, the suggestions that I not look
for the cause of the system failure and instead replace the
server with one or three servers also doesn't seem to be a
useful diagnostic approach either.

During the next server downtime, I'll re-seat RAM and
cables, check for excess dust, and do normal maintenance
as folks have suggested.  I might also run a memory diag.
I'll also look at the several excellent and appreciated
suggestions (some of which I've already installed) on how
to get a better picture on the state of the server when/if
there is a future failure.

Thanks all!

-- 
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] CentOS 5.5 does not recognise SAS drives with LSI 1068E Controller

2011-03-09 Thread compdoc

> Hmm, I am not sure if I understand you correctly: are you saying
>That in the firmware configuration there might be an option that
>makes the disks invisible for the OS?

No, not as such. You just have to define the arrays: sssign the drives as
needed. It's a rare thing that a factory will set up the controller and
drives in a way that suits your needs.

I think you mentioned that Centos does see the controller, (but listing a
different number) and isn't seeing the drives. Which is why myself and
others are mentioning configuring the drives within the controller's bios.

The number Centos sees might just be the controllers chipset number rather
than the controller's part number...



___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] CentOS 5.5 does not recognise SAS drives with LSI 1068E Controller

2011-03-09 Thread compdoc

Hmm, I am not sure if I understand you correctly: are you saying that
in the firmware configuration there might be an option that makes the
disks invisible for the OS?

Most controllers have a firmware you can enter at boot with a keystroke.
Once in, you create/prepare arrays or single drives, which the OS can then
see...


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Leen de Braal

>>sure, if your time is worthless.  you can easily burn a couple hours
>>recapping a motherboard, which typically exceeds the boards worth.
>
> Amen. It's not enough to replace the bulging caps - you need to replace
> all
> the caps of the same brand as the damaged ones. Otherwise you'll just be
> doing it again later.
>
> And after ordering the exact replacements, and soldering them in, you've
> been down for days/weeks, and you'll lucky if it hasn't been damaged in
> other ways from lack of filtered power.
>
> Recycle the motherboard (its hazardous waste) and buy a modern one.
>
> By the way - don't forget to check the caps inside the PSU.

Very true. Had one server two weeks ago with a broken PSU because of caps.
Only after moving it, it showed because it rebooted several times even
before completing POST, and then stopped completely.


>
>
>
>
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
>


-- 
L. de Braal
BraHa Systems
NL - Terneuzen
T +31 115 649333
F +31 115 649444

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] kernel vulnerabilities

2011-03-09 Thread Lamar Owen

On Wednesday, March 09, 2011 11:48:55 am Peter Kjellström wrote:
> The kernel you're expecting is not an update for 5.5 but a part of 5.6. 5.6 
> (along with 4.9 and 6.0) is currently being built and tested by the CentOS 
> team. 

Minor correction: 4.9 is released:
[root@localhost ~]# cat /etc/redhat-release
CentOS release 4.9 (Final)
[root@localhost ~]#uname -a
Linux localhost.localdomain 2.6.9-100.EL #1 Fri Feb 18 01:29:32 EST 2011 i686 
athlon i386 GNU/Linux

SL has also just released their first alpha for their 4.9; see the SL lists for 
more information.

CentOS took the path of getting the updates for 4 and 5 done before 6; thus 
CentOS 4.9 is fully out there now.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] CentOS 5.5 does not recognise SAS drives with LSI 1068E Controller

2011-03-09 Thread Peter Peltonen

Hi,

On Wed, Mar 9, 2011 at 6:57 PM, Les Mikesell  wrote:
> Some controllers want to map arrays to volumes and present the volumes
> to the OS instead of drives, so you have to go through the motions of
> assigning the resources to volumes and initializing them even if you
> only want one disk in the array or volume.

I am pretty sure this was done already as that was what I had been
told, and I remember seeing on the screen during the bootup messages
about the drives being initialized and RAID5 working ok. But its been
a while since I've been working with hardware issues so I will double
check this tomorrow and show you the config.

So is it so that the LSI 1068E Controller *should* be supported by
megaraid_sas driver and the net install should use it without any
driver disk needed?

Regards,
Peter
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Lamar Owen

On Wednesday, March 09, 2011 11:45:06 am Les Mikesell wrote:
> And if you are running Centos the one thing you 
> don't need is to pay for extra licenses to cover the backup/development 
> instances.

And this is significant, and really highlights the reasoning of the CentOS team 
in 'bug-for'bug' binary compatibility with the upstream EL.

That is, in your hypothetical 'three of everything' approach you'd run a fully 
entitled copy of the upstream on the production unit, and save costs by running 
CentOS on the backup and the backup backup.

This is another fine financial point, and I'll not use the semi-derogatory 
'bean counters' thing, because some money really is cheaper than other money, 
and I'm not making that up, it is reality.  In particular, capital can be 
donated, but rarely will opex be donation-driven.  I have quite a bit of 
donated capital here, capital that I don't have replacement capex budget for.  
Also, many grants are awarded with 'capex-only' stipulations in the awards; it 
is a violation of the grant agreement to use that grant money on opex.  
Likewise, there are some grants that have exactly the opposite stipulation, and 
there are a few that have both, and have further direct versus indirect opex 
stipulations.

The point is that CentOS saves on opex; not personnel opex, but subscription 
opex.  Support subscriptions are opex, not capex.  And while that fine of a 
point might be lost to some, it is a point I deal with on virtually a daily 
basis.  I literally have to think about that distinction, and the various grant 
stipulations for monies that fund my salary, when filling out my biweekly 
timesheet; though salaried I am, that salary is funded between several grants, 
and most of those have different direct versus indirect cost budgets.

And helping keep things simpler is something that CentOS has helped me in 
significant ways.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] CentOS 5.5 does not recognise SAS drives with LSI 1068E Controller

2011-03-09 Thread Les Mikesell

On 3/9/2011 10:47 AM, Peter Peltonen wrote:
>
>> I recently had a problem like that with a Dell box. The trick is that with
>> a hardware controller, it supercedes software RAID. What you need to do is
>> go into the firmware controller configuration on boot, before you get to
>> grub, and make sure everything's visible and correct. The controller can
>> see the drives, but not present them to the o/s if you don't.
>
> Hmm, I am not sure if I understand you correctly: are you saying that
> in the firmware configuration there might be an option that makes the
> disks invisible for the OS? This sounds a bit strange and I wonder
> what such config could be...

Some controllers want to map arrays to volumes and present the volumes 
to the OS instead of drives, so you have to go through the motions of 
assigning the resources to volumes and initializing them even if you 
only want one disk in the array or volume.

> Or are you suggesting that I should put the controller in "JBOD mode"
> and then use software RAiD instead of hardware RAID? I would not like
> to go with this option as I think the performance would suffer this
> way?

Depending on the raid level there may or may not be a performance 
difference, but the point is you have to configure the controller and 
drives they way you want them before they will show up at all.

-- 
   Les Mikesell
lesmikes...@gmail.com
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Lamar Owen

On Wednesday, March 09, 2011 10:48:29 am m.r...@5-cent.us wrote:
> Lamar Owen wrote:
> > Heat and airflow are two others.

> Hmmm... has the a/c been changed lately? Or maybe stuff outside the rack
> been moved, and so obstructed the airflow?

To followup a little, I had a motherboard one time, with a factory-installed 
CPU, heatsink, and fan, that would not run for more than four or five hours 
before hanging.  This motherboard was in a system that was donated to us as 
being 'flaky' so I don't know the warranty status or what the original owner 
had or had not done, but it did have a factory seal sticker strip between the 
heatsink and the CPU and the motherboard socket, and that sticker was 
tamper-evident type, and there had been no tampering.

I decided I would refresh the heatsink compound, and, since even if it were 
still covered by the warranty that would have only been valid for the original 
purchaser.  So I pulled the sticker strip, which left little 'voids' on things, 
and pulled the heatsink.  At that point I laughed so hard I cried, as the 
heatsink still had the clear plastic protector film between the CPU and the 
heatsink compound.  From the factory.  I pulled the film, reinstalled the 
heatsink, and that system is and has been for several years rock-solid stable.

The issue of dust buildup follows from the heat and airflow.

There is another potential culprit, though, especially if this system has been 
in a raised floor environment, that some might find odd.  That culprit, or, 
rather, those culprits, are zinc whiskers.  Also, the metal components in the 
electronics themselves can exude whiskers; see the wikipedia article on the 
subject for more information ( 
https://secure.wikimedia.org/wikipedia/en/wiki/Whisker_%28metallurgy%29 )
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] CentOS 5.5 does not recognise SAS drives with LSI 1068E Controller

2011-03-09 Thread m . roth

Peter Peltonen wrote:
> Hi and thanks for your reply,
>
> On Wed, Mar 9, 2011 at 6:33 PM,   wrote:
>> Peter Peltonen wrote:
>>> Based on that info I assume the board having a "8x SAS Ports via LSI
>>> 1068E Controller". We received the server with 3 drives + 1 spare as
>>> hw RAID-5 preinstalled. During bootup I see that the drives are
>>> initialised and everything seems ok.
>>>
>>> The issue I am facing is that when trying to install CentOS no hard
>>> drives are recognised.
>> 
>> I recently had a problem like that with a Dell box. The trick is that
>> with a hardware controller, it supercedes software RAID. What you need
to do
>> is go into the firmware controller configuration on boot, before you
get to
>> grub, and make sure everything's visible and correct. The controller can
>> see the drives, but not present them to the o/s if you don't.
>
> Hmm, I am not sure if I understand you correctly: are you saying that
> in the firmware configuration there might be an option that makes the
> disks invisible for the OS? This sounds a bit strange and I wonder
> what such config could be...
>
> Or are you suggesting that I should put the controller in "JBOD mode"
> and then use software RAiD instead of hardware RAID? I would not like
> to go with this option as I think the performance would suffer this
> way?

Nope. They may have said they "pre-installed the RAID, but you really need
to go into the setup (, or -f, or whatever), and see what it
presents ->logically<- (key buzzword). If it hasn't been initialized, or
put into logical configuration, then it simply will not present the
logical drives to the o/s, and AFAIK, it will *not* present the physical
drives at all.

 mark

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Security updates for CentOS-5

2011-03-09 Thread Peter Kjellström

On Wednesday, March 09, 2011 05:45:22 pm Mark Foster wrote:
> Hello, I was wondering why there haven't seemed to be any security
> updates for centos-5 since Jan 6. Per
> https://rhn.redhat.com/errata/rhel-server-errata.html there are a ton of
> outstanding issues.
> Thanks.

See the on-going thread "kernel vulnerabilities" and/or search the arhives.

/Peter

signature.asc
Description: This is a digitally signed message part.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Security updates for CentOS-5

2011-03-09 Thread Karanbir Singh

On 03/09/2011 04:45 PM, Mark Foster wrote:
> Hello, I was wondering why there haven't seemed to be any security
> updates for centos-5 since Jan 6. Per
> https://rhn.redhat.com/errata/rhel-server-errata.html there are a ton of
> outstanding issues.
> Thanks.

All of those apply to 5.6 ( where apply implies that they link into 5.6 
code; which literally just got finalised in the last day or so ).

Having said that, I've done a fair bit of work on the updates and hope 
to get them released into the 5.5/updates tree either later today or 
tomorrow. It might be a case of first hosting them into a testing repo, 
publicly so more people can confirm that deep linking and inherited 
issues are no longer a 'breaker'. But that testing will need to be 
fairly short and sweet, keep an eye on centos-devel for more info

- KB
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] kernel vulnerabilities

2011-03-09 Thread Peter Kjellström

On Wednesday, March 09, 2011 05:06:21 pm Riccardo Veraldi wrote:
> excuse me, could you be more helpful ?
> Actually I am not able to get any security update from CentOS 5.5 repo.
> Is there something I must change in the repo files ?

The kernel you're expecting is not an update for 5.5 but a part of 5.6. 5.6 
(along with 4.9 and 6.0) is currently being built and tested by the CentOS 
team. The short and frustrated first answer you got is due to an excessive 
flood of "is it done yet? what's going on?"-type threads over the last few 
weeks (consult the archives...).

/Peter

signature.asc
Description: This is a digitally signed message part.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] CentOS 5.5 does not recognise SAS drives with LSI 1068E Controller

2011-03-09 Thread Peter Peltonen

Hi and thanks for your reply,

On Wed, Mar 9, 2011 at 6:33 PM,   wrote:
> Peter Peltonen wrote:
>> Based on that info I assume the board having a "8x SAS Ports via LSI
>> 1068E Controller". We received the server with 3 drives + 1 spare as
>> hw RAID-5 preinstalled. During bootup I see that the drives are
>> initialised and everything seems ok.
>>
>> The issue I am facing is that when trying to install CentOS no hard
>> drives are recognised.
> 
> I recently had a problem like that with a Dell box. The trick is that with
> a hardware controller, it supercedes software RAID. What you need to do is
> go into the firmware controller configuration on boot, before you get to
> grub, and make sure everything's visible and correct. The controller can
> see the drives, but not present them to the o/s if you don't.

Hmm, I am not sure if I understand you correctly: are you saying that
in the firmware configuration there might be an option that makes the
disks invisible for the OS? This sounds a bit strange and I wonder
what such config could be...

Or are you suggesting that I should put the controller in "JBOD mode"
and then use software RAiD instead of hardware RAID? I would not like
to go with this option as I think the performance would suffer this
way?

Regards,
Peter
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

[CentOS] Security updates for CentOS-5

2011-03-09 Thread Mark Foster

Hello, I was wondering why there haven't seemed to be any security
updates for centos-5 since Jan 6. Per
https://rhn.redhat.com/errata/rhel-server-errata.html there are a ton of
outstanding issues.
Thanks.
-- 
Mark D. Foster 
http://mark.foster.cc/

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Les Mikesell

On 3/9/2011 9:55 AM, Brunner, Brian T. wrote:
>
> This is where mental ossification amongst bean-counters can kill a
> company.
> "Economic Opportunity Cost" should raise its head here: What would we do
> with the $capex if we paid $opex vs what would we do with the $opex if
> we paid $capex.  "The Time Value of Money vs The Money Value of Time" is
> another phrasing of this point-of-view.  Unfortunately this is no longer
> a CentOS topic.

The admin/operator's time is usually seen as a fixed cost and keeping a 
machine working is not supposed to take unplanned time.  So, if you want 
to keep something running you really need to buy 3 of them in the first 
place.  One as primary in production, one as a backup, and one to be 
developing/testing the next version on.  In some cases you can replace 
the third one with a virtual setup, and you might be able to have one 
backup as a spare for more than one live server but you can't skimp much 
more than that.  Everything breaks, so if one thing breaking causes a 
big problem, it wasn't planned realistically.  This should be a 'swap in 
the backup' while you run extensive diagnostics or get a warranty repair 
on the broken thing.  And if you are running Centos the one thing you 
don't need is to pay for extra licenses to cover the backup/development 
instances.

-- 
   Les Mikesell
 lesmikes...@gmail.com
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] kernel vulnerabilities

2011-03-09 Thread David Sommerseth

On 09/03/11 17:06, Riccardo Veraldi wrote:
> excuse me, could you be more helpful ?
> Actually I am not able to get any security update from CentOS 5.5 repo.
> Is there something I must change in the repo files ?

What he meant was that you could do this:

http://lmgtfy.com/?q=centos+mailing+list+archive&l=1

And go through the archives.  There are plenty of information about your
question there.  But to summarize it again, there has not been any CentOS5
updates since early January (just check the announce list, available above)
since they are working hard on getting CentOS 5.6 ready.

Otherwise, I recommend you to get familiar with what's called netiquette,
like this one:  
Also look at the bottom of the web page from the link above as well.
(Hint: top-posting)

kind regards,

David Sommerseth

> On 3/4/11 12:14 PM, Kai Schaetzl wrote:
>> the archive would have told you.
>>
>> Kai
>>
>>
>> ___
>> CentOS mailing list
>> CentOS@centos.org
>> http://lists.centos.org/mailman/listinfo/centos

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] CentOS 5.5 does not recognise SAS drives with LSI 1068E Controller

2011-03-09 Thread m . roth

Peter Peltonen wrote:
> I need to do a new CentOS net install on a new server having the
> Supermicro X7DVL-3 motherboard:
>
>   http://www.supermicro.com/products/motherboard/xeon1333/5000V/X7DVL-3.cfm
>
> Based on that info I assume the board having a "8x SAS Ports via LSI
> 1068E Controller". We received the server with 3 drives + 1 spare as
> hw RAID-5 preinstalled. During bootup I see that the drives are
> initialised and everything seems ok.
>
> The issue I am facing is that when trying to install CentOS no hard
> drives are recognised.

I recently had a problem like that with a Dell box. The trick is that with
a hardware controller, it supercedes software RAID. What you need to do is
go into the firmware controller configuration on boot, before you get to
grub, and make sure everything's visible and correct. The controller can
see the drives, but not present them to the o/s if you don't.

  mark

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread compdoc

>sure, if your time is worthless.  you can easily burn a couple hours
>recapping a motherboard, which typically exceeds the boards worth.

Amen. It's not enough to replace the bulging caps - you need to replace all
the caps of the same brand as the damaged ones. Otherwise you'll just be
doing it again later.

And after ordering the exact replacements, and soldering them in, you've
been down for days/weeks, and you'll lucky if it hasn't been damaged in
other ways from lack of filtered power.

Recycle the motherboard (its hazardous waste) and buy a modern one.

By the way - don't forget to check the caps inside the PSU.




___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] /etc/hosts - hostname alias for 127.0.0.1

2011-03-09 Thread Robert Spangler

On Tuesday 08 March 2011 12:39, the following was written:

>  >> And giving it 127.0.0.1 would tell it others to ignore it, I think.
>  >>
>  >> Where did your user come up with this idea - clearly, they have *no*
>  >> clue what they're doing, and need at least a brown bag lunch about 
>  >> TCP/IP, and they should not be allowed to dictate this. Their "idea" is 
>  >> a bug, and needs to be fixed.
>
>  
>
>  > You guys do know that the names in your host file only apply to YOU on
>  > that machine right?  It does not matter if you connect to 127.0.0.1 or
>  > something else UNLESS you specifically listen on a specific IP address
>  > on that machine AND you need to connect to that address from the machine
>  > itself.
>
>  
>  Let me expand on the above: if anyone on *any* other machine is trying to
>  connect to that, it won't work. If they try to point a browser to it,
>  unless they've done ssh -X to the server, they'll talk to their *own*
>  machine, and it won't be found.

Let me try another way to explain this to you.

If you try to get to the site xyz.com and you open your browser and type that 
in you are using what to get the ip address of that service?  Correct, DNS, 
as you don't have xyz.com listed in your LOCAL host file.

In DNS the site xyz.com resolves to 1.1.1.1

Now you ssh (ssh -x) into the xyz server. The server has the following in its 
Hosts file;

127.0.0.1   xyz.com

You open a browser the xyz servers X session what is going to resolve for 
xyz.com? Correct, 127.0.0.1 and if the system is configured correctly to 
listen on that address you will connect.

Now lets say that the host file has the following;

127.0.0.1 xyz

You are still logged into the server with your x session going.
Now in your browser you type "xyz".  What address do you get and why?
If you type "xyz.com" into the same browser what address do you get and why?


-- 

Regards
Robert

Linux
The adventure of a lifetime.

Linux User #296285
Get Counted
http://counter.li.org/
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] kernel vulnerabilities

2011-03-09 Thread Riccardo Veraldi

excuse me, could you be more helpful ?
Actually I am not able to get any security update from CentOS 5.5 repo.
Is there something I must change in the repo files ?

thank you

On 3/4/11 12:14 PM, Kai Schaetzl wrote:
> the archive would have told you.
>
> Kai
>
>
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Brunner, Brian T.

centos-boun...@centos.org wrote:
> On Wednesday, March 09, 2011 10:16:34 am Brunner, Brian T. wrote:
>> This would be far cheaper than the time spent troubleshooting the
>> running (sometimes hanging) system.
> 
> Let me interject here, that from a budgeting standpoint
> 'cheaper' has to be interpreted in the context of which
> budget the costs are coming out of.  

This degenerates into "Your dollars are cheaper than my dollars".

> New hardware is capex,
> and thus would come out of the capital budget, and admin time
> is opex, and thus would come out of the operating budget.

This is where mental ossification amongst bean-counters can kill a
company.
"Economic Opportunity Cost" should raise its head here: What would we do
with the $capex if we paid $opex vs what would we do with the $opex if
we paid $capex.  "The Time Value of Money vs The Money Value of Time" is
another phrasing of this point-of-view.  Unfortunately this is no longer
a CentOS topic.

>> Starting with RAM and Power Supply is not random ... They're "The
>> Usual Suspects".
> 
> This is a very true statement.
> 
> Heat and airflow are two others.

RAM and PowerSupply are easy starting points: swap RAM between two
systems and see (in the next 3 months) if the problem moved,  swapping
power supplies is a bit trickier but doable if the systems are similar
enough.  Again, several months watching to see where the problem
manifests is a test of patience and diligence.  It's possible that doing
this will make the problem stop arising (RAM and PS are both good
enough, they just don't play well together).

Heat & airflow are harder to swap (says the guy who opened an office
desktop, and vacuumed out enough hair, lint, dust, dander, and ashes to
knit a grey angora hamster (with lung cancer)).

Insert spiffy .sig here:
Life is complex: it has both real and imaginary parts.

//me
***
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom
they are addressed. If you have received this email in error please
notify the system manager. This footnote also confirms that this
email message has been swept for the presence of computer viruses.
www.Hubbell.com - Hubbell Incorporated**

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

[CentOS] CentOS 5.5 does not recognise SAS drives with LSI 1068E Controller

2011-03-09 Thread Peter Peltonen

I need to do a new CentOS net install on a new server having the
Supermicro X7DVL-3 motherboard:

  http://www.supermicro.com/products/motherboard/xeon1333/5000V/X7DVL-3.cfm

Based on that info I assume the board having a "8x SAS Ports via LSI
1068E Controller". We received the server with 3 drives + 1 spare as
hw RAID-5 preinstalled. During bootup I see that the drives are
initialised and everything seems ok.

The issue I am facing is that when trying to install CentOS no hard
drives are recognised.

I am a bit confused as others have reported CentOS 5.3 and 5.4 working
"out of the box" with the same controller:

  
http://www.linux-archive.org/centos/287219-installing-centos-5-4-64bit-server-lsi-sas-1068e-controller.html

So I assumed that the megaraid_sas driver shipped with CentOS should
support the controller?

What confuses me also is the output of lspci:

  MegaRAID SAS 3208 ELP

Does this mean its not the 1068E controller inside but something else?
Or CentOS misidentifies it?

So I assume the controller is not supported and I need a binary driver
for it. For 1068e it should be:

  
http://www.lsi.com/storage_home/products_home/standard_product_ics/sas_ics/lsisas1068e/index.html

and the driver I need is inside the file
mptlinux-4.26.00.00-1-rhel5.5.x86_64.dd.gz

But how do I go applying this driver during installation as the server
has no floppy drive?

And what happens if I get the driver installed and then the server's
kernel is updated? Do I need reinstall the driver somehow?

Best regards,
Peter
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread m . roth

Lamar Owen wrote:
> On Wednesday, March 09, 2011 10:16:34 am Brunner, Brian T. wrote:

>> Starting with RAM and Power Supply is not random ... They're "The Usual
>> Suspects".
>
> This is a very true statement.
>
> Heat and airflow are two others.

Hmmm... has the a/c been changed lately? Or maybe stuff outside the rack
been moved, and so obstructed the airflow?

 mark

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Lamar Owen

On Wednesday, March 09, 2011 10:16:34 am Brunner, Brian T. wrote:
> This would be far cheaper than the time spent troubleshooting the
> running (sometimes hanging) system.

Let me interject here, that from a budgeting standpoint 'cheaper' has to be 
interpreted in the context of which budget the costs are coming out of.  New 
hardware is capex, and thus would come out of the capital budget, and admin 
time is opex, and thus would come out of the operating budget.  There may be 
sufficient funds in the operating budget to pay an admin $x,000 but the funds 
in the capital budget may be insufficient to buy a server costing $y,000, where 
y=x.  And if this is an educational institution, and there are grants involved, 
it may be the reverse situation.  So 'cheaper' only has meaning when the costs 
are coming out of the same budget.  So, yes, while it's easy for a 
single-budget entity to make this decision, it's not so easy when you have 
multiple budgets involved with different spending parameters and different 
funding entities. 

> Starting with RAM and Power Supply is not random ... They're "The Usual
> Suspects".

This is a very true statement.  

Heat and airflow are two others.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] [Newbie] Reclaiming /boot space

2011-03-09 Thread Todd Cary

OK, here is what I have after manually moving files into /boot/ 
and /boot/grub/, leaving behind old kernel files.  Of course, no 
links.

/boot/:
total 9169
drwxr-xr-x   5 root root9216 Mar  8 20:24 .
drwxr-xr-x  24 root root4096 Jan 24 08:34 ..
-rw-r--r--   1 root root   51676 Feb 17 22:41 config-2.6.9-100.EL
-rw-r--r--   1 root root   51645 Mar  8 17:38 config-2.6.9-89.35.1.EL
-rw-r--r--   1 root root   51270 Mar  8 17:38 
config-2.6.9-89.35.1.ELsmp
drwxr-xr-x   2 root root1024 Mar  8 20:38 grub
-rw-r--r--   1 root root 1343048 Mar  8 20:24 initrd-2.6.9-100.EL.img
-rw-r--r--   1 root root  692224 Mar  8 17:51 
initrd-2.6.9-89.35.1.EL.img
-rw-r--r--   1 root root   0 Mar  8 17:51 
initrd-2.6.9-89.35.1.ELsmp.img
drwx--   2 root root   12288 Jan 12  2007 lost+found
-rw-r--r--   1 root root9371 Mar  8 17:46 message
-rw-r--r--   1 root root9371 Mar  8 17:46 message.ja
-rw-r--r--   1 root root   67797 Feb 17 22:41 symvers-2.6.9-100.EL.gz
-rw-r--r--   1 root root   67701 Mar  8 17:49 
symvers-2.6.9-89.35.1.EL.gz
-rw-r--r--   1 root root   68477 Mar  8 17:49 
symvers-2.6.9-89.35.1.ELsmp.gz
-rw-r--r--   1 root root  770652 Feb 17 22:41 System.map-2.6.9-100.EL
-rw-r--r--   1 root root  769061 Mar  8 17:48 
System.map-2.6.9-89.35.1.EL
-rw-r--r--   1 root root  786055 Mar  8 17:48 
System.map-2.6.9-89.35.1.ELsmp
drwx--   2 root root9216 Mar  8 17:58 .Trash-root
-rw-r--r--   1 root root 1538264 Feb 17 22:41 vmlinuz-2.6.9-100.EL
-rw-r--r--   1 root root 1536995 Mar  8 17:47 
vmlinuz-2.6.9-89.35.1.EL
-rw-r--r--   1 root root 1472967 Mar  8 17:47 
vmlinuz-2.6.9-89.35.1.ELsmp

/boot/grub:
total 345
drwxr-xr-x  2 root root   1024 Mar  8 20:38 .
drwxr-xr-x  5 root root   9216 Mar  8 20:24 ..
-rw-r--r--  1 root root 82 Mar  8 17:39 device.map
-rw-r--r--  1 root root   7956 Mar  8 17:40 e2fs_stage1_5
-rw-r--r--  1 root root   7684 Mar  8 17:40 fat_stage1_5
-rw-r--r--  1 root root   6996 Mar  8 17:40 ffs_stage1_5
-rw---  2 root root   4240 Mar  8 20:24 grub.conf
-rw-r--r--  1 root root   7028 Mar  8 17:40 iso9660_stage1_5
-rw-r--r--  1 root root   8448 Mar  8 17:40 jfs_stage1_5
-rw---  2 root root   4240 Mar  8 20:24 menu.lst
-rw-r--r--  1 root root   7188 Mar  8 17:40 minix_stage1_5
-rw-r--r--  1 root root   9396 Mar  8 17:40 reiserfs_stage1_5
-rw-r--r--  1 root root512 Mar  8 17:45 stage1
-rw-r--r--  1 root root 103688 Mar  8 17:45 stage2
-rw-r--r--  1 root root  67701 Mar  8 18:30 
symvers-2.6.9-89.35.1.EL.gz
-rw-r--r--  1 root root  68477 Mar  8 18:30 
symvers-2.6.9-89.35.1.ELsmp.gz
-rw-r--r--  1 root root   7272 Mar  8 17:40 ufs2_stage1_5
-rw-r--r--  1 root root   6612 Mar  8 17:40 vstafs_stage1_5
-rw-r--r--  1 root root   9308 Mar  8 17:40 xfs_stage1_5

/boot/lost+found:
total 22
drwx--  2 root root 12288 Jan 12  2007 .
drwxr-xr-x  5 root root  9216 Mar  8 20:24 ..

/boot/.Trash-root:
total 81058
drwx--  2 root root9216 Mar  8 17:58 .
drwxr-xr-x  5 root root9216 Mar  8 20:24 ..
-rw-r--r--  1 root root   51502 Jan 14  2009 config-2.6.9-78.0.13.EL
-rw-r--r--  1 root root   51127 Jan 14  2009 
config-2.6.9-78.0.13.ELsmp
-rw-r--r--  1 root root   51614 Sep 15  2009 config-2.6.9-89.0.11.EL
-rw-r--r--  1 root root   51239 Sep 15  2009 
config-2.6.9-89.0.11.ELsmp
-rw-r--r--  1 root root   51614 Nov  3  2009 config-2.6.9-89.0.16.EL
-rw-r--r--  1 root root   51239 Nov  3  2009 
config-2.6.9-89.0.16.ELsmp
-rw-r--r--  1 root root   51614 Dec 15  2009 config-2.6.9-89.0.18.EL
-rw-r--r--  1 root root   51239 Dec 15  2009 
config-2.6.9-89.0.18.ELsmp
-rw-r--r--  1 root root   51614 Jan  8  2010 config-2.6.9-89.0.19.EL
-rw-r--r--  1 root root   51239 Jan  8  2010 
config-2.6.9-89.0.19.ELsmp
-rw-r--r--  1 root root   51614 Feb  2  2010 config-2.6.9-89.0.20.EL
-rw-r--r--  1 root root   51239 Feb  2  2010 
config-2.6.9-89.0.20.ELsmp
-rw-r--r--  1 root root   51645 May  6  2010 config-2.6.9-89.0.25.EL
-rw-r--r--  1 root root   51270 May  6  2010 
config-2.6.9-89.0.25.ELsmp
-rw-r--r--  1 root root   51645 Aug 20  2010 config-2.6.9-89.0.28.EL
-rw-r--r--  1 root root   51270 Aug 20  2010 
config-2.6.9-89.0.28.ELsmp
-rw-r--r--  1 root root   51645 Oct 19 14:27 config-2.6.9-89.31.1.EL
-rw-r--r--  1 root root   51270 Oct 19 15:17 
config-2.6.9-89.31.1.ELsmp
-rw-r--r--  1 root root   51645 Dec  2 07:23 config-2.6.9-89.33.1.EL
-rw-r--r--  1 root root   51270 Dec  2 07:41 
config-2.6.9-89.33.1.ELsmp
-rw-r--r--  1 root root   51645 Jan 18 15:06 config-2.6.9-89.35.1.EL
-rw-r--r--  1 root root   51270 Jan 18 15:36 
config-2.6.9-89.35.1.ELsmp
-rw-r--r--  1 root root  82 Jan 13  2007 device.map
-rw-r--r--  1 root root7956 Jan 13  2007 e2fs_stage1_5
-rw-r--r--  1 root root7684 Jan 13  2007 fat_stage1_5
-rw-r--r--  1 root root6996 Jan 13  2007 ffs_stage1_5
-rw---  1 root root4091 Jan 28 07:21 grub.conf
-rw-r--r--  1 root root 101 Mar 14  2009 
initrd-2.6.9-78.0.13.EL.img
-rw-r--r--  1 root root 1319564 Mar 14  2009 
initrd-2.6.9-78.0.13.ELsmp.img
-rw-r--r--  1

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Lamar Owen

On Wednesday, March 09, 2011 03:24:48 am Leen de Braal wrote:
> While you open the case, check for the bulging capacitor problem.
> Will have the effect you describe, freezing up the system so that even
> bios routines don't work (your fans).
> If that's the case, replace mainboard.

I've seen capacitor problems in the past, and they can be rather interesting.

What the caps do is open up (electrically speaking) meaning they no longer can 
smooth out the ripple in the output of the switching regulator; this ripple is 
very high frequency due to the switching regulator's design.  As the CPU draws 
more current (which happens when it's loaded, of course, since MOS gates by 
design consume the most power during the switching period (capacitor charging 
time constants on the gates of the transistors themselves)), the switching 
regulator has to supply more current, and if the caps are open they can't 
smooth out the deeper ripple.

I actually had one motherboard blow two caps; one of the cases of one of the 
blown capacitors was violently ejected off of the 'guts' of the cap, hard 
enough that it dented the PC's case from the inside.

The PC kept running, until it was put under load, then it would lock up.  When 
the second cap blew, about an hour later, the PC hung; it would power up and 
run POST, and even run the BIOS setup's memory check and health check, but as 
soon as the CPU was shifted into protect mode as the OS booted it would hard 
hang due to the CPU's increased current draw overwhelming the ripple absorbing 
capacity of the remaining good capacitors on the CPU's switching regulator.

There's really only one way to determine this, and that's by putting an 
oscilloscope on the CPU's power supply output rails and looking for ripple 
while running a CPU burnin program.  The hard part of that is actually finding 
a good place to measure the output, thanks to the typical motherboard's 
multilayer design.  

And while with the proper desoldering equipment and training/experience one can 
re-cap a motherboard, I would not recommend doing so for a critical server, 
unless you want and can assume personal liability for that server's operation.  
Better to get a new motherboard with a warranty.  For a personal server that if 
it breaks isn't going to open you up to personal liability, sure, you can 
re-cap if you'd like and have the patience, time, equipment, and experience 
necessary to work on 6 to 8 layer PC boards, with may be soldered with RoHS 
lead-free solder, which requires special techniques.  Otherwise, as you said, 
you can damage the 'vias' (that is, the plated through holes the capacitor 
leads solder to, which may be used to connect to internal layers that you can't 
resolder) very easily.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] [Newbie] Reclaiming /boot space

2011-03-09 Thread Todd Cary

OK, here is what I have after manually copying files back to 
/boot, leaving behind old kernel files:

/boot/:
total 9169
drwxr-xr-x   5 root root9216 Mar  8 20:24 .
drwxr-xr-x  24 root root4096 Jan 24 08:34 ..
-rw-r--r--   1 root root   51676 Feb 17 22:41 config-2.6.9-100.EL
-rw-r--r--   1 root root   51645 Mar  8 17:38 config-2.6.9-89.35.1.EL
-rw-r--r--   1 root root   51270 Mar  8 17:38 
config-2.6.9-89.35.1.ELsmp
drwxr-xr-x   2 root root1024 Mar  8 20:38 grub
-rw-r--r--   1 root root 1343048 Mar  8 20:24 initrd-2.6.9-100.EL.img
-rw-r--r--   1 root root  692224 Mar  8 17:51 
initrd-2.6.9-89.35.1.EL.img
-rw-r--r--   1 root root   0 Mar  8 17:51 
initrd-2.6.9-89.35.1.ELsmp.img
drwx--   2 root root   12288 Jan 12  2007 lost+found
-rw-r--r--   1 root root9371 Mar  8 17:46 message
-rw-r--r--   1 root root9371 Mar  8 17:46 message.ja
-rw-r--r--   1 root root   67797 Feb 17 22:41 symvers-2.6.9-100.EL.gz
-rw-r--r--   1 root root   67701 Mar  8 17:49 
symvers-2.6.9-89.35.1.EL.gz
-rw-r--r--   1 root root   68477 Mar  8 17:49 
symvers-2.6.9-89.35.1.ELsmp.gz
-rw-r--r--   1 root root  770652 Feb 17 22:41 System.map-2.6.9-100.EL
-rw-r--r--   1 root root  769061 Mar  8 17:48 
System.map-2.6.9-89.35.1.EL
-rw-r--r--   1 root root  786055 Mar  8 17:48 
System.map-2.6.9-89.35.1.ELsmp
drwx--   2 root root9216 Mar  8 17:58 .Trash-root
-rw-r--r--   1 root root 1538264 Feb 17 22:41 vmlinuz-2.6.9-100.EL
-rw-r--r--   1 root root 1536995 Mar  8 17:47 
vmlinuz-2.6.9-89.35.1.EL
-rw-r--r--   1 root root 1472967 Mar  8 17:47 
vmlinuz-2.6.9-89.35.1.ELsmp

/boot/grub:
total 345
drwxr-xr-x  2 root root   1024 Mar  8 20:38 .
drwxr-xr-x  5 root root   9216 Mar  8 20:24 ..
-rw-r--r--  1 root root 82 Mar  8 17:39 device.map
-rw-r--r--  1 root root   7956 Mar  8 17:40 e2fs_stage1_5
-rw-r--r--  1 root root   7684 Mar  8 17:40 fat_stage1_5
-rw-r--r--  1 root root   6996 Mar  8 17:40 ffs_stage1_5
-rw---  2 root root   4240 Mar  8 20:24 grub.conf
-rw-r--r--  1 root root   7028 Mar  8 17:40 iso9660_stage1_5
-rw-r--r--  1 root root   8448 Mar  8 17:40 jfs_stage1_5
-rw---  2 root root   4240 Mar  8 20:24 menu.lst
-rw-r--r--  1 root root   7188 Mar  8 17:40 minix_stage1_5
-rw-r--r--  1 root root   9396 Mar  8 17:40 reiserfs_stage1_5
-rw-r--r--  1 root root512 Mar  8 17:45 stage1
-rw-r--r--  1 root root 103688 Mar  8 17:45 stage2
-rw-r--r--  1 root root  67701 Mar  8 18:30 
symvers-2.6.9-89.35.1.EL.gz
-rw-r--r--  1 root root  68477 Mar  8 18:30 
symvers-2.6.9-89.35.1.ELsmp.gz
-rw-r--r--  1 root root   7272 Mar  8 17:40 ufs2_stage1_5
-rw-r--r--  1 root root   6612 Mar  8 17:40 vstafs_stage1_5
-rw-r--r--  1 root root   9308 Mar  8 17:40 xfs_stage1_5

/boot/lost+found:
total 22
drwx--  2 root root 12288 Jan 12  2007 .
drwxr-xr-x  5 root root  9216 Mar  8 20:24 ..

/boot/.Trash-root:
total 81058
drwx--  2 root root9216 Mar  8 17:58 .
drwxr-xr-x  5 root root9216 Mar  8 20:24 ..
-rw-r--r--  1 root root   51502 Jan 14  2009 config-2.6.9-78.0.13.EL
-rw-r--r--  1 root root   51127 Jan 14  2009 
config-2.6.9-78.0.13.ELsmp
-rw-r--r--  1 root root   51614 Sep 15  2009 config-2.6.9-89.0.11.EL
-rw-r--r--  1 root root   51239 Sep 15  2009 
config-2.6.9-89.0.11.ELsmp
-rw-r--r--  1 root root   51614 Nov  3  2009 config-2.6.9-89.0.16.EL
-rw-r--r--  1 root root   51239 Nov  3  2009 
config-2.6.9-89.0.16.ELsmp
-rw-r--r--  1 root root   51614 Dec 15  2009 config-2.6.9-89.0.18.EL
-rw-r--r--  1 root root   51239 Dec 15  2009 
config-2.6.9-89.0.18.ELsmp
-rw-r--r--  1 root root   51614 Jan  8  2010 config-2.6.9-89.0.19.EL
-rw-r--r--  1 root root   51239 Jan  8  2010 
config-2.6.9-89.0.19.ELsmp
-rw-r--r--  1 root root   51614 Feb  2  2010 config-2.6.9-89.0.20.EL
-rw-r--r--  1 root root   51239 Feb  2  2010 
config-2.6.9-89.0.20.ELsmp
-rw-r--r--  1 root root   51645 May  6  2010 config-2.6.9-89.0.25.EL
-rw-r--r--  1 root root   51270 May  6  2010 
config-2.6.9-89.0.25.ELsmp
-rw-r--r--  1 root root   51645 Aug 20  2010 config-2.6.9-89.0.28.EL
-rw-r--r--  1 root root   51270 Aug 20  2010 
config-2.6.9-89.0.28.ELsmp
-rw-r--r--  1 root root   51645 Oct 19 14:27 config-2.6.9-89.31.1.EL
-rw-r--r--  1 root root   51270 Oct 19 15:17 
config-2.6.9-89.31.1.ELsmp
-rw-r--r--  1 root root   51645 Dec  2 07:23 config-2.6.9-89.33.1.EL
-rw-r--r--  1 root root   51270 Dec  2 07:41 
config-2.6.9-89.33.1.ELsmp
-rw-r--r--  1 root root   51645 Jan 18 15:06 config-2.6.9-89.35.1.EL
-rw-r--r--  1 root root   51270 Jan 18 15:36 
config-2.6.9-89.35.1.ELsmp
-rw-r--r--  1 root root  82 Jan 13  2007 device.map
-rw-r--r--  1 root root7956 Jan 13  2007 e2fs_stage1_5
-rw-r--r--  1 root root7684 Jan 13  2007 fat_stage1_5
-rw-r--r--  1 root root6996 Jan 13  2007 ffs_stage1_5
-rw---  1 root root4091 Jan 28 07:21 grub.conf
-rw-r--r--  1 root root 101 Mar 14  2009 
initrd-2.6.9-78.0.13.EL.img
-rw-r--r--  1 root root 1319564 Mar 14  2009 
initrd-2.6.9-78.0.13.ELsmp.img
-rw-r--r--  1 root root 1342750 Sep 22  2009 
init

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread Brunner, Brian T.

centos-boun...@centos.org wrote:
> On Wed, 9 Mar 2011, Michael Eager wrote:
> 
>> The problem with randomly replacing various components, other than
>> the downtime and nuisance, is that there's no way to know that the
>> change actually fixed any problem.  When the base rate is one
>> unknown system hang every few weeks, how many weeks should I wait
>> without a failure to conclude that the replaced component was the
>> cause?  A failure which happens infrequently isn't really amenable
>> to a random diagnostic approach.
> 
> So you pitch the whole thing over to being a test rig, and buy all
> new hardware? 

This would be far cheaper than the time spent troubleshooting the
running (sometimes hanging) system.
Starting with RAM and Power Supply is not random ... They're "The Usual
Suspects".

Insert spiffy .sig here:
Life is complex: it has both real and imaginary parts.

//me
***
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom
they are addressed. If you have received this email in error please
notify the system manager. This footnote also confirms that this
email message has been swept for the presence of computer viruses.
www.Hubbell.com - Hubbell Incorporated**

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Server hangs on CentOS 5.5

2011-03-09 Thread John Hodrien

On Wed, 9 Mar 2011, Michael Eager wrote:

> The problem with randomly replacing various components, other than
> the downtime and nuisance, is that there's no way to know that the
> change actually fixed any problem.  When the base rate is one
> unknown system hang every few weeks, how many wees should I wait
> without a failure to conclude that the replaced component was the
> cause?  A failure which happens infrequently isn't really amenable
> to a random diagnostic approach.

So you pitch the whole thing over to being a test rig, and buy all new
hardware?

jh
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Apache/Active Directory authentication

2011-03-09 Thread John Hodrien

On Wed, 9 Mar 2011, John Hodrien wrote:

> On Wed, 9 Mar 2011, Dvorkin, Asya wrote:
>
>> Thank you, John.
>>
>> I forgot to add that we cannot generate keytab from AD server for various
>> reasons that I have no control over.

And are you really sure this is the case?  If you can join to a domain, you
can get a keytab (you don't need AD admin rights to do this).

If you were just using Samba to do the join, something like:

use kerberos keytab = yes

in your smb.conf

and a:

net ads keytab create
net ads keytab add http

on the joined machine would get you a keytab suitable for web auth.

klist -k would then show you what you'd got.

jh
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] connection speeds between nodes

2011-03-09 Thread m . roth

John Hodrien wrote:
> On Wed, 9 Mar 2011, Ross Walker wrote:
>
>> On Mar 8, 2011, at 12:02 PM, John Hodrien 
>> wrote:
>>
>>> The absolute definiton of safe here is quite important.  In the event
>>> of a power loss, and a failure of the UPS, quite possibly also
followed by a
>>> failure of the RAID battery you'll get data loss, as some writes won't
>>> be committed to disk despite the client thinking they are.
>>
>> Don't forget about kernel panics and the accidentally pulled the
>> plugs...
>
> Sure, but the kernel's always in a position to screw you over.  While
> you're being negative, include bad memory on the RAID card, and then
your life
> becomes really interesting.

Hey, you forgot a failing connection on the backplane for the RAID

 mark "they were in slot 5 & 6, now they're in 14 and 15"

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

1 2 >

1 - 100 of 123 matches

Mail list logo