Re: [CentOS] Odd INFO "120 seconds" in logs for 2.6.18-194.3.1

2010-06-08 Thread Dianne Yumul
On Jun 8, 2010, at 12:08 PM, Dianne Yumul wrote:

> Hello,
> 
> I'm getting the same thing on one of our servers since upgrading to CentOS 
> 5.5:
> 
> INFO: task pdflush:21249 blocked for more than 120 seconds.
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> pdflush   D 1EE1  3540 21249 11   21226 (L-TLB)
>   f0b59f24 0046 af2b024f 1ee1 c04ec500  c041e314 000a 
>   f76ae550 af2b2566 1ee1 2317 0001 f76ae65c c180dc44 c198e200 
>    c180e5e4 f0b59fa8 f76ae550 c180dc44 c061bbc0 f76ae550  
> Call Trace:
> [] __next_cpu+0x12/0x21
> [] find_busiest_group+0x177/0x462
> [] schedule+0xbc/0xa55
> [] rwsem_down_read_failed+0x128/0x143
> [] .text.lock.rwsem+0x35/0x3a
> [] sync_supers+0x2f/0xb8
> [] wb_kupdate+0x36/0x10f
> [] pdflush+0x0/0x1a3
> [] pdflush+0x10b/0x1a3
> [] wb_kupdate+0x0/0x10f
> [] kthread+0xc0/0xed
> [] kthread+0x0/0xed
> [] kernel_thread_helper+0x7/0x10
> 
>> From the bugs already filed, it seems to happen to many (or any?) processes 
>> and some notice hangups and performance drops.  But our system seems okay, 
>> probably because it has low traffic and is mostly idle.  But I'll still 
>> reboot to the previous kernel version tonight.
> 
> dianne
> 
> On Jun 8, 2010, at 1:04 AM, Ireneusz Piasecki wrote:
> 
>> W dniu 2010-06-08 09:54, Tsuyoshi Nagata pisze:
>>> Hi
>>> (2010/06/08 5:12), Steve Brooks wrote:
>>>> Jun  7 19:45:21 sraid3 kernel:  [] inode_wait+0x0/0xd
>>>> Jun  7 19:45:21 sraid3 kernel:  []
>>>> out_of_line_wait_on_bit+0x6c/0x78
>>>> Jun  7 19:45:21 sraid3 kernel:  []
>>>> wake_bit_function+0x0/0x23
>>>> Jun  7 19:45:21 sraid3 kernel:  [] ifind_fast+0x6e/0x83
>>> This message was created at Linux/fs/inode.c:ifind_fast()
>>> The source code was bellows,
>>> 
>>> Linux/fs/inode.c:
>>> 912 static struct inode *ifind_fast(struct super_block *sb,
>>> 913 struct hlist_head *head, unsigned long ino)
>>> 914 {
>>> 915 struct inode *inode;
>>> 916
>>> 917  *LOCK* spin_lock(&inode_lock);<= This takes
>>> 918 inode = find_inode_fast(sb, head, ino);<=  more 120s.
>>> 919 if (inode) {
>>> 920 __iget(inode);
>>> 921   *UNLOCK*  spin_unlock(&inode_lock);
>>> 922 wait_on_inode(inode);
>>> 923 return inode;
>>> 924 }
>>> 925 spin_unlock(&inode_lock);
>>> 926 return NULL;
>>> 927 }
>>> 928
>>> 
>>> I guess your your file system has a trouble with i-node(file number) 
>>> resources.
>>> CAUSES:
>>>   Hard Disk trouble (bit error/raid trouble.)
>>>   i-node trouble (overflow. etc.)
>>>   Memory/CPU trouble(&inode_lock)
>>> 
>>> Buy Fresh Hard disks&  rebuild them is convenience way.
>>> Or memtest86 can finds DIMM trouble.(or CPU, mother board)
>>> Or ext4 bug in 194.3.1 kernel, back to ext3!
>>> 
>> Ok, then i will test all of my centos 5.5 32 nodes: cpu, ram, disks etc. 
>> This came with the kernel of Centos 5.5. Before there was'nt such 
>> errors/warrning. Redhat bugizilla: 
>> https://bugzilla.redhat.com/show_bug.cgi?id=573106
>> 
>> I.Piasecki
>> 
>>> -tsuyoshi
>>> ___
>>> CentOS mailing list
>>> CentOS@centos.org
>>> http://lists.centos.org/mailman/listinfo/centos
>>> 
>> 
>> ___
>> CentOS mailing list
>> CentOS@centos.org
>> http://lists.centos.org/mailman/listinfo/centos
>> 
> 
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
> 

Oh crud, I did a top post! So sorry, won't happen again.

dianne
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Odd INFO "120 seconds" in logs for 2.6.18-194.3.1

2010-06-08 Thread Dianne Yumul
Hello,

I'm getting the same thing on one of our servers since upgrading to CentOS 5.5:

INFO: task pdflush:21249 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
pdflush   D 1EE1  3540 21249 11   21226 (L-TLB)
   f0b59f24 0046 af2b024f 1ee1 c04ec500  c041e314 000a 
   f76ae550 af2b2566 1ee1 2317 0001 f76ae65c c180dc44 c198e200 
    c180e5e4 f0b59fa8 f76ae550 c180dc44 c061bbc0 f76ae550  
Call Trace:
 [] __next_cpu+0x12/0x21
 [] find_busiest_group+0x177/0x462
 [] schedule+0xbc/0xa55
 [] rwsem_down_read_failed+0x128/0x143
 [] .text.lock.rwsem+0x35/0x3a
 [] sync_supers+0x2f/0xb8
 [] wb_kupdate+0x36/0x10f
 [] pdflush+0x0/0x1a3
 [] pdflush+0x10b/0x1a3
 [] wb_kupdate+0x0/0x10f
 [] kthread+0xc0/0xed
 [] kthread+0x0/0xed
 [] kernel_thread_helper+0x7/0x10

>From the bugs already filed, it seems to happen to many (or any?) processes 
>and some notice hangups and performance drops.  But our system seems okay, 
>probably because it has low traffic and is mostly idle.  But I'll still reboot 
>to the previous kernel version tonight.

dianne

On Jun 8, 2010, at 1:04 AM, Ireneusz Piasecki wrote:

>  W dniu 2010-06-08 09:54, Tsuyoshi Nagata pisze:
>> Hi
>> (2010/06/08 5:12), Steve Brooks wrote:
>>> Jun  7 19:45:21 sraid3 kernel:  [] inode_wait+0x0/0xd
>>> Jun  7 19:45:21 sraid3 kernel:  []
>>> out_of_line_wait_on_bit+0x6c/0x78
>>> Jun  7 19:45:21 sraid3 kernel:  []
>>> wake_bit_function+0x0/0x23
>>> Jun  7 19:45:21 sraid3 kernel:  [] ifind_fast+0x6e/0x83
>> This message was created at Linux/fs/inode.c:ifind_fast()
>> The source code was bellows,
>> 
>> Linux/fs/inode.c:
>> 912 static struct inode *ifind_fast(struct super_block *sb,
>> 913 struct hlist_head *head, unsigned long ino)
>> 914 {
>> 915 struct inode *inode;
>> 916
>> 917  *LOCK* spin_lock(&inode_lock);<= This takes
>> 918 inode = find_inode_fast(sb, head, ino);<=  more 120s.
>> 919 if (inode) {
>> 920 __iget(inode);
>> 921   *UNLOCK*  spin_unlock(&inode_lock);
>> 922 wait_on_inode(inode);
>> 923 return inode;
>> 924 }
>> 925 spin_unlock(&inode_lock);
>> 926 return NULL;
>> 927 }
>> 928
>> 
>> I guess your your file system has a trouble with i-node(file number) 
>> resources.
>> CAUSES:
>>Hard Disk trouble (bit error/raid trouble.)
>>i-node trouble (overflow. etc.)
>>Memory/CPU trouble(&inode_lock)
>> 
>> Buy Fresh Hard disks&  rebuild them is convenience way.
>> Or memtest86 can finds DIMM trouble.(or CPU, mother board)
>> Or ext4 bug in 194.3.1 kernel, back to ext3!
>> 
> Ok, then i will test all of my centos 5.5 32 nodes: cpu, ram, disks etc. 
> This came with the kernel of Centos 5.5. Before there was'nt such 
> errors/warrning. Redhat bugizilla: 
> https://bugzilla.redhat.com/show_bug.cgi?id=573106
> 
> I.Piasecki
> 
>> -tsuyoshi
>> ___
>> CentOS mailing list
>> CentOS@centos.org
>> http://lists.centos.org/mailman/listinfo/centos
>> 
> 
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
> 

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] New selinux-policy breaks logwatch emails?

2010-01-13 Thread Dianne Yumul
On Jan 8, 2010, at 4:54 PM, James Rankin wrote:

> For anyone else finding this:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=553492
> 

Here's a stupid question, can we install the rpm provided on the link above 
(see comment 12)?  Or is the correct way to modify the local policy?

Thanks,

Dianne___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Problem YUM Centos 5

2009-10-30 Thread Dianne Yumul

On Oct 30, 2009, at 8:43 AM, Adriano Frare wrote:


Dear Friends,

Today , I ran command yum update and  I received follow error below.

=== BEGIN =
Loaded plugins: fastestmirror
Determining fastest mirrors
Traceback (most recent call last):
 File "/usr/bin/yum", line 29, in ?
   yummain.user_main(sys.argv[1:], exit_code=True)
 File "/usr/share/yum-cli/yummain.py", line 229, in user_main
   errcode = main(args)



Whenever yum gets weird I usually run "yum clean all" and everything  
gets better.  CentOS 5.4 is out so if you haven't upgraded yet, check  
the release notes before running your next "yum update."


Thanks,

Dianne___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] crontab won't work

2009-10-30 Thread Dianne Yumul

On Oct 30, 2009, at 9:29 AM, Niki Kovacs wrote:

Now here's what I have on the local backup server :

[r...@grossebertha:~] # crontab -l
24 17 * * * /usr/local/bin/sauvegarde.sh


You may have checked already, but make sure that crond is running,  
i.e. /sbin/service crond status. I get crond (pid 1706) is running


Thanks,

Dianne___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] OT: Testing and monitoring hardware RAID

2009-05-28 Thread Dianne Yumul
On May 28, 2009, at 2:01 PM, Les Mikesell wrote:

> The thing you need to know about RAIDs at runtime is whether or not  
> one
> or more of the drives have already failed since their job is to hide
> this fact from you but you still need to replace it before you lose  
> the
> other one and your data...

Got it. Thank you Les, I appreciate the explanation.

dianne
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] OT: Testing and monitoring hardware RAID

2009-05-28 Thread Dianne Yumul

On May 28, 2009, at 1:22 PM, Karanbir Singh wrote:

what sort of license is this distributed under ?


I'm sorry but I'm not sure. It's software that supposedly comes with  
it when you buy it. But the server was assembled by somebody else and  
they probably neglected to include the CD when they shipped the  
server to us. Here's the page with the download link.


http://www.adaptec.com/en-US/speed/raid/aac/sm/asm-linux_v2_12(922) 
_rpm.htm


I'm going to give it a try but don't know when because I would have  
to bring the server down just to get the serial number.


Thanks Karanbir and everyone for all the help.

dianne___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] OT: Testing and monitoring hardware RAID

2009-05-28 Thread Dianne Yumul
On May 28, 2009, at 1:00 PM, Phil Schaffner wrote:

> If you feel you must test the functionality of the ability of the  
> RAID1
> to recover from a failed drive, the power down, remove drive, boot,
> power down, replace drive, boot process will test the ability of the
> system to re-sync the drives.  A more rigorous test would be to  
> zero out
> the partition table of the removed drive, or to use a new blank  
> drive to
> test the recovery; however, what I think Karanbir meant by "silly" is
> that all these tests simply confirm that the RAID is working as
> designed.  If it doesn't, then you should have bought different  
> hardware
> to start with.

Thanks Phil for the clarification, felt like the light bulb suddenly  
turned on in my head. I should be more concerned at monitoring the  
health of the hard drives then.

dianne
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] OT: Testing and monitoring hardware RAID

2009-05-28 Thread Dianne Yumul
Thanks Scott and Guy, I appreciate the advice/warning.

I can't download it at the moment because the Adaptec site requires  
the serial number. I have to reboot and get it from the SMOR (Storage  
Manager on ROM) utility. Then I'll have to see if it's worth the  
trouble to install and run :)

Would you have any suggestions on how I would test the RAID 1  
configuration? (Sorry if you already saw the question on the previous  
posts)

Thanks,

dianne
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] OT: Testing and monitoring hardware RAID

2009-05-28 Thread Dianne Yumul
Thanks for the help Karanbir and Rainer.

> That sounds quite silly - also going through reboots means downtime,
> isnt that the sort of thing that raid1 was designed to protect against
> anyway.

I guess it would be silly :). I just wanted to make sure it was doing  
what it ought too.

>> Have you looked into adaptec supplied management s/w ? In pretty much
>> ever case with such hba's the most functional way to look at state  
>> and
>> do any management on raid tends to be from vendor supplied s/w
>
> Adaptec - I don't know.
> They aquired a lot of companies over the years and thus the quality of
> the management-apps is/was sometimes questionable - and you never knew
> which one was good and which one wasn't as it varied between different
> revisions of the same hardware (which might have a similar name but a
> totally different tech inside...).

I checked the Adaptec site again and found a link to the Adaptec  
Storage Manager.
I don't know how I missed that, thanks for the push in the right  
direction.
Hopefully it works.

Thanks,

dianne
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] OT: Testing and monitoring hardware RAID

2009-05-27 Thread Dianne Yumul
Hello everyone,

I was hoping to get recommendations on the proper way to test a RAID  
1 hardware configuration.  The controller is an Adaptec 2200S.  I  
found an article, but not for this controller, that suggests to power  
off the system, pull one of the drives, boot the OS and power off  
again. Would this be the way to go?

Also any suggestions on what to use to monitor it as well? Can't get  
smartmontools to work.

Thank you and have a nice day.

Dianne
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Odd SELinux messages during+after 5.3 upgrade (system_mail_t and postfix_postdrop_t access rpm_var_lib_t)

2009-04-16 Thread Dianne Yumul

Dan Mensom wrote:

Does anyone know what these accesses are?

Also, on a related note, is it normally best practices to  
'setenforce 0'

during a 5.x upgrade?



I also got these type of messages.  I just did a yum update from  
5.2.  Output from audit2allow are as follows:


allow useradd_t rpm_t:tcp_socket { read write };
allow useradd_t rpm_var_lib_t:file { read write };
allow useradd_t var_lib_t:file write;
allow useradd_t var_t:file read;

I have similar messages for auditctl_t, cupsd_t, groupadd_t, rdisc_t,  
restorecon_t, restorecond_t, semanage_t and setrans_t.  It looks like  
they only happened during the upgrade and haven't gotten any every  
since.  Just wondering too if these messages are normal (everything  
is working flawlessly) and if there's anything I should've done to  
ensure the upgrade is complete.


Thank you.

Dianne
Wells Gaming Research
(800) 854-6809
(775) 826-3232




___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Closest Fedora to CentOS 4?

2008-04-18 Thread Dianne Yumul

On Apr 18, 2008, at 10:05 AM, Tony Mountifield wrote:

Hi, I want to take a SRPM that is available for various versions of  
Fedora

and rebuild it on a CentOS 4 system.


I read from somewhere that RHEL 4 is based on Fedora Core 3, but  
can't find the article now. Somebody will correct me though if I'm  
wrong.




Which release of Fedora is the closest to CentOS 4? In other words,  
which

would be the best FC to take the SRPM for?
Cheers
Tony
--
Tony Mountifield
Work: [EMAIL PROTECTED] - http://www.softins.co.uk
Play: [EMAIL PROTECTED] - http://tony.mountifield.org
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos



___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] EDAC error

2008-01-28 Thread Dianne Yumul

On Jan 28, 2008, at 2:46 AM, Peter Kjellstrom wrote:
It's safe to not load EDAC at all, but also safe to leave it loaded  
and ignore
the error (I'd actually call it a warning). If the functionality is  
very
important to your then you might want to do as EDAC suggests and  
investigate

BIOS upgrades (or just have a look at the relevant BIOS settings).

/Peter


I think I will do as you suggest and just ignore it for now. Thank  
you very much.


Dianne
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] EDAC error

2008-01-19 Thread Dianne Yumul

Hello,

I upgraded to CentOS 5.1 and everything went smoothly (Thanks for the  
awesome work!). But after rebooting, I get the following error:


EDAC MC: Ver: 2.0.1 Nov 30 2007
EDAC e7xxx: error reporting device not found:vendor 8086 device  
0x2541 (broken BIOS?)


I found http://edacbugs.buttersideup.com/show_bug.cgi?id=21 with  
google but no solution. Is it safe to ignore the error or remove the  
EDAC module? I read their wiki but I'm new to this and I don't want  
to break anything?


Please advice on what to do next?

Thank you so much.

dianne ___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos