Re: [SCIENTIFIC-LINUX-USERS] yum error, SL 6.3, file is encrypted or is not a database
I have the same experience. After patch rebuilding the repo everything works for me using Larry's suggestion of clean all then update. Thanks for the help. Joe On 4/23/16 8:00 PM, P. Larry Nelson wrote: Forgot to say that one should do a 'yum clean all' and then 'yum update' works. - Larry P. Larry Nelson wrote on 4/23/16 9:49 PM: Fixed! Thanks Pat! - Larry Pat Riehecky wrote on 4/23/16 5:52 PM: Weird, the only change to the on April 21 was a security errata that was published just like the rest. I'll rebuild the metadata across the board just to be safe. Pat On 04/23/2016 05:38 PM, P. Larry Nelson wrote: I am having same problem with 3 of my SL5.x systems. One is 5.1 and two are 5.4. All my other SL 5.x are 5.5 and have had no problems, nor have I seen this problem with any of my SL6.x systems. The problem seems to be with sl-security repo. If I do a 'yum update --disablerepo=sl-security' on the 5.1 and 5.4 systems I do NOT get the: Error: file is encrypted or is not a database This just started happening with the early morning auto yum update on 4/22/16. - Larry Joseph Areeda wrote on 4/23/16 4:35 PM: I see people are having the same problem with some of the version 7 repos. But I don't understand how to figure out which repo is causing the problem. Are people disabling star and enabling one at a time? Thanks, Joe On 4/23/16 1:53 PM, Joseph Areeda wrote: We started getting this error couple of days ago machine that has been auto updating for years. I would assume that it was a corruption of a local database but it happened on two systems simultaneously. Googling for that error message produces nothing on yum but several hits on's SQLite. I'd appreciate any insight into what the error means and how to track down exactly which repo or file on my system is causing it the problem. Below is what I see, yum update also produces the same error message. Thanks, Joe [root@mavraki yum.repos.d]# yum clean all Loaded plugins: fastestmirror, refresh-packagekit, security Cleaning repos: CONDOR-stable VDT-Production-sl6 elrepo lscsoft-epel lscsoft-pegasus lscsoft-production sl sl-security Cleaning up Everything Cleaning up list of fastest mirrors [root@mavraki yum.repos.d]# yum repolist Loaded plugins: fastestmirror, refresh-packagekit, security Determining fastest mirrors * elrepo: elrepo.org * sl: ftp1.scientificlinux.org * sl-security: ftp1.scientificlinux.org CONDOR-stable | 2.9 kB 00:00 CONDOR-stable/primary_db | 427 kB 00:00 VDT-Production-sl6 | 1.3 kB 00:00 VDT-Production-sl6/primary | 35 kB 00:00 VDT-Production-sl6 11/11 elrepo | 2.9 kB 00:00 elrepo/primary_db | 732 kB 00:00 lscsoft-epel | 2.7 kB 00:00 lscsoft-epel/primary_db | 4.2 MB 00:02 lscsoft-pegasus | 2.6 kB 00:00 lscsoft-pegasus/primary_db | 5.8 kB 00:00 lscsoft-production | 2.9 kB 00:00 lscsoft-production/primary_db | 301 kB 00:00 sl | 3.5 kB 00:00 sl/primary_db | 4.2 MB 00:03 sl-security | 3.0 kB 00:00 sl-security/primary_db | 12 MB 00:06 Error: file is encrypted or is not a database [root@mavraki yum.repos.d]#
Re: yum error, SL 6.3, file is encrypted or is not a database
I see people are having the same problem with some of the version 7 repos. But I don't understand how to figure out which repo is causing the problem. Are people disabling star and enabling one at a time? Thanks, Joe On 4/23/16 1:53 PM, Joseph Areeda wrote: We started getting this error couple of days ago machine that has been auto updating for years. I would assume that it was a corruption of a local database but it happened on two systems simultaneously. Googling for that error message produces nothing on yum but several hits on's SQLite. I'd appreciate any insight into what the error means and how to track down exactly which repo or file on my system is causing it the problem. Below is what I see, yum update also produces the same error message. Thanks, Joe [root@mavraki yum.repos.d]# yum clean all Loaded plugins: fastestmirror, refresh-packagekit, security Cleaning repos: CONDOR-stable VDT-Production-sl6 elrepo lscsoft-epel lscsoft-pegasus lscsoft-production sl sl-security Cleaning up Everything Cleaning up list of fastest mirrors [root@mavraki yum.repos.d]# yum repolist Loaded plugins: fastestmirror, refresh-packagekit, security Determining fastest mirrors * elrepo: elrepo.org * sl: ftp1.scientificlinux.org * sl-security: ftp1.scientificlinux.org CONDOR-stable | 2.9 kB 00:00 CONDOR-stable/primary_db | 427 kB 00:00 VDT-Production-sl6 | 1.3 kB 00:00 VDT-Production-sl6/primary | 35 kB 00:00 VDT-Production-sl6 11/11 elrepo | 2.9 kB 00:00 elrepo/primary_db | 732 kB 00:00 lscsoft-epel | 2.7 kB 00:00 lscsoft-epel/primary_db | 4.2 MB 00:02 lscsoft-pegasus | 2.6 kB 00:00 lscsoft-pegasus/primary_db | 5.8 kB 00:00 lscsoft-production | 2.9 kB 00:00 lscsoft-production/primary_db | 301 kB 00:00 sl | 3.5 kB 00:00 sl/primary_db | 4.2 MB 00:03 sl-security | 3.0 kB 00:00 sl-security/primary_db | 12 MB 00:06 Error: file is encrypted or is not a database [root@mavraki yum.repos.d]#
yum error, SL 6.3, file is encrypted or is not a database
We started getting this error couple of days ago machine that has been auto updating for years. I would assume that it was a corruption of a local database but it happened on two systems simultaneously. Googling for that error message produces nothing on yum but several hits on's SQLite. I'd appreciate any insight into what the error means and how to track down exactly which repo or file on my system is causing it the problem. Below is what I see, yum update also produces the same error message. Thanks, Joe [root@mavraki yum.repos.d]# yum clean all Loaded plugins: fastestmirror, refresh-packagekit, security Cleaning repos: CONDOR-stable VDT-Production-sl6 elrepo lscsoft-epel lscsoft-pegasus lscsoft-production sl sl-security Cleaning up Everything Cleaning up list of fastest mirrors [root@mavraki yum.repos.d]# yum repolist Loaded plugins: fastestmirror, refresh-packagekit, security Determining fastest mirrors * elrepo: elrepo.org * sl: ftp1.scientificlinux.org * sl-security: ftp1.scientificlinux.org CONDOR-stable | 2.9 kB 00:00 CONDOR-stable/primary_db | 427 kB 00:00 VDT-Production-sl6 | 1.3 kB 00:00 VDT-Production-sl6/primary | 35 kB 00:00 VDT-Production-sl6 11/11 elrepo | 2.9 kB 00:00 elrepo/primary_db | 732 kB 00:00 lscsoft-epel | 2.7 kB 00:00 lscsoft-epel/primary_db | 4.2 MB 00:02 lscsoft-pegasus | 2.6 kB 00:00 lscsoft-pegasus/primary_db | 5.8 kB 00:00 lscsoft-production | 2.9 kB 00:00 lscsoft-production/primary_db | 301 kB 00:00 sl | 3.5 kB 00:00 sl/primary_db | 4.2 MB 00:03 sl-security | 3.0 kB 00:00 sl-security/primary_db | 12 MB 00:06 Error: file is encrypted or is not a database [root@mavraki yum.repos.d]#
SL7-RC1 Installer bug reporting
I downloaded SL-7-x86_64-DVD.iso verified the sha256 hash and started a test and install on the VM that I've been using for the beta releases. After specifying the language it reported a "bad file descriptor" error. I want to try a fresh VM before I discuss that problem but following the bug reporting options on the error message I have a screen that looks like: The question right now is do we really want to report SL7 issues to Red Hat Customer Support? It's fine with me but I wonder if I found a deep down reference to the upstream provider that has been missed. Joe
Re: [SL-Users] Re: Scientific Linux 7 ALPHA - Updating
On 07/05/2014 11:02 AM, Nico Kadel-Garcia wrote: On Sat, Jul 5, 2014 at 11:23 AM, Joseph Areeda wrote: On 07/05/2014 08:03 AM, Nico Kadel-Garcia wrote: On Sat, Jul 5, 2014 at 10:43 AM, Joseph Areeda wrote: On 07/04/2014 09:12 PM, Nico Kadel-Garcia wrote: Set up a local rsync mirror from any of the locally fast upstream repositories, and slap a web server in front of it. Use the 'netinstall' from th elocal mirror,, or even a PXE setup, not the full DVD, to point to the local mirror. Ideally, set up a kickstart file too on the local mirror, ideally tied to a the PXE setup. Then just use the PXE or netinstall ISO to do a kickstarted, network based re-install. That way, you don't even have to download the DVD images, which are quite builky. If you like, I'll post my download scripts Thanks Nico, I would like to see your scripts. Let me put them up at github.com, and give me a day. What I will be testing is what changes are needed to our packaged and unpackaged applications. Excuse the basic questions but I'm more of a developer than a sysadmin. If I understand your recommendations the system including user accounts will be rebuilt on each boot. So if I want to work through multiple reboots I could put my home directory on an NFS mount and end up with fresh software but the same environment. Correct? That's one workable way, yes. If PXE and kickstart can be set up correctly, the "kickstart" can even set up your NFS mounted home directory and local authentication and sudo privileges, and the PXE can allow a "rebuild me from scratch" option at boot time. that will select and automatically use the relevant kickstart file. It can even be set to auto-rebuild every time, if you want. I've done this a lot for hardware testing, and for building clusters. One of the limitations is local bandwidth: if you're rebuilding a bunch of times from the upstream SL 7 Alpha website, well, that's rude. It's hundereds of megs, possibly even Gigs, of bandwidth. Set up a local mirror to pull from instead, and keep *that* updated. Thanks. I'm sure others in our collaboration will be doing more extensive tests. Best, Joe Thanks again, I appreciate the help. I've set up kickstart once, successfully and I'm ready to do battle with PXE. Just one of quick question for now, I'm sure more will follow later: The mirror I've started is rsync://mirror.mcs.anl.gov/scientific-linux/7rolling/x86_64/os I think that's all I need to maintain, correct? I'm in Los Angeles. Best, Joe
Re: [SL-Users] Re: Scientific Linux 7 ALPHA - Updating
On 07/05/2014 08:03 AM, Nico Kadel-Garcia wrote: On Sat, Jul 5, 2014 at 10:43 AM, Joseph Areeda wrote: On 07/04/2014 09:12 PM, Nico Kadel-Garcia wrote: On Fri, Jul 4, 2014 at 10:11 AM, Akemi Yagi wrote: All, Please refrain from posting anything other than testing results of the released SL packages in this thread. Let's keep this one free of trolls. Akemi The SL 7 Alpha is running well in PC Virtualbox (my virtualiztion toolsuite of choice). The add-on tools for more graseful focus switching and for mouse management are not yet installable, but I expect that to be fixed upstream by PC Virtualbox, now that RHEL 7 is in production. I can report it also runs well in and SL6 Virtualbox VM. I saw the firstboot problem report. My question is how often should we download and reinstall from scratch vs using yum update (or autoupdate or cron)? I assume the only reason to download the DVD image is to test changes to the install procedure and yum update will end up with the same installation. I just want to confirm that how best to help the process. If you're a weasel and want to save speed and bandwidth: Set up a local rsync mirror from any of the locally fast upstream repositories, and slap a web server in front of it. Use the 'netinstall' from th elocal mirror,, or even a PXE setup, not the full DVD, to point to the local mirror. Ideally, set up a kickstart file too on the local mirror, ideally tied to a the PXE setup. Then just use the PXE or netinstall ISO to do a kickstarted, network based re-install. That way, you don't even have to download the DVD images, which are quite builky. If you like, I'll post my download scripts Thanks Nico, I would like to see your scripts. What I will be testing is what changes are needed to our packaged and unpackaged applications. Excuse the basic questions but I'm more of a developer than a sysadmin. If I understand your recommendations the system including user accounts will be rebuilt on each boot. So if I want to work through multiple reboots I could put my home directory on an NFS mount and end up with fresh software but the same environment. Correct? Thanks. I'm sure others in our collaboration will be doing more extensive tests. Best, Joe
Re: [SL-Users] Re: Scientific Linux 7 ALPHA - Updating
On 07/04/2014 09:12 PM, Nico Kadel-Garcia wrote: On Fri, Jul 4, 2014 at 10:11 AM, Akemi Yagi wrote: All, Please refrain from posting anything other than testing results of the released SL packages in this thread. Let's keep this one free of trolls. Akemi The SL 7 Alpha is running well in PC Virtualbox (my virtualiztion toolsuite of choice). The add-on tools for more graseful focus switching and for mouse management are not yet installable, but I expect that to be fixed upstream by PC Virtualbox, now that RHEL 7 is in production. I can report it also runs well in and SL6 Virtualbox VM. I saw the firstboot problem report. My question is how often should we download and reinstall from scratch vs using yum update (or autoupdate or cron)? I assume the only reason to download the DVD image is to test changes to the install procedure and yum update will end up with the same installation. I just want to confirm that how best to help the process. Joe
Problem with OpenJDK getting system time zone
Hi, I have a strange problem. I have a little java app that displays an analog clock with a few weird additions like GPS time. It's been working for years. A recent update or the last switch to DST has it return the time zone as GMT-08:00 instead of PDT. If I use Oracle's Oracle's jdk1.7.0_45 it works fine but I I use openjdk-1.7.0.51 from the sl-security repo it does not. Does anyone know how openjdk gets its timezone information? Thanks, Joe
Re: Requiered steps to configure samba
Hi Pritam, I think this article would be a good place to start: https://www.linux.com/learn/tutorials/296391-easy-samba-setup Joe On 09/21/2013 11:32 PM, Pritam Khedekar wrote: > Hi All, > > I want to configure samba -- what i want to do is, i have windows > machine share dir.. want to access in my SL 6.4 fermi... please reply > with step by step instructions. I am new to linux bt i know the power > of linux...
Re: slow loading browser homepage
On 09/15/2013 07:15 AM, sascha.fo...@safo.at wrote: > On Sat, 14 Sep 2013 20:46:55 -0700 > Todd And Margo Chester wrote: >> On 09/14/2013 05:34 PM, Tom Rosmond wrote: >>> T. >>> >>> No luck. Making your suggested changes didn't solve the problem. I >>> think it is because for some reason 'resolv.conf' didn't recreate, even >>> after a reboot. So without it there was no nameservice and nothing >>> worked. >> I forgot to tell yo to restart your netowrking daemon. Sorry. >> >>> I put the original back in place and that restored nameservice, >>> but at the original slowdown. I assume this is because of the DNS >>> mismatch between 'ifcfg-eth0' and 'resolv.conf'? I tried putting the >>> Google DNS values in 'resolv.conf' and restarting 'eth0', and now the >>> file was recreated, but with my own router and ISP nameservice >>> addresses. The 'dhclient' deamon seems to insist on that. >>> >>> This problem is not unique to me. I see similar threads in various >>> Linux forums (Ubuntu, Redhat, etc) complaining about slow nameservice >>> compared to Windows. And no clear resolution of the problem. >> You have probably gone a far as you can go. > The easiest solution would be to follow Joseph Areeda's advice and > check the routers DHCP-Server configuration. > > As we can see in the dhclient-eth0.leases file the router sends the > following DNS-Servers and Defaultgateway: > Primary: 192.168.0.1 > Secondary: 216.177.225.9 > Gateway: 192.168.1.1 > > Now you should already see whats wrong here. Since this is a home > router it will probably put itself as the primary DNS-Server in the > network but as you can see it points to a other IP Adress (192.168.0.1). > > Is there actually a DNS-Server running at that IP? (I guess not) > > Now to the question why it works in Windows XP and not in Scientific > Linux. Thats because of the difference of how the resolvers work. > Windows XP sends a request to all configured DNS-Servers and just takes > the first response (the secondary DNS answers in that case). > > Scientific Linux sends a request to the primary DNS-Server, waits for 5 > seconds and if there is no answer it will move on to the secondary DNS. > This is why every lookup takes additional 5 seconds in your case. > > Regards, > Sascha One other option that hasn't been discussed is to use a fixed IP address and specify everything manually. It's fairly straight forward with the Network Manager GUI or the /etc/network configuration files. The reasons this may be a viable option are: * If you want to use the SL system as a server of some sort for other system in your LAN * You're uncomfortable messing with the router's DHCP settings. Typically these routers will allocate a small block of IP addresses for DHCP somewhere between 20 and 50. You must manage the rest of the address space and be sure to only assign each IP to one device. I've also found that printers work much better with fixed IP. Joe
Re: [SCIENTIFIC-LINUX-USERS] Metadata file does not match checksum
On 07/10/2013 07:54 AM, Pat Riehecky wrote: yum clean expire-cache Thanks Pat, I did try a yum clean which I think is clean all including expire-cache. Anyway trying that explicitly does not seem to fix it: joe@george:~$ sudo yum clean expire-cache Loaded plugins: fastestmirror, refresh-packagekit, security Cleaning repos: CONDOR-stable LDG_EPEL6 LDG_SL6.1-base LDG_SL6.1-securityupdates VDT-Production-sl6 : elrepo google-chrome lscsoft lscsoft-testing rpmforge sl sl-livecd-extra sl-security : sl6x sl6x-security 17 metadata files removed joe@george:~$ yum info swig2 Loaded plugins: fastestmirror, refresh-packagekit, security Loading mirror speeds from cached hostfile * elrepo: elrepo.org * rpmforge: mirror.hmc.edu * sl: ftp1.scientificlinux.org * sl-security: ftp1.scientificlinux.org * sl6x: ftp1.scientificlinux.org * sl6x-security: ftp1.scientificlinux.org sl-livecd-extra/primary | 28 kB 00:00 http://www.livecd.ethz.ch/download/sl-livecd-extra/6.4/x86_64/repodata/primary.xml.gz: [Errno -1] Metadata file does not match checksum Trying other mirror. Error: failure: repodata/primary.xml.gz from sl-livecd-extra: [Errno 256] No more mirrors to try. I wonder if I should remove that repo from the list. I'm not sure what packages are in it though. Joe
Metadata file does not match checksum
Greetings, I'm getting an error from yum (see below) on a system installed from LiveCD. I remember reading the solution to this but can't seem to find that email or website. My memory and search abilities seem to be fading. Anybody know the mystic incantation off the top of their head? Thanks Joe ~$ yum search swig Loaded plugins: fastestmirror, refresh-packagekit, security Loading mirror speeds from cached hostfile * elrepo: elrepo.org * rpmforge: mirror.hmc.edu * sl: ftp1.scientificlinux.org * sl-security: ftp1.scientificlinux.org * sl6x: ftp1.scientificlinux.org * sl6x-security: ftp1.scientificlinux.org sl-livecd-extra/primary | 28 kB 00:00 http://www.livecd.ethz.ch/download/sl-livecd-extra/6.4/x86_64/repodata/primary.xml.gz: [Errno -1] Metadata file does not match checksum Trying other mirror. Error: failure: repodata/primary.xml.gz from sl-livecd-extra: [Errno 256] No more mirrors to try. J
Re: LO destroyed envelopes
This may be obvious and obnoxious but I've dealt with similar printer problems by printing to a pdf then printing the pdf. The only thing good to say about it is you don't have to look at the ppd's. Joe On 06/25/2013 02:06 PM, Mark Stodola wrote: On 06/25/2013 01:21 PM, Todd And Margo Chester wrote: On 06/25/2013 10:40 AM, Mark Stodola wrote: On 06/25/2013 12:07 PM, Todd And Margo Chester wrote: Hi All, Can you guys tell if this is finger pointing or if this really is not a Libre Office problem? https://bugs.freedesktop.org/show_bug.cgi?id=42327 Many thanks, -T Having had my fair share of odd behavior with CUPS, I would lean toward that as the culprit. There are several different filters that get used depending on the mime type provided. For instance, on SL 5, texttopaps did very bad things, causing me to force texttops for text/plain processing. Some of these filters have been known to double-rotate, which might be what you are experiencing. It might also be worth skimming through the ppd for the printer to see if the paper definitions or orientation are wrong. If the ppd contains a page orientation, and the program specifies a rotation, this can also lead to incorrect orientation. -Mark Hi Mark, The frustrating thing is that I have no problems printing from anything else. I can print an envelope just fine from Wine/Word Pro, which also uses CUPS. Other programs, portrait or landscape, print just as it is told. Anyway, I opened up the following with Red Hat: https://bugzilla.redhat.com/show_bug.cgi?id=977976 Maybe, someday, I will be able to print an envelope through LO. Thank you for your response. -T If you have the time and patience, you can look into turning up the debug/log level of cups to see what is going on between the programs. You can also intercept the print queue contents by leaving the spool enabled but the printer disabled. With enough poking, you should be able to pin down where the problem exists. I am guessing (not sure) that most word processors generate postscript and send it to CUPS. You can compare the postscript generated in the spool from each of your programs to see how they differ. It may be worth while to open the postscript in a text editor to see if there is other meta-data that could be affecting the outcome as well. -Mark
Re: Help finding a hardware problem (I think)
I can't thank you all enough for bearing with me as I stumble my way through this. I now understand the logic behind running memtest uninterrupted for a long period (>24hr) and will do that. I have to take back my comment about kmod-nvidia. I repeatedly messed up /etc/selinux/config trying to disable it and that it what was causing the kernel panics. I suppose that's a sign I'm not paying enough attention. The purpose of running from LiveCD is not to necessarily find a hardware problem but to remove the hard disks and the installed software from the equation. The idea being IF I got one of these rare and random failures while running that way I could rule out insidious package conflicts, mangled configurations and the system disk as the cause. As far as finding a computer repair professional whom I would go to for a problem like this, well all I can say is I've been living in this town for 32 years working in computing, I do have an outstanding doctor, a great car mechanic, an exceptional plumber... but I haven't found a computer guy better than me at this. That is not to imply that I am any good at it. I am now up and running with SL6.4 on a spinning disk (to remove the SSD and a bunch of useful and need packages from the equation). I'll try to get some work done today and see if it crashes. My next step is to swap memory and GPU with another box and see if the problem follows. I hope I'm not posting too much useless (to others) information to the list. Joe On 04/24/2013 09:10 AM, Yasha Karant wrote: A small comment: stress testing is cumulative only if the underlying system has no recovery mechanism. (An understanding of this in detail requires non-equilibrium statistical mechanics but can be summarized with non-equilibrium "thermodynamics"). My experience with failing electronics and magnetics -- depending upon the exact failure mode -- is that non-interrupted stress testing is better than interrupted in terms of finding failures. A simple example: suppose a failure mode is temperature dependent, and temperature depends upon the amount of work being done. An interrupted but cumulative stress test might never reach the "critical" temperature, whereas a continued stress test might. Yasha Karant On 04/24/2013 08:03 AM, Joseph Areeda wrote: Thanks for the tips Konstantin, I assume that your recommendation for 24 hrs of memtest is cumulative and I can probably see the same results starting it each night when I quit for the day. When I mentioned SMART I was talking about the self tests not the status that comes up. I've also copied large files around and checked their md5sum's. I played with LiveCD for 4 or 5 hours today, much of it was trying to install it on a different spinning hard drive. I did see one time when the SSD was shown in the disk utility but all the partitions were zero length. that's where my root directory used to be. I also found that the nvidia drivers in ELREPO don't seem to work with 6.4. I seem to be able to run fine (at least for a while) unless I install kmod-nvidia then I get a kernal panic on the next reboot (3 times until I tracked it down). It saiys something like "not syncing attempt xxx(can't read my writing) PID 1 comm init not tainted 2.6.32.258.2.1. That's another problem I think. Right now I suspect not necessarily in order: * Bad SSD. Run time is reported as 1.8 years. I did have /usr /usr/local /tmp swap and /home on spinning media but... * Bad memory: still a good possiblity * Some insidious incompatibility with all packages from multiple repos. I really hope it's not that, I don't load much I don't need. And as for finding a real computer repairman, let me know if you have one in Los Angeles. This is similar to a problem I had with an iMac. The geniuses at the store took three trips to convince them something was wrong and that was after about an hour each time with the phone support people. That one turned out to be a flaky memory DIMM that passed all the quick diagnostics. Oh well the saga continues. It's nice have a group to go to for ideas. Thank you all. Joe On 04/23/2013 04:20 PM, Konstantin Olchanski wrote: On Tue, Apr 23, 2013 at 11:44:22AM -0700, Joseph Areeda wrote: I'm having this strange behavior that I think is a hardware problem ... * System freezes, mouse and keyboard dead, sshd unresponsive sometimes First action is to run memtest86 (Q: which one? google finds several. A: all of them). Run memtest86 for 24 hours at least - if it reports memory errors, hangs, freezes or machine turns off, you definitely have a hardware problem. Suspect parts are in this order: RAM, power supply, CPU socket (bent pins), mobo, CPU. If memtest86 runs fine for 24 hours and more, there *still* could be a hardware problem. (memtest86 does not test the vi
Re: Help finding a hardware problem (I think)
Thanks for the tips Konstantin, I assume that your recommendation for 24 hrs of memtest is cumulative and I can probably see the same results starting it each night when I quit for the day. When I mentioned SMART I was talking about the self tests not the status that comes up. I've also copied large files around and checked their md5sum's. I played with LiveCD for 4 or 5 hours today, much of it was trying to install it on a different spinning hard drive. I did see one time when the SSD was shown in the disk utility but all the partitions were zero length. that's where my root directory used to be. I also found that the nvidia drivers in ELREPO don't seem to work with 6.4. I seem to be able to run fine (at least for a while) unless I install kmod-nvidia then I get a kernal panic on the next reboot (3 times until I tracked it down). It saiys something like "not syncing attempt xxx(can't read my writing) PID 1 comm init not tainted 2.6.32.258.2.1. That's another problem I think. Right now I suspect not necessarily in order: * Bad SSD. Run time is reported as 1.8 years. I did have /usr /usr/local /tmp swap and /home on spinning media but... * Bad memory: still a good possiblity * Some insidious incompatibility with all packages from multiple repos. I really hope it's not that, I don't load much I don't need. And as for finding a real computer repairman, let me know if you have one in Los Angeles. This is similar to a problem I had with an iMac. The geniuses at the store took three trips to convince them something was wrong and that was after about an hour each time with the phone support people. That one turned out to be a flaky memory DIMM that passed all the quick diagnostics. Oh well the saga continues. It's nice have a group to go to for ideas. Thank you all. Joe On 04/23/2013 04:20 PM, Konstantin Olchanski wrote: On Tue, Apr 23, 2013 at 11:44:22AM -0700, Joseph Areeda wrote: I'm having this strange behavior that I think is a hardware problem ... * System freezes, mouse and keyboard dead, sshd unresponsive sometimes First action is to run memtest86 (Q: which one? google finds several. A: all of them). Run memtest86 for 24 hours at least - if it reports memory errors, hangs, freezes or machine turns off, you definitely have a hardware problem. Suspect parts are in this order: RAM, power supply, CPU socket (bent pins), mobo, CPU. If memtest86 runs fine for 24 hours and more, there *still* could be a hardware problem. (memtest86 does not test the video, the disk, the network and the usb interfaces). disk utility show ... SMART [is] fine. SMART "health report" is useless. I had dead disks report "SMART OK" and perfectly functional disks report "SMART Failure, replace your disk now". This is free advice. For advice that would actually get your computer working again, you would want to hire a proper computer repairman.
Re: how to find internet dead spots
Hi Todd, If you mean the dsl goes out for a while, what I've done is pretty low tech but works for reporting downtime. A cron job from inside that pings a couple of servers on the outside and one on the outside that pings the server in question. I usually grep for the summary line and redirect it out to a log file. Soemthing like: #!/bin/bash ips="example.com another.example.com " for ip in $ips; do dat=`date +"%Y%m%d %H%M"` res=`ping -c 3 $ip | grep loss| awk '{print $6, ",", $10 }'` echo $dat "," $ip "," $res >>/home/joe/ping.stats done Joe On 3/27/13 12:36 PM, Todd And Margo Chester wrote: Hi All, I have a Cent OS 5.x server sitting on a DSL line acting as a firewall. I have noticed that there are dead spots, up to a minute, every so often in their Internet service. It could be a storm on someone's part, but the worst they run is IMAP. No music; no video. Is there a utility I can run to map this? Many thanks, -T xxx
Resolved: ssh returns "Permission denied (gssapi-keyex,gssapi-with-mic)."
Well this has been a thorn in my side for months but I think I've figured it out. At least I found a plausible reason for it and it's been working longer than it has before. The problem turned out to be I had both gsisshd and sshd running and the fix was to use chkconfig to disable it. The really weird part that made it hard to figure out was that ssh would work for days then suddenly stop. "sudo service sshd restart" would get it to work again for a few days. I had installed the gsi server stuff because we will (hopefully) move to that certificate based access soon, not thinking that it would be enabled on install. The take home lesson is think before you install potentially conflicting services. Thanks, Joe On 11/21/2012 02:16 PM, Joseph Areeda wrote: I can't figure out what causes this error. I can "fix" it by regenerating the server key on the system I'm trying to connect to and restarting sshd but that seems to be temporary as the same problem comes back in a week or so. Rebooting the server does not fix it. Does anyone know what that error means? I am using ssh not gsissh although I do have globus toolkit installed to contact grid computers. I'm pretty sure it's a misconfiguration on my part but I can't figure out what I did or didn't do. Thanks, Joe
Re: Will HTML5 eventually sub for Java?
Well, I'll add my 2¢ but don't think I have a definitive answer for you. First of all, HTML5 is meant to obviate the need for many browser plugin and when combined with Javascript will be able to substitute for some of the things applets are used for. Java is much more than a browser plug-in and HTML5 has nothing to do with its other uses. Joe On 01/18/2013 05:26 PM, Todd And Margo Chester wrote: Hi All, With all the security problems in Java right now, does anyone know if HTML5 will eventually sub for Java? And, will HTML5 have its own list of prodigious security problems? Many thanks, -T
Re: openmpi compilation options
Arnau, I'm an opmi novice but I believe this FAQ answers your question: http://www.open-mpi.org/faq/?category=running#run-n1ge-or-sge Joe On 01/11/2013 06:53 AM, Arnau Bria wrote: Hi all, I'd like to know if openmpi (1.5.4-1.el6) provided by SL6.3 was compiled with the option: --with-sge I don't know where I should look for it, so, apart from the reply, if someone could tell me how to do it in the future I'll really appreciate it! TIA, Arnau
Re: SL 6 etc. on ARM CPU units
I'm pretty sure there are Debian ports for ARM including RasberryPi. Here's an interesting project out of the UK http://www.southampton.ac.uk/~sjc/raspberrypi/ where the guy built a 64 node cluster using Lego for the supports. I'm also sure it was a lot of work like others have mentioned. Perhaps when the upstream providers get the kernel and the drivers going in the Fedora and RedHat branches we'll see SL7 or 8 available for ARM also. Joe On 12/07/2012 11:27 AM, Konstantin Olchanski wrote: Please do not confuse 3 separate issues: 1) Linux userland: this is pretty much universal and will run on any CPU as long as you have a cross-compiler and as long as the "autoconf" tools do not try too hard to prevent you from cross-compiling the stuff. 2) Linux kernel: is also pretty much universal and assumes very little about the CPU. There *is* some assembly code that needs to be ported when you move between CPUs (say from hypothetical SuperARM to hypothetical HyperARM). I believe current versions of Linux kernel have this support for all existing ARM CPU variations. 3) Linux device drivers: in the PC world devices are standardized around the PCI bus architecture (from the CPU, PCIe looks like PCI, on purpose) and most devices drivers are universal, so if you have a PCI/PCIe based ARM machine with PC-type peripherals ("South Bridge", ethernet, video, etc), you are good to go. If you have an ARM machine with strange devices (i.e. the RaspberryPI), you have to wait for the manufacturer to release the specs, then you can write the drivers, then you can run Linux. Rinse, repeat for the next revision of the CPU ASIC (because they moved the registers around or used a slightly different ethernet block). It helps if you have some standardized interfaces, for example on the RaspberryPI you have standard USB, so you can use "all supported" USB-Wifi adapters right away. 4) boot loader: is different for each type of machine, each type of boot device media. period. (Even on PCs there is no longer any standard standard - some use old-school BIOS booting, others use EFI boot, some need BIOS/ACPI help, some do not). This makes it 4 issues, if you count the first (linux userland) non-issue. K.O. On Fri, Dec 07, 2012 at 01:01:36PM -0600, SLtryer wrote: On 10/23/2012 12:37 PM, Konstantin Olchanski wrote: An "ARM platform" does not exist. Unlike the "PC platform" where "PC hardware" is highly standardized and almost any OS can run on almost any vendor hardware, the "ARM platform" is more like the early Linux days where instead of 3 video card makers there were 23 of them, all incompatible, all without Linux drivers. If you had the "wrong" video card, too bad, no soup for you. In the ARM world, there is a zoo of different ARM processors, all incompatible with each other (think as if each Android device had a random CPU - a 16-bit i8086, or a 32-bit i386, or a 64-bit i7 - the variation in capabilities is that high). Then each device contains random i/o chips connected in it's own special way - there is no PCI/PCIe bus where everything is standardized. There are several WiFi chips, several Bluetooth, USB, etc chips. Some have Linux drivers, some do not. As result, there is no generic Linux that will run on every ARM machine. Not to be argumentative, but I always believed that the advantage of *nix* was that it could be ported to numerous platforms, regardless of hardware. You even mention the "early Linux days," when there was little or no standardization of PC hardware. Yet, the platform didn't disappear from use simply because there might have been porting issues, most of which were caused more by proprietary secrets and hardware defects than the ever-present fact of diversity of hardware. But one could make the same argument even today: That there are many different CPU platforms, e.g., and that they are not standardized. One example I am thinking of is the Intel v. Amdahl CPU compatibility issue. Even though most of the Linux system will run on either without modification, there are still some unique issues to each of them; from having worked and studied VirtualBox, there are differences in how each manufacturer chose to implement the ring structure that permits virtualization to work as nicely as it does on these platforms. For the most part, they are compatible, but the kernel developers have to be aware of certain implemention issues, including a bug in the Intel CPU platform that requires a VirtualBox workaround (for optimizing the code or something; I forget). And this is in addition to Linux supporting umpteen different processing platforms besides the x86 types. New hardware appears constantly, and some Linux user somewhere wants to use it on their system. I feel that variety of hardware and variation in hardware implementation is a fact, and a main reason why Linux and Unix are so powerful and ubiquitou
Re: clients slow down due to unknown process
Hi David, I am certainly no expert but this looks to me like the classic NFS symptoms when the server gets overloaded, or a disk or the network gets flaky. If it were me, I'd try to get the class to do more local i/o (if possible). Perhaps a scratch area on the local disk would solve the problem. I think you could reproduce the problem by writing a test script that does heavy i/o to the network folders and then running on more and more machines and watch the i/o throughput approach zero with the machines hung while waiting for NFS. Again, I'm no expert feel free to ignore me. Joe On 11/29/2012 10:49 AM, David Fitzgerald wrote: Last night during class time I had a chance to check some of the machines with the frozen displays, and I am not sure what to make of what I found. Running 'lsof -p $PID' with (PID being 5044) on one of the affected machines, gave this which, doesn't tell me much: 10.10.10 5044 root cwd DIR8,7 40962 / 10.10.10 5044 root rtd DIR8,7 40962 / 10.10.10 5044 root txt unknown /proc/5044/exe I also ran pstree and I will put that output below, but I think I may be barking up the wrong tree. While some of my clients were freezing up, I saw that my NFS server was getting very high 'top' loads. Fortunately I have sysstat running on the server and after class 'sar -u' showed that %iowait went from less than 1 before class to a high of 53 after class began, and stayed high until class ended. Here is the relevant 'chunk' of the sar -u output: 05:20:01 PM all 0.03 0.00 0.07 0.17 0.00 99.73 05:30:01 PM all 0.03 0.00 0.03 0.11 0.00 99.83 05:40:01 PM all 0.18 0.00 0.50 1.88 0.00 97.44 05:50:01 PM all 0.16 0.00 1.12 6.93 0.00 91.78 06:00:01 PM all 0.73 0.00 5.23 32.61 0.00 61.43 06:10:01 PM all 0.77 0.00 6.55 53.67 0.00 39.01 06:20:01 PM all 0.13 0.00 4.81 27.81 0.00 67.25 06:30:01 PM all 0.13 0.00 6.69 21.71 0.00 71.47 06:40:01 PM all 0.11 0.00 3.47 33.34 0.00 63.08 06:50:01 PM all 0.11 0.00 3.20 31.02 0.00 65.67 07:00:01 PM all 0.24 0.00 3.93 30.79 0.00 65.05 07:10:01 PM all 0.16 0.00 3.63 20.51 0.00 75.71 07:20:01 PM all 0.18 0.00 5.23 1.45 0.00 93.13 07:30:01 PM all 0.10 0.00 5.72 0.70 0.00 93.48 Average:all 0.06 0.01 0.46 2.13 0.00 97.34 The NFS server is a virtual machine in running ESXI 4.1 and VMware tools IS installed. Could this be slow disk access, and thus a VMware misconfiguration? I hate to admit it, but I am at a loss. I can run other sar reports on yesterday's (Wednesday's) data if anyone thinks there may be something in there to help. For what its worth, here is the output from pstree from one of the affected clients, and I do NOT see the PID that I was looking for: init(1)-+-NetworkManager(1782)-+-dhclient(1808) | `-{NetworkManager}(1809) |-abrtd(2341) |-acpid(2039) |-anacron(3615) |-atd(2413) |-atieventsd(2421)---authatieventsd.(4134) |-auditd(1547)-+-audispd(1549)-+-sedispatch(1550) | | `-{audispd}(1551) | `-{auditd}(1548) |-automount(2134)-+-{automount}(2135) | |-{automount}(2136) | |-{automount}(2139) | |-{automount}(2142) | |-{automount}(2143) | `-{automount}(2144) |-avahi-daemon(1794)---avahi-daemon(1795) |-bonobo-activati(4549)---{bonobo-activat}(4550) |-cachefilesd(1597) |-certmonger(2435) |-clock-applet(4644) |-console-kit-dae(2521)-+-{console-kit-da}(2522) | |-{console-kit-da}(2523) | |-{console-kit-da}(2524) | |-{console-kit-da}(2525) | |-{console-kit-da}(2526) | |-{console-kit-da}(2527) | |-{console-kit-da}(2528) | |-{console-kit-da}(2529) | |-{console-kit-da}(2530) | |-{console-kit-da}(2531) | |-{console-kit-da}(2532) | |-{console-kit-da}(2533) | |-{console-kit-da}(2534) | |-{console-kit-da}(2535) | |-{console-kit-da}(2536)
Re: ssh returns "Permission denied (gssapi-keyex,gssapi-with-mic)."
Thanks for the comments Paul. I was surprised when I joined the collaboration and saw home directories world readable but that decision was made long before I arrived and changing it remains above my pay grade. The reason I doubt that's my current problem is because regenerating the server key files works. I can log in fine today and I haven't changed permissions. I also don't have problem logging into other systems from that machine that are [supposed to be] set up the same way. When it happens again, I will check if changing permissions helps. Also for the record I waited until my existing Kerberos tickets expired. These are to other services not that machine. I can log in fine with an expired or valid TGT hanging around and after kdestroy. Happy holidays, Joe On 11/22/2012 08:32 AM, Paul Robert Marino wrote: Well there is your problem The users home directory needs to be 700 unless you turn off strict key checking in the sshd configuration file. Also the public key should be 600 as well. Making home directories world or group readable isn't a good plan for collaberation because many applications store sensitive information like passwords and cached information like session data in the home directory. instead consider creating group directories an setting the setgid bit on it so the group permissions are inherited by any files created in the directories. Making home directories world or group readable is a lazy solution to an easily solved problem. Its a common mistake that causes loads of problems because many application which are written to be secure purposly break when you do it. I highly suggest you comeup with a better plan for collaberation than that. On Nov 21, 2012 11:10 PM, "Joseph Areeda" <mailto:newsre...@areeda.com>> wrote: On 11/21/2012 07:08 PM, Alan Bartlett wrote: On 22 November 2012 01:18, Joseph Areeda mailto:newsre...@areeda.com>> wrote: The user's directory is 755 which is the convention for grid computers in our collaboration and the plan is for this machine to be on our soon to be delivered cluster. The .ssh directory is 700. This doesn't change between the working and non-working state. Good, you've checked the directory. Now what about the files within it? Hopefully they are all 600? Alan. Alan, The private keys are all 600 and the public keys are 644. I keep a few different ones for going to different systems. Joe
Re: ssh returns "Permission denied (gssapi-keyex,gssapi-with-mic)."
On 11/21/2012 07:08 PM, Alan Bartlett wrote: On 22 November 2012 01:18, Joseph Areeda wrote: The user's directory is 755 which is the convention for grid computers in our collaboration and the plan is for this machine to be on our soon to be delivered cluster. The .ssh directory is 700. This doesn't change between the working and non-working state. Good, you've checked the directory. Now what about the files within it? Hopefully they are all 600? Alan. Alan, The private keys are all 600 and the public keys are 644. I keep a few different ones for going to different systems. Joe
Re: ssh returns "Permission denied (gssapi-keyex,gssapi-with-mic)."
Thank you Paul, Steven and Steve, I think Kerberos may be the issue. I do NOT use Kerberos to access this machine, I have a lot to learn before I turn that and LDAP on. But I do use it to access several services in our collaboration so the client machine often has a valid Kerberos TGT (and probably more often an expired ticket). I think it's worth experimenting with the client in different states of Kerberosity (or whatever that word should be). The user's directory is 755 which is the convention for grid computers in our collaboration and the plan is for this machine to be on our soon to be delivered cluster. The .ssh directory is 700. This doesn't change between the working and non-working state. I tarred the /etc/ssh directory and saved it for next time but wouldn't generating new keys make them almost completely different? Generating new keys makes no sense to me either, but it does work. Well, at least it has been the only thing I've done coincident with resolving the problem the last 3 times this has happened. I also save the triple verbose ssh output. I really appreciate the discussion gentlemen, it helps a lot. Best, Joe On 11/21/2012 04:58 PM, Paul Robert Marino wrote: On Nov 21, 2012 7:57 PM, "Paul Robert Marino" <mailto:prmari...@gmail.com>> wrote: Ok To be clear are you using kerberos or not If the answer is no and you are just using ssh keys the most common cause of this issue is that the useres home directory is group or world readable. In the most secure mode which is the default if the useres home and or the ~/.ssh directory is has a any thing other than 700 or 500 set as the permissions it will reject the public key (the one on the server you are trying to connect to) this become obvious with -vvv but not -vv On Nov 21, 2012 7:34 PM, "Steven C Timm" mailto:t...@fnal.gov>> wrote: Shouldn’t need to regenerate the keys.. once you get them generated once they should be good for the life of the machine. Save copies of the keys as they are now and if your system goes bad, do differences to see what changed, if anything. Steve Timm *From:*owner-scientific-linux-us...@listserv.fnal.gov <mailto:owner-scientific-linux-us...@listserv.fnal.gov> [mailto:owner-scientific-linux-us...@listserv.fnal.gov <mailto:owner-scientific-linux-us...@listserv.fnal.gov>] *On Behalf Of *Joseph Areeda *Sent:* Wednesday, November 21, 2012 5:46 PM *To:* owner-scientific-linux-us...@listserv.fnal.gov <mailto:owner-scientific-linux-us...@listserv.fnal.gov> *Cc:* scientific-linux-users *Subject:* Re: ssh returns "Permission denied (gssapi-keyex,gssapi-with-mic)." Thank you Tam, and Steven, I just confirmed that regenerating the keys (ssh-keygen -t dsa -f ssh_host_dsa_key && ssh -t rsa -f ssh_host_rsa_key) in /etc/ssh "fixes the problem" So ssh -vv shows me how it's supposed to look. I'll save that and do a diff when it happens again. As I continue my googling I can report on a few things it's not Server machine has a fixed ip address and dns/rdns appears working. Time issue Steven mentioned does not seem to be it, although I may stop using pool machines and set up a local ntp server so everybody gets the same time. I can ssh and gsissh to other servers. Server: ntpq -p remote refid st t when poll reach delay offset jitter == *ping-audit-207- .ACTS. 1 u5 128 377 19.8675.804 1.927 +10504.x.rootbsd 198.30.92.2 2 u 129 128 376 45.146 -28.571 5.558 +ntp.sunflower.c 132.236.56.250 3 u 77 128 355 63.836 -14.753 5.360 -ntp2.ResComp.Be <http://ntp2.ResComp.Be> 128.32.206.553 u 126 128 377 22.1127.311 2.022 Client: ntpq -p remote refid st t when poll reach delay offset jitter == 64.147.116.229 .ACTS. 1 u 47 1280 13.5430.567 0.000 *nist1-chi.ustim .ACTS. 1 u 25 128 377 106.619 14.458 5.896 +name3.glorb.com <http://name3.glorb.com> 69.36.224.15 2 u 64 128 377 88.564 -27.542 3.631 +131.211.8.244 .PPS.1 u 81 128 377 167.1073.259 2.340 The only setting I change in sshd_config is to turn off password auth b
Re: ssh returns "Permission denied (gssapi-keyex,gssapi-with-mic)."
Thank you Tam, and Steven, I just confirmed that regenerating the keys (ssh-keygen -t dsa -f ssh_host_dsa_key && ssh -t rsa -f ssh_host_rsa_key) in /etc/ssh "fixes the problem" So ssh -vv shows me how it's supposed to look. I'll save that and do a diff when it happens again. As I continue my googling I can report on a few things it's not Server machine has a fixed ip address and dns/rdns appears working. Time issue Steven mentioned does not seem to be it, although I may stop using pool machines and set up a local ntp server so everybody gets the same time. I can ssh and gsissh to other servers. Server: ntpq -p remote refid st t when poll reach delay offset jitter == *ping-audit-207- .ACTS. 1 u5 128 377 19.867 5.804 1.927 +10504.x.rootbsd 198.30.92.2 2 u 129 128 376 45.146 -28.571 5.558 +ntp.sunflower.c 132.236.56.250 3 u 77 128 355 63.836 -14.753 5.360 -ntp2.ResComp.Be 128.32.206.553 u 126 128 377 22.112 7.311 2.022 Client: ntpq -p remote refid st t when poll reach delay offset jitter == 64.147.116.229 .ACTS. 1 u 47 1280 13.543 0.567 0.000 *nist1-chi.ustim .ACTS. 1 u 25 128 377 106.619 14.458 5.896 +name3.glorb.com 69.36.224.15 2 u 64 128 377 88.564 -27.542 3.631 +131.211.8.244 .PPS.1 u 81 128 377 167.107 3.259 2.340 The only setting I change in sshd_config is to turn off password auth but this machine is being brought up behind a firewall and I haven't done that yet. Also if it was a config problem I doubt changing the key would fix it, even temporarily. I will report back with the ssh -vv stuff when it happens again. At least now I have a chance of figuring out what's going on. Best, Joe On 11/21/2012 02:30 PM, Tam Nguyen wrote: Hi Joe, Did you look at the sshd_config file? I ran into a similar error output but it may not necessarily be the same issue you're having. In my case, the sshd_conf file on one of my users machine was edited and renamed. I backup that file and copy a default sshd_config file, then test it. Good luck. -T On Wed, Nov 21, 2012 at 5:16 PM, Joseph Areeda <mailto:newsre...@areeda.com>> wrote: I can't figure out what causes this error. I can "fix" it by regenerating the server key on the system I'm trying to connect to and restarting sshd but that seems to be temporary as the same problem comes back in a week or so. Rebooting the server does not fix it. Does anyone know what that error means? I am using ssh not gsissh although I do have globus toolkit installed to contact grid computers. I'm pretty sure it's a misconfiguration on my part but I can't figure out what I did or didn't do. Thanks, Joe
ssh returns "Permission denied (gssapi-keyex,gssapi-with-mic)."
I can't figure out what causes this error. I can "fix" it by regenerating the server key on the system I'm trying to connect to and restarting sshd but that seems to be temporary as the same problem comes back in a week or so. Rebooting the server does not fix it. Does anyone know what that error means? I am using ssh not gsissh although I do have globus toolkit installed to contact grid computers. I'm pretty sure it's a misconfiguration on my part but I can't figure out what I did or didn't do. Thanks, Joe
Re: The opposite SL and VirtualBox problem
thanks, I did spot that and updated everything so the versions match. Stupid mistake on my part. Joe On 10/02/2012 02:05 PM, Akemi Yagi wrote: I would not do that. The matching version of kernel-devel you need ( 2.6.32-220.17.1.el6 ) is available here: http://ftp.scientificlinux.org/linux/scientific/6.2/x86_64/updates/security/ Remove the link you manually created and install the right version of kernel-devel. If/when you update the kernel, remember to update kernel-devel to the same version.
Re: The opposite SL and VirtualBox problem
Well, I'm not going to touch Nico's comment because I don't know KVM. For me it's the Devil you know kind of thing. I've had good experience with Vbox on multiple OS and am just playing in my comfort zone. I do have reasons to explore other VMs but none of them pressing. I just want to install one of the University's "free" site license copy of Windows as a courtesy to our students. Joe On 10/2/12 3:15 AM, David Sommerseth wrote: ----- Original Message - From: "Joseph Areeda" To: SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV Sent: Tuesday, 2 October, 2012 12:33:59 AM Subject: The opposite SL and VirtualBox problem I want to run Windows as a guest system on my Sl6.3 box. Installing vbox from the Oracle repository gives me an error trying to create the kernel modules. Just a silly question. Why bother with VirtualBox when you have KVM built into the OS? Using the SPICE protocol (yum search spice) and you'll even get a decent console performance. And it's really easy to setup and configure using virt-manager. kind regards, David Sommerseth xxx
Re: The opposite SL and VirtualBox problem
On 10/01/2012 04:24 PM, Akemi Yagi wrote: On Mon, Oct 1, 2012 at 3:33 PM, Joseph Areeda wrote: I want to run Windows as a guest system on my Sl6.3 box. Installing vbox from the Oracle repository gives me an error trying to create the kernel modules. When trying to do it manually, I run /etc/init.d/vboxdrv -setup and get: Stopping VirtualBox kernel modules [ OK ] Uninstalling old VirtualBox DKMS kernel modules[ OK ] Trying to register the VirtualBox kernel modules using DKMS Error! Your kernel headers for kernel 2.6.32-220.17.1.el6.x86_64 cannot be found at /lib/modules/2.6.32-220.17.1.el6.x86_64/build or /lib/modules/2.6.32-220.17.1.el6.x86_64/source. and when I look for those file I see a broken link ll /lib/modules/2.6.32-220.23.1.el6.x86_64/build lrwxrwxrwx 1 root root 51 Jun 20 09:54 /lib/modules/2.6.32-220.23.1.el6.x86_64/build -> ../../../usr/src/kernels/2.6.32-220.23.1.el6.x86_64 It looks like that file should be linked to: ls /usr/src/kernels/2.6.32-279.9.1.el6.x86_64/ archdrivers include kernelMakefile.common net security tools block firmware init lib mm samples sound usr crypto fsipc Makefile Module.symvers scripts System.map virt I'm going try just fixing the link. but it seems like the kernel-header rpm has a problem. Or am I missing something? Would not be the first time or even a rare occurrence. Joe You need the kernel-devel package (not kernel-headers). That version must match your *running* kernel. You can find the version of your running kernel by: uname -r Then install kernel-devel of that version. Akemi Thanks Akemi, I think I see the problem now. A yum search produces only one listing for kernel-devel and yum info says: Installed Packages Name: kernel-devel Arch: x86_64 Version : 2.6.32 Release : 279.9.1.el6 uname -a says Linux 2.6.32-220.17.1.el6.x86_64 #1 SMP Tue May 15 17:16:46 CDT 2012 x86_64 x86_64 x86_64 GNU/Linux I'm using the repo maintained by the collaboration I'm in and there seems to an issue. For the record fixing that broken link did allow me to build the kernel module and run vbox. I wonder if I introduced any instabilities. Joe
The opposite SL and VirtualBox problem
I want to run Windows as a guest system on my Sl6.3 box. Installing vbox from the Oracle repository gives me an error trying to create the kernel modules. When trying to do it manually, I run /etc/init.d/vboxdrv -setup and get: Stopping VirtualBox kernel modules [ OK ] Uninstalling old VirtualBox DKMS kernel modules[ OK ] Trying to register the VirtualBox kernel modules using DKMS Error! Your kernel headers for kernel 2.6.32-220.17.1.el6.x86_64 cannot be found at /lib/modules/2.6.32-220.17.1.el6.x86_64/build or /lib/modules/2.6.32-220.17.1.el6.x86_64/source. and when I look for those file I see a broken link ll /lib/modules/2.6.32-220.23.1.el6.x86_64/build lrwxrwxrwx 1 root root 51 Jun 20 09:54 /lib/modules/2.6.32-220.23.1.el6.x86_64/build -> ../../../usr/src/kernels/2.6.32-220.23.1.el6.x86_64 It looks like that file should be linked to: ls /usr/src/kernels/2.6.32-279.9.1.el6.x86_64/ archdrivers include kernelMakefile.common net securitytools block firmware init lib mm samples sound usr crypto fsipc Makefile Module.symvers scripts System.map virt I'm going try just fixing the link. but it seems like the kernel-header rpm has a problem. Or am I missing something? Would not be the first time or even a rare occurrence. Joe
Re: X11 server won't start after yum upgrade
Thank you Malcolm. I'm runn 64bit but I'll bet I can find something close that might work. Joe On 07/18/2012 02:11 AM, Malcolm MacCallum wrote: I downloaded (using Firefox under Windows XP) xorg-x11-server-Xorg-1.7.7-29.el6.i686.rpm xorg-x11-server-common-1.7.7-29.el6.i686.rpm from the 6.1 i386 os filestore, http://ftp.scientificlinux.org/linux/scientific/6.1/i386/os/Packages/ I then ran rpm with (if I recall well) rpm --oldpackage -i (maybe also with --replacefiles) on each of them and all was well again. I'm not saying this is the best or even a good way to fix things! Malcolm - Original Message - From: "Joseph Areeda" To: "Malcolm MacCallum" Cc: scientific-linux-us...@fnal.gov Sent: Wednesday, 18 July, 2012 3:09:20 AM Subject: Re: X11 server won't start after yum upgrade Malcolm, Which rpm's worked for you? I have sshd running so I've been able to survive but none of the suggestions so far let me log in from the console on my VM. thanks Joe On 07/17/2012 01:41 AM, Malcolm MacCallum wrote: I have solved my problem by downloading the relevant rpm files to my Windows partition and running rpm with suitable arguments, so I am now back to a fully functioning SL with Gnome etc. But it would still have been nice to have instructions that worked 'out of the box'.
Re: X11 server won't start after yum upgrade
Malcolm, Which rpm's worked for you? I have sshd running so I've been able to survive but none of the suggestions so far let me log in from the console on my VM. thanks Joe On 07/17/2012 01:41 AM, Malcolm MacCallum wrote: I have solved my problem by downloading the relevant rpm files to my Windows partition and running rpm with suitable arguments, so I am now back to a fully functioning SL with Gnome etc. But it would still have been nice to have instructions that worked 'out of the box'.
Re: X11 server won't start after yum upgrade
have the same problem with a virtual machine under VirtualBox. I'm running: Linux 2.6.32-220.23.1.el6.x86_64 #1 SMP Mon Jun 18 09:58:09 CDT 2012 x86_64 x86_64 x86_64 GNU/Linux Log files Xorg.0.log says: [ 38.144] (II) LoadModule: "vboxvideo" [ 38.145] (II) Loading /usr/lib64/xorg/modules/drivers/vboxvideo_drv.so [ 38.145] (II) Module vboxvideo: vendor="Oracle Corporation" [ 38.145] compiled for 1.5.99.901, module version = 1.0.1 [ 38.145] Module class: X.Org Video Driver [ 38.145] ABI class: X.Org Video Driver, version 9.0 [ 38.145] (EE) module ABI major version (9) doesn't match the server's version (10) [ 38.145] (II) UnloadModule: "vboxvideo" [ 38.145] (II) Unloading vboxvideo [ 38.145] (EE) Failed to load module "vboxvideo" (module requirement mismatch, 0) [ 38.145] (EE) No drivers available. [ 38.145] Fatal server error: [ 38.145] no screens found [ 38.145] Please consult the Scientific Linux support at https://www.scientificlinux.org/maillists Everything else is working fine but I can't use the console. Joe
Re: USB unresponsive
I forgot to include the relevant part of /var/log/messages: Jun 18 18:24:12 george kernel: drivers/hid/usbhid/hid-core.c: can't reset device, :00:1a.0-1.2.2.4/input0, status -71 Jun 18 18:24:12 george kernel: usb 1-1.2.2: clear tt 1 (00a0) error -71 Jun 18 18:24:12 george kernel: drivers/hid/usbhid/hid-core.c: can't reset device, :00:1a.0-1.2.2.3/input0, status -71 Jun 18 18:24:12 george kernel: usb 1-1.2.2: USB disconnect, address 6 Jun 18 18:24:12 george kernel: usb 1-1.2.2.1: USB disconnect, address 7 Jun 18 18:24:12 george kernel: usb 1-1.2.2.2: USB disconnect, address 8 Jun 18 18:24:12 george kernel: usb 1-1.2.2.3: USB disconnect, address 9 Jun 18 18:24:17 george kernel: drivers/hid/usbhid/hid-core.c: can't reset device, :00:1a.0-1.2.2.3/input1, status -110 Jun 18 18:24:17 george kernel: usb 1-1.2.2: clear tt 1 (0090) error -19 Jun 18 18:24:17 george kernel: usb 1-1.2.2.4: USB disconnect, address 10 After that the system is unresponsive and I have reboot with the power switch. Those devices are: Jun 18 14:47:40 george kernel: generic-usb 0003:04F2:0833.0002: input,hidraw1: USB HID v1.11 Keyboard [CHICONY USB Keyboard] on usb-:00:1a.0-1.2.2.3/input0 Jun 18 14:47:40 george kernel: generic-usb 0003:045E:0029.0004: input,hidraw3: USB HID v1.00 Mouse [Microsoft Microsoft IntelliMouse® Optical] on usb-:00:1a.0-1.2.2.4/input0 One last question, should I post text only to this list or are people happy with html formatting? Joe On 06/18/2012 07:42 PM, Joseph Areeda wrote: Greeting, This is my first post to this list, I'm hoping for some insight into a vexing problem. My situation is 4 computers running various operating systems: Ubuntu, Scientific Linux 6.2, Debian Squeeze, MacOS X lion, and Windows 7. All except the Mac and one Ubuntu are multiple boot and used for cross platform development and testing. I have a manual switch box with computers on ports 1-4 and a powered USB hub with mouse, keyboard, scanner, microphone and a USB headset adapter. There is also an HDMI and DVI switch box but they are not part of of the problem. Together they give me a more flexible KVM switch. I can watch a long running job on Monitor #2 connected to one system while working on another using Monitor #1. When I boot up everything works fine. I can also switch between systems freely. However if I leave one of the Linux systems disconnected for a long while it doesn't respond when I switch back to it. I have not seen this behavior with Windows or Mac. The Mac is sometimes disconnected for days but Windows usually gets booted back into Linux when I'm done with it. SL6 usually dies but I can usually ssh into Ubuntu and reboot cleanly. SL6 doesn't respond to pings or ssh. As long as I keep the switch box on SL6 it has run for weeks. Now I have tried 2 different USB switch boxes and when it doesn't respond it doesn't respond even if I plug the hub, or the mouse and keyboard directly into the usb ports on the system. I don't believe it has anything to do with the KVM the OP mentioned or my switch boxes. Searching the web I found a comment that said a powered hub per system worked with his USB KVM switch. I suspect we're seeing some sort of USB timeout. I suppose I can get a powered hub per system but I built these machines with 6 or 10 USB 2 and 2 or 4 USB 3 ports so I really don't need them. The powered hub may however convince Linux that there is something plugged into that port and keep it alive. Does anyone know of a reason for this or even better a fix for it? Thanks Joe
USB unresponsive
Greeting, This is my first post to this list, I'm hoping for some insight into a vexing problem. My situation is 4 computers running various operating systems: Ubuntu, Scientific Linux 6.2, Debian Squeeze, MacOS X lion, and Windows 7. All except the Mac and one Ubuntu are multiple boot and used for cross platform development and testing. I have a manual switch box with computers on ports 1-4 and a powered USB hub with mouse, keyboard, scanner, microphone and a USB headset adapter. There is also an HDMI and DVI switch box but they are not part of of the problem. Together they give me a more flexible KVM switch. I can watch a long running job on Monitor #2 connected to one system while working on another using Monitor #1. When I boot up everything works fine. I can also switch between systems freely. However if I leave one of the Linux systems disconnected for a long while it doesn't respond when I switch back to it. I have not seen this behavior with Windows or Mac. The Mac is sometimes disconnected for days but Windows usually gets booted back into Linux when I'm done with it. SL6 usually dies but I can usually ssh into Ubuntu and reboot cleanly. SL6 doesn't respond to pings or ssh. As long as I keep the switch box on SL6 it has run for weeks. Now I have tried 2 different USB switch boxes and when it doesn't respond it doesn't respond even if I plug the hub, or the mouse and keyboard directly into the usb ports on the system. I don't believe it has anything to do with the KVM the OP mentioned or my switch boxes. Searching the web I found a comment that said a powered hub per system worked with his USB KVM switch. I suspect we're seeing some sort of USB timeout. I suppose I can get a powered hub per system but I built these machines with 6 or 10 USB 2 and 2 or 4 USB 3 ports so I really don't need them. The powered hub may however convince Linux that there is something plugged into that port and keep it alive. Does anyone know of a reason for this or even better a fix for it? Thanks Joe