Re: [SCIENTIFIC-LINUX-USERS] yum error, SL 6.3, file is encrypted or is not a database

2016-04-24 Thread Joseph Areeda
I have the same experience. After patch rebuilding the repo everything 
works for me using Larry's suggestion of clean all then update.


 Thanks for the help.
 Joe

On 4/23/16 8:00 PM, P. Larry Nelson wrote:

Forgot to say that one should do a 'yum clean all' and then
'yum update' works.

- Larry

P. Larry Nelson wrote on 4/23/16 9:49 PM:

Fixed!  Thanks Pat!

- Larry

Pat Riehecky wrote on 4/23/16 5:52 PM:
Weird, the only change to the on April 21 was a security errata that 
was

published just like the rest.

I'll rebuild the metadata across the board just to be safe.

Pat

On 04/23/2016 05:38 PM, P. Larry Nelson wrote:
I am having same problem with 3 of my SL5.x systems. One is 5.1 and 
two are 5.4.
All my other SL 5.x are 5.5 and have had no problems, nor have I 
seen this

problem with any of my SL6.x systems.

The problem seems to be with sl-security repo.
If I do a 'yum update --disablerepo=sl-security' on the 5.1 and 5.4 
systems

I do NOT get the:

Error: file is encrypted or is not a database

This just started happening with the early morning auto yum update 
on 4/22/16.


- Larry

Joseph Areeda wrote on 4/23/16 4:35 PM:
I see people are having the same problem with some of the version 
7 repos.

But I
don't understand how to figure out which repo is causing the 
problem. Are

people
disabling star and enabling one at a time?

 Thanks,
 Joe

On 4/23/16 1:53 PM, Joseph Areeda wrote:
We started getting this error couple of days ago machine that has 
been auto
updating for years. I would assume that it was a corruption of a 
local

database but it happened on two systems simultaneously.

 Googling for that error message produces nothing on yum but 
several hits on's

SQLite.

 I'd appreciate any insight into what the error means and how to 
track down

exactly which repo or file on my system is causing it the problem.

 Below is what I see, yum update also produces the same error 
message.


 Thanks,
 Joe

[root@mavraki yum.repos.d]# yum clean all
Loaded plugins: fastestmirror, refresh-packagekit, security
Cleaning repos: CONDOR-stable VDT-Production-sl6 elrepo lscsoft-epel
lscsoft-pegasus lscsoft-production sl sl-security
Cleaning up Everything
Cleaning up list of fastest mirrors
[root@mavraki yum.repos.d]# yum repolist
Loaded plugins: fastestmirror, refresh-packagekit, security
Determining fastest mirrors
 * elrepo: elrepo.org
 * sl: ftp1.scientificlinux.org
 * sl-security: ftp1.scientificlinux.org
CONDOR-stable | 2.9 kB 00:00
CONDOR-stable/primary_db | 427 kB 00:00
VDT-Production-sl6 | 1.3 kB 00:00
VDT-Production-sl6/primary |  35 kB 00:00
VDT-Production-sl6 11/11
elrepo | 2.9 kB 00:00
elrepo/primary_db | 732 kB 00:00
lscsoft-epel | 2.7 kB 00:00
lscsoft-epel/primary_db | 4.2 MB 00:02
lscsoft-pegasus | 2.6 kB 00:00
lscsoft-pegasus/primary_db | 5.8 kB 00:00
lscsoft-production | 2.9 kB 00:00
lscsoft-production/primary_db | 301 kB 00:00
sl | 3.5 kB 00:00
sl/primary_db | 4.2 MB 00:03
sl-security | 3.0 kB 00:00
sl-security/primary_db |  12 MB 00:06
Error: file is encrypted or is not a database
[root@mavraki yum.repos.d]#













Re: yum error, SL 6.3, file is encrypted or is not a database

2016-04-23 Thread Joseph Areeda
I see people are having the same problem with some of the version 7 
repos. But I don't understand how to figure out which repo is causing 
the problem. Are people disabling star and enabling one at a time?


 Thanks,
 Joe

On 4/23/16 1:53 PM, Joseph Areeda wrote:
We started getting this error couple of days ago machine that has been 
auto updating for years. I would assume that it was a corruption of a 
local database but it happened on two systems simultaneously.


 Googling for that error message produces nothing on yum but several 
hits on's SQLite.


 I'd appreciate any insight into what the error means and how to track 
down exactly which repo or file on my system is causing it the problem.


 Below is what I see, yum update also produces the same error message.

 Thanks,
 Joe

[root@mavraki yum.repos.d]# yum clean all
Loaded plugins: fastestmirror, refresh-packagekit, security
Cleaning repos: CONDOR-stable VDT-Production-sl6 elrepo lscsoft-epel 
lscsoft-pegasus lscsoft-production sl sl-security

Cleaning up Everything
Cleaning up list of fastest mirrors
[root@mavraki yum.repos.d]# yum repolist
Loaded plugins: fastestmirror, refresh-packagekit, security
Determining fastest mirrors
 * elrepo: elrepo.org
 * sl: ftp1.scientificlinux.org
 * sl-security: ftp1.scientificlinux.org
CONDOR-stable | 2.9 kB 00:00
CONDOR-stable/primary_db | 427 kB 00:00
VDT-Production-sl6 | 1.3 kB 00:00
VDT-Production-sl6/primary |  35 kB 00:00
VDT-Production-sl6 11/11
elrepo | 2.9 kB 00:00
elrepo/primary_db | 732 kB 00:00
lscsoft-epel | 2.7 kB 00:00
lscsoft-epel/primary_db | 4.2 MB 00:02
lscsoft-pegasus | 2.6 kB 00:00
lscsoft-pegasus/primary_db | 5.8 kB 00:00
lscsoft-production | 2.9 kB 00:00
lscsoft-production/primary_db | 301 kB 00:00
sl | 3.5 kB 00:00
sl/primary_db | 4.2 MB 00:03
sl-security | 3.0 kB 00:00
sl-security/primary_db |  12 MB 00:06
Error: file is encrypted or is not a database
[root@mavraki yum.repos.d]#


yum error, SL 6.3, file is encrypted or is not a database

2016-04-23 Thread Joseph Areeda
We started getting this error couple of days ago machine that has been 
auto updating for years. I would assume that it was a corruption of a 
local database but it happened on two systems simultaneously.


 Googling for that error message produces nothing on yum but several 
hits on's SQLite.


 I'd appreciate any insight into what the error means and how to track 
down exactly which repo or file on my system is causing it the problem.


 Below is what I see, yum update also produces the same error message.

 Thanks,
 Joe

[root@mavraki yum.repos.d]# yum clean all
Loaded plugins: fastestmirror, refresh-packagekit, security
Cleaning repos: CONDOR-stable VDT-Production-sl6 elrepo lscsoft-epel 
lscsoft-pegasus lscsoft-production sl sl-security

Cleaning up Everything
Cleaning up list of fastest mirrors
[root@mavraki yum.repos.d]# yum repolist
Loaded plugins: fastestmirror, refresh-packagekit, security
Determining fastest mirrors
 * elrepo: elrepo.org
 * sl: ftp1.scientificlinux.org
 * sl-security: ftp1.scientificlinux.org
CONDOR-stable | 2.9 kB 00:00
CONDOR-stable/primary_db | 427 kB 00:00
VDT-Production-sl6 | 1.3 kB 00:00
VDT-Production-sl6/primary |  35 kB 00:00
VDT-Production-sl6 11/11
elrepo | 2.9 kB 00:00
elrepo/primary_db | 732 kB 00:00
lscsoft-epel | 2.7 kB 00:00
lscsoft-epel/primary_db | 4.2 MB 00:02
lscsoft-pegasus | 2.6 kB 00:00
lscsoft-pegasus/primary_db | 5.8 kB 00:00
lscsoft-production | 2.9 kB 00:00
lscsoft-production/primary_db | 301 kB 00:00
sl | 3.5 kB 00:00
sl/primary_db | 4.2 MB 00:03
sl-security | 3.0 kB 00:00
sl-security/primary_db |  12 MB 00:06
Error: file is encrypted or is not a database
[root@mavraki yum.repos.d]#


SL7-RC1 Installer bug reporting

2014-09-27 Thread Joseph Areeda
I downloaded SL-7-x86_64-DVD.iso verified the sha256 hash and started a 
test and install on the VM that I've been using for the beta releases.


After specifying the language it reported a "bad file descriptor" 
error.  I want to try a fresh VM before I discuss that problem but 
following the bug reporting options on the error message I have a screen 
that looks like:




The question right now is do we really want to report SL7 issues to Red 
Hat Customer Support?  It's fine with me but I wonder if I found a deep 
down reference to the upstream provider that has been missed.


Joe


Re: [SL-Users] Re: Scientific Linux 7 ALPHA - Updating

2014-07-05 Thread Joseph Areeda

On 07/05/2014 11:02 AM, Nico Kadel-Garcia wrote:

On Sat, Jul 5, 2014 at 11:23 AM, Joseph Areeda  wrote:

On 07/05/2014 08:03 AM, Nico Kadel-Garcia wrote:

On Sat, Jul 5, 2014 at 10:43 AM, Joseph Areeda 
wrote:

On 07/04/2014 09:12 PM, Nico Kadel-Garcia wrote:

Set up a local rsync mirror from any of the locally fast upstream
repositories, and slap a web server in front of it. Use the
'netinstall' from th elocal mirror,, or even a PXE setup, not the full
DVD, to point to the local mirror. Ideally, set up a kickstart file
too on the local mirror, ideally tied to a the PXE setup.

Then just use the PXE or netinstall ISO to do a kickstarted, network
based re-install. That way, you don't even have to download the DVD
images, which are quite builky. If you like, I'll post my download
scripts

Thanks Nico,

I would like to see your scripts.

Let me put them up at github.com, and give me a day.


What I will be testing is what changes are needed to our packaged and
unpackaged applications.

Excuse the basic questions but I'm more of a developer than a sysadmin.  If
I understand your recommendations the system including user accounts will be
rebuilt on each boot.  So if I want to work through multiple reboots I could
put my home directory on an NFS mount and end up with fresh software but the
same environment. Correct?

That's one workable way, yes. If PXE and kickstart can be set up
correctly, the "kickstart" can even set up your NFS mounted home
directory and local authentication and sudo privileges, and the PXE
can allow a "rebuild me from scratch" option at boot time. that will
select and automatically use the relevant kickstart file. It can even
be set to auto-rebuild every time, if you want.

I've done this a lot for hardware testing, and for building clusters.
One of the limitations is local bandwidth: if you're rebuilding a
bunch of times from the upstream SL 7 Alpha website, well, that's
rude. It's hundereds of megs, possibly  even Gigs, of bandwidth. Set
up a local mirror to pull from instead, and keep *that* updated.


Thanks.  I'm sure others in our collaboration will be doing more extensive
tests.

Best,
Joe

Thanks again, I appreciate the help.

I've set up kickstart once, successfully and I'm ready to do battle with 
PXE.  Just one of quick question for now, I'm sure more will follow later:


The mirror I've started is 
rsync://mirror.mcs.anl.gov/scientific-linux/7rolling/x86_64/os  I think 
that's all I need to maintain, correct?  I'm in Los Angeles.


Best,
Joe


Re: [SL-Users] Re: Scientific Linux 7 ALPHA - Updating

2014-07-05 Thread Joseph Areeda

On 07/05/2014 08:03 AM, Nico Kadel-Garcia wrote:

On Sat, Jul 5, 2014 at 10:43 AM, Joseph Areeda  wrote:

On 07/04/2014 09:12 PM, Nico Kadel-Garcia wrote:

On Fri, Jul 4, 2014 at 10:11 AM, Akemi Yagi  wrote:

All,

Please refrain from posting anything other than testing results of the
released SL packages in this thread. Let's keep this one free of
trolls.

Akemi

The SL 7 Alpha is running well in PC Virtualbox (my virtualiztion
toolsuite of choice). The add-on tools for more graseful focus
switching and for mouse management are not yet installable, but I
expect that to be fixed upstream by PC Virtualbox, now that RHEL 7 is
in production.

I can report it also runs well in and SL6 Virtualbox VM.

I saw the firstboot problem report.

My question is how often should we download and reinstall from scratch vs
using yum update (or autoupdate or cron)?

I assume the only reason to download the DVD image is to test changes to the
install procedure and yum update will end up with the same installation.

I just want to confirm that how best to help the process.

If you're a weasel and want to save speed and bandwidth:

Set up a local rsync mirror from any of the locally fast upstream
repositories, and slap a web server in front of it. Use the
'netinstall' from th elocal mirror,, or even a PXE setup, not the full
DVD, to point to the local mirror. Ideally, set up a kickstart file
too on the local mirror, ideally tied to a the PXE setup.

Then just use the PXE or netinstall ISO to do a kickstarted, network
based re-install. That way, you don't even have to download the DVD
images, which are quite builky. If you like, I'll post my download
scripts

Thanks Nico,

I would like to see your scripts.

What I will be testing is what changes are needed to our packaged and 
unpackaged applications.


Excuse the basic questions but I'm more of a developer than a sysadmin.  
If I understand your recommendations the system including user accounts 
will be rebuilt on each boot.  So if I want to work through multiple 
reboots I could put my home directory on an NFS mount and end up with 
fresh software but the same environment. Correct?


Thanks.  I'm sure others in our collaboration will be doing more 
extensive tests.


Best,
Joe


Re: [SL-Users] Re: Scientific Linux 7 ALPHA - Updating

2014-07-05 Thread Joseph Areeda

On 07/04/2014 09:12 PM, Nico Kadel-Garcia wrote:

On Fri, Jul 4, 2014 at 10:11 AM, Akemi Yagi  wrote:

All,

Please refrain from posting anything other than testing results of the
released SL packages in this thread. Let's keep this one free of
trolls.

Akemi

The SL 7 Alpha is running well in PC Virtualbox (my virtualiztion
toolsuite of choice). The add-on tools for more graseful focus
switching and for mouse management are not yet installable, but I
expect that to be fixed upstream by PC Virtualbox, now that RHEL 7 is
in production.

I can report it also runs well in and SL6 Virtualbox VM.

I saw the firstboot problem report.

My question is how often should we download and reinstall from scratch 
vs using yum update (or autoupdate or cron)?


I assume the only reason to download the DVD image is to test changes to 
the install procedure and yum update will end up with the same installation.


I just want to confirm that how best to help the process.

Joe


Problem with OpenJDK getting system time zone

2014-04-06 Thread Joseph Areeda

Hi,

I have a strange problem.  I have a little java app that displays an 
analog clock with a few weird additions like GPS time.  It's been 
working for years.  A recent update or the last switch to DST has it 
return the time zone as GMT-08:00 instead of PDT.


If I use Oracle's Oracle's jdk1.7.0_45 it works fine but I I use 
openjdk-1.7.0.51 from the sl-security repo it does not.


Does anyone know how openjdk gets its timezone information?

Thanks,
Joe


Re: Requiered steps to configure samba

2013-09-22 Thread Joseph Areeda
Hi Pritam,

I think this article would be a good place to start:

https://www.linux.com/learn/tutorials/296391-easy-samba-setup

Joe

On 09/21/2013 11:32 PM, Pritam Khedekar wrote:
> Hi All,
>
> I want to configure samba -- what i want to do is, i have windows
> machine share dir.. want to access in my SL 6.4 fermi... please reply
> with step by step instructions. I am new to linux bt i know the power
> of linux...


Re: slow loading browser homepage

2013-09-15 Thread Joseph Areeda

On 09/15/2013 07:15 AM, sascha.fo...@safo.at wrote:
> On Sat, 14 Sep 2013 20:46:55 -0700
> Todd And Margo Chester  wrote:
>> On 09/14/2013 05:34 PM, Tom Rosmond wrote:
>>> T.
>>>
>>> No luck.  Making your suggested changes didn't solve the problem.  I
>>> think it is because for some reason 'resolv.conf' didn't recreate, even
>>> after a reboot.  So without it there was no nameservice and nothing
>>> worked.
>> I forgot to tell yo to restart your netowrking daemon.   Sorry.
>>
>>> I put the original back in place and that restored nameservice,
>>> but at the original slowdown.  I assume this is because of the DNS
>>> mismatch between 'ifcfg-eth0' and 'resolv.conf'?   I tried putting the
>>> Google DNS values in 'resolv.conf' and restarting 'eth0', and now the
>>> file was recreated, but with my own router and ISP nameservice
>>> addresses.  The 'dhclient' deamon seems to insist on that.
>>>
>>> This problem is not unique to me.  I see similar threads in various
>>> Linux forums (Ubuntu, Redhat, etc) complaining about slow nameservice
>>> compared to Windows.  And no clear resolution of the problem.
>> You have probably gone a far as you can go.
> The easiest solution would be to follow Joseph Areeda's advice and
> check the routers DHCP-Server configuration.
>
> As we can see in the dhclient-eth0.leases file the router sends the
> following DNS-Servers and Defaultgateway:
> Primary: 192.168.0.1
> Secondary: 216.177.225.9
> Gateway: 192.168.1.1
>
> Now you should already see whats wrong here. Since this is a home
> router it will probably put itself as the primary DNS-Server in the
> network but as you can see it points to a other IP Adress (192.168.0.1).
>
> Is there actually a DNS-Server running at that IP? (I guess not)
>
> Now to the question why it works in Windows XP and not in Scientific
> Linux. Thats because of the difference of how the resolvers work.
> Windows XP sends a request to all configured DNS-Servers and just takes
> the first response (the secondary DNS answers in that case).
>
> Scientific Linux sends a request to the primary DNS-Server, waits for 5
> seconds and if there is no answer it will move on to the secondary DNS.
> This is why every lookup takes additional 5 seconds in your case.
>
> Regards,
> Sascha
One other option that hasn't been discussed is to use a fixed IP address
and specify everything manually.  It's fairly straight forward with the
Network Manager GUI or the /etc/network configuration files.

The reasons this may be a viable option are:

  * If you want to use the SL system as a server of some sort for other
system in your LAN
  * You're uncomfortable messing with the router's DHCP settings.

Typically these routers will allocate a small block of IP addresses for
DHCP somewhere between 20 and 50.  You must manage the rest of the
address space and be sure to only assign each IP to one device.

I've also found that printers work much better with fixed IP.

Joe



Re: [SCIENTIFIC-LINUX-USERS] Metadata file does not match checksum

2013-07-10 Thread Joseph Areeda


On 07/10/2013 07:54 AM, Pat Riehecky wrote:

yum clean expire-cache


Thanks Pat,

I did try a yum clean which I think is clean all including expire-cache.
Anyway trying that explicitly does not seem to fix it:

   joe@george:~$ sudo yum clean expire-cache
   Loaded plugins: fastestmirror, refresh-packagekit, security
   Cleaning repos: CONDOR-stable LDG_EPEL6 LDG_SL6.1-base
   LDG_SL6.1-securityupdates VDT-Production-sl6
  : elrepo google-chrome lscsoft lscsoft-testing
   rpmforge sl sl-livecd-extra sl-security
  : sl6x sl6x-security
   17 metadata files removed
   joe@george:~$ yum info swig2
   Loaded plugins: fastestmirror, refresh-packagekit, security
   Loading mirror speeds from cached hostfile
 * elrepo: elrepo.org
 * rpmforge: mirror.hmc.edu
 * sl: ftp1.scientificlinux.org
 * sl-security: ftp1.scientificlinux.org
 * sl6x: ftp1.scientificlinux.org
 * sl6x-security: ftp1.scientificlinux.org
   sl-livecd-extra/primary |  28 kB 00:00
   
http://www.livecd.ethz.ch/download/sl-livecd-extra/6.4/x86_64/repodata/primary.xml.gz:
   [Errno -1] Metadata file does not match checksum
   Trying other mirror.
   Error: failure: repodata/primary.xml.gz from sl-livecd-extra: [Errno
   256] No more mirrors to try.

I wonder if I should remove that repo from the list.  I'm not sure what 
packages are in it though.


Joe


Metadata file does not match checksum

2013-07-10 Thread Joseph Areeda

Greetings,

I'm getting an error from yum (see below) on a system installed from LiveCD.

I remember reading the solution to this but can't seem to find that 
email or website.  My memory and search abilities seem to be fading.


Anybody know the mystic incantation off the top of their head?

Thanks

Joe

   ~$ yum search swig
   Loaded plugins: fastestmirror, refresh-packagekit, security
   Loading mirror speeds from cached hostfile
 * elrepo: elrepo.org
 * rpmforge: mirror.hmc.edu
 * sl: ftp1.scientificlinux.org
 * sl-security: ftp1.scientificlinux.org
 * sl6x: ftp1.scientificlinux.org
 * sl6x-security: ftp1.scientificlinux.org
   sl-livecd-extra/primary |  28 kB 00:00
   
http://www.livecd.ethz.ch/download/sl-livecd-extra/6.4/x86_64/repodata/primary.xml.gz:
   [Errno -1] Metadata file does not match checksum
   Trying other mirror.
   Error: failure: repodata/primary.xml.gz from sl-livecd-extra: [Errno
   256] No more mirrors to try.


J


Re: LO destroyed envelopes

2013-06-25 Thread Joseph Areeda
This may be obvious and obnoxious but I've dealt with similar printer 
problems by printing to a pdf then printing the pdf.


The only thing good to say about it is you don't have to look at the ppd's.

Joe

On 06/25/2013 02:06 PM, Mark Stodola wrote:

On 06/25/2013 01:21 PM, Todd And Margo Chester wrote:

On 06/25/2013 10:40 AM, Mark Stodola wrote:

On 06/25/2013 12:07 PM, Todd And Margo Chester wrote:

Hi All,

Can you guys tell if this is finger pointing
or if this really is not a Libre Office problem?

https://bugs.freedesktop.org/show_bug.cgi?id=42327

Many thanks,
-T


Having had my fair share of odd behavior with CUPS, I would lean toward
that as the culprit. There are several different filters that get used
depending on the mime type provided. For instance, on SL 5, texttopaps
did very bad things, causing me to force texttops for text/plain
processing. Some of these filters have been known to double-rotate,
which might be what you are experiencing. It might also be worth
skimming through the ppd for the printer to see if the paper 
definitions

or orientation are wrong. If the ppd contains a page orientation, and
the program specifies a rotation, this can also lead to incorrect
orientation.

-Mark



Hi Mark,

The frustrating thing is that I have no problems printing
from anything else. I can print an envelope just fine
from Wine/Word Pro, which also uses CUPS. Other programs,
portrait or landscape, print just as it is told.

Anyway, I opened up the following with Red Hat:
https://bugzilla.redhat.com/show_bug.cgi?id=977976

Maybe, someday, I will be able to print an envelope
through LO.

Thank you for your response.
-T


If you have the time and patience, you can look into turning up the 
debug/log level of cups to see what is going on between the programs. 
You can also intercept the print queue contents by leaving the spool 
enabled but the printer disabled.  With enough poking, you should be 
able to pin down where the problem exists.


I am guessing (not sure) that most word processors generate postscript 
and send it to CUPS.  You can compare the postscript generated in the 
spool from each of your programs to see how they differ.  It may be 
worth while to open the postscript in a text editor to see if there is 
other meta-data that could be affecting the outcome as well.


-Mark



Re: Help finding a hardware problem (I think)

2013-04-24 Thread Joseph Areeda
I can't thank you all enough for bearing with me as I stumble my way 
through this.


I now understand the logic behind running memtest uninterrupted for a 
long period (>24hr) and will do that.


I have to take back my comment about kmod-nvidia.  I repeatedly messed 
up /etc/selinux/config trying to disable it and that it what was causing 
the kernel panics.  I suppose that's a sign I'm not paying enough attention.


The purpose of running from LiveCD is not to necessarily find a hardware 
problem but to remove the hard disks and the installed software from the 
equation.  The idea being IF I got one of these rare and random failures 
while running that way I could rule out insidious package conflicts, 
mangled configurations and the system disk as the cause.


As far as finding a computer repair professional whom I would go to for 
a problem like this, well all I can say is I've been living in this town 
for 32 years working in computing, I do have an outstanding doctor, a 
great car mechanic, an exceptional plumber... but I haven't found a 
computer guy better than me at this.  That is not to imply that I am any 
good at it.


I am now up and running with SL6.4 on a spinning disk (to remove the SSD 
and a bunch of useful and need packages from the equation). I'll try to 
get some work done today and see if it crashes.


My next step is to swap memory and GPU with another box and see if the 
problem follows.


I hope I'm not posting too much useless (to others) information to the list.

Joe

On 04/24/2013 09:10 AM, Yasha Karant wrote:
A small comment:  stress testing is cumulative only if the underlying 
system has no recovery mechanism. (An understanding of this in detail 
requires non-equilibrium statistical mechanics but can be summarized 
with non-equilibrium "thermodynamics").  My experience with failing 
electronics and magnetics -- depending upon the exact failure mode -- 
is that non-interrupted stress testing is better than interrupted in 
terms of finding failures.  A simple example: suppose a failure mode 
is temperature dependent, and temperature depends upon the amount of 
work being done.  An interrupted but cumulative stress test might 
never reach the "critical" temperature, whereas a continued stress 
test might.


Yasha Karant

On 04/24/2013 08:03 AM, Joseph Areeda wrote:

Thanks for the tips Konstantin,

I assume that your recommendation for 24 hrs of memtest is cumulative
and I can probably see the same results starting it each night when I
quit for the day.

When I mentioned SMART I was talking about the self tests not the status
that comes up.  I've also copied large files around and checked their
md5sum's.

I played with LiveCD for 4 or 5 hours today, much of it was trying to
install it on a different spinning hard drive.

I did see one time when the SSD was shown in the disk utility but all
the partitions were zero length.  that's where my root directory used 
to be.


I also found that the nvidia drivers in ELREPO don't seem to work with
6.4.  I seem to be able to run fine (at least for a while) unless I
install kmod-nvidia then I get a kernal panic on the next reboot (3
times until I tracked it down).  It saiys something like "not syncing
attempt xxx(can't read my writing) PID 1 comm init not tainted
2.6.32.258.2.1.  That's another problem I think.

Right now I suspect not necessarily in order:

  * Bad SSD.  Run time is reported as 1.8 years.  I did have /usr
/usr/local /tmp swap and /home on spinning media but...
  * Bad memory:  still a good possiblity
  * Some insidious incompatibility with all packages from multiple
repos.  I really hope it's not that, I don't load much I don't need.

And as for finding a real computer repairman, let me know if you have
one in Los Angeles.  This is similar to a problem I had with an iMac.
The geniuses at the store took three trips to convince them something
was wrong and that was after about an hour each time with the phone
support people.  That one turned out to be a flaky memory DIMM that
passed all the quick diagnostics.

Oh well the saga continues.  It's nice have a group to go to for ideas.
Thank you all.

Joe


On 04/23/2013 04:20 PM, Konstantin Olchanski wrote:

On Tue, Apr 23, 2013 at 11:44:22AM -0700, Joseph Areeda wrote:
I'm having this strange behavior that I think is a hardware problem 
...

* System freezes, mouse and keyboard dead, sshd unresponsive sometimes

First action is to run memtest86 (Q: which one? google finds 
several. A: all of them).


Run memtest86 for 24 hours at least - if it reports memory errors, 
hangs, freezes or
machine turns off, you definitely have a hardware problem. Suspect 
parts
are in this order: RAM, power supply, CPU socket (bent pins), mobo, 
CPU.


If memtest86 runs fine for 24 hours and more, there *still* could be 
a hardware

problem. (memtest86 does not test the vi

Re: Help finding a hardware problem (I think)

2013-04-24 Thread Joseph Areeda

Thanks for the tips Konstantin,

I assume that your recommendation for 24 hrs of memtest is cumulative 
and I can probably see the same results starting it each night when I 
quit for the day.


When I mentioned SMART I was talking about the self tests not the status 
that comes up.  I've also copied large files around and checked their 
md5sum's.


I played with LiveCD for 4 or 5 hours today, much of it was trying to 
install it on a different spinning hard drive.


I did see one time when the SSD was shown in the disk utility but all 
the partitions were zero length.  that's where my root directory used to be.


I also found that the nvidia drivers in ELREPO don't seem to work with 
6.4.  I seem to be able to run fine (at least for a while) unless I 
install kmod-nvidia then I get a kernal panic on the next reboot (3 
times until I tracked it down).  It saiys something like "not syncing 
attempt xxx(can't read my writing) PID 1 comm init not tainted 
2.6.32.258.2.1.  That's another problem I think.


Right now I suspect not necessarily in order:

 * Bad SSD.  Run time is reported as 1.8 years.  I did have /usr
   /usr/local /tmp swap and /home on spinning media but...
 * Bad memory:  still a good possiblity
 * Some insidious incompatibility with all packages from multiple
   repos.  I really hope it's not that, I don't load much I don't need.

And as for finding a real computer repairman, let me know if you have 
one in Los Angeles.  This is similar to a problem I had with an iMac.  
The geniuses at the store took three trips to convince them something 
was wrong and that was after about an hour each time with the phone 
support people.  That one turned out to be a flaky memory DIMM that 
passed all the quick diagnostics.


Oh well the saga continues.  It's nice have a group to go to for ideas.  
Thank you all.


Joe


On 04/23/2013 04:20 PM, Konstantin Olchanski wrote:

On Tue, Apr 23, 2013 at 11:44:22AM -0700, Joseph Areeda wrote:

I'm having this strange behavior that I think is a hardware problem ...
* System freezes, mouse and keyboard dead, sshd unresponsive sometimes


First action is to run memtest86 (Q: which one? google finds several. A: all of 
them).

Run memtest86 for 24 hours at least - if it reports memory errors, hangs, 
freezes or
machine turns off, you definitely have a hardware problem. Suspect parts
are in this order: RAM, power supply, CPU socket (bent pins), mobo, CPU.

If memtest86 runs fine for 24 hours and more, there *still* could be a hardware
problem. (memtest86 does not test the video, the disk, the network
and the usb interfaces).


disk utility show ... SMART [is] fine.


SMART "health report" is useless. I had dead disks report "SMART OK" and perfectly 
functional disks report "SMART Failure, replace your disk now".

This is free advice. For advice that would actually get your computer
working again, you would want to hire a proper computer repairman.





Re: how to find internet dead spots

2013-03-27 Thread Joseph Areeda

Hi Todd,

If you mean the dsl goes out for a while, what I've done is pretty low 
tech but works for reporting downtime.


A cron job from inside that pings a couple of servers on the outside and 
one on the outside that pings the server in question.  I usually grep 
for the summary line and redirect it out to a log file.


Soemthing like:

   #!/bin/bash
   ips="example.com  another.example.com "
   for ip in $ips; do
dat=`date +"%Y%m%d %H%M"`
res=`ping -c 3 $ip | grep loss| awk '{print $6, ",", $10 }'`
echo $dat "," $ip "," $res >>/home/joe/ping.stats
   done

Joe


On 3/27/13 12:36 PM, Todd And Margo Chester wrote:


Hi All,

I have a Cent OS 5.x server sitting on a DSL line
acting as a firewall.  I have noticed that there are
dead spots, up to a minute, every so often in their
Internet service.

It could be a storm on someone's part, but the worst
they run is IMAP.  No music; no video.

Is there a utility I can run to map this?

Many thanks,
-T

xxx





Resolved: ssh returns "Permission denied (gssapi-keyex,gssapi-with-mic)."

2013-02-08 Thread Joseph Areeda
Well this has been a thorn in my side for months but I think I've 
figured it out.  At least I found a plausible reason for it and it's 
been working longer than it has before.


The problem turned out to be I had both gsisshd and sshd running and the 
fix was to use chkconfig to disable it.


The really weird part that made it hard to figure out was that ssh would 
work for days then suddenly stop.  "sudo service sshd restart" would get 
it to work again for a few days.


I had installed the gsi server stuff because we will (hopefully) move to 
that certificate based access soon, not thinking that it would be 
enabled on install.


The take home lesson is think before you install potentially conflicting 
services.


Thanks,
Joe

On 11/21/2012 02:16 PM, Joseph Areeda wrote:

I can't figure out what causes this error.

I can "fix" it by regenerating the server key on the system I'm trying 
to connect to and restarting sshd but that seems to be temporary as 
the same problem comes back in a week or so.  Rebooting the server 
does not fix it.


Does anyone know what that error means?  I am using ssh not gsissh 
although I do have globus toolkit installed to contact grid computers.


I'm pretty sure it's a misconfiguration on my part but I can't figure 
out what I did or didn't do.


Thanks,

Joe


Re: Will HTML5 eventually sub for Java?

2013-01-18 Thread Joseph Areeda

Well, I'll add my 2¢ but don't think I have a definitive answer for you.

First of all,  HTML5 is meant to obviate the need for many browser 
plugin and when combined with Javascript will be able to substitute for 
some of the things applets are used for.


Java is much more than a browser plug-in and HTML5 has nothing to do 
with its other uses.


Joe

On 01/18/2013 05:26 PM, Todd And Margo Chester wrote:

Hi All,

With all the security problems in Java right now, does
anyone know if HTML5 will eventually sub for Java?

And, will HTML5 have its own list of prodigious security
problems?

Many thanks,
-T


Re: openmpi compilation options

2013-01-11 Thread Joseph Areeda

Arnau,
I'm  an opmi novice but I believe this FAQ answers your question: 
http://www.open-mpi.org/faq/?category=running#run-n1ge-or-sge


Joe

On 01/11/2013 06:53 AM, Arnau Bria wrote:

Hi all,

I'd like to know if openmpi (1.5.4-1.el6) provided by SL6.3 was compiled
with the option:

--with-sge

I don't know where I should look for it, so, apart from the reply, if
someone could tell me how to do it in the future I'll  really
appreciate it!

TIA,
Arnau


Re: SL 6 etc. on ARM CPU units

2012-12-08 Thread Joseph Areeda
I'm pretty sure there are Debian ports for ARM including RasberryPi.  
Here's an interesting project out of the UK 
http://www.southampton.ac.uk/~sjc/raspberrypi/ where the guy built a 64 
node cluster using Lego for the supports.


I'm also sure it was a lot of work like others have mentioned.

Perhaps when the upstream providers get the kernel and the drivers going 
in the Fedora and RedHat branches we'll see SL7 or 8 available for ARM also.


Joe

On 12/07/2012 11:27 AM, Konstantin Olchanski wrote:

Please do not confuse 3 separate issues:

1) Linux userland: this is pretty much universal and will
run on any CPU as long as you have a cross-compiler
and as long as the "autoconf" tools do not try too hard
to prevent you from cross-compiling the stuff.

2) Linux kernel: is also pretty much universal and assumes
very little about the CPU. There *is* some assembly code
that needs to be ported when you move between CPUs (say
from hypothetical SuperARM to hypothetical HyperARM). I believe
current versions of Linux kernel have this support for
all existing ARM CPU variations.

3) Linux device drivers: in the PC world devices are standardized
around the PCI bus architecture (from the CPU, PCIe looks like PCI,
on purpose) and most devices drivers are universal, so if you
have a PCI/PCIe based ARM machine with PC-type peripherals ("South Bridge",
ethernet, video, etc), you are good to go. If you have an ARM machine
with strange devices (i.e. the RaspberryPI), you have to wait
for the manufacturer to release the specs, then you can write
the drivers, then you can run Linux. Rinse, repeat for the next
revision of the CPU ASIC (because they moved the registers around
or used a slightly different ethernet block). It helps if you have
some standardized interfaces, for example on the RaspberryPI you have
standard USB, so you can use "all supported" USB-Wifi adapters right away.

4) boot loader: is different for each type of machine, each type
of boot device media. period. (Even on PCs there is no longer any
standard standard - some use old-school BIOS booting, others use EFI boot,
some need BIOS/ACPI help, some do not).

This makes it 4 issues, if you count the first (linux userland) non-issue.


K.O.


On Fri, Dec 07, 2012 at 01:01:36PM -0600, SLtryer wrote:

On 10/23/2012 12:37 PM, Konstantin Olchanski wrote:

An "ARM platform" does not exist.

Unlike the "PC platform" where "PC hardware" is highly standardized
and almost any OS can run on almost any vendor hardware,
the "ARM platform" is more like the early Linux days where instead
of 3 video card makers there were 23 of them, all incompatible,
all without Linux drivers. If you had the "wrong" video card,
too bad, no soup for you.

In the ARM world, there is a zoo of different ARM processors,
all incompatible with each other (think as if each Android device
had a random CPU - a 16-bit i8086, or a 32-bit i386, or a 64-bit i7 -
the variation in capabilities is that high).

Then each device contains random i/o chips connected in it's own
special way - there is no PCI/PCIe bus where everything is standardized.
There are several WiFi chips, several Bluetooth, USB, etc chips. Some
have Linux drivers, some do not.

As result, there is no generic Linux that will run on every ARM machine.

Not to be argumentative, but I always believed that the advantage of
*nix* was that it could be ported to numerous platforms, regardless
of hardware.  You even mention the "early Linux days," when there
was little or no standardization of PC hardware.  Yet, the platform
didn't disappear from use simply because there might have been
porting issues, most of which were caused more by proprietary
secrets and hardware defects than the ever-present fact of diversity
of hardware.

But one could make the same argument even today:  That there are
many different CPU platforms, e.g., and that they are not
standardized.  One example I am thinking of is the Intel v. Amdahl
CPU compatibility issue.  Even though most of the Linux system will
run on either without modification, there are still some unique
issues to each of them; from having worked and studied VirtualBox,
there are differences in how each manufacturer chose to implement
the ring structure that permits virtualization to work as nicely as
it does on these platforms.  For the most part, they are compatible,
but the kernel developers have to be aware of certain implemention
issues, including a bug in the Intel CPU platform that requires a
VirtualBox workaround (for optimizing the code or something; I
forget).

And this is in addition to Linux supporting umpteen different
processing platforms besides the x86 types.  New hardware appears
constantly, and some Linux user somewhere wants to use it on their
system.  I feel that variety of hardware and variation in hardware
implementation is a fact, and a main reason why Linux and Unix are
so powerful and ubiquitou

Re: clients slow down due to unknown process

2012-12-01 Thread Joseph Areeda

Hi David,

I am certainly no expert but this looks to me like the classic NFS 
symptoms when the server gets overloaded, or a disk or the network gets 
flaky.


If it were me, I'd try to get the class to do more local i/o (if 
possible).  Perhaps a scratch area on the local disk would solve the 
problem.


I think you could reproduce the problem by writing a test script that 
does heavy i/o to the network folders and then running on more and more 
machines and watch the i/o throughput approach zero with the machines 
hung while waiting for NFS.


Again, I'm no expert feel free to ignore me.

Joe

On 11/29/2012 10:49 AM, David Fitzgerald wrote:

Last night during class time I had a chance to check some of the machines with 
the frozen displays, and I am not sure what to make of what I found.  Running 
'lsof -p $PID'  with (PID being 5044) on one of the affected machines, gave 
this which, doesn't tell me much:

10.10.10 5044 root  cwd   DIR8,7 40962 /
10.10.10 5044 root  rtd   DIR8,7 40962 /
10.10.10 5044 root  txt   unknown  /proc/5044/exe


I also ran pstree and I will put that output below, but I think I may be 
barking up the wrong tree.  While some of my clients were freezing up, I saw 
that my NFS server was getting very high 'top' loads.  Fortunately I  have 
sysstat running on the server and after class 'sar -u' showed that %iowait went 
from less than 1 before class to a high of 53 after class began, and stayed 
high until class ended.  Here is the relevant 'chunk' of the sar -u  output:

05:20:01 PM all  0.03  0.00  0.07  0.17  0.00 99.73
05:30:01 PM all  0.03  0.00  0.03  0.11  0.00 99.83
05:40:01 PM all  0.18  0.00  0.50  1.88  0.00 97.44
05:50:01 PM all  0.16  0.00  1.12  6.93  0.00 91.78
06:00:01 PM all  0.73  0.00  5.23 32.61  0.00 61.43
06:10:01 PM all  0.77  0.00  6.55 53.67  0.00 39.01
06:20:01 PM all  0.13  0.00  4.81 27.81  0.00 67.25
06:30:01 PM all  0.13  0.00  6.69 21.71  0.00 71.47
06:40:01 PM all  0.11  0.00  3.47 33.34  0.00 63.08
06:50:01 PM all  0.11  0.00  3.20 31.02  0.00 65.67
07:00:01 PM all  0.24  0.00  3.93 30.79  0.00 65.05
07:10:01 PM all  0.16  0.00  3.63 20.51  0.00 75.71
07:20:01 PM all  0.18  0.00  5.23  1.45  0.00 93.13
07:30:01 PM all  0.10  0.00  5.72  0.70  0.00 93.48
Average:all  0.06  0.01  0.46  2.13  0.00 97.34


  The NFS server is a virtual machine in running ESXI 4.1 and VMware tools IS 
installed.  Could this be slow disk access, and thus a VMware misconfiguration? 
 I hate to admit it, but I am at a loss.

I can run other sar reports on yesterday's (Wednesday's) data if anyone thinks 
there may be something in there to help.

For what its worth, here is the output from pstree from one of the affected 
clients, and I do NOT see the PID that I was looking for:

init(1)-+-NetworkManager(1782)-+-dhclient(1808)
 |  `-{NetworkManager}(1809)
 |-abrtd(2341)
 |-acpid(2039)
 |-anacron(3615)
 |-atd(2413)
 |-atieventsd(2421)---authatieventsd.(4134)
 |-auditd(1547)-+-audispd(1549)-+-sedispatch(1550)
 |  |   `-{audispd}(1551)
 |  `-{auditd}(1548)
 |-automount(2134)-+-{automount}(2135)
 | |-{automount}(2136)
 | |-{automount}(2139)
 | |-{automount}(2142)
 | |-{automount}(2143)
 | `-{automount}(2144)
 |-avahi-daemon(1794)---avahi-daemon(1795)
 |-bonobo-activati(4549)---{bonobo-activat}(4550)
 |-cachefilesd(1597)
 |-certmonger(2435)
 |-clock-applet(4644)
 |-console-kit-dae(2521)-+-{console-kit-da}(2522)
 |   |-{console-kit-da}(2523)
 |   |-{console-kit-da}(2524)
 |   |-{console-kit-da}(2525)
 |   |-{console-kit-da}(2526)
 |   |-{console-kit-da}(2527)
 |   |-{console-kit-da}(2528)
 |   |-{console-kit-da}(2529)
 |   |-{console-kit-da}(2530)
 |   |-{console-kit-da}(2531)
 |   |-{console-kit-da}(2532)
 |   |-{console-kit-da}(2533)
 |   |-{console-kit-da}(2534)
 |   |-{console-kit-da}(2535)
 |   |-{console-kit-da}(2536)
   

Re: ssh returns "Permission denied (gssapi-keyex,gssapi-with-mic)."

2012-11-22 Thread Joseph Areeda

Thanks for the comments Paul.

I was surprised when I joined the collaboration and saw home directories 
world readable but that decision was made long before I arrived and 
changing it remains above my pay grade.


The reason I doubt that's my current problem is because regenerating the 
server key files works.  I can log in fine today and I haven't changed 
permissions.  I also don't have problem logging into other systems from 
that machine that are [supposed to be] set up the same way.


When it happens again, I will check if changing permissions helps.

Also for the record I waited until my existing Kerberos tickets 
expired.  These are to other services not that machine.  I can log in 
fine with an expired or valid TGT hanging around and after kdestroy.


Happy holidays,
Joe




On 11/22/2012 08:32 AM, Paul Robert Marino wrote:


Well there is your problem
The users home directory needs to be 700 unless you turn off strict 
key checking in the sshd configuration file. Also the public key 
should be 600 as well.


Making home directories world or group readable isn't a good plan for 
collaberation because many applications store sensitive information 
like passwords and cached information like session data in the home 
directory. instead consider creating group directories an setting the 
setgid bit on it so the group permissions are inherited by any files 
created in the directories.
Making home directories world or group readable is a lazy solution to 
an easily solved problem. Its a common mistake that causes loads of 
problems because many application which are written to be secure 
purposly break when you do it.
I highly suggest you comeup with a better plan for collaberation than 
that.


On Nov 21, 2012 11:10 PM, "Joseph Areeda" <mailto:newsre...@areeda.com>> wrote:


On 11/21/2012 07:08 PM, Alan Bartlett wrote:

    On 22 November 2012 01:18, Joseph Areeda mailto:newsre...@areeda.com>> wrote:

The user's directory is 755 which is the convention for
grid computers in
our collaboration and the plan is for this machine to be
on our soon to be
delivered cluster.  The .ssh directory is 700.  This
doesn't change between
the working and non-working state.

Good, you've checked the directory.

Now what about the files within it? Hopefully they are all 600?

Alan.

Alan,

The private keys are all 600 and the public keys are 644.  I keep
a few different ones for going to different systems.

Joe



Re: ssh returns "Permission denied (gssapi-keyex,gssapi-with-mic)."

2012-11-21 Thread Joseph Areeda

On 11/21/2012 07:08 PM, Alan Bartlett wrote:

On 22 November 2012 01:18, Joseph Areeda  wrote:

The user's directory is 755 which is the convention for grid computers in
our collaboration and the plan is for this machine to be on our soon to be
delivered cluster.  The .ssh directory is 700.  This doesn't change between
the working and non-working state.

Good, you've checked the directory.

Now what about the files within it? Hopefully they are all 600?

Alan.

Alan,

The private keys are all 600 and the public keys are 644.  I keep a few 
different ones for going to different systems.


Joe


Re: ssh returns "Permission denied (gssapi-keyex,gssapi-with-mic)."

2012-11-21 Thread Joseph Areeda

Thank you Paul, Steven and Steve,

I think Kerberos may be the issue.  I do NOT use Kerberos to access this 
machine, I have a lot to learn before I turn that and LDAP on.  But I do 
use it to access several services in our collaboration so the client 
machine often has a valid Kerberos TGT (and probably more often an 
expired ticket).  I think it's worth experimenting with the client in 
different states of Kerberosity (or whatever that word should be).


The user's directory is 755 which is the convention for grid computers 
in our collaboration and the plan is for this machine to be on our soon 
to be delivered cluster.  The .ssh directory is 700.  This doesn't 
change between the working and non-working state.


I tarred the /etc/ssh directory and saved it for next time but wouldn't 
generating new keys make them almost completely different?  Generating 
new keys makes no sense to me either, but it does work.  Well, at least 
it has been the only thing I've done coincident with resolving the 
problem the last 3 times this has happened.


I also save the triple verbose ssh output.

I really appreciate the discussion gentlemen, it helps a lot.

Best,
Joe

On 11/21/2012 04:58 PM, Paul Robert Marino wrote:
On Nov 21, 2012 7:57 PM, "Paul Robert Marino" <mailto:prmari...@gmail.com>> wrote:


Ok
To be clear are you using kerberos or not
If the answer is no and you are just using ssh keys the most
common cause of this issue is that the useres home directory is
group or world readable. In the most secure mode which is the
default if the useres home and or the ~/.ssh directory is has a
any thing other than 700 or 500 set as the permissions it will
reject the public key (the one on the server you are trying to
connect to) this become obvious with -vvv but not -vv

On Nov 21, 2012 7:34 PM, "Steven C Timm" mailto:t...@fnal.gov>> wrote:

Shouldn’t need to regenerate the keys.. once you get them
generated once they should be good for the life of the machine.

Save copies of the keys as they are now and if your system
goes bad, do differences to see what changed, if anything.

Steve Timm

*From:*owner-scientific-linux-us...@listserv.fnal.gov
<mailto:owner-scientific-linux-us...@listserv.fnal.gov>
[mailto:owner-scientific-linux-us...@listserv.fnal.gov
<mailto:owner-scientific-linux-us...@listserv.fnal.gov>] *On
Behalf Of *Joseph Areeda
*Sent:* Wednesday, November 21, 2012 5:46 PM
*To:* owner-scientific-linux-us...@listserv.fnal.gov
<mailto:owner-scientific-linux-us...@listserv.fnal.gov>
*Cc:* scientific-linux-users
*Subject:* Re: ssh returns "Permission denied
(gssapi-keyex,gssapi-with-mic)."

Thank you Tam, and Steven,

I just confirmed that regenerating the keys (ssh-keygen -t dsa
-f ssh_host_dsa_key && ssh -t rsa -f ssh_host_rsa_key) in
/etc/ssh "fixes the problem"

So ssh -vv shows me how it's supposed to look.  I'll save that
and do a diff when it happens again.

As I continue my googling I can report on a few things it's not

Server machine has a fixed ip address and dns/rdns appears
working.

Time issue Steven mentioned does not seem to be it, although I
may stop using pool machines and set up a local ntp server so
everybody gets the same time.  I can ssh and gsissh to other
servers.

Server:
ntpq -p

 remote   refid  st t when poll reach  
delay   offset  jitter


==
*ping-audit-207- .ACTS.   1 u5  128  377  
19.8675.804   1.927
+10504.x.rootbsd 198.30.92.2  2 u  129  128  376   45.146 
-28.571   5.558
+ntp.sunflower.c 132.236.56.250   3 u   77  128  355   63.836 
-14.753   5.360

-ntp2.ResComp.Be <http://ntp2.ResComp.Be> 128.32.206.553
 u  126  128  377  
22.1127.311   2.022



Client:

ntpq -p
 remote   refid  st t when poll reach  
delay   offset  jitter


==
 64.147.116.229  .ACTS.   1 u   47  1280  
13.5430.567   0.000
*nist1-chi.ustim .ACTS.   1 u   25  128  377 
106.619   14.458   5.896

+name3.glorb.com <http://name3.glorb.com> 69.36.224.15 2
u   64  128  377   88.564  -27.542   3.631
+131.211.8.244   .PPS.1 u   81  128  377 
167.1073.259   2.340





The only setting I change in sshd_config is to turn off
password auth b

Re: ssh returns "Permission denied (gssapi-keyex,gssapi-with-mic)."

2012-11-21 Thread Joseph Areeda

Thank you Tam, and Steven,

I just confirmed that regenerating the keys (ssh-keygen -t dsa -f 
ssh_host_dsa_key && ssh -t rsa -f ssh_host_rsa_key) in /etc/ssh "fixes 
the problem"


So ssh -vv shows me how it's supposed to look.  I'll save that and do a 
diff when it happens again.


As I continue my googling I can report on a few things it's not

Server machine has a fixed ip address and dns/rdns appears working.

Time issue Steven mentioned does not seem to be it, although I may stop 
using pool machines and set up a local ntp server so everybody gets the 
same time.  I can ssh and gsissh to other servers.


Server:
ntpq -p
 remote   refid  st t when poll reach   delay   
offset  jitter

==
*ping-audit-207- .ACTS.   1 u5  128  377   19.867
5.804   1.927
+10504.x.rootbsd 198.30.92.2  2 u  129  128  376   45.146  
-28.571   5.558
+ntp.sunflower.c 132.236.56.250   3 u   77  128  355   63.836  
-14.753   5.360
-ntp2.ResComp.Be 128.32.206.553 u  126  128  377   22.112
7.311   2.022


Client:

ntpq -p
 remote   refid  st t when poll reach   delay   
offset  jitter

==
 64.147.116.229  .ACTS.   1 u   47  1280   13.543
0.567   0.000
*nist1-chi.ustim .ACTS.   1 u   25  128  377  106.619   
14.458   5.896
+name3.glorb.com 69.36.224.15 2 u   64  128  377   88.564  
-27.542   3.631
+131.211.8.244   .PPS.1 u   81  128  377  167.107
3.259   2.340




The only setting I change in sshd_config is to turn off password auth 
but this machine is being brought up behind a firewall and I haven't 
done that yet.  Also if it was a config problem I doubt changing the key 
would fix it, even temporarily.


I will report back with the ssh -vv stuff when it happens again.
At least now I have a chance of figuring out what's going on.

Best,
Joe


On 11/21/2012 02:30 PM, Tam Nguyen wrote:

Hi Joe,
Did you look at the sshd_config file?
I ran into a similar error output but it may not necessarily be the 
same issue you're having.  In my case, the sshd_conf file on one of my 
users machine was edited and renamed.  I backup that file and copy a 
default sshd_config file, then test it.


Good luck.
-T

On Wed, Nov 21, 2012 at 5:16 PM, Joseph Areeda <mailto:newsre...@areeda.com>> wrote:


I can't figure out what causes this error.

I can "fix" it by regenerating the server key on the system I'm
trying to connect to and restarting sshd but that seems to be
temporary as the same problem comes back in a week or so.
 Rebooting the server does not fix it.

Does anyone know what that error means?  I am using ssh not gsissh
although I do have globus toolkit installed to contact grid computers.

I'm pretty sure it's a misconfiguration on my part but I can't
figure out what I did or didn't do.

Thanks,

Joe




ssh returns "Permission denied (gssapi-keyex,gssapi-with-mic)."

2012-11-21 Thread Joseph Areeda

I can't figure out what causes this error.

I can "fix" it by regenerating the server key on the system I'm trying 
to connect to and restarting sshd but that seems to be temporary as the 
same problem comes back in a week or so.  Rebooting the server does not 
fix it.


Does anyone know what that error means?  I am using ssh not gsissh 
although I do have globus toolkit installed to contact grid computers.


I'm pretty sure it's a misconfiguration on my part but I can't figure 
out what I did or didn't do.


Thanks,

Joe


Re: The opposite SL and VirtualBox problem

2012-10-02 Thread Joseph Areeda

thanks,

I did spot that and updated everything so the versions match.

Stupid mistake on my part.

Joe

On 10/02/2012 02:05 PM, Akemi Yagi wrote:

I would not do that. The matching version of kernel-devel you need (
2.6.32-220.17.1.el6 ) is available here:

http://ftp.scientificlinux.org/linux/scientific/6.2/x86_64/updates/security/

Remove the link you manually created and install the right version of
kernel-devel. If/when you update the kernel, remember to update
kernel-devel to the same version.


Re: The opposite SL and VirtualBox problem

2012-10-02 Thread Joseph Areeda

Well, I'm not going to touch Nico's comment because I don't know KVM.

For me it's the Devil you know kind of thing.  I've had good experience 
with Vbox on multiple OS and am just playing in my comfort zone.


I do have reasons to explore other VMs but none of them pressing.  I 
just want to install one of the University's "free" site license copy of 
Windows as a courtesy to our students.


Joe


On 10/2/12 3:15 AM, David Sommerseth wrote:


----- Original Message -

From: "Joseph Areeda" 
To: SCIENTIFIC-LINUX-USERS@LISTSERV.FNAL.GOV
Sent: Tuesday, 2 October, 2012 12:33:59 AM
Subject: The opposite SL and VirtualBox problem

I want to run Windows as a guest system on my Sl6.3 box.

Installing vbox from the Oracle repository gives me an error trying
to
create the kernel modules.

Just a silly question.  Why bother with VirtualBox when you have KVM built into 
the OS?  Using the SPICE protocol (yum search spice) and you'll even get a 
decent console performance.  And it's really easy to setup and configure using 
virt-manager.


kind regards,

David Sommerseth

xxx



Re: The opposite SL and VirtualBox problem

2012-10-01 Thread Joseph Areeda

On 10/01/2012 04:24 PM, Akemi Yagi wrote:

On Mon, Oct 1, 2012 at 3:33 PM, Joseph Areeda  wrote:

I want to run Windows as a guest system on my Sl6.3 box.

Installing vbox from the Oracle repository gives me an error trying to
create the kernel modules.

When trying to do it manually,  I run /etc/init.d/vboxdrv -setup and get:


Stopping VirtualBox kernel modules [  OK  ]
Uninstalling old VirtualBox DKMS kernel modules[  OK  ]
Trying to register the VirtualBox kernel modules using DKMS
Error! Your kernel headers for kernel 2.6.32-220.17.1.el6.x86_64 cannot be
found at
/lib/modules/2.6.32-220.17.1.el6.x86_64/build or
/lib/modules/2.6.32-220.17.1.el6.x86_64/source.


and when I look for those file I see a broken link


ll /lib/modules/2.6.32-220.23.1.el6.x86_64/build
lrwxrwxrwx 1 root root 51 Jun 20 09:54
/lib/modules/2.6.32-220.23.1.el6.x86_64/build ->
../../../usr/src/kernels/2.6.32-220.23.1.el6.x86_64


It looks like that file should be linked to:



ls /usr/src/kernels/2.6.32-279.9.1.el6.x86_64/
archdrivers   include  kernelMakefile.common  net  security
tools
block   firmware  init lib   mm   samples  sound
usr
crypto  fsipc  Makefile  Module.symvers   scripts  System.map
virt


I'm going try just fixing the link. but it seems like the kernel-header rpm
has a problem.  Or am I missing something?  Would not be the first time or
even a rare occurrence.

Joe

You need the kernel-devel package (not kernel-headers). That version
must match your *running* kernel. You can find the version of your
running kernel by:

uname -r

Then install kernel-devel of that version.

Akemi

Thanks Akemi,

I think I see the problem now.  A yum search produces only one listing 
for kernel-devel and yum info says:


Installed Packages
Name: kernel-devel
Arch: x86_64
Version : 2.6.32
Release : 279.9.1.el6


uname -a says
Linux  2.6.32-220.17.1.el6.x86_64 #1 SMP Tue May 15 17:16:46 
CDT 2012 x86_64 x86_64 x86_64 GNU/Linux


I'm using the repo maintained by the collaboration I'm in and there 
seems to an issue.


For the record fixing that broken link did allow me to build the kernel 
module and run vbox.  I wonder if I introduced any instabilities.


Joe


The opposite SL and VirtualBox problem

2012-10-01 Thread Joseph Areeda

I want to run Windows as a guest system on my Sl6.3 box.

Installing vbox from the Oracle repository gives me an error trying to 
create the kernel modules.


When trying to do it manually,  I run /etc/init.d/vboxdrv -setup and get:


Stopping VirtualBox kernel modules [  OK  ]
Uninstalling old VirtualBox DKMS kernel modules[  OK  ]
Trying to register the VirtualBox kernel modules using DKMS
Error! Your kernel headers for kernel 2.6.32-220.17.1.el6.x86_64 
cannot be found at
/lib/modules/2.6.32-220.17.1.el6.x86_64/build or 
/lib/modules/2.6.32-220.17.1.el6.x86_64/source.


and when I look for those file I see a broken link


ll /lib/modules/2.6.32-220.23.1.el6.x86_64/build
lrwxrwxrwx 1 root root 51 Jun 20 09:54 
/lib/modules/2.6.32-220.23.1.el6.x86_64/build -> 
../../../usr/src/kernels/2.6.32-220.23.1.el6.x86_64


It looks like that file should be linked to:



ls /usr/src/kernels/2.6.32-279.9.1.el6.x86_64/
archdrivers   include  kernelMakefile.common  net  
securitytools
block   firmware  init lib   mm   samples  
sound   usr
crypto  fsipc  Makefile  Module.symvers   scripts  
System.map  virt


I'm going try just fixing the link. but it seems like the kernel-header 
rpm has a problem.  Or am I missing something?  Would not be the first 
time or even a rare occurrence.


Joe


Re: X11 server won't start after yum upgrade

2012-07-18 Thread Joseph Areeda

Thank you Malcolm.
I'm runn 64bit but I'll bet I can find something close that might work.

Joe
On 07/18/2012 02:11 AM, Malcolm MacCallum wrote:

I downloaded (using Firefox under Windows XP)
xorg-x11-server-Xorg-1.7.7-29.el6.i686.rpm
xorg-x11-server-common-1.7.7-29.el6.i686.rpm
from the 6.1 i386 os filestore,
http://ftp.scientificlinux.org/linux/scientific/6.1/i386/os/Packages/
I then ran rpm with (if I recall well)
rpm --oldpackage -i 
(maybe also with --replacefiles)
on each of them and all was well again. I'm not saying this is the
best or even a good way to fix things!

Malcolm

- Original Message -
From: "Joseph Areeda" 
To: "Malcolm MacCallum" 
Cc: scientific-linux-us...@fnal.gov
Sent: Wednesday, 18 July, 2012 3:09:20 AM
Subject: Re: X11 server won't start after yum upgrade

Malcolm,
Which rpm's worked for you?
I have sshd running so I've been able to survive but none of the
suggestions so far let me log in from the console on my VM.

thanks
Joe

On 07/17/2012 01:41 AM, Malcolm MacCallum wrote:

I have solved my problem by downloading the relevant rpm files to my
Windows partition and running rpm with suitable arguments, so I am now
back to a fully functioning SL with Gnome etc. But it would still have
been nice to have instructions that worked 'out of the box'.




Re: X11 server won't start after yum upgrade

2012-07-17 Thread Joseph Areeda

Malcolm,
Which rpm's worked for you?
I have sshd running so I've been able to survive but none of the 
suggestions so far let me log in from the console on my VM.


thanks
Joe

On 07/17/2012 01:41 AM, Malcolm MacCallum wrote:

I have solved my problem by downloading the relevant rpm files to my
Windows partition and running rpm with suitable arguments, so I am now
back to a fully functioning SL with Gnome etc. But it would still have
been nice to have instructions that worked 'out of the box'.


Re: X11 server won't start after yum upgrade

2012-07-12 Thread Joseph Areeda

  
  
have the same problem with a virtual machine under VirtualBox.

I'm running:
Linux  2.6.32-220.23.1.el6.x86_64 #1 SMP Mon Jun 18 09:58:09 CDT
2012 x86_64 x86_64 x86_64 GNU/Linux

Log files Xorg.0.log says:

[    38.144] (II) LoadModule: "vboxvideo"
[    38.145] (II) Loading
/usr/lib64/xorg/modules/drivers/vboxvideo_drv.so
[    38.145] (II) Module vboxvideo: vendor="Oracle Corporation"
[    38.145]    compiled for 1.5.99.901, module version = 1.0.1
[    38.145]    Module class: X.Org Video Driver
[    38.145]    ABI class: X.Org Video Driver, version 9.0
[    38.145] (EE) module ABI major version (9) doesn't match the
server's version (10)
[    38.145] (II) UnloadModule: "vboxvideo"
[    38.145] (II) Unloading vboxvideo
[    38.145] (EE) Failed to load module "vboxvideo" (module
requirement mismatch, 0)
[    38.145] (EE) No drivers available.
[    38.145]
Fatal server error:
[    38.145] no screens found
[    38.145]
Please consult the Scientific Linux support
 at https://www.scientificlinux.org/maillists

  
Everything else is working fine but I can't use the console.

Joe
  



Re: USB unresponsive

2012-06-18 Thread Joseph Areeda

I forgot to include the relevant part of /var/log/messages:

   Jun 18 18:24:12 george kernel: drivers/hid/usbhid/hid-core.c: can't
   reset device, :00:1a.0-1.2.2.4/input0, status -71
   Jun 18 18:24:12 george kernel: usb 1-1.2.2: clear tt 1 (00a0) error -71
   Jun 18 18:24:12 george kernel: drivers/hid/usbhid/hid-core.c: can't
   reset device, :00:1a.0-1.2.2.3/input0, status -71
   Jun 18 18:24:12 george kernel: usb 1-1.2.2: USB disconnect, address 6
   Jun 18 18:24:12 george kernel: usb 1-1.2.2.1: USB disconnect, address 7
   Jun 18 18:24:12 george kernel: usb 1-1.2.2.2: USB disconnect, address 8
   Jun 18 18:24:12 george kernel: usb 1-1.2.2.3: USB disconnect, address 9
   Jun 18 18:24:17 george kernel: drivers/hid/usbhid/hid-core.c: can't
   reset device, :00:1a.0-1.2.2.3/input1, status -110
   Jun 18 18:24:17 george kernel: usb 1-1.2.2: clear tt 1 (0090) error -19
   Jun 18 18:24:17 george kernel: usb 1-1.2.2.4: USB disconnect, address 10

After that the system is unresponsive and I have reboot with the power 
switch.  Those devices are:


Jun 18 14:47:40 george kernel: generic-usb 0003:04F2:0833.0002: 
input,hidraw1: USB HID v1.11 Keyboard [CHICONY USB Keyboard] on 
usb-:00:1a.0-1.2.2.3/input0
Jun 18 14:47:40 george kernel: generic-usb 0003:045E:0029.0004: 
input,hidraw3: USB HID v1.00 Mouse [Microsoft Microsoft IntelliMouse® 
Optical] on usb-:00:1a.0-1.2.2.4/input0



One last question, should I post text only to this list or are people 
happy with html formatting?


Joe


On 06/18/2012 07:42 PM, Joseph Areeda wrote:

Greeting,

This is my first post to this list, I'm hoping for some insight into a 
vexing problem.


My situation is 4 computers running various operating systems: Ubuntu, 
Scientific Linux 6.2, Debian Squeeze, MacOS X lion, and Windows 7. All 
except the Mac and one Ubuntu are multiple boot and used for cross 
platform development and testing.


I have a manual switch box with computers on ports 1-4 and a powered 
USB hub with mouse, keyboard, scanner, microphone and a USB headset 
adapter.


There is also an HDMI and DVI switch box but they are not part of of 
the problem. Together they give me a more flexible KVM switch. I can 
watch a long running job on Monitor #2 connected to one system while 
working on another using Monitor #1.


When I boot up everything works fine. I can also switch between 
systems freely. However if I leave one of the Linux systems 
disconnected for a long while it doesn't respond when I switch back to 
it. I have not seen this behavior with Windows or Mac. The Mac is 
sometimes disconnected for days but Windows usually gets booted back 
into Linux when I'm done with it.


SL6 usually dies but I can usually ssh into Ubuntu and reboot 
cleanly.  SL6 doesn't respond to pings or ssh.  As long as I keep the 
switch box on SL6 it has run for weeks.


Now I have tried 2 different USB switch boxes and when it doesn't 
respond it doesn't respond even if I plug the hub, or the mouse and 
keyboard directly into the usb ports on the system. I don't believe it 
has anything to do with the KVM the OP mentioned or my switch boxes.


Searching the web I found a comment that said a powered hub per system 
worked with his USB KVM switch. I suspect we're seeing some sort of 
USB timeout. I suppose I can get a powered hub per system but I built 
these machines with 6 or 10 USB 2 and 2 or 4 USB 3 ports so I really 
don't need them. The powered hub may however convince Linux that there 
is something plugged into that port and keep it alive.


Does anyone know of a reason for this or even better a fix for it?

Thanks
Joe


USB unresponsive

2012-06-18 Thread Joseph Areeda

Greeting,

This is my first post to this list, I'm hoping for some insight into a 
vexing problem.


My situation is 4 computers running various operating systems: Ubuntu, 
Scientific Linux 6.2, Debian Squeeze, MacOS X lion, and Windows 7. All 
except the Mac and one Ubuntu are multiple boot and used for cross 
platform development and testing.


I have a manual switch box with computers on ports 1-4 and a powered USB 
hub with mouse, keyboard, scanner, microphone and a USB headset adapter.


There is also an HDMI and DVI switch box but they are not part of of the 
problem. Together they give me a more flexible KVM switch. I can watch a 
long running job on Monitor #2 connected to one system while working on 
another using Monitor #1.


When I boot up everything works fine. I can also switch between systems 
freely. However if I leave one of the Linux systems disconnected for a 
long while it doesn't respond when I switch back to it. I have not seen 
this behavior with Windows or Mac. The Mac is sometimes disconnected for 
days but Windows usually gets booted back into Linux when I'm done with it.


SL6 usually dies but I can usually ssh into Ubuntu and reboot cleanly.  
SL6 doesn't respond to pings or ssh.  As long as I keep the switch box 
on SL6 it has run for weeks.


Now I have tried 2 different USB switch boxes and when it doesn't 
respond it doesn't respond even if I plug the hub, or the mouse and 
keyboard directly into the usb ports on the system. I don't believe it 
has anything to do with the KVM the OP mentioned or my switch boxes.


Searching the web I found a comment that said a powered hub per system 
worked with his USB KVM switch. I suspect we're seeing some sort of USB 
timeout. I suppose I can get a powered hub per system but I built these 
machines with 6 or 10 USB 2 and 2 or 4 USB 3 ports so I really don't 
need them. The powered hub may however convince Linux that there is 
something plugged into that port and keep it alive.


Does anyone know of a reason for this or even better a fix for it?

Thanks
Joe