SL6.1 is missing various bug fix and fasttrack updates?

2011-09-14 Thread Vladimir Mosgalin
Hello everybody.

I wonder why SL6 (with installed yum-conf-sl6x and yum-conf-sl-other and
turned on fastbugs) is missing lots of various updates from TUV.

For example, bug fix updates to curl, glibc, binutils, portreserve,
xmlrpc-c
http://rhn.redhat.com/errata/RHBA-2011-1284.html
http://rhn.redhat.com/errata/RHBA-2011-1255.html
http://rhn.redhat.com/errata/RHBA-2011-1179.html
http://rhn.redhat.com/errata/RHBA-2011-1186.html
http://rhn.redhat.com/errata/RHBA-2011-1285.html

Glibc update is almost 1 month old, for example; corresponding SRPMS of
glibc, binutils and others are freely available, however no SL repos
include these.

There is also whole bunch of fasttrack updates which, as I thought, were
supposed to appear in fastbugs SL6 repo but they don't. SRPMS for (at
least most) of these are avaliable too.
>From the list on
http://rhn.redhat.com/errata/rhel-server-fastrack-6-errata.html?by=date
I can see lots of packages like
cpufrequtils
libcgroup
powertop
tmpwatch
smartmontools
vte
newt
qt
setup
doxygen
sudo
tuned
mingetty
DeviceKit-power
attr
perl-Net-DNS
qt3
file

which aren't in SL repos. (there are maybe some others, too)


My question is very simple, if this situation is under control and these
updates will appear in SL at some point, or something broke and they
were missed out and thus won't be rebuilt until someone will fix
building process or somethings. I can see that some of updates from
"bug fix" category and fasttrack updates get rebuilt for SL - for
example "bug fix" for selinux-policy appeared in repos, but older "bug
fix" glibc update didn't.

Thanks!

-- 

Vladimir


Re: [SCIENTIFIC-LINUX-USERS] SL6.1 is missing various bug fix and fasttrack updates?

2011-09-14 Thread Vladimir Mosgalin
Hi Pat Riehecky!

 On 2011.09.14 at 13:42:43 -0500, Pat Riehecky wrote next:

> Sorry about that, the fastbugs process got modified when 6.1 came
> out and it hadn't been fully restored to working.  The delay was
> mostly because of human (ie ME!) error.
> 
> If you do a  yum clean all  the packages you are looking for should
> be in fastbugs for 6.1 and 6.x now.  Typically we release
> bugfix/enhancement updates on a weekly basis as occasionally the
> packages don't build easily and its nice to have a bit of flexible
> time in getting them released.  When security errata doesn't build
> it gets full attention, when an enhancement doesn't build we care
> and work hard at it, but its just not the same.
> 
> Let me know if you notice anything still missing.
> 

Thanks, that was very fast!
Yes, I do see all the packages in fastbugs. I'll wait a bit before
installing till debuginfo packages will become available, since I don't
want to make glibc debuginfo non-compatible (they will eventually,
right? I understand that glibc and some other debuginfos are large and
mirrors take time to catch up with these).

I must say, I'm so pleasantly surprised with how fast updates appear in
SL and that debuginfo is nearly always (aside from few mirror lag
issues) up to date, it's really breath of fresh air after Centos that
I've been using for years before, to always be able to analyze with
perf & oprofile because all the debuginfo is actually there and gets
updates together with packages. Please keep up the good work, SL is
definitely the best distro for many of my tasks.


-- 

Vladimir


Re: [SCIENTIFIC-LINUX-USERS] SL6.1 is missing various bug fix and fasttrack updates?

2011-09-14 Thread Vladimir Mosgalin
Hi Pat Riehecky!

 On 2011.09.14 at 15:02:53 -0500, Pat Riehecky wrote next:

> Anyway, the mirrors should pick everything up on their next sync,
> provided what we've got posted now is actually accurate.
> 
> Can I have you give
> ftp://ftp.scientificlinux.org/linux/scientific/6x/archive/debuginfo/
> a quick look over for the debuginfo packages you want.  If they are
> posted there, everyone who mirrors SL 6 should pick them up at their
> next update.  If not, then they are hiding out in a directory I
> haven't found yet and should get posted.

Out of 59 fresh packages due to update on my system - when completely
enabling fastbugs repo - all have correct debuginfos in this directory.

-- 

Vladimir


Re: [SCIENTIFIC-LINUX-USERS] SL6.1 is missing various bug fix and fasttrack updates?

2011-09-15 Thread Vladimir Mosgalin
Hi Pat Riehecky!

 On 2011.09.14 at 15:02:53 -0500, Pat Riehecky wrote next:

> Anyway, the mirrors should pick everything up on their next sync,
> provided what we've got posted now is actually accurate.
> 
> Can I have you give
> ftp://ftp.scientificlinux.org/linux/scientific/6x/archive/debuginfo/
> a quick look over for the debuginfo packages you want.  If they are
> posted there, everyone who mirrors SL 6 should pick them up at their
> next update.  If not, then they are hiding out in a directory I
> haven't found yet and should get posted.

Mirrors did pick all the new debuginfo rpms, but it's still ignored;
looks like repodata wasn't updated and doesn't list any of new
debuginfos.


-- 

Vladimir


Re: momentarily disabling synaptic touchpad

2011-09-19 Thread Vladimir Mosgalin
Hi William Shu!

 On 2011.09.19 at 16:28:32 -0700, William Shu wrote next:

> I have SL 6.0 installed on a Seagate FreeAgent GoFlex USB drive, which
> I use on various laptops (and desktops). The touchpad is so sensitive
> on some machines and I would like to disable it. At the same time, the
> attached mouse seems to be selectively responsive, notably its left
> button. Looking through the docs etc,a synaptics input driver has been
> installed, but the corresponding xorg.conf file is not in place for me
> to modify. (From a separate thread on nVidia, creating this file is
> NOT automatic in SL 6.)  
> 
> Question 1: If I create the xorg.conf file, would that later create
> problems for me when I switch to other machines--legacy or recent? I
> would not want some of the clashes (no/incorrect video, etc.) I
> experienced with SL 52 on USB sticks.

It will create problems if you switch configuration if you create
full-fledged xorg.conf; for example video card ID on PCI bus creates
problem, and video card driver. However good thing is, you don't have to
create xorg.conf; modern xorg supports small snippets of config files in
which you can tweak only some part of config, and letting everything
else to be autoconfigured. Sadly, this doesn't work for some stuff like
video (if you have to set gamma, you have to put all sections like video
card - display - etc), but it works perfectly for input devices.
So, it's best to be without xorg.conf at all, if your system can handle
it (if you install nvidia binary drivers, you probably must have that
file :( )


Here lies the trouble, however, as SL6 uses Xorg 7.4 which doesn't
support udev-based configuration (it appeared in Xorg 8 and higher)
or config snippets from /etc/X11/xorg.conf.d; it uses older method of
hal-based configuration with config snippets from .fdi files. Check out
this page, here you can find exact solution which should work in SL6.0:
http://en.gentoo-wiki.com/wiki/Synaptics_Touchpad/Xorg_7.3

(of course, skip kernel & X11 compiling part :)


So, just create .fdi and customize it with options from manpage to your
liking, you can enable synclient real-time configuration, too. You won't
have to touch anything else or xorg.conf and this file won't interfere
with systems that don't have touchpads at all. This should answer your
second question, too.

> Question 3: Can the touchpad sometimes interfere with the use of the
> mouse? If so, how to minimize interference. the [replacement] mouse I
> use may not be the right quality?

Seems unlikely, I'd suspect faulty mouse, but can never be sure.. Still,
never heard or experienced any kind of interference between mouse,
touchpad and touchscreen - they all seem to work in any combination.


-- 

Vladimir


Re: momentarily disabling synaptic touchpad

2011-09-19 Thread Vladimir Mosgalin
Hi William Shu!

 On 2011.09.19 at 16:28:32 -0700, William Shu wrote next:

> I have SL 6.0 installed on a Seagate FreeAgent GoFlex USB drive, which
> I use on various laptops (and desktops). The touchpad is so sensitive
> on some machines and I would like to disable it. At the same time, the
> attached mouse seems to be selectively responsive, notably its left
> button. Looking through the docs etc,a synaptics input driver has been
> installed, but the corresponding xorg.conf file is not in place for me
> to modify. (From a separate thread on nVidia, creating this file is
> NOT automatic in SL 6.)

Btw, once you enable real-time configuration, you can do lots of magic
with your touchpad, for example this snippet (works on modern systems
but haven't checked on SL6 - but probably it should work) - tweaks
touchpad so that double-finger tap works as middle button that's
always missing on touchpads (of course, touchpad must supports
multitouch) and also disables touchpad while you are typing text, so it
won't be sending annoying commands during that.

syndaemon -i 1 -d -K && xinput set-int-prop \"SynPS/2 Synaptics TouchPad\" 
\"Synaptics Two-Finger Pressure\" 32 10 &","0","*")

It's kind of a magic so please don't ask me how it works, I have no idea :)
It just does!

-- 

Vladimir


Re: Need KVM HD settings advice

2011-09-20 Thread Vladimir Mosgalin
Hi Todd And Margo Chester!

 On 2011.09.19 at 18:03:08 -0700, Todd And Margo Chester wrote next:

> What I need help with is getting the optimum performance
> settings while converting over my old hard (virtual) drive.
> 
> This is what I have gathered from these parts as to the best
> settings:
> 
> - controller: virtio
> - kvm option: cache=none
> - qcow2 disk format with metadata preallocation
> - create your disk image with:
>  qemu-img create -f qcow2 -o \
>  size=400,preallocation=metadata vdisk.img

If you don't need snapshots and such, you might get better performance
with LVM volumes for storing images, connected as "raw images". At least
that's only thing that I'm using in production and it works well.

For that, you create lvm volume of size of your hard drive, convert your
image to raw .img and write over lvm volume.

Storing images in filesystem is flexible, of course, but I believe it's
too risky on random power off or hang/reset and performance might be lower.

> And, I am vague how a *.qcow file/drive and a *.img file/drive
> relate.

.img is (usually) simple raw image, copy of hard disk, and qcow is
qemu-specific format with various features.
http://en.wikipedia.org/wiki/Qcow
http://wiki.qemu.org/download/qemu-doc.html#disk_005fimages

-- 

Vladimir


Re: Need KVM HD settings advice

2011-09-20 Thread Vladimir Mosgalin
Hi Nico Kadel-Garcia!

 On 2011.09.20 at 08:48:06 -0400, Nico Kadel-Garcia wrote next:

> > If you don't need snapshots and such, you might get better performance
> > with LVM volumes for storing images, connected as "raw images". At least
> > that's only thing that I'm using in production and it works well.
> 
> LVM has its uses. But the ability to re-allocate space without having
> to manipulate your partition tables is *vital* in a dynamic
> environmemnt, and it's a lot easier to do with image files.

Maybe. So far I've been okay doing it in few stages (grow LVM on host
with lvresize, grow second partition with fdisk on guest, which is lvm
partition too, use pvresize & lvresize on guest); I must admit it might
be a bit too complicated for some uses, but then again, on production
disks aren't growing *that* often.
Ability to do this without guest reboot depends on ability of guest to
re-read disk geometry, which is a bit independent of image format.

If I had to resize disks all the time for some reason then qcow images
might be better, but then again, in such environment I'd soon think of
SAN that stores images sparsingly, be that something hardware or just
solaris/openindiana zfs-backed up iscsi provider :)

How are qcow volumes compared to LVM-backed up images in terms of
performance, btw? Are there any good numbers? All I can find is
comparison between qcow & qcow2, or qcows with various features, or
information like "qcow used to suck, but with this commit fixing this
problem it's XX times faster on these operations!"

> Tuning nthe partition the images reside on, now *that* is invaluable.
> Turn off noatime, use a fast and simple file system. LVM can be handy
> for doing backup snapshots when you're re-arranging and migrating
> filesystem images, but its management of additional space and release
> of the snapshots is somewhat undermanaged.

Ugh. Tried some, won't try again ever. LVM snapshots behave so badly
under load so I just can't find a way to use them anywhere in
production. I tried deploying them in pure testing environment too, but
they were unable to handle even slightest load; depending on the way of
creating snapshots, either operating with drive with active snapshot
will cut IO performance by like 5 times, or operating with snapshot
itself will be that slow, and removing snapshot is insane load on
system, too.

Sadly, only full COW-based filesystems like zfs and btrfs can provide
adequately-working snapshots :-/

> One kicker you may not have noticed: if your disk for your KVM server
> has 4096 byte blocks, you *REALLY, REALLY, REALLY* wan the virtualized
> OS to use partitions aligned on 4096 byte block boundaries. The
> virtualized OS's have no way to detect the underlying disk layout, and
> can burn incredible amounts of resources re-aligning everything for
> disk access.

Hm. But is that even needed, when modern operating systems try their
best to retain larger-scale alignment even without 4k sectors?
Like, all modern linux distributions and even windows 7 try to use 1MB
alignment for partitions. Since LVM itself doesn't destroy alignment, in
practice it will be aligned regardless.


-- 

Vladimir


Re: momentarily disabling synaptic touchpad

2011-09-20 Thread Vladimir Mosgalin
Hi William Shu!

 On 2011.09.20 at 13:24:20 -0700, William Shu wrote next:

> Vladimir,
> Thank you so much for the suggestion. I followed the instructions on the web 
> page. For a moment it seems it will not work, then the touchpad was actually 
> disabled, but I have no clue what I did. Then, of its own, the pad was 
> activated after a few hours. I have rebooted the system as well as give the 
> command as root but no success. The output of synclient is given below.o

Did you create .fdi file and enabled configuring with synclient in it?
Because otherwise only options from .fdi file are going to work, you
have to enable synclient run-time configuring with separate option.

> The other command you gave seems to have an error! Not clear if part of a 
> shell script, but the characters -- ","0","*") -- after the last & seem 
> redundant.:


> 
>  syndaemon -i 1 -d -K && xinput set-int-prop \"SynPS/2 Synaptics 
> TouchPad\" \"Synaptics Two-Finger Pressure\" 32 10 &","0","*")
> 
> ignoring them and unquoting the "'s removed the errors, but seemed to have no 
> effect!

Yes, sorry. quotes were unneeded. Here is alternate version of second
command, anyhow, to enable two-finger tap, if you need middle button on
touchpad

xinput --set-prop --type=int --format=32 "SynPS/2 Synaptics TouchPad" 
"Synaptics Two-Finger Pressure" 10  


First one (syndaemon command) disables touchpad *temporarily* when you
are typing only; that's merely one of the options, as most people just
don't want touchpad to interfere with typing. Value after -i is in
seconds for how long touchpad will stay disabled after keypress. Maybe
you saw effect of that when thinking that touchpad was disabled
permanently?

I believe both syndaemon and synclient will work only when run-time
configuration is enabled in .fdi (after which you might need to reboot
or restart hal, as you can't detach and re-attach touchpad).


If you want to disable touchpad permanently, you can also use xinput,
like
xinput --set-prop --type=int "SynPS/2 Synaptics TouchPad"  "Device Enabled" 8 0

By the way, if finding device by text string in xinput not working for
you, you can call "xinput list" to find device id of your touchpad and
then use it instead. 

In case you absolutely can't get synclient & syndaemon working for you,
just stick with xinput, they should work regardless of .fdi settings, I
believe. More examples of xinput are here
https://wiki.ubuntu.com/X/Config/Input


Here is automated script to turn touchpad on & off. Customize to your
liking and then bind its execution to some key. Script isn't mine, I'm
merely copypasting it:

#!/bin/sh
# get touchpad id
XINPUTNUM=`xinput list | grep 'SynPS/2 Synaptics TouchPad' | sed -n 
-e's/.*id=\([0-9]\+\).*/\1/p'`

# get the current state of the touchpad
TPSTATUS=`xinput list-props $XINPUTNUM | awk '/Device Enabled/ { print $NF }'`

# if getting the status failed, exit
test -z $TPSTATUS && exit 1

if [ $TPSTATUS = 0 ]; then
xinput set-int-prop $XINPUTNUM "Device Enabled" 8 1
else
xinput set-int-prop $XINPUTNUM "Device Enabled" 8 0
fi



Also, on gnome desktop you might get working something as simple as this
(but I'm really not sure it works on SL6.0 desktop):

#!/bin/sh
TOUCHPAD_ENABLED=$(gconftool-2 --get 
"/desktop/gnome/peripherals/touchpad/touchpad_enabled")

if [ "$TOUCHPAD_ENABLED" = "true" ]
then
   gconftool-2 --set "/desktop/gnome/peripherals/touchpad/touchpad_enabled" 
--type boolean false
else
   gconftool-2 --set "/desktop/gnome/peripherals/touchpad/touchpad_enabled" 
--type boolean true
fi 




-- 

Vladimir


Re: momentarily disabling synaptic touchpad

2011-09-20 Thread Vladimir Mosgalin
Hi William Shu!

 On 2011.09.20 at 15:34:50 -0700, William Shu wrote next:

> Yes, I did create the fdi file, see below. (I also copied the file from 
> /usr/share/hal/policy/99-synaptics.fdi  to 
> /usr/share/hal/policy/20thirdparty/99-ssynaptics.fdi and renamed originals to 
> *.save)
> 
> 
> Your inititial solution seems the best .
> 
> Dumb question: how is it enabled for synclient?

input.x11_options.SHMConfig = on does that trick, which is done in your
config file.

synclient/syndaemon configuring doesn't work at all, still?

If you can't get syndaemon to turn off touchpad while typing, I don't
really know what to suggest; still, you might find one of other
solutions from last mail acceptable (binding to some key launch of
"touchpad on/off" script which uses xinput or gconftool, whichever
works)..

Btw, out of curiousity, did you check if fdi catched by system/works?
Like, try putting some obvious option there from synaptic(4) manpage to
check if it has effect.


-- 

Vladimir


Re: momentarily disabling synaptic touchpad

2011-09-21 Thread Vladimir Mosgalin
Hi William Shu!

 On 2011.09.21 at 03:48:40 -0700, William Shu wrote next:

> A) using synclient and syndaemon (partial success).
> 
> syndaemon works all the time, but synclient only works *sometimes*. However, 
> I'm not sure what I did, as my activities (below) don't seem 
> repeatable/reproducible. My guess is they are being controlled/overidden from 
> two or more independent sources.
> 
> First, I reversed the order of lines in *.fdi file, though I'm not convinced 
> that matters, to:
> 
>        synaptics
>        true

Did you try "on", btw? I believe "true" might be deprecated.

Unfortunately, I can't assist you on gnome interfering issues.. these
are hard to debug and deal with. Can give another idea, though - there
is alternative way to turning touchpad off while typing with syndaemon -
it's to make it ignore palm touch, the synclient setting for that is
described here
https://wiki.archlinux.org/index.php/Synaptics#Disable_Trackpad_while_Typing
(sadly, most of this article contents is for Xorg 8 w/udev, so it won't
work with SL6 which uses Xorg 7.4 w/hal)


> B) gnome desktop manipulation (unsuccessful).
> The script for gnome could not work. complained of not finding 
> "/desktop/gnome/...". find could not trace it (rooted elsewhere) and so I 
> abandon the approach.

There won't be such file - it's gconf key (you can browse around with
gconf-editor after installing corresponding package, for example).
But if there is no such key, probably touchpad manipulation from isn't
supported on SL6.. (this key exists on Fedora system, for example).

> 
> Once more, thanks for the assistance.

No problem, I'm glad at least some solution worked :)


-- 

Vladimir


Re: How to improve scp speed ??

2011-09-28 Thread Vladimir Mosgalin
Hi Pablo Cavero!

 On 2011.09.28 at 17:25:26 -0400, Pablo Cavero wrote next:

> I want to know if exist any tips to have a faster scp transfer. In the
> Client, or in the Server Config.
> 
> I'm testing use the standar SCP and the PSCP, a command extra from the Puty
> apps.
> 
> Home page of Putty:
> http://www.chiark.greenend.org.uk/~sgtatham/putty/
> 
> Link to Download the File, to make and compiling the new commands:
> http://the.earth.li/~sgtatham/putty/latest/putty-0.61.tar.gz
> 
> Well, I see a few best performance in the standar scp, but this command,
> don't support run in a batch, add the user and password like parameter.
> And generate and share the RSA public kay is too much work, when some one
> have to many clients.

Well, adding password as a parameter instead of using public key *is*
bad idea and causes compatibility issues, like you just experienced.

What is too much work, btw? On server, you do ssh-keygen once and then
"ssh-copy-id user@machine" for all your clients, it's simple shell
script where you can replace ssh invocation by something else like
putty, to supply passwords for the copying itself automatically just
like you are doing now - and after that you'll have public key
infrastructure fully set and no more trouble!


Anyhow regarding your question, there is an easy way but how much it'll
help you, depends; it's to use blowfish encryption instead of aes.
Server should already support this, on a client just change chipher
order so "blowfish-cbc" somes first - be that in ssh_config or in
command line, consult the manpage. Obviously, this is solution for
openssh scp, the same thing can be done with putty/pscp but syntax is
different.

It mostly helps by increasing throughput by 30-50% if you are cpu-bound,
but that's not always the case, sometimes something else is the problem,
plus modern intel cpus support hardware aes encryption, which openssl
automatically uses on modern systems (such as SL6), so aes should be
fastest for these systems. Still, give blowfish encryption a try. And of
course, use openssh scp. Also, if your files are compressible, use
compression (-C on commandline, or matching ssh_config option); if they
are not, then turn compression off as it wastes cpu cycles and slows
down encryption. Correct compression setting can increase transfer speed
by a lot.

Probably only way to transfer faster, if you are still cpu-bound, is to
use no encryption, but it isn't safe and requires recompiling openssh
client to enable it.


-- 

Vladimir


Re: Scientific-Linux: which version has bigger support date?

2011-09-29 Thread Vladimir Mosgalin
Hi lancebaynes87!

 On 2011.09.29 at 04:28:18 -0700, lancebaynes87 wrote next:

> SOLUTION #2
> if we install the:
> 
> http://www.osst.co.uk/Download/scientific/6rolling/x86_64/iso/SL-61-x86_64-2011-07-27-Install-DVD.iso.torrent
> 
> that's a "rolling release". Now I haven't used any "rolling release" based 
> Linux distros so I don't know what that exactly means. Q: does it mean that 
> if I install it once, then I never have to re-install it again because of a 
> version upgrade, ex.: Scientific-Linux 7 comes out, and neither do I need to 
> "dist-upgrade"? - Because rolling release means that there is no more version 
> numbers?

Well I'd say that rolling release concept isn't strictly related to your
need, so you can ignore the fact it's "rolling" for now. It has more to
do with the way versions are set and flows of package updates.

There is no automatic migration to SL7 once it's out (well, most
likely), and manual upgrade isn't recommended, there is no dist-upgrade
equivalent. You will most likely have to reinstall. However, this 6 year
long support is for SL6 alone, it means that for 6 years you can stay on
SL6 (6.1, 6.2, etc - updating between these is very painless, also kind
of optional if there is no need), without reinstalling anything.


The other question is that lot of desktop needs require fresher distro,
not 6 year old, so using SL6 in 2017 on desktop might be a bit annoying,
so to say; therefore, if it's about desktop, you might want to reinstall
to SL7 soon after it's out, maybe in 2 or 3 years or something. However,
this depends on usage patterns of your systems and you can ignore this
for now, too.

SL6 seems to be great for your needs, so just install 6.1 and start
using it, you can understand all these rolling release concepts and
others later, as they won't affect you for now.

-- 

Vladimir


Re: Scientific-Linux: which version has bigger support date?

2011-09-29 Thread Vladimir Mosgalin
Hi Larry Linder!

 On 2011.09.29 at 09:32:51 -0400, Larry Linder wrote next:

> Major concern about rolling upgrades is that you never really know the side 
> effects.
> We have been on SL 5.6 for a long time and everything works that we use.
> If SL 6.X is going to be supported for the foreseeable future then we will 
> make the upgrade to all of our systems.
> We use SL 5.6 to run our business, & factory.   For what we do it works and 
> its stable.  Runs on relatively old hardware - a little slower but 
> functional.

Well, I don't see how that's relevant, SL "rolling upgrades" work only
within single major version, like SL5 or SL6. You can say that it's kind
of alternative to completely separating SL5.5 from 5.6, for example.
They have nothing to do with manual upgrades or SL5->SL6 or other
similar migrations.. 


For users, SL concept of "rolling upgrades" provides only benefits: if
you compare SL to CentOS, in SL you have option when you are
version-locked, to for example 6.0 or 5.5 but still get most important
security updates, and you also can get important updates that usually
only come with next release from TUV before same version of SL is
released. It's up to you which of these options to prefer, SL provides
them both. There is no reason to be scared of words "rolling upgrades"
at all, it's a really nice (and optional) feature.

-- 

Vladimir


Re: KVM Host - Missing In Action

2011-10-05 Thread Vladimir Mosgalin
Hi James Kelly!

 On 2011.10.05 at 22:31:18 +0100, James Kelly wrote next:

> I lost contact with my Scientific Linux 6.1 KVM host earlier today.
> 
> The machine is headless and I don't have any IPMI stuff on the machine so I
> had to plug a monitor into it. However, there was no life from the monitor
> and I pressed the reset button.
> 
> It seems to me that the networking died. The machine is booted first thing
> every morning (so the 9:00am start was missed by two minutes!) and the
> networking error seems to have occurred about 27 minutes after
> the initial boot.

It's unclear to me if tg3 driver errors in the second half of message
are source or cause of this situation, however if they are source, you
might be interested in recent update that Red Hat has released:
http://rhn.redhat.com/errata/RHEA-2011-1348.html

Try installing kmod-tg3 from sl-fastbugs repo and rebooting, it should
make your system use newer version of network driver that's mentioned in
these messages. I have no idea if it will really help, but it probably
won't hurt to try.


The often cause of similar problems with network drivers could be
interrupt setup - network cards generate lots of interrupts under load
and use various advanced features to ease it a bit, I saw some
situations where panics and warnings in kernel appeared due to hardware
interrupt setup or buggy interrupt code in network driver under load.
Just in case, you might want to find mention of eth in /proc/interrupts
to make sure that it uses MSI-X (shown as PCI-MSI-edge or PCI-MSI-X) and
not IO-APIC-level or something like that. However, I don't think these
kind of problems should arrive on such hardware.

In the worst case, if these problems will keep appearing, consider
installing external intel-based network card, these work most flawlessly
under Linux in my (and some other people) experience. It's kind of sad,
but marvell, broadcom and nvidia products are a bit of second class
citizens and don't always work flawlessly under load - might be more of
a driver problem, who knows, but that's just my experience from past
years.
(also, I'd definitely stay away from NICs based on other manufacturer's
chips, except for these 4 nothing else should probably be allowed in
server market. YMMV)

These messages also can be indicating something else than network
problems but people with deeper kernel knowledge than me should answer
this. All I can say is that NICs+network drivers+interrupt settings
combination *can* be real source of problems, up to kernel panics under
some conditions, it's not that rare at all to find out that such
problems are caused by network driver.

-- 

Vladimir


Re: Flash plugin

2011-10-06 Thread Vladimir Mosgalin
Hi jdow!

 On 2011.10.06 at 05:05:05 -0700, jdow wrote next:

> Date: Thu, 06 Oct 2011 05:05:05 -0700
> From: jdow 
> To: scientific-linux-us...@fnal.gov
> X-Original-To: mosgalin@localhost
> Subject: Flash plugin
> User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:7.0.1) Gecko/20110929
> Thunderbird/7.0.1
> 
> I have the elrepo 64 bit beta flash plugin installed. A 32 bit flash update
> is being forced on my system. Here are the error messages.
> 
> Transaction Check Error:
>   file /usr/share/applications/flash-player-properties.desktop from
> install of flash-plugin-11.0.1.152-release.i386 conflicts with file
> from package flash-plugin-11.0.1.129-0.1.el6.rf.x86_64

There is no flash plugin in elrepo. You seem to have one from rpmforge
installed. Either wait until x86_64 package appears in rpmforge, or
uninstall it, then install official adobe yum repository and install
flash plugin from there..

-- 

Vladimir


Re: Flash plugin

2011-10-06 Thread Vladimir Mosgalin
Hi Dag Wieers!

 On 2011.10.06 at 16:38:04 +0200, Dag Wieers wrote next:

> >There is no flash plugin in elrepo. You seem to have one from rpmforge
> >installed. Either wait until x86_64 package appears in rpmforge, or
> >uninstall it, then install official adobe yum repository and install
> >flash plugin from there..
> 
> RPMforge provides already the (beta) 64bit flash-plugin, so there's
> no need to wait for it. In this case the 64bit is installed, so
> there is no reason to install the 32bit. Unless you want to replace
> the 64bit by the 32bit.

Yes, well, I meant "when final 11 release will appear in rpmforge" (like
it is now in official repo).

OK, according to you it's best to just wait a bit.

-- 

Vladimir


Re: Flash plugin

2011-10-07 Thread Vladimir Mosgalin
Hi Dag Wieers!

 On 2011.10.07 at 01:34:38 +0200, Dag Wieers wrote next:

> >Evidently, a number of stock end-user applications, such as
> >Firefox, Thunderbird, and the like, have security holes as well as
> >bugs, and thus need regularly kept current.
> 
> Do you have any proof of security problems ? Was there a security
> advisory for this release ?

It's not as simple as that.
There was no supported version of 64-bit flash 10 plugin.
Information about security problems in betas and RCs of flash plugins
aren't displayed on that page that you saw - it does, however, appear in
news from adobe and in adobe blogs; but they don't add them to list of
problems in final releases.

There *were* various security problems in 64-bit betas and RCs of flash
plugin, and it got some updates, but they merely aren't listed on that
page because of adobe policy regarding betas.

Now, for 32-bit users there always was "latest stable flash 10", which,
as you noted correctly, doesn't seem to have any security problems.
These people can live just fine for now without updating to flash 11.

But 64-bit users of flash plugin had only beta which had known security
problems - they were fixed from time to time as new betas and rcs were
released, and all known problems were fixed by the time of final flash
11 release. For 64-bit users, "official" tracking of security problems
starts only now, with flash 11 release. All 64-bit users should update
to final flash 11 ASAP, and the fact that there are no problems listed
on that page only means that beta problems weren't tracked there - there
*ARE* known security problems with flash 11 series.

Here is example of security vulnerabilities fixed during course of flash
11 beta/rc releases: http://kb2.adobe.com/cps/916/cpsid_91694.html
you check check out some security bulletins from this link.


Btw, 64-bit flash 10 plugin was even in more sorry state: there were
lot of known security problems for it, but adobe stopped developing it
and latest known (beta) version was said to be very vulnerable.

-- 

Vladimir


Re: RocketRaid 644 Card

2011-10-11 Thread Vladimir Mosgalin
Hi Jeremy Wellner!

 On 2011.10.11 at 11:14:30 -0700, Jeremy Wellner wrote next:

> Still very linux green but sopping up info quick.  I have a SL 6.1 install 
> updated to current that I am trying to build/install a driver for a 
> RocketRaid 644 eSATA card to run an enclosure off my tower. - 
> http://www.highpoint-tech.com/BIOS_Driver/page/rr644_U.htm
> 
> It looks like they only have drivers for RHEL5.5, but they do have the source 
> that I'm getting a error finding a .him_magni.o file during make.
> 
> Any ideas??  I could use the help! :)
> 

I'm afraid it's not possible to easy make it work under linux. They
provide only binary drivers for old distributions; this so-called
"source code" is merely a simple wrapper around pre-compiled binary blob
which isn't compatible with newer kernels. You probably should find a
way to return this card and replace it with something else..

It's good idea to stand away from highpoint cards - most of them are
simply overpriced HBAs with software raid layer implemented in
proprietary driver. This card's hardware is worth about $30 - that's
what SATA chip on it costs - and rest of money went into old proprietary
driver which implements "RAID" functionality using your own CPU and has
pretty bad performance and compatibility problems..

HighPoint never supports modern linux systems because they will *only*
provide closed-source drivers compiled for older systems - they don't
want to open their RAID driver code which they value so much to sell for
big money :) This design - most money goes for the driver code - simply
doesn't play well with linux kernel design..


For your need, you probably should some very simple 4-port eSATA HBA, or
2x 2-port cards, if possible (it's cheaper).

-- 

Vladimir


Re: RocketRaid 644 Card

2011-10-11 Thread Vladimir Mosgalin
Hi Jeremy Wellner!

 On 2011.10.11 at 13:58:30 -0700, Jeremy Wellner wrote next:

> 
> Do you have any suggestions on a good card to run a port multiplier SATA 
> enclosure?  I'll have to see about getting these returned :)
> 
> 
> 

For list of good cards with some descriptions, check out this page: 
http://blog.zorinaq.com/?e=10
Or, rather, chips which card should be based on; for many of these chips
there are lots of cards from various manufacturers, which work more or
less the same and differ on port layout (internal/internal+external/external)

As for supporting port multiplier, unfortunately I never tried it with
sata solutions so can't give any advice or tell if it's even supported
well in linux.. SAS solutions work with multipliers, but if I understand
correctly it's quite different technology. But if they work, for
example, with ahci driver, than they generally should work with any
ahci-supported controller.

Note that this list includes highpoint adapter, Rocket 620 series. This
adapter is cheap and plain HBA without any highpoint driver tricks that
I mentioned in previous mail, as this blog page lists, it works with
standard driver - not saying you should be getting it, just making it
clear so you won't get wrong assumptions :)

Anyhow that page is good enough and provides lots of details about
controllers.


-- 

Vladimir


Re: How to run to launch script when nic interface is up

2011-10-12 Thread Vladimir Mosgalin
Hi carlopmart!

 On 2011.10.12 at 18:42:30 +0200, carlopmart wrote next:

>  Is it possible under SL6.1 to run a script (or insert commands in
> ifcfg-ethX files) when a nic is up, immediatly after network script
> runs?? Like for example it can do with debian/ubuntu: post-up
> post-down.

/sbin/ifup-pre-local and /sbin/ifdown-pre-local are executed (if exist)
before bringing interface up & down

/sbin/ifup-local and /sbin/ifup-local are executed (if exist) after
bringing interface up & down and setting up routes.

Argument to each will be interface name. Note that if you have aliases,
this script will be called for each alias.



-- 

Vladimir


Re: How to run to launch script when nic interface is up

2011-10-13 Thread Vladimir Mosgalin
Hi jdow!

 On 2011.10.12 at 18:28:02 -0700, jdow wrote next:

> >>  Is it possible under SL6.1 to run a script (or insert commands in
> >>ifcfg-ethX files) when a nic is up, immediatly after network script
> >>runs?? Like for example it can do with debian/ubuntu: post-up
> >>post-down.
> >
> >/sbin/ifup-pre-local and /sbin/ifdown-pre-local are executed (if exist)
> >before bringing interface up&  down
> >
> >/sbin/ifup-local and /sbin/ifup-local are executed (if exist) after
> >bringing interface up&  down and setting up routes.
> >
> >Argument to each will be interface name. Note that if you have aliases,
> >this script will be called for each alias.
> 
> OK, I have such a file that I still have to manually run when the system
> reboots. I get no error messages nor is the file run.
> 
> # ls --lcontext /sbin/ifup-local
> -rwxr-xr-x. 1 system_u:object_r:bin_t:s0   root root 329 Jul 25
> 13:30 /sbin/ifup-local
> 
> Any hints what I may have wrong?

Is your network setup managed by "network" service or NetworkManager? If
later, I don't think it cares about these local scripts at all..

If you are using "network", try to find out if it runs or not by putting
something simple inside, like 

#!/bin/sh
echo "$@" > /tmp/ifup-local-test

If you need NetworkManager, I don't think I can give any advice on how
to make it work when NM manages network..

-- 

Vladimir


Re: KVM guests with time sync issues

2011-11-03 Thread Vladimir Mosgalin
Hi Klaus Steinberger!

 On 2011.11.03 at 13:53:50 +0100, Klaus Steinberger wrote next:

> we run into same trouble and got some hints from TUV (we have some RHEL 
> systems
> with support). There is a ticket open in TUV's supports about the issue.
> 
> Try the following clock settings for your Windows VM's:
> 
>   
> 
>   
> 

This is very interesting! Thanks for the tip.

Do you (or anyone else, maybe) know any solution to time jump when
saving/restoring guest? I'm talking about most basic SL6.1 guest on
SL6.1 KVM host situation. Either by manual operation or during reboot,
when guest is restored, it has "old time" - from the moment it was
saved. Such sudden jump usually kills ntp in guest which doesn't like
time jumps, and without ntp time stays badly shifted and causes various
problems.


-- 

Vladimir


Re: KVM guests with time sync issues

2011-11-04 Thread Vladimir Mosgalin
Hi Klaus Steinberger!

 On 2011.11.04 at 07:29:54 +0100, Klaus Steinberger wrote next:

> > Do you (or anyone else, maybe) know any solution to time jump when
> > saving/restoring guest? I'm talking about most basic SL6.1 guest on
> > SL6.1 KVM host situation. Either by manual operation or during reboot,
> 
> No idea yet, our VM's run as cluster services (always).
> 
> But a google search shows this:
> 
> https://build.opensuse.org/package/view_file?file=kvm-QMP-Introduce-RESUME-event.patch&package=qemu-kvm&project=home%3Apdignan&srcmd5=e74ee417a95b014789b98f5905c0eef9
> 
> I do not know if this patch is already in SL6.1 and how to intercept this 
> Event.

It should be, according to
https://bugzilla.redhat.com/show_bug.cgi?id=590102

But unfortunately, as is it doesn't work; no one intercepts event.

Sadly, can't find anything good in google on this issue either.

-- 

Vladimir


Re: hardware upgrade

2011-11-21 Thread Vladimir Mosgalin
Hi Andrew Z!

 On 2011.11.21 at 01:07:44 -0500, Andrew Z wrote next:

> all this brings me to a simple question - how do i move from i686 SL 6.1
> that was running on Sempron to Phenom ( which is 64 and 4 cores ) system?

There isn't a good way to move from i686 system to x86-64 with upgrade.
Sure, it can be performed as a big hack, but really it's not something
one should do, and it's not supported in any way.

If you are fine with using i686 system, then you usually don't need to
do anything on such hardware upgrade - the only typical minor problem
can be with ide controller detection (driver for new controller is not
present in old initrd). It can be fixed in various ways - for example, I
prefer booting rescue mode, going to console, mounting old system,
chrooting into it and running mkinitrd or reinstalling kernel.

If you want to move to x86-64, then you have to reinstall system, as in
real reinstall, not upgrade; depending on your partition scheme /home
might be preserved, or you might need full backup, clean reinstall and
then restore of parts you need. Or, if your VG has enough free space,
you can just create new LVs during installation, install there then copy
data you need from old LVs.

Installing from usb should be no problem, be that usb flash drive or usb
cdrom.

-- 

Vladimir


Can't access SL repos - DNS problem

2011-12-05 Thread Vladimir Mosgalin
Hello everybody.

My DNS server (bind-9.7.3-2.el6_1.P3.3.x86_64 running on SL 6.1) stopped
resolving ftp.scientificlinux.org, ftp1.scientificlinux.org and such!
In logs it writes about probing every parent dns server in config and
then finally gives up.

$ host ftp.scientificlinux.org
Host ftp.scientificlinux.org not found: 3(NXDOMAIN)

in logs:
Dec  5 19:13:49 lime named[2109]: validating @0x7f719007dfe0: fnal.gov DNSKEY: 
no valid signature found (DS)
Dec  5 19:13:49 lime named[2109]: error (no valid RRSIG) resolving 
'fnal.gov/DNSKEY/IN': 8.8.4.4#53
Dec  5 19:13:49 lime named[2109]: validating @0x7f7194018900: fnal.gov DNSKEY: 
no valid signature found (DS)
Dec  5 19:13:49 lime named[2109]: error (no valid RRSIG) resolving 
'fnal.gov/DNSKEY/IN': 8.8.8.8#53
Dec  5 19:13:49 lime named[2109]: validating @0x7f7188067920: fnal.gov DNSKEY: 
no valid signature found (DS)
Dec  5 19:13:49 lime named[2109]: error (no valid RRSIG) resolving 
'fnal.gov/DNSKEY/IN': 198.49.208.71#53
Dec  5 19:13:49 lime named[2109]: validating @0x7f71900ab2b0: fnal.gov DNSKEY: 
no valid signature found (DS)
Dec  5 19:13:49 lime named[2109]: error (no valid RRSIG) resolving 
'fnal.gov/DNSKEY/IN': 198.49.208.70#53
Dec  5 19:13:49 lime named[2109]: validating @0x7f719007dfe0: fnal.gov DNSKEY: 
no valid signature found (DS)
Dec  5 19:13:49 lime named[2109]: error (no valid RRSIG) resolving 
'fnal.gov/DNSKEY/IN': 198.124.252.22#53
Dec  5 19:13:49 lime named[2109]: validating @0x7f7194018900: fnal.gov DNSKEY: 
no valid signature found (DS)
Dec  5 19:13:49 lime named[2109]: error (no valid RRSIG) resolving 
'fnal.gov/DNSKEY/IN': 2001:400:910:1::2#53
Dec  5 19:13:50 lime named[2109]: validating @0x7f7198603180: fnal.gov DNSKEY: 
no valid signature found (DS)
Dec  5 19:13:50 lime named[2109]: error (no valid RRSIG) resolving 
'fnal.gov/DNSKEY/IN': 198.128.2.10#53
Dec  5 19:13:50 lime named[2109]: validating @0x7f71900ba5f0: fnal.gov DNSKEY: 
no valid signature found (DS)
Dec  5 19:13:50 lime named[2109]: error (no valid RRSIG) resolving 
'fnal.gov/DNSKEY/IN': 2001:400:6000::22#53
Dec  5 19:13:50 lime named[2109]: validating @0x7f7194018900: fnal.gov DNSKEY: 
no valid signature found (DS)
Dec  5 19:13:50 lime named[2109]: error (no valid RRSIG) resolving 
'fnal.gov/DNSKEY/IN': 198.129.252.34#53
Dec  5 19:13:50 lime named[2109]: validating @0x7f719862a1a0: fnal.gov DNSKEY: 
no valid signature found (DS)
Dec  5 19:13:50 lime named[2109]: error (no valid RRSIG) resolving 
'fnal.gov/DNSKEY/IN': 2001:400:14:2::10#53
Dec  5 19:13:50 lime named[2109]: error (broken trust chain) resolving 
'linux9.fnal.gov/A/IN': 8.8.8.8#53

I tried restarting it, didn't help. Is something broken on my side or
SL side? Looks like my DNS server resolves other names, including
DNSSEC-secured ones.

-- 

Vladimir


Re: kernel panic - kernel-2.6.32-220.el6 on ASUS

2011-12-16 Thread Vladimir Mosgalin
Hi Dusan Bruncko!

 On 2011.12.16 at 07:52:53 +0100, Dusan Bruncko wrote next:

> I installed new kernel-2.6.32-220.el6.x86_64 
> 
> on ASUS:
> 
> ASUS NX90Jq B1
> Core i7 740QM / 1.73 GHz - RAM 10 GB - HDD 640 GB + 640 GB -
> DVD-Writer / BD-ROM - GF GT 335M - Gigabit Ethernet - WLAN :
> 802.11b/g/n, Bluetooth 2.1 EDR - Windows 7 Ultimate 64-bit - 18.4"
> Widescreen LED backlight Color Shine TFT 1920 x 1080 ( Full HD ) -
> camera - silver - Microsoft Office 2010, US keyboard
> 
> but unfortunately I see kernel panic.
> I went to the old kernel-2.6.32-131.17.1.el6.x86_64. 
> 
> 
> Can you send the experts this problem?

You should attach kernel output during panic, guessing about it without
any information is meaningless. If you don't have means of capturing
text output, a photo of screen during panic might do; depending on
length of error messages, increasing resolution with vga=... kernel
parameter might be required, so start of the errors can be seen on
photo.

If you can scroll with shift-pgup/pgdown after panic, then make sure to
include lines from before abnormal/error messages start and till the end
line of panic.

-- 

Vladimir


WARNING from latest SL6 kernel?

2012-01-12 Thread Vladimir Mosgalin
Hello everybody.

After updating from kernel-2.6.32-131.21.1.el6.x86_64 to current
kernel-2.6.32-220.2.1.el6.x86_64 on SL6.1 system and rebooting, I got
following warning few minutes after system boot:

[ cut here ]
WARNING: at kernel/sched.c:5914 thread_return+0x232/0x79d() (Not tainted)
Hardware name: X7SB4/E
Modules linked in: cryptd aes_x86_64 aes_generic cbc cts ebtable_nat ebtables 
nfnetlink_queue cpufreq_stats xt_CHECKSUM nfsd exportfs w83793 hwmon_vid 
coretemp ipmi_devintf ipmi_si ipmi_msghandler nfs fscache nfs_acl 
rpcsec_gss_krb5 auth_rpcgss des_generic lockd sunrpc cpufreq_ondemand 
acpi_cpufreq freq_table mperf act_police cls_u32 sch_ingress sch_sfq sch_hfsc 
ppp_synctty ppp_async crc_ccitt ppp_generic slhc sit tunnel4 bridge stp llc 
nf_conntrack_netlink nfnetlink nf_conntrack_ftp ipt_LOG ipt_REJECT 
iptable_filter xt_dscp xt_NFQUEUE iptable_mangle ipt_MASQUERADE iptable_nat 
nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_NOTRACK iptable_raw ip_tables 
ip6t_LOG xt_limit ip6t_REJECT xt_TCPMSS nf_conntrack_ipv6 nf_defrag_ipv6 
xt_state nf_conntrack ip6table_filter xt_length xt_CLASSIFY xt_mark xt_MARK 
xt_owner xt_multiport ip6table_mangle ip6_tables ipv6 dm_mirror dm_region_hash 
dm_log vhost_net macvtap macvlan tun kvm_intel kvm tg3 microcode sg serio_raw 
i2c_i801 iTCO_wdt iTCO_vendor_support e1000e shpchp i3200_edac edac_core ext4 
mbcache jbd2 sd_mod crc_t10dif stex ahci video output radeon ttm drm_kms_helper 
drm i2c_algo_bit i2c_core dm_mod [last unloaded: scsi_wait_scan]
Pid: 13, comm: ksoftirqd/2 Not tainted 2.6.32-220.2.1.el6.x86_64 #1
Call Trace:
 [] ? warn_slowpath_common+0x87/0xc0
 [] ? warn_slowpath_null+0x1a/0x20
 [] ? thread_return+0x232/0x79d
 [] ? ksoftirqd+0xd5/0x110
 [] ? ksoftirqd+0x0/0x110
 [] ? kthread+0x96/0xa0
 [] ? child_rip+0xa/0x20
 [] ? kthread+0x0/0xa0
 [] ? child_rip+0x0/0x20
---[ end trace 8029c9a8b796e39a ]---


System was working stable for a long time before without such messages
from kernel. It seems to work fine for now, too. What does this message
mean, is it something bad?


-- 

Vladimir


Re: [SCIENTIFIC-LINUX-USERS] error on yum update sl 6.1

2012-01-12 Thread Vladimir Mosgalin
Hi Pat Riehecky!

 On 2012.01.12 at 13:34:32 -0600, Pat Riehecky wrote next:

> The IPA packages recently released are the ones which came out as
> security updates along with TUV's 6.2 release.  The current 6.2 BETA
> has those packages integrated.  So, to answer your question, SL 6.2
> will ship by default with these IPA packages available and not any
> earlier versions.  Thus, in 6.2 you will not be able to have
> bind-dyndb-ldap and ipa-server installed due to their conflict.
> I've no idea why they conflict as I'm unfamiliar with the specifics
> of the technology, but I trust upstream to get this right.

That's not it, actually. Look at requirements:

> >-->  Processing Conflict: ipa-server-2.1.3-9.el6.i686 conflicts
> >bind-dyndb-ldap<  0.2.0-3

IPA server manages bind & LDAP and due to some internal requirements
needs fresher version of bind-dyndb-ldap that's in 6.1. Since it's
essentially 6.2 IPA package, it requires bind-dyndb-ldap from 6.2
(0.2.0-7) to work properly; so, this conflict will disappear with update
to 6.2.

(actually, bind-dyndb-ldap is quite suggested for free-ipa, so right now
having that conflict in 6.1 due to free-ipa update might be pretty bad.
Maybe bind-dyndb-ldap should've been rebuilt for 6.1 too to fix it?)

-- 

Vladimir


Re: nfs-utils-1.2.3-15.el6.0.sl6 segfaulting

2012-01-13 Thread Vladimir Mosgalin
Hi Orion Poplawski!

 On 2012.01.13 at 10:18:40 -0700, Orion Poplawski wrote next:

> >It looks like the rebuilt nfs-utils-1.2.3-15.el6.0.sl6 package is incorrectly
> >linked. I am repeatedly getting segfaults in rpc.gssd:
> >
> >Jan 12 14:25:12 vincent kernel: rpc.gssd[12580]: segfault at 600 ip
> >7fcbebd4c1fc sp 7fff1b03d570 error 4 in 
> >libc-2.12.so[7fcbebcd7000+197000]
> >
> >Reverting to the earlier rebuilt nfs-utils-1.2.3-15.el6 fixes the issue. The
> >original nfs-utils-1.2.3-15.el6 however also segfaulted.
> >
> >Jonathan
> 
> Seen here too.

Does anyone use kerberized nfs4 in SL6.1? Ever since 6.1, nfs-utils were
broken for me; version 1.2.3-6 introduced " Added support for AD style
kerberos to rpc.gssd (bz 671474)" which completely broke kerberized nfs4
server when accessed from linux hosts (no AD in question). All versions
of nfs-utils since then had same bug, so I had to rollback to freshest
nfs-utils that work - one from 6.1 beta, nfs-utils-1.2.3-4.el6.x86_64.rpm
and locked updates of this package in yum.conf.

It wasn't rpc.gssd segfault or anything like that, it's just that with
that "AD style support" krb principal in rpc.gssd was appearing with
wrong realm name or something like that, and it refused to give access
to linux clients.

I wonder if modern versions of nfs-utils overcame this problem and
kerberized nfs works for anyone with new nfs-utils, and it's worth
upgrading?

-- 

Vladimir


Re: How does one change SL 6 boot screen font size

2012-01-14 Thread Vladimir Mosgalin
Hi Yasha Karant!

 On 2012.01.14 at 10:25:43 -0800, Yasha Karant wrote next:

> At present, the only means I have found to force SL 6.x to display
> the actual steps during the boot of SL 6.x is to hit the ESC key;
> otherwise, only a time changing indicator appears, but no actual
> information. However, even so displaying, I have still have the
> issue below.
> 
> The font SL 6.x uses is too small on some platforms.  What controls
> this font size:  a configuration file in SL, a setting in either the
> BIOS or the bootloader (e.g. grub) -- both of which have
> relinquished control to SL at this stage of the boot -- or something
> else, perhaps hard compiled that cannot be changed without
> recompilation?
> 
> I fully understand that this may be set to be identical to TUV and
> other EL clones.  I do not care.  Just as ELRepo or SL additions
> provide (some) functionalities beyond those of TUV, I would like to
> modify this.

It's two thing that control it: framebuffer resolution and actual font
size.
To get bigger font, you can either increase font size or reduce
resolution.
Font is defined in /etc/sysconfig/i18n with SYSFONT directive; for
example, mine is 8x16 font "cyr-sun16.psuf.gz". You can look at
/lib/kbd/consolefonts for full list of fonts.
Example of bigger font is sun12x22.psfu.gz

Also, SYSFONT is passed as kernel parameter from grub; that's font used
by kernel before i18n file is parsed and font is changed. For
consistency, you probably should change SYSFONT= directive in
/boot/grub/grub.conf too.


To reduce framebuffer resolution, specify different video= option as
kernel parameter from grub. Default varies depending on video card and
monitor; it can be 1024x768, 1600x900 or anything else. Try
video=800x600 for example. You might need to specify framebuffer driver,
too, like video=radeonfb:800x600 or video=radeondrmfb:800x600 etc

You can find try to find your current framebuffer driver by executing
something like "dmesg|grep fbcon" after boot. It really is a complicated
topic as exact name can depend on KMS / nomodeset kernel options and so
on. Documentation/fb/modedb.txt from kernel-doc package explains video=
option, as for mode setting & fb driver stuff, I don't really know
precise documentation, so google is your friend, if you have to venture
there..



-- 

Vladimir


debuginfo generation broke few days ago?

2012-02-24 Thread Vladimir Mosgalin
Hello everybody.

There seems to be some problem with debuginfo generation. Packages were
released on 22.02, but debuginfo wasn't updated since 20.02. Because of
that, latest squid & kernel packages are missing debuginfo.

-- 

Vladimir


Re: Wine RPM's

2012-02-26 Thread Vladimir Mosgalin
Hi jdow!

 On 2012.02.26 at 16:17:09 -0800, jdow wrote next:

> 
> Todd, doesn't that mean you can look for those versions of Wine in SL7, maybe?

Making plans for SL7 when Red Hat hasn't even announced plans for RHEL7,
and only like 1.5 years passed since RHEL6 release?.. (3.5 years passed
between releases of RHEL5 and RHEL6, btw).

Well, actually, they said something about general plans for 2013 release
of RHEL7.

-- 

Vladimir


Re: nfsd stats on SL6

2012-02-28 Thread Vladimir Mosgalin
Hi Wayne Betts!

 On 2012.02.28 at 11:32:17 -0500, Wayne Betts wrote next:

> We have a couple of Scientific Linux 6.1 NFS servers.  I looked at 
> /proc/net/rpc/nfsd and was surprised to see on both of them that 
> the thread histogram is all zero:
> 
> # cat /proc/net/rpc/nfsd | grep th
> th 8 0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

It is the same for me on nfs4 server on SL6.2

What statistics are you trying to get? What's wrong with standard way of
querying nfs server, "nfsstat -s" command?
It's much better than looking in proc, because the file you see there is
specific only to current linux kernel-mode nfs implementation, while
nfsstat -s *is* the correct and portable way across various systems.

> Here is /proc/net/rpc/nfsd for one of them (sure to get line-wrapped
> mangled):
> 
> # cat /proc/net/rpc/nfsd
> rc 19 70434483 726969598
> fh 352 0 0 0 0
> io 2092104517 3971637111
> th 8 0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
> ra 32 393921819 0 0 0 0 0 0 0 0 0 985424
> net 797404498 0 797447453 155310
> rpc 797455746 0 0 0 0
> proc2 18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> proc3 22 74845 228480507 19923612 17460391 64581715 115066 394911350
> 39971881 5313458 37750 36353 0 849734 3589 4288363 8251 267872 1146073
> 22996 140662 0 18601904
> proc4 2 0 0
> proc4ops 59 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

If you want, here is mine - nfs4 only server on SL6.2:
$ cat /proc/net/rpc/nfs
rc 0 0 7056013
fh 0 0 0 0 0
io 155895234 1885486900
th 4 0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
ra 32 0 0 0 0 0 0 0 0 0 0 0
net 7060356 0 7060271 262
rpc 7060343 20 0 20 0
proc3 22 10 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11 15 10 5 0
proc4 2 8 7059356
proc4ops 59 0 0 0 229322 6405 678 105 0 0 788959 254756 0 0 0 0 286720 44 749 
9689 0 3369 0 7089523 0 126 6093069 171024 2113 1630 37624 39 43153 46434 0 
3302 118 118 0 30605 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0



-- 

Vladimir


page allocation failure?

2012-02-29 Thread Vladimir Mosgalin
Hello everybody,

I have some puzzling issue. I see this in dmesg:

ksoftirqd/1: page allocation failure. order:1, mode:0x20
Pid: 9, comm: ksoftirqd/1 Not tainted 2.6.32-220.4.1.el6.x86_64 #1
Call Trace:
   [] ? __alloc_pages_nodemask+0x77f/0x940
 [] ? enqueue_entity+0x390/0x420
 [] ? kmem_getpages+0x62/0x170
 [] ? fallback_alloc+0x1ba/0x270
 [] ? cache_alloc_node+0x99/0x160
 [] ? kmem_cache_alloc+0x11b/0x190
 [] ? sk_prot_alloc+0x48/0x1c0
 [] ? sk_clone+0x22/0x2e0
 [] ? inet_csk_clone+0x16/0xd0
 [] ? tcp_create_openreq_child+0x23/0x450
 [] ? tcp_v4_syn_recv_sock+0x4d/0x2a0
 [] ? tcp_check_req+0x201/0x420
 [] ? nf_ct_deliver_cached_events+0xa2/0xe2 [nf_conntrack]
 [] ? tcp_v4_do_rcv+0x35b/0x430
 [] ? ipv4_confirm+0x87/0x1d0 [nf_conntrack_ipv4]
 [] ? tcp_v4_rcv+0x4e1/0x860
 [] ? ip_local_deliver_finish+0x0/0x2d0
 [] ? ip_local_deliver_finish+0xdd/0x2d0
 [] ? ip_local_deliver+0x98/0xa0
 [] ? ip_rcv_finish+0x12d/0x440
 [] ? ip_rcv+0x275/0x350
 [] ? ingress_enqueue+0x2b/0x94 [sch_ingress]
 [] ? __netif_receive_skb+0x49b/0x6e0
 [] ? delayed_work_timer_fn+0x39/0x50
 [] ? process_backlog+0x9a/0x100
 [] ? net_rx_action+0x103/0x2f0
 [] ? __do_softirq+0xc1/0x1d0
 [] ? call_softirq+0x1c/0x30
 [] ? call_softirq+0x1c/0x30
   [] ? do_softirq+0x65/0xa0
 [] ? ksoftirqd+0x80/0x110
 [] ? ksoftirqd+0x0/0x110
 [] ? kthread+0x96/0xa0
 [] ? child_rip+0xa/0x20
 [] ? kthread+0x0/0xa0
 [] ? child_rip+0x0/0x20


I googled a bit for similar bugs, and, well, I just find it strange to
get these on my system. These can be typical on systems pressured for
kernel memory, but I find it hard to believe that the system in question
has it. It's SL6.2 running 2.6.32-220.4.1.el6.x86_64 kernel, with 8GB
RAM and
vm.min_free_kbytes = 131072
sysctl parameter, typically 150-250 MB are always free (and like 4 GB
cache+buffers). slabtop in the very same session (nothing bad seems to
happened after this message) shows

 Active / Total Objects (% used): 1533015 / 2020657 (75.9%)
 Active / Total Slabs (% used)  : 68451 / 68477 (100.0%)
 Active / Total Caches (% used) : 120 / 202 (59.4%)
 Active / Total Size (% used)   : 212940.66K / 274324.79K (77.6%)
 Minimum / Average / Maximum Object : 0.02K / 0.14K / 4096.00K

I wonder if anyone knows what might be going on and can suggest any
tweak for solving this. While this isn't a problem per se, I fear that
it might be a symptom of something worse.

This is what /proc/meminfo looks like (having swapped stuff isn't a
problem, it's just something that wasn't used in ages that was thrown
there):
MemTotal:7854796 kB
MemFree:  231400 kB
Buffers:  182564 kB
Cached:  3433284 kB
SwapCached:   197872 kB
Active:  4365836 kB
Inactive:2800612 kB
Active(anon):2686648 kB
Inactive(anon):   877976 kB
Active(file):1679188 kB
Inactive(file):  1922636 kB
Unevictable:   0 kB
Mlocked:   0 kB
SwapTotal:  10485744 kB
SwapFree:9942660 kB
Dirty:   328 kB
Writeback: 0 kB
AnonPages:   3467456 kB
Mapped:33580 kB
Shmem: 13880 kB
Slab: 298616 kB
SReclaimable: 129076 kB
SUnreclaim:   169540 kB
KernelStack:5976 kB
PageTables:30128 kB
NFS_Unstable:  0 kB
Bounce:0 kB
WritebackTmp:  0 kB
CommitLimit:14413140 kB
Committed_AS:3952728 kB
VmallocTotal:   34359738367 kB
VmallocUsed:   50208 kB
VmallocChunk:   34359677904 kB
HardwareCorrupted: 0 kB
AnonHugePages:   2140160 kB
HugePages_Total:   0
HugePages_Free:0
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB
DirectMap4k:6912 kB
DirectMap2M: 8380416 kB


-- 

Vladimir


Re: unsigned package -- jdk-1.6.0_31-fcs.i586.rpm

2012-03-01 Thread Vladimir Mosgalin
Hi g!

 On 2012.03.01 at 10:34:41 +, g wrote next:

> during recent update, yumex returned error;
> 
> "Error checking package signatures:
>  Package jdk-1.6.0_31-fcs.i586.rpm is not signed"
> 
> update completed via "yum --nogpgcheck update".

This package isn't part of SL system. If you have installed it manually,
it's up to you to deal with that problem. However, it could be that its
vendor (Oracle) actually signs it, but you didn't install oracle
repository files so its certificate was never installed; I'm not sure.

Unless you have some specific requirements, generally you can use
SL-supplied java-1.6.0-openjdk and java-1.6.0-openjdk-devel packages
instead of package above, then you won't have this kind of problem. But
if you are sure you strictly require Oracle Java implementation, then,
well, either deal with it, or try checking oracle website to see if you
are using recommended way of installing it.

(note that using nogpgcheck all the time or setting it in yum.conf is
a really bad and dangerous practice)


-- 

Vladimir


Re: unsigned package -- jdk-1.6.0_31-fcs.i586.rpm

2012-03-01 Thread Vladimir Mosgalin
Hi g!

 On 2012.03.01 at 18:08:26 +, g wrote next:

> 
> > Unless you have some specific requirements, generally you can use
> > SL-supplied java-1.6.0-openjdk and java-1.6.0-openjdk-devel packages
> > instead of package above, then you won't have this kind of problem. But
> > if you are sure you strictly require Oracle Java implementation, then,
> > well, either deal with it, or try checking oracle website to see if you
> > are using recommended way of installing it.
> -=-
> 
> seems like you are guessing again.

Yes, sorry for providing wrong information. Didn't check that that
package is hosted in SL repository. Personally, I'm a bit surprised
about it: I thought that packages of that kind aren't in main
repository.

It's just that, well, you know, such packages - which are binary-only
additions, they aren't properly built from source will always be
second-class citizens. Their existence is also not consistent among EL
releases (Centos doesn't have these, RH puts them into additional repo,
binary-only, again. Oh and btw SL6 doesn't have these packages, too),
which is a problem in mixed environments. They will always lack
debuginfo, can lack signatures and cause various problems, coming from
them being second-class. Because of that, I simply don't consider them
to be proper part of distribution. But that's just my point of view, I
understand that you might have different opinion about it.

Sadly, looks like oracle still doesn't provide any public yum repository
with jdk :-/

-- 

Vladimir


Re: SL 6.x on Mac mini

2012-03-06 Thread Vladimir Mosgalin
Hi Mike Zanker!

 On 2012.03.06 at 09:26:39 +, Mike Zanker wrote next:

> >ps. Apparently I was removed from this email list which I did not
> >notice until I tried to send this mail. Not sure how that happened!
> 
> Sorry, no information about 6.2 on the Mac Mini, but I was also
> mysteriously removed from the list and had to resubscribe a couple
> of weeks ago.

Same happened to me - just stopped getting mails until resubscribed. A Mystery!

-- 

Vladimir


Re: Red Hat Bug #639280: vte not setting TERM variable for terminal emulators other than gnome-terminal

2012-03-15 Thread Vladimir Mosgalin
Hi Todd And Margo Chester!

 On 2012.03.15 at 11:34:59 -0700, Todd And Margo Chester wrote next:

> On 03/15/2012 10:27 AM, Dennis Schridde wrote:
> >Hello!
> >
> >It appears to me that Red Hat Bug #639280 [1] against Fedora 14 appears also
> >in Scientific Linux 6.2: E.g. in Guake $TERM is "dumb" instead of "xterm". 
> >Can
> >someone confirm this? Is this also a bug in RHEL 6.2? Where should I report
> >it, so it can be fixed soon?
> >
> >Kind regards,
> >Dennis
> >
> >[1] https://bugzilla.redhat.com/show_bug.cgi?id=639280
> 
> 
> $ echo $TERM
> xterm

I, too, have no problems - I'm using vte-based roxterm. Always had
"xterm" TERM in SL 6.1 and 6.2.
I checked other vte-based terminal emulator that I used in the past,
sakura; it works just fine out of the box, too.

There are tons of vte-based terminals out there, and from that bug it's
hard to say how many people are affected: the problem, if present, seems
to be confirmed for just a few of them. So maybe most of people never
notice if this bug is present at all.

-- 

Vladimir


Re: Anyone fire up w8 preview in KVM?

2012-03-31 Thread Vladimir Mosgalin
Hi Nico Kadel-Garcia!

 On 2012.03.30 at 22:34:37 -0400, Nico Kadel-Garcia wrote next:

> 
> > I haven't, but just wanted to say that both VirtualBox and VMWare Player
> > run fine under SL6.0, and provide a much better user experience than KVM.
> > I've run XP and and W7 with no issues using VBox and VMWare.
> >
> KVM has allegedly gotten better, and with the direct support of our
> favorite upstream vendor it may be a workable enterprise solution. But I
> still find that the built-in management tools for it were designed by
> monkeys actually trying to write Hamlet.

I didn't try W8 preview, but earlier W8 beta installed and worked just
fine under KVM in SL6.1, without any problems. I picked Windows 7 in
virt-manager during installation.

-- 

Vladimir


Re: Multiple terminal windows won't send jobs to individual cores

2012-04-05 Thread Vladimir Mosgalin
Hi Wil Irwin!

 On 2012.04.05 at 09:47:45 -0700, Wil Irwin wrote next:

> I'm convinced it has to be a kernel issue, but that doesn't explain why it
> was working 2 months ago with the previous kernel and then Monday (w/o any
> updates or changes), multiple terminal won't send jobs to multiple cores.
> 
> Does anyone have any idea what might me causing this. Suggest on how to
> further troubleshoot?

This is possible with cgroups (terminal gets some cgroup which is locked
to certain core by cpuset, then all children are locked too), but it
requires someone to actually set it up, it shouldn't happen on fresh
system or something like that.
If you suspect someone might have tweaked this system, check cgroups
documentation
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/ch-Using_Control_Groups.html
and cpuset documentation
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-cpuset.html
to see how processes are assigned to cgroups and how cgroups can be
cpuset-locked.

However, if this happens on fresh system, it might be something else.
Normally it's not even about terminal window - every process normally
has chance of using any CPU core, and scheduler spreads them more or
less evenly - even if you just run few background jobs from single
shell. It doesn't matter at all to scheduler if you are running
background process from same session or from multiple windows.
You might want to check if something else (not cgroup) was involved in
setting processes masks, by calling "taskset -p " for interesting
pids.

If they don't have special assigned affinity ('f' for 4-core host, 'ff'
for 8-core and so on), then, well, maybe something is wrong with
scheduler, or the way it picks cores for these processes..


-- 

Vladimir


Re: network manager questions

2012-04-09 Thread Vladimir Mosgalin
Hi Lamar Owen!

 On 2012.04.09 at 13:35:36 -0400, Lamar Owen wrote next:

> In 2004, when NetworkManager first came on the scene, the opposite was true, 
> where any network connectivity required either distribution-specific tools or 
> text editing of config files, regardless of network technology.  While NM is 
> desktop-centric (not laptop-centric, incidentally) it works fine for me for a 
> number of servers, a number of desktop workstations (not laptops), as well as 
> for a handful of laptops.  It's not hard; and if the mouse walks you can't 
> use the older system-config-network GUI either.  As Pat mentions, nmcli works 
> to do network startup by connection name.  There is a somewhat experimental 
> cli networkmanager configurator called 'cnetworkmanager' it isn't yet 
> complete.
> 
> It isn't as buggy as it used to be, and it is being developed by people who 
> aren't just saying that it works on their laptop.

Sadly, last time I saw every guide on how to setup virtual machines
(KVM) in bridge configuration started with "turn off network manager"..
I had to ask people to turn it off for at least 5 systems out of 15
running Linux at our work solely for that reason; desktop systems, I
mean (Ubuntu & Fedora mostly). That's kind of a big problem still, and
definitely no-go for NM on servers..

Not all servers need bridging, but it's also incompatible with trunks
too. And once you find that half of your servers can't use NM anyway,
you turn it off on the rest of them for consistency, to ease
administration..

-- 

Vladimir


Yum package metadata not updated (debuginfo)?

2012-05-23 Thread Vladimir Mosgalin
Hello everybody.

There were some recent updates for SL6.2, however debuginfo packages
aren't updated. Packages are in repository (on ftp), but metadata
doesn't know about these packages.

Example would be
ftp://ftp.scientificlinux.org/linux/scientific/6.2/archive/debuginfo/nfs-utils-debuginfo-1.2.3-15.el6_2.1.x86_64.rpm

# yum list nfs-utils*
...
Installed Packages
nfs-utils.x86_64 1:1.2.3-15.el6_2.1@sl-fastbugs
nfs-utils-debuginfo.x86_64   1:1.2.3-15.el6@sl-debuginfo/6.1
nfs-utils-lib.x86_64 1.1.5-4.el6   installed
nfs-utils-lib-debuginfo.x86_64   1.1.5-4.el6   @sl-debuginfo/6.1
Available Packages
nfs-utils-lib.i686   1.1.5-4.el6   sl
nfs-utils-lib-debuginfo.i686 1.1.5-4.el6   sl-debuginfo
nfs-utils-lib-devel.i686 1.1.5-4.el6   sl
nfs-utils-lib-devel.x86_64   1.1.5-4.el6   sl


-- 

Vladimir


Re: RE: Download servers ftp[x].scientificlinux.org unreachable

2012-06-10 Thread Vladimir Mosgalin
Hi peter.c...@stfc.ac.uk!

 On 2012.06.07 at 18:01:30 +, peter.c...@stfc.ac.uk wrote next:

> 
> My apologies, should have checked with another DNS resolver.
> 
> I shall report this DNS fault to our site admin.
> 
> Thanks for your speedy reply.

I'm pretty sure it was fault of either SL hosting provider or someone
else close to it in DNS chain, not your site admin. This time, it lasted
for a day or two, I think.

Exactly same thing happened before, check out
http://listserv.fnal.gov/scripts/wa.exe?A2=ind1112&L=scientific-linux-users&T=0&P=2757


Few days ago, scientificlinux.org wasn't resolving for me either.
My bind checked google DNS servers and all others and situation was the same 
everywhere:

validating @0x7f93b01ee450: fnal.gov DNSKEY: no valid signature found (DS)
error (no valid RRSIG) resolving 'fnal.gov/DNSKEY/IN': 8.8.4.4#53
validating @0x7f93bc8865f0: fnal.gov DNSKEY: no valid signature found (DS)
error (no valid RRSIG) resolving 'fnal.gov/DNSKEY/IN': 8.8.8.8#53
validating @0x7f93b0c09f90: fnal.gov DNSKEY: no valid signature found (DS)
error (no valid RRSIG) resolving 'fnal.gov/DNSKEY/IN': 198.49.208.70#53
validating @0x7f93b433e5f0: fnal.gov DNSKEY: no valid signature found (DS)
error (no valid RRSIG) resolving 'fnal.gov/DNSKEY/IN': 198.49.208.71#53
validating @0x7f93ac1e1290: fnal.gov DNSKEY: no valid signature found (DS)
error (no valid RRSIG) resolving 'fnal.gov/DNSKEY/IN': 2001:400:6000::22#53
validating @0x7f93bc8865f0: fnal.gov DNSKEY: no valid signature found (DS)
error (no valid RRSIG) resolving 'fnal.gov/DNSKEY/IN': 2001:400:910:1::2#53
validating @0x7f93b433e5f0: fnal.gov DNSKEY: no valid signature found (DS)
error (no valid RRSIG) resolving 'fnal.gov/DNSKEY/IN': 198.128.2.10#53
[..skipped..]

error (broken trust chain) resolving 'linux21.fnal.gov/A/IN': 8.8.4.4#53
  validating @0x7f93ac1e1290: MLV3I3JULF9HLTIIPF6CQHA1Q51TOGTU.fnal.gov NSEC3: 
bad cache hit (fnal.gov/DNSKEY)
error (broken trust chain) resolving 'linux21.fnal.gov//IN': 8.8.4.4#53
validating @0x7f93b433e5f0: linux01.fnal.gov A: bad cache hit (fnal.gov/DNSKEY)
error (broken trust chain) resolving 'linux01.fnal.gov/A/IN': 8.8.4.4#53
  validating @0x7f93b01284d0: fnal.gov SOA: bad cache hit (fnal.gov/DNSKEY)
  validating @0x7f93b01284d0: 6JGTJCC74FMN7VR86T153U5TDA4MBUDT.fnal.gov NSEC3: 
bad cache hit (fnal.gov/DNSKEY)
error (broken trust chain) resolving 'linux01.fnal.gov//IN': 8.8.8.8#53
validating @0x7f93b433e5f0: linux9.fnal.gov A: bad cache hit (fnal.gov/DNSKEY)
error (broken trust chain) resolving 'linux9.fnal.gov/A/IN': 8.8.8.8#53
  validating @0x7f93b01284d0: fnal.gov SOA: bad cache hit (fnal.gov/DNSKEY)
  validating @0x7f93b01284d0: TSR1OLA6N3BA20AH8OLM0CPQE8LP.fnal.gov NSEC3: 
bad cache hit (fnal.gov/DNSKEY)
[..and so on..]


I believe that the fact that it started to work when you changed DNS
resolver just means that they use outdated DNS server which doesn't care
about DNSSEC :)

Not that I need DNSSEC to trust the way SL website resolves, however
it's somewhat sad that situations like this happen again.


-- 

Vladimir


Re: OT: SAS 29-pin 8482 and 4-pin Molex connector

2012-07-03 Thread Vladimir Mosgalin
Hi Ken Teh!

 On 2012.07.03 at 09:50:05 -0500, Ken Teh wrote next:

> The LTO-5 drive has a 29-pin 8482 connector on it.  I bought at SAS card and 
> a break-out cable that goes from a single 8087 to 4 cables each with a 29-pin 
> 8482 connector.  Each of these 8482 cable connectors has a 4-pin male Molex 
> dangling from it.
> 
> According to the install instructions, it says to "ensure that a 4-pin Molex 
> connector is plugged into the power inputs of the SAS cable as shown in 
> figure 5."  I have attached the figure. It shows the 4-pin Molex connector 
> dangling off the 8482 cable connector but unconnected to anything.  Just like 
> what I have.
> 
> Is the 4-pin Molex supposed to dangle like that or is it supposed to be 
> plugged into a 4-pin female Molex?
> 

It is supposed to be plugged into female molex for sure. That's how
device gets the power.



-- 

Vladimir


Re: server crashing out of memory

2012-07-17 Thread Vladimir Mosgalin
Hi Orion Poplawski!

 On 2012.07.17 at 13:38:50 -0600, Orion Poplawski wrote next:

> > If your atop service is on, you should be able to see something about
> >what was happening shortly before the crash by viewing the appropriate
> >/var/log/atop/ with the atop -r  command.
> > You could just try increasing your swap space; you don't have very much
> >compared with your ram.  Simple 'top' and 'atop' commands show, among other
> >things, current swap usage.  I'd get nervous if most of it gets used up.
> 
> I installed and started atop to see what that shows.  Didn't know
> about that one before, thanks.  I am running sa/sar and that showed:
> 
> 12:00:02 AM kbswpfree kbswpused  %swpused  kbswpcad   %swpcad
> 08:50:01 AM   1989664107480  5.13 25616 23.83
> 
> before the last crash.  and:
> 
> 02:10:01 AM   2097144 0  0.00 0  0.00
> 
> for the previous one.  So I don't particularly suspect lack of swap.
> The machine should have way more RAM than it needs, so it's mainly
> just buffer cache.  I did bump it up to 8GB just for fun though.

For this configuration (lots of ram and you don't actually plan to use
swap) I suggest lowering vm.swappiness to very low numbers to ensure
this buffer cache doesn't actually try to grow so much so it pushes out
something else into swap.

vm.swappiness = 1 should be fine value for 48 Gb KVM server.

Not that you actually need any buffer cache for pure virtual host, as
setting cache mode to "none" for all storage devices in guests provides
higher performance / lower latency and prevents nasty problems if guest
storage is accessed from host (with libguestfs or manual mounting), and
in this configuration each guest maintains own buffer cache and it's not
buffered for the second time on host.


Also, if you are using memory overcomitting with KSM, TUV recommends
(for some very good reasons!) to have sufficient amount of swap to
ensure RAM+swap is higher than amount of RAM allocated to all guests.
When running lots of linux guests I typically see lots of memory
saved by KSM (>10 gb on 48Gb servers), and you have to increase swap by
that value even if it's not used, because it suddenly might get required
depending on guest activity.

-- 

Vladimir


Re: Different cpu usage between Fedora and SL6 guests

2012-07-17 Thread Vladimir Mosgalin
Hi Orion Poplawski!

 On 2012.07.17 at 16:46:19 -0600, Orion Poplawski wrote next:

> >That depends heavily on what version of Fedora they are (particularly Rawhide
> >instances), and what services they are running.
> 
> Well, I have a Fedora 17 instance running now with nothing but
> kernel processes and the qemu-kvm process still shows using 7-8%
> cpu.


Run "vmstat 1 10" on both guests and compare output; look for
differences in amount of context switches, CPU system usage and such.

Also running powertop and comparing wakeups/sec and general output on
guests might give some hint.


-- 

Vladimir


Re: Different cpu usage between Fedora and SL6 guests - SOLVED

2012-07-18 Thread Vladimir Mosgalin
Hi Orion Poplawski!

 On 2012.07.18 at 09:15:39 -0600, Orion Poplawski wrote next:

> >>Also running powertop and comparing wakeups/sec and general output on
> >>guests might give some hint.
> 
> powertop was a good suggestion.  It appears that toggling the
> Autosuspend for USB device QEMU USB Tablet [QEMU 0.12.1] tunable has
> a big effect and brings it in line with SL6.

Good catch; I'll have to remember that.

> # cat /sys/devices/pci:00/:00:01.2/usb1/power/autosuspend
> 2
> # cat  /sys/devices/pci:00/:00:01.2/usb1/1-1/power/level
> on
> 
> So it appears that SL6 is setting to auto automatically, but Fedora
> is not.  I wonder why.

Different udev rules. Actually, with current udev autosuspend is
disabled for HID devices in Fedora (*), because autosuspending
keyboard/mouse/touchpad introduces lags. Most likely this shouldn't
matter for virtual devices.. Guess this should be reported to fedora
guys so they'll include QEMU devices for autosuspend.


(*) 
https://lists.fedoraproject.org/pipermail/package-announce/2012-June/081999.html

-- 

Vladimir


Re: SL to 6.3 release

2012-07-18 Thread Vladimir Mosgalin
Hi Semi!

 On 2012.07.18 at 16:37:31 +0300, Semi wrote next:

> 
> 1) When the SL6.3 is expected? Centos 6.3 already exists.

*sigh* I like SL mailing lists so much because people here are very
thoughtful and there isn't storm of similar questions, irritated people
and rude answers like during Centos 6.0/6.1 "yet to be released" times;
I'd love if this ML could stay nice and calm and there won't be any more
similar questions until actual release.

I don't blame you for asking this, just hope there won't be any more
similar questions.

Anyhow, if you *really-really* feel like trying out something that feels
like SL6.3 and don't mind it'll break your system, you can try to do
this on some non-production system to try out SL6.3 beta. This is NOT
recommended and NOT supported, also you really might have problems
upgrading to 6.3 final after these steps - but if you feel like toying
with your system a bit, why not? If you discover some bug you can help
testers, too. Just report it carefully and don't complain that you
discovered it :)

--- (don't do this on production systems) ---
# yum clean all
# yum --releasever=6rolling update
--- (don't do this on production systems) ---

Some external repos are incompatible with this style of upgrade so you
might have to turn them off with --disablerepo=... (in my case, I had to
disable pgdg91).

Read
ftp://ftp.scientificlinux.org/linux/scientific/6rolling/x86_64/os/SL_6_3_ALPHA_BETA_CHANGES
and make sure you really understand implications of installing beta
version. There really is no point in doing this on almost any production
system because most important security fixes from 6.3 were backported to
6.2 already. However, if you are eager to get the taste of 6.3, feel
free to do it, it's here and it works (at least for me).

-- 

Vladimir


Re: is the drop to fsck visual fixed in 6.3?

2012-07-18 Thread Vladimir Mosgalin
Hi Todd And Margo Chester!

 On 2012.07.18 at 13:50:13 -0700, Todd And Margo Chester wrote next:

> 
>I don't know the exact number, I think it is 27 reboots,
> your boot will automatically drop to an FSCK.  In RHEL5,
> your would see a status bar showing you progress.  In 6,
> you get no indication that an FSCK is happening and you
> think you are frozen.  The temptation to throw the power
> switch is overwhelming.

Well, if you press 'ESC' you can see messages about fsck going on..

-- 

Vladimir


New kernel-debuginfo in repos but no kernel?

2012-07-19 Thread Vladimir Mosgalin
Hello everybody.

I'm getting this list of updates which doesn't make much sense as
debuginfo for 6.3 kernel is offered without kernel itself. Maybe it's
just some repo update is lagging, but it could also be some problem with
repository.

# yum update
Loaded plugins: auto-update-debuginfo, fastestmirror, ps, refresh-packagekit
Loading mirror speeds from cached hostfile
epel/metalink|  17 kB 00:00 
epel-debuginfo/metalink  |  17 kB 00:00 
epel-testing/metalink|  17 kB 00:00 
epel-testing-debuginfo/metalink  |  18 kB 00:00 
 * epel: mirror.yandex.ru
 * epel-debuginfo: mirror.yandex.ru
 * epel-testing: mirror.yandex.ru
 * epel-testing-debuginfo: mirror.yandex.ru
 * sl: ftp2.scientificlinux.org
 * sl-debuginfo: ftp2.scientificlinux.org
 * sl-fastbugs: ftp2.scientificlinux.org
 * sl-security: ftp2.scientificlinux.org
 * sl6x: ftp2.scientificlinux.org
 * sl6x-fastbugs: ftp2.scientificlinux.org
 * sl6x-security: ftp2.scientificlinux.org
sl   | 3.2 kB 00:00 
sl-debuginfo | 1.9 kB 00:00 
sl-debuginfo/primary_db  | 1.3 MB 00:01 
sl-fastbugs  | 1.9 kB 00:00 
sl-security  | 1.9 kB 00:00 
sl-security/primary_db   | 3.7 MB 00:00 
sl6x | 3.2 kB 00:00 
sl6x-fastbugs| 1.9 kB 00:00 
sl6x-security| 1.9 kB 00:00 
sl6x-security/primary_db | 3.7 MB 00:02 
Setting up Update Process
Resolving Dependencies
--> Running transaction check
---> Package firefox.x86_64 0:10.0.5-1.el6_2 will be updated
---> Package firefox.x86_64 0:10.0.6-1.el6_3 will be an update
---> Package kernel-debuginfo.x86_64 0:2.6.32-220.23.1.el6 will be updated
---> Package kernel-debuginfo.x86_64 0:2.6.32-279.1.1.el6 will be an update
---> Package kernel-debuginfo-common-x86_64.x86_64 0:2.6.32-220.23.1.el6 will 
be updated
---> Package kernel-debuginfo-common-x86_64.x86_64 0:2.6.32-279.1.1.el6 will be 
an update
---> Package nspr.x86_64 0:4.9-1.el6 will be updated
---> Package nspr.x86_64 0:4.9.1-2.el6_3 will be an update
---> Package nspr-debuginfo.x86_64 0:4.9-1.el6 will be updated
---> Package nspr-debuginfo.x86_64 0:4.9.1-2.el6_3 will be an update
---> Package nss.x86_64 0:3.13.3-8.el6 will be updated
---> Package nss.x86_64 0:3.13.5-1.el6_3 will be an update
---> Package nss-debuginfo.x86_64 0:3.13.3-8.el6 will be updated
---> Package nss-debuginfo.x86_64 0:3.13.5-1.el6_3 will be an update
---> Package nss-sysinit.x86_64 0:3.13.3-8.el6 will be updated
---> Package nss-sysinit.x86_64 0:3.13.5-1.el6_3 will be an update
---> Package nss-tools.x86_64 0:3.13.3-8.el6 will be updated
---> Package nss-tools.x86_64 0:3.13.5-1.el6_3 will be an update
---> Package nss-util.x86_64 0:3.13.3-2.el6 will be updated
---> Package nss-util.x86_64 0:3.13.5-1.el6_3 will be an update
---> Package nss-util-debuginfo.x86_64 0:3.13.3-2.el6 will be updated
---> Package nss-util-debuginfo.x86_64 0:3.13.5-1.el6_3 will be an update
---> Package perf-debuginfo.x86_64 0:2.6.32-220.23.1.el6 will be updated
---> Package perf-debuginfo.x86_64 0:2.6.32-279.1.1.el6 will be an update
---> Package xulrunner.x86_64 0:10.0.5-1.el6_2 will be updated
---> Package xulrunner.x86_64 0:10.0.6-1.el6_3 will be an update
--> Finished Dependency Resolution

Dependencies Resolved


 PackageArch   Version   RepositorySize

Updating:
 firefoxx86_64 10.0.6-1.el6_3sl-security   20 M
 kernel-debuginfo   x86_64 2.6.32-279.1.1.el6sl-debuginfo 242 M
 kernel-debuginfo-common-x86_64 x86_64 2.6.32-279.1.1.el6sl-debuginfo  36 M
 nspr   x86_64 4.9.1-2.el6_3 sl-security  110 k
 nspr-debuginfo x86_64 4.9.1-2.el6_3 sl-debuginfo 573 k
 nssx86_64 3.13.5-1.el6_3sl-security  762 k
 nss-debuginfo  x86_64 3.13.5-1.el6_3sl-debuginfo 5.3 M
 nss-sysinitx86_64 3.13.5-1.el6_3sl-security   31 k
 nss-tools  x86_64 3.13.5-1.el6_3sl-security  729 k
 nss-util   x86_64 3.13.5-1.el6_3sl-security   52 k
 nss-util-debuginfo x86_64 3.13.5-1.el6_3sl-debuginfo 201 k
 perf-debuginfo x86_64 2.6.32-279.1.1.el6  

Re: SL6.2 no boot from degraded RAID1... with fix...

2012-08-07 Thread Vladimir Mosgalin
Hi Winnie Lacesso!

 On 2012.08.07 at 14:33:43 +0100, Winnie Lacesso wrote next:

> > FYI, as a regression from SL6.0 and SL6.1, SL6.2 does not boot from 
> > degraded RAID1 devices.
> 
> Apologies for the question but is this true of Linux Software RAID1 only, 
> or of hardware RAID1 as well?

Hardware RAID is presented as a single device to OS, so no such
regressions should affect it.

In my practice there are no problems with booting SL6 from degraded
RAID1 on LSI and HP controllers.

-- 

Vladimir


Re: Curious SL not resolve DNS.

2012-08-13 Thread Vladimir Mosgalin
Hi Pere Casas!

 On 2012.08.13 at 11:18:16 +0200, Pere Casas wrote next:

> [root@intern ~]# ping priona.net
> ping: unknown host priona.net
> 
> 
> [root@intern ~]# nslookup priona.net
> Server:8.8.8.8
> Address:8.8.8.8#53
> Non-authoritative answer:
> Name:priona.net
> Address: 146.255.96.119

Unlike host, dig and nslookup, which are part of BIND suite and use BIND
resolver, ping uses system-wide resolver provided by glibc (libresolv).
They can behave differently.

First thing worth checking if you are running nscd - and if you are, its
configuration on domain name caching.
Check output of "pgrep nscd" - if you see number (pid), you are running
nscd.

If you are, check if advices from
http://www.eyrie.org/~eagle/notes/solaris/dns-cache.html
will help.

If you are not, well, such situation can happen, but I can't remember
any more possible reasons. At very least, you should check if your
glibc isn't broken ("rpm -qV glibc" will do for a check) and if
"strace ping priona.net" writes anything suspicious.

-- 

Vladimir


Disable showing "LVM volume" in gnome desktop

2012-08-30 Thread Vladimir Mosgalin
Hello everybody.

I've just installed SL6.3 with default gnome desktop on laptop (using
default partitioning and encryption options, with encrypted LVM VG) and
it displays whole disk LVM PV ("500 GB Hard Disk: 500 GB LVM2 Physical
Volume") in "Places" menu and also when opening "Computer".

Obviously this is useless, also annoying because it asks for password
("Authentication is required to mount this device") when clicking it. I
don't even want to try entering password because I doubt it's a good
idea to "mount" main system LVM PV!

I need to give this notebook to another person who isn't linux expert,
so I need a mean to disable this in order not to confuse him. Quick
googling didn't help.. Does anyone know a way to completely rid of
mention of this LVM PV?

-- 

Vladimir


Re: Disable showing "LVM volume" in gnome desktop

2012-08-31 Thread Vladimir Mosgalin
Hi Mark Whidby!

 On 2012.08.31 at 09:52:03 +0100, Mark Whidby wrote next:

> This *might* help a bit - it won't prevent the icon from being
> displayed but it won't ask for the password, instead putting up a "Not
> authorised" message. 
> 
> Create a file called what you like with the extension .pkla and stick it
> in one of the directories in /var/lib/polkit-1/localauthority - I used
> 50-local.d. The file should have the following contents:
> 
> [Disable Mount]
> Identity=unix-user:*
> Action=org.freedesktop.udisks.filesystem-mount-system-internal
> ResultAny=no
> ResultInactive=no
> ResultActive=no
> 
> It worked for me. If you find a way of disabling the icon as well
> I'd be grateful if you could let me know.

Thanks, this helped some. After googling some more I couldn't find a way
of disabling the icon, unfortunately - the "old" way from fedora didn't
work (disabling in hal .fdi file), as well as "new" way (disabling in
udev). There are various reports of it being a bug in udisks subsystem,
there are bugs filled at least for fedora and ubuntu and they are in
open state everywhere.


-- 

Vladimir


Re: SSD and RAID question

2012-09-03 Thread Vladimir Mosgalin
Hi Todd And Margo Chester!

 On 2012.09.02 at 17:33:24 -0700, Todd And Margo Chester wrote next:

> On several Windows machines lately, I have been using
> Intel's Cherryville enterprise SSD drives.  They work
> very, very well.
> 
> Cherryville drives have a 1.2 million hour MTBF (mean time
> between failure) and a 5 year warranty.
> 
> I have been thinking, for small business servers
> with a low data requirement, what would be the
> risk of dropping RAID in favor of just one of these
> drives?

Personally I wouldn't recommend dropping RAID on *any* server where
loss of functionality can cause you any problems. That is, if its
functionality is duplicated and load balancer will switch to other
server automatically - sure, deploy it without RAID, but if its loss can
cause you business troubles, it's a bad idea.

We do use SSDs for some kind of small business servers, but we prefer
to buy at very least two of them and use software RAID; it's fine to use
cheaper SSDs, RAID1 of them is still better idea than single more
expensive one. (unless we are talking about something ultra-reliable
like SLC drive, but these are like 10 times more expensive).

You should understand that MTBF values or warranty time are absolutely
useless when you think about chance of single failure or calculating how
soon the problem will likely happen; taking them into account is worth
it only when you calculate cost of usage or replacement rate for big
park of computers (say, 1000's).

If you want to calculate SSD reliability for server task, the best
indicator would be amount of allowed writes; in some cases, like
database journals (redo logs in Oracle / WAL logs in postgresql / etc)
you might need really lots of writes, if you calculate the value it's
easy to check that it's impossible for consumer SSD drive to last for
years under such load (you need SLC drive or at least enterprise-grade
MLC drive like Intel 710). There are other usage scenarios where SLC
SSDs are must, like ZFS ZIL.

SSDs reliability is okay for mostly-read usage scenarios, but: the
problem is that SSDs fail in completely different way than HDDs.
Actually, I've never seen SSD that has run out of write cycles - but
I've seen quite a few SSDs who died from flash controller failure or
something similar to that. That is, most likely problem with SSD is -
"oops, it died". While this happens with HDDs too, it's way more likely
to just get bad blocks on it. So MTBF for SSD != MTBF for HDDs as we are
talking about completely different types of common failures; you can't
even compare these numbers directly.

> 
> Seems to me the RAID controller would have a worse
> MTBF than a Cherryville SSD drive?
> 
> And, does SL 6 have trim stuff built into it?

Yes, make sure to add "discard" option to ext4 filesystem mounts.

However, it won't work on hardware RAID and probably won't work on
software one either - though I'm not 100% sure of the later.
This means, if you are going to RAID SSD's, using Sandforce-driven SSDs
isn't recommended as you will lose lots of performance over time; use
Marvell-based solutions like Crucial M4, Plextor M3/M3P, Intel 510, OCZ
Vertex 4 and few others; don't use Marvell-based Corsair Performance
Pro with hardware RAID controllers as they are often incompatible,
however.

-- 

Vladimir


Re: Using all 3 of pair bonding, tagged VLAN's, and KVM compatible bridges?

2012-09-25 Thread Vladimir Mosgalin
Hi Nico Kadel-Garcia!

 On 2012.09.25 at 15:23:41 -0400, Nico Kadel-Garcia wrote next:

> On Mon, Sep 24, 2012 at 7:57 PM, Paul Robert Marino  
> wrote:
> > Don't focus on the boonding its transparent once configured.
> > Treat the bonded interface bondx like it was a ethx and the guides will make
> > sense.
> > I beleave off the top of my head the answer is apply the bridge to the vlan
> > interface unless you want the vlan tags to go to the vms and do 802.1Q on
> > the vms. I know running 802.1Q to a vm sounds crazy but I've seen it done to
> 
> "Off the top of my head the answer is apply the bridge" is not
> helpful. I can google as well as the next engineer, better than most.
> I need a precise answer, please. What do I need to tweak in
> /etc/sysconfig/network-scripts/ifcfg-bond0.vlan1 to enable it not
> merely as a VLAN, but as a bridge suitable for KVM virtualization? If
> I'm forced to use the ifcfg-bond0 device as the KVM bridge, I'll be
> forced to set up the VLAN configurations in the guest VM's and
> that. causes real adventures for kickstart and anaconda.

I don't think there should be any serious problem if you add bond. Here
is how vlan+bridge for kvm works, for example:
$ cat ifcfg-eth0.2 
DEVICE="eth0.2"
ONBOOT="yes"
TYPE="Ethernet"
BRIDGE=br0v2
VLAN=yes

$ cat ifcfg-br0v2
TYPE=Bridge
DEVICE=br0v2
SLAVE=eth0.2
BOOTPROTO="static"
IPADDR=10.77.7.28
NETMASK=255.255.255.0


You probably just add vlan to bonded interface and then use it as
bridge.

-- 

Vladimir


Re: clock factor file

2012-10-03 Thread Vladimir Mosgalin
Hi g!

 On 2012.10.03 at 05:09:32 +, g wrote next:

> in unix, there is a file, name of which i do not recall, used as a
> 'clock factor' and controls the 'tick rate' for the system clock.
> 
> is such a file used in scientific linux and what is it's name?
> 

It's called /etc/adjtime

The way it works is documented in adjtimex (2) call. During system boot,
contents of file is used for adjtimex call to remind freshly boot kernel
about correct tick rate (usually by means of calling "hwclock"), and on
shutdown new correction to tick rate is written there.

Additionaly, ntp daemons use that call and do correction job, too.

-- 

Vladimir


Re: Traffic shaping today

2012-10-11 Thread Vladimir Mosgalin
Hi Henrique Junior!

 On 2012.10.11 at 09:23:33 -0300, Henrique Junior wrote next:

> >>> Hello, I'm doing some research about efficient ways of performing
> >>> traffic shaping in a network but all I can see is a lot of outdated
> >>> tools (wondershaper is from 2002, HTB from 2004) and CBQ is quite a bad
> >>> idea because it is "shaping" even transfers in my internal network (pc
> >>> to pc).
> >>> What are people using that is less than 8 year old and in active
> >>> development? Does anyone really compiled and successfully used
> >>> layer7-filter (for content filtering) in any RHEL 6 based system with
> >>> kernel 2.6.32? I know about ClearOS (Clear Foundation is the new
> >>> developer of layer7, but the last release of his layer7 is from 2009).
> >>>
> >>
> >> I use shorewall, although on Fedora. Very active list and developer.
> >>
> >
> > I use wshaper on some RHEL5 boxes but I don't think anything really
> > changed in RHEL6.
> >
> > Works as good today as it did 8 years ago :)
> >
> > Jeff
> >
> 
> Thanks for replying.
> I'm amazed to see that impressive projects (like layer-7) are stagnated or
> dead. Did we have any software to replace layer-7?

l7-filter works for me. I use latest version of userspace L7 filter
(with some patch IIRC). It does use more memory over time but it's not
unbearable, it can work for months nevertheless.

But it depends on task you need L7 filtering for. If it's just for
traffic shaping / QoS marks on certain kinds of traffic, it is possible,
but modern P2P protocols have adopted solutions to be mostly invisible
to this kind of filtering. Also, if you need L7 analyzing for something
else (say, load balancing), then userspace solutions simply don't cut.
There are some kernel solutions like Ultramonkey-L7
http://www.ultramonkey.info/, but I have no idea how good are they.

I'd say that L7 filtering solutions went out of fashion because most
people nowadays can't increase traffic shaping quality much by using
them, and these solutions aren't designed for more serious usage like
LB; I mean, I'd love to have fully working and supported L7 equivalent
of LVS (Linux Virtual Server), but it's just not here yet.

-- 

Vladimir


(SL6.3) yum picks wrong qemu version upgrade

2012-11-15 Thread Vladimir Mosgalin
This seems quite wrong - qemu wants to update itself to i686 version!

# yum update
Loaded plugins: auto-update-debuginfo, fastestmirror, ps, refresh-packagekit
Loading mirror speeds from cached hostfile
epel/metalink|  12 kB 00:00 
epel-debuginfo/metalink  |  13 kB 00:00 
epel-testing/metalink|  12 kB 00:00 
epel-testing-debuginfo/metalink  |  12 kB 00:00 
 * epel: fedora-mirror01.rbc.ru
 * epel-debuginfo: fedora-mirror01.rbc.ru
 * epel-testing: fedora-mirror01.rbc.ru
 * epel-testing-debuginfo: fedora-mirror01.rbc.ru
 * sl: ftp2.scientificlinux.org
 * sl-debuginfo: ftp2.scientificlinux.org
 * sl-fastbugs: ftp2.scientificlinux.org
 * sl-security: ftp2.scientificlinux.org
 * sl6x: ftp2.scientificlinux.org
 * sl6x-fastbugs: ftp2.scientificlinux.org
 * sl6x-security: ftp2.scientificlinux.org
epel | 4.3 kB 00:00 
epel/primary_db  | 4.8 MB 00:00 
epel-debuginfo   | 3.1 kB 00:00 
epel-debuginfo/primary_db| 485 kB 00:00 
epel-testing | 4.3 kB 00:00 
epel-testing/primary_db  | 296 kB 00:00 
epel-testing-debuginfo   | 3.1 kB 00:00 
epel-testing-debuginfo/primary_db|  28 kB 00:00 
sl   | 3.2 kB 00:00 
sl-debuginfo | 1.9 kB 00:00 
sl-debuginfo/primary_db  | 1.7 MB 00:03 
sl-fastbugs  | 1.9 kB 00:00 
sl-security  | 1.9 kB 00:00 
sl-security/primary_db   | 3.0 MB 00:03 
sl6x | 3.2 kB 00:00 
sl6x-fastbugs| 1.9 kB 00:00 
sl6x-security| 1.9 kB 00:00 
sl6x-security/primary_db | 3.0 MB 00:03 
Setting up Update Process
Resolving Dependencies
--> Running transaction check
---> Package libproxy.x86_64 0:0.3.0-2.el6 will be updated
---> Package libproxy.x86_64 0:0.3.0-3.el6_3 will be an update
---> Package libproxy-bin.x86_64 0:0.3.0-2.el6 will be updated
---> Package libproxy-bin.x86_64 0:0.3.0-3.el6_3 will be an update
---> Package libproxy-debuginfo.x86_64 0:0.3.0-2.el6 will be updated
---> Package libproxy-debuginfo.x86_64 0:0.3.0-3.el6_3 will be an update
---> Package libproxy-python.x86_64 0:0.3.0-2.el6 will be updated
---> Package libproxy-python.x86_64 0:0.3.0-3.el6_3 will be an update
---> Package mysql.x86_64 0:5.1.61-4.el6 will be updated
---> Package mysql.x86_64 0:5.1.66-1.el6_3 will be an update
---> Package mysql-libs.x86_64 0:5.1.61-4.el6 will be updated
---> Package mysql-libs.x86_64 0:5.1.66-1.el6_3 will be an update
---> Package qemu-img.x86_64 2:0.12.1.2-2.295.el6_3.2 will be updated
---> Package qemu-img.i686 2:1.2.0-19.el6.1 will be an update
--> Processing Dependency: libz.so.1 for package: 2:qemu-img-1.2.0-19.el6.1.i686
--> Processing Dependency: libuuid.so.1(UUID_1.0) for package: 
2:qemu-img-1.2.0-19.el6.1.i686
--> Processing Dependency: libuuid.so.1 for package: 
2:qemu-img-1.2.0-19.el6.1.i686
--> Processing Dependency: libssl3.so for package: 
2:qemu-img-1.2.0-19.el6.1.i686
--> Processing Dependency: libsmime3.so for package: 
2:qemu-img-1.2.0-19.el6.1.i686
--> Processing Dependency: libplds4.so for package: 
2:qemu-img-1.2.0-19.el6.1.i686
--> Processing Dependency: libplc4.so for package: 
2:qemu-img-1.2.0-19.el6.1.i686
--> Processing Dependency: libnssutil3.so for package: 
2:qemu-img-1.2.0-19.el6.1.i686
--> Processing Dependency: libnss3.so(NSS_3.9.3) for package: 
2:qemu-img-1.2.0-19.el6.1.i686
--> Processing Dependency: libnss3.so(NSS_3.9.2) for package: 
2:qemu-img-1.2.0-19.el6.1.i686
--> Processing Dependency: libnss3.so(NSS_3.5) for package: 
2:qemu-img-1.2.0-19.el6.1.i686
--> Processing Dependency: libnss3.so(NSS_3.4) for package: 
2:qemu-img-1.2.0-19.el6.1.i686
--> Processing Dependency: libnss3.so(NSS_3.3) for package: 
2:qemu-img-1.2.0-19.el6.1.i686
--> Processing Dependency: libnss3.so(NSS_3.2) for package: 
2:qemu-img-1.2.0-19.el6.1.i686
--> Processing Dependency: libnss3.so(NSS_3.12) for package: 
2:qemu-img-1.2.0-19.el6.1.i686
--> Processing Dependency: libnss3.so for package: 
2:qemu-img-1.2.0-19.el6.1.i686
--> Processing Dependency: libnspr4.so for package: 
2:qemu-img-1.2.0-19.el6.1.i686
--> Processing Dependency: libgthread-2.0.so.0 for package: 
2:qemu-img-1.2.0-19.el6.1.

Re: (SL6.3) yum picks wrong qemu version upgrade

2012-11-15 Thread Vladimir Mosgalin
Hi MT Julianto!

 On 2012.11.16 at 01:05:42 +0100, MT Julianto wrote next:

> 
> > This seems quite wrong - qemu wants to update itself to i686 version!
> >
> 
> Have you check out
> http://serverfault.com/questions/356674/why-is-my-rhel6-x86-64-server-trying-to-install-libselinux-i686
> 

This is useful link; this case doesn't seem to be quite related, though.
I did yum -v update, but there is no more information than "qemu x86_64
wants to update to i686 version because version of i686 package is
newer". Like, it's obvious for yum that 1.2.0-19 >
0.12.1.2-2.295! This isn't dependency problem like in the question
above, but rather a problem of incorrect package in the repo.

-- 

Vladimir


Re: (SL6.3) yum picks wrong qemu version upgrade

2012-11-16 Thread Vladimir Mosgalin
Hi MT Julianto!

 On 2012.11.16 at 08:41:20 +0100, MT Julianto wrote next:

> Do you really need to use epel-testing repo?
> This might the source of your problem.
> For me, epel repo is enough.
> 
> 
> 
> > ---> Package qemu-img.x86_64 2:0.12.1.2-2.295.el6_3.2 will be updated
> > ---> Package qemu-img.i686 2:1.2.0-19.el6.1 will be an update
> >
> 
> Do you have i686 version installed before updating?
> What is the result of
>  yum list installed | grep qemu

No, I don't have i686 version installed, but you are exactly right:
epel-testing repo is the source of the problem. Silly me for not
noticing it before. Turning off epel-testing helped, thanks!


I had to use few packages from epel-testing in the past (I needed newer
versions which weren't making their way into regular epel for months),
and since I use only small amount of epel packages and none of others
came from testing, I just turned it on in config so I wouldn't have to
turn it on yum commandline from time to time.


Anyhow this has really surprised me. I assumed that epel won't be
publishing packages that are in base distro! I know that SL has some
extra packages compared to other EL distros, so it is possible to
encounter these packages in epel, but qemu is upstream package, so I
never expected it to be in epel.

Probably I've read epel policy on packages wrong, I should re-read it
more carefully. It always was least problematic add-on repo for me
(compared to atrpms or rpmforge).

-- 

Vladimir


Crash in tg3 driver on SL6.3 when interface goes down

2012-12-26 Thread Vladimir Mosgalin
Hello everybody.

For a few months I've been experiencing this problem - it was a bit hard
to track because it usually happens only during shutdown, when network
interfaces go down, so I just didn't notice it. A kernel panic happens
when one of the interfaces, provided by tg3 driver goes down. "ifdown
eth2" is enough to cause it.

It doesn't matter if this interface was actively used or even if the
link was up - I can unplug the cable, boot up (interface is configured
to use dhcp, it will attempt to go up and fail), then execute "ifdown
eth2" and system will crash.

It's a bit hard to get the full message of the crash as this happens on
the machine which I use for remote logging itself.. The best thing I can
get right away are screenshots, however, some of information might be
missing on them.

It goes like this on shutdown (or "ifdown eth2", or "service network
restart" etc):
1) interfaces are being brought down, at some point eth2 is being
   brought down
2) nothing happens for about 10 seconds, system appears to be hang
3) lots of lines with call traces appear and scroll through the
   screen. These are last lines which I captured in screenshot:
   http://img202.imageshack.us/img202/5459/20121225205828.png
4) about 10 second pause again
5) kernel panic happens, more lines scroll. Again, here are some of the
   last ones:
   http://img5.imageshack.us/img5/397/20121225205838.png
6) system hangs completely

This happens on latest kernel-2.6.32-279.19.1.el6.x86_64. It also
happened on 2.6.32-279.11.1.el6.x86_64 and 2.6.32-279.14.1.el6.x86_64.

It didn't happen in SL6.2 with (official, not from elrepo)
kmod-tg3-3.122 package installed which was present in
6.2-fastbugs repository.

I found some information about tg3 crashes like this
http://elrepo.org/bugs/view.php?id=315
or this
http://bugs.centos.org/view.php?id=5428
but in either case 3.122 version of tg3 driver solved the problem.
However, I'm already using 3.122 and still experience crash.

The controller in question is Broadcom NetXtreme BCM5701, PCI-X version
which is inserted into PCI-X slot of Supermicro X7SBE. There haven't
been any hardware changes lately and it is working stable. I'm pretty
sure that this bug has appeared somewhere along the 6.2->6.3 upgrade or
in one of the 6.3 kernels. It's a bit hard to track because it appears
simply as "hang during reboot or shutdown", which rarely happens for
this system, but I'm sure that few months ago it rebooted and powered
off just fine.

This is interface used for internet connection. VLANs are not used.
There exists sixxs-based IPv6 interface in system, configured to work
over this interface. This problem doesn't happen with other (intel
e1000e) network interfaces.

$ cat /etc/sysconfig/network-scripts/ifcfg-eth2 
DEVICE=eth2
BOOTPROTO=dhcp
ONBOOT=yes
TYPE=Ethernet
HWADDR=00:02:A5:E7:0A:10
PEERDNS=no
NOZEROCONF=yes
$ ifconfig eth2
eth2  Link encap:Ethernet  HWaddr 00:02:A5:E7:0A:10  
  inet addr:
  inet6 addr: fe80::202:a5ff:fee7:a10/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:130906804 errors:0 dropped:0 overruns:0 frame:0
  TX packets:178575110 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:100 
  RX bytes:83971053482 (78.2 GiB)  TX bytes:205754543966 (191.6 GiB)
  Interrupt:52 
$ dmesg|grep '\(eth2\|tg3\)'
tg3.c:v3.122 (December 7, 2011)
tg3 :03:02.0: PCI INT A -> GSI 52 (level, low) -> IRQ 52
tg3 :03:02.0: eth2: Tigon3 [partno(253212-001) rev 0105] 
(PCIX:133MHz:64-bit) MAC address 00:02:a5:e7:0a:10
tg3 :03:02.0: eth2: attached PHY is 5701 (10/100/1000Base-T Ethernet) 
(WireSpeed[1], EEE[0])
tg3 :03:02.0: eth2: RXcsums[0] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[0]
tg3 :03:02.0: eth2: dma_rwctrl[76db000f] dma_mask[64-bit]
ADDRCONF(NETDEV_UP): eth2: link is not ready
tg3 :03:02.0: eth2: Link is up at 100 Mbps, full duplex
tg3 :03:02.0: eth2: Flow control is on for TX and on for RX
ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready




Does anyone know some solution or workaround?
I'm fine with installing other version of this driver from kmod (if I
knew where to get better version), but not very comfortable with using
kernel-3.5/3.6/3.7 etc from elrepo.


-- 

Vladimir


Re: Crash in tg3 driver on SL6.3 when interface goes down

2012-12-29 Thread Vladimir Mosgalin
Hi Phil Perry!

 On 2012.12.27 at 13:59:49 +, Phil Perry wrote next:

> 
> Elrepo has an updated kmod package for the tg3 driver you could try.
> 
> With elrepo installed;
> 
> yum install kmod-tg3
> 
> and reboot.
> 
> If it doesn't fix the issue, try giving the elrepo folks a ping to
> see if there is a more recent version you could try that might fix
> the issue.

Thanks for advice! I completely missed the fact that version is newer because I
saw same 3.122 number; actually in-kernel is version 3.122 and elrepo has
3.122n, and this "n" makes a lot of difference.

I tried it, according to changelog it includes fix
---
tg3: Fix tg3_get_stats64 for 5700 / 5701 devs

tg3_get_stats64() takes tp->lock when dealing with non-serdes bcm5700
and bcm5701 devices.  However, functions that call tg3_halt() have
already acquired tp->lock.  When tg3_get_stats64() is called in
tg3_halt(), deadlock will occur.
---

which exactly resolves my problem.

This fix is dated Feb 28, sad that Red Hat maintainers have missed it in their
version :(
But at least I have the solution now.


-- 

Vladimir


Re: FQDM = nala.example.com.example.com

2013-01-20 Thread Vladimir Mosgalin
Hi Torpey List!

 On 2013.01.20 at 08:44:00 -0600, Torpey List wrote next:

> 
> I may not be a newbie but this problem has me feeling like one.
> 
> Here is how the hostname is being reported:
> # hostname
> nala.example.com
> # hostname -s
> nala
> # hostname -f
> nala.example.com.example.com
> # domainname
> example.com
> 
> I have tried various ways to get the name changed and have been able to make 
> the change.  However, the Network Manager changes it back to this when 
> “service network restart”.  I have googled and the answer I have seen is to 
> stop the Network Manager.  While this might work, it does not seem to be the 
> most appropriate answer.
> 
> I have decided not to list the things that I have tried because I have 
> obviously missed something.

The old version of NetworkManager that's used in SL is mostly useful for
connecting to random WiFi networks and some kind of VPNs. Usage beyond
this scope might not lead to best result, and upstream documentation
recommends disabling NM in certain cases. Actually, for most servers,
disabling NM sounds like a good idea.

That said, if it's just this issue bothering you, it probably can be
solved. NM gets this name from either global settings in
/etc/sysconfig/network, or from one of the interface files in
/etc/sysconfig/network-scripts/ifcfg-* (check all that don't contain
explicit NM_CONTROLLED=no line).  I believe NM has other means of
storing configuration (gconf maybe? not sure), but SL version is set up
out of the box to use these configuration files.

If you won't find traces for example.com anywhere, check all files in
etc containing this string:
grep -r example.com /etc

(note that this is rfc2606 reserved name so it can be somewhere even in
default configs, but won't afect you)

Also check if your DNS server didn't supply this name: for all your IPs,
do "host " - maybe NM just picks hostname based on what DNS server
suggested.

DHCP is another possible source of hostname (iirc it doesn't usually
happen in linux, but I might be wrong or NM might be trying to be
smart). So check DHCP logs maybe, if you're using DHCP.

-- 

Vladimir


Re: FQDM = nala.example.com.example.com

2013-01-20 Thread Vladimir Mosgalin
Hi Torpey List!

 On 2013.01.20 at 11:02:38 -0600, Torpey List wrote next:

> /etc/sysconfig/network has only the following:
> NETWORKING=yes
> HOSTNAME=nala.
> DOMAIN=example.com
> 
> I see the trailing period at the end of the Hostname and have removed it.  I 
> have also added "NM_CONTROLLED=no" to ifcfg-eth0.
> 
> This certainly an improvement and fixes the issue that I was having.  
> However, I am not sure if there is another issue that will pop up because 
> there is no domainname.

Glad that it helped!

Output from "domainname" doesn't matter, this is mostly obsolete
command, and unless you're using NIS (and it's almost guaranteed that you
don't) you shouldn't care about it.

What you probably want is "dnsdomainname" command.


-- 

Vladimir


Re: ESATA port multiplier

2013-01-23 Thread Vladimir Mosgalin
Hi Larry Linder!

 On 2013.01.23 at 15:43:26 -0500, Larry Linder wrote next:

> Have a card which supports port multiplier - give you the ability to connect 
> 4 
> SATA 6 drives to system.
> Hardware appears to work but SL 5.7 does not appear to support ESATA.
> Plan is to use a TR4M-BNC for 4  2 T byte drives.   Looking at ports and do 
> not see ports?

>From my experience, SATA port multipliers are so unreliable and getting
them to work properly is annoying, so if it is possible, you really
should look at SAS expander technology. Something like Dell H200 ($120)
+ Intel RES2SV240 ($300) and you're set for 20 ports, at 24 Gbps total
speed. (you might need extra cables and internal-to-external SAS
brackets for external hard drives, though). I know it's probably much
more expensive but it's very solid configuration - works without
problems under any server OS. That is, if you need few dozens of drives.

If you need only a few drives, what about something like
http://www.pc-pitstop.com/sata_enclosures/scsat84xb.asp
or
http://www.pc-pitstop.com/sas_cables_enclosures/sas4bay.asp
or similar solution, combined with Dell 6Gbps SAS or similar relatively
cheap card? No expander at all in either case, up to 8 drives, single
SAS cable for every 4 drives, very simple and reliable setup (can't
guarantee quality of this non hot-swap 8 drives enclosure, it feels too
cheap for what it offers, but there are many alternatives, this is just
an example).

-- 

Vladimir


Re: AMD edac_mce_amd kernel module question

2013-02-06 Thread Vladimir Mosgalin
Hi Yasha Karant!

 On 2013.02.05 at 21:22:21 -0800, Yasha Karant wrote next:

> SL 6x X86-64 on an AMD CPU.  During boot, the dac_mce_amd kernel
> module is indicated as not being loaded.  However, lsmod as well as
> a direct viewing of /proc/modules shows that the module is loaded
> and live. Evidence below.  Is this consistent?  Is the module
> actually active?

The module is normally loaded when "edac" service is started (from
edac-utils package). Do you have it enabled at default runlevel?

You can check if all correct modules are loaded with
edac-ctl --status
command.

-- 

Vladimir


Re: AMD edac_mce_amd kernel module question

2013-02-08 Thread Vladimir Mosgalin
Hi Yasha Karant!

 On 2013.02.07 at 22:16:53 -0800, Yasha Karant wrote next:

> For an answer to the question as to which CPU is present on this
> particular machine:
> 
> Processor name string: AMD Phenom(tm) II X4 840 Processor

[..skipped..]

This is desktop CPU. Are you running it on desktop board? EDAC is kind
of server technology and usually won't work on desktop board. If you are
100% sure that your motherboard declares ECC features (not just ability
to install ECC ram - sometimes manufacturers don't trace extra lines
required for ECC support or don't activate BIOS code for ECC support)
and you are really using ECC ram, then you should investigate this
matter further. If not, then your system simply doesn't support EDAC and
it's not supposed to work.

At very least, you should be able to see EDAC-related options in bios:
ECC enable (for ram, not cache!) and ECC type and similar ones. They
should be enabled for EDAC to work properly. But, like I said, many
desktop boards won't support this..

You can also check it quickly from linux with
# dmidecode -t memory|grep 'Error Correction Type'
command.

On non-EDAC supported systems you'll see Error Correction Type: None

On EDAC-supported you'll see Error Correction Type: Multi-bit ECC
or
Error Correction Type: Single-bit ECC


Seeing "None" sometimes means that it just wasn't enabled in BIOS, though, so
check there first if you get that output.


PS my previous similar reply to this list got filtered by
"88.blocklist.zap" filter @messaging.microsoft.com; just how come
that microsoft filter applies to SL mailing list??

-- 

Vladimir


Re: AMD edac_mce_amd kernel module question

2013-02-08 Thread Vladimir Mosgalin
Hi Yasha Karant!

 On 2013.02.08 at 09:43:03 -0800, Yasha Karant wrote next:

> Thank you for that clarification -- I was aware that this was
> primarily server technology.  However, the boot diagnostic
> concerning "unsupported CPU" does not appear on my desktop
> workstation (same base hardware platform) -- and thus I assumed that
> something was amiss.  Running the command you specified reveals

Maybe it's produced when edac service is started? Try disabling it, for
example..

It does indeed feels like module is loaded for some reason (someone
forces it to load, though normally edac script checks for motherboard
support before trying to load anything). And after it loads, it checks
and reports that it loaded for no reason because system doesn't support
EDAC.

Maybe someone manually forces it to load from /etc/modprobe.d trickery,
/etc/rc.local or some custom service? Try doing
# grep -r edac_mce_amd /etc/

I'd just disable edac service (better yet, remove edac-utils package)
and check throughly /etc for the traces of some code which tries to load
the module manually; it's likely that the problem is there somewhere.

-- 

Vladimir


Re: /run/media/todd

2016-01-24 Thread Vladimir Mosgalin
Hi ToddAndMargo!

 On 2016.01.24 at 02:06:38 -0800, ToddAndMargo wrote next:

> Hi All,
>
> I am noticing that Sl 7.2 is mounting flash drives in
> /run/media/todd

There are many reasons for this - new directory "todd" is created in
tmpfs, not on real filesystem (in /media), this supports multiple users
using their own devices (one user locks the screen and goes away,
another one logs in on the same PC, inserts flash drive - he gets his
own "/run/media/newuser" hierarchy for his devices, not being able to
see what's on yours) and so on.

>
> Any reason why I can't turn /media into a link
> to /run/media/todd
> ?

Well technically you can do whatever you want, as long as you are sure
that no other users will be logging in on this PC so it'll always be
"/run/media/todd". Note that since the later directory is created when
first device is inserted, the link will be broken till that time (not a
big deal).

But, if you replace proper directory "/media" with a link, it will cause
errors when upgrading filesystem package. You aren't mean to replace
system directories.

However, you can do something like
ln -sf /run/media/todd /media/todd
or
ln -sf /run/media/todd /Media

(as a root). If that's convenient to you, why not?

--

Vladimir