Re: No DHCP on boot with a fresh install

2013-12-03 Thread ~Stack~
Doh! I didn't send this to the list. Sorry. Forwarding to the list this
time.

On 12/03/2013 09:16 PM, ~Stack~ wrote:
> On 12/03/2013 08:37 PM, Nico Kadel-Garcia wrote:
>> On Tue, Dec 3, 2013 at 6:36 PM, ~Stack~  wrote:
>>> On 12/01/2013 10:36 AM, olli hauer wrote:
>>>> Have you tried 'service network restart'? Does that bring up your nic?
>>>
>>> Well now. That is interesting. This is consistent even with a fresh
>>> kickstart install.
>>> $ service network restart
>>> Shutting down interface eth0:  [  OK  ]
>>> Shutting down loopback interface:  [  OK  ]
>>> Bringing up loopback interface:[  OK  ]
>>> Bringing up interface eth0:
>>> Determining IP information for eth0... failed; no link present.  Check
>>> cable?  [FAILED]
>>> $ ifup eth0
>>> Determining IP information for eth0... done.
>>>
>>> Errr...what? *scratches head* What exactly is 'ifup eth0' doing that
>>> 'service network restart' isn't?
>>
>> It's running significantly later. Even dumb switches, and supported
>> network drivers, can tike time to recognize  the available MAC
>> address. This is especially the case with DHCP, which requires
>> communications all the way upstream to whatever DHCP server is in
>> place.
> 
> The weird part for me is that this is after the box is booted and I have
> logged in. When I manually run 'service network restart' it fails in the
> same way _every_ time. Then as soon as I run 'ifup eth0' it works! I
> think I am going to experiment with this a bit.

Also, I have been tinkering with this a bit. In /etc/init.d/functions on
line ~536 (I have been editing a bit but I think that is right) there is
a line like this in the action function:
"$@" && success $"$STRING" || failure $"$STRING"

When I dumped out the variables it is just running './ifup eth0' but it
is on this line that everything seems to choke. What I find odd though is:
* If I run it on the command line it works. Running it as a service, it
fails. Thus I am wondering if it is an environmental variable setting?
That is my next investigation.

* If I run 'ifup eth0', get a IP, I can run 'service network restart'
and get an IP! If I run 'ifdown eth0' or reboot then the service kicks
back the error about a missing cable (which is obviously wrong).

Very very odd.

>> Try hard-coding the network configuration, temporarily, and see if it
>> comes up consistently. Then you'll know if it's the availability of
>> DHCP that's the issue.
> 
> Yup. Hard coding the IP seems to have consistent results. I haven't done
> extensive testing, but the few reboots and test seem to be good.
> Thanks!
> ~Stack~
> 




signature.asc
Description: OpenPGP digital signature


Re: No DHCP on boot with a fresh install

2013-12-03 Thread ~Stack~
I think we are on to something!

On 12/03/2013 09:41 PM, ~Stack~ wrote:
> On 12/03/2013 09:16 PM, ~Stack~ wrote:
>> On 12/03/2013 08:37 PM, Nico Kadel-Garcia wrote:
>>> On Tue, Dec 3, 2013 at 6:36 PM, ~Stack~  wrote:
>>>> On 12/01/2013 10:36 AM, olli hauer wrote:
>>>>> Have you tried 'service network restart'? Does that bring up your nic?
>>>>
>>>> Well now. That is interesting. This is consistent even with a fresh
>>>> kickstart install.
>>>> $ service network restart
>>>> Shutting down interface eth0:  [  OK  ]
>>>> Shutting down loopback interface:  [  OK  ]
>>>> Bringing up loopback interface:[  OK  ]
>>>> Bringing up interface eth0:
>>>> Determining IP information for eth0... failed; no link present.  Check
>>>> cable?  [FAILED]
>>>> $ ifup eth0
>>>> Determining IP information for eth0... done.
>>>>
>>>> Errr...what? *scratches head* What exactly is 'ifup eth0' doing that
>>>> 'service network restart' isn't?
>>>
>>> It's running significantly later. Even dumb switches, and supported
>>> network drivers, can tike time to recognize  the available MAC
>>> address. This is especially the case with DHCP, which requires
>>> communications all the way upstream to whatever DHCP server is in
>>> place.
>>
>> The weird part for me is that this is after the box is booted and I have
>> logged in. When I manually run 'service network restart' it fails in the
>> same way _every_ time. Then as soon as I run 'ifup eth0' it works! I
>> think I am going to experiment with this a bit.
> 
> Also, I have been tinkering with this a bit. In /etc/init.d/functions on
> line ~536 (I have been editing a bit but I think that is right) there is
> a line like this in the action function:
> "$@" && success $"$STRING" || failure $"$STRING"
> 
> When I dumped out the variables it is just running './ifup eth0' but it
> is on this line that everything seems to choke. What I find odd though is:
> * If I run it on the command line it works. Running it as a service, it
> fails. Thus I am wondering if it is an environmental variable setting?
> That is my next investigation.
> 
> * If I run 'ifup eth0', get a IP, I can run 'service network restart'
> and get an IP! If I run 'ifdown eth0' or reboot then the service kicks
> back the error about a missing cable (which is obviously wrong).
> 
> Very very odd.

I checked out the environment variables, that is not it. I tried a few
other things and nothing. I don't understand why running '/sbin/ifup
eth0' but in the service command it doesn't work.

So I just started adding '/sbin/ifup eth0' statements into the start
command till it worked. I tweaked it and to reliably get a DHCP IP (even
on reboot!) just add *two* copies of the ifup command in the start
section. I put mine at the end just before the ";;" of the "start)" case
section. One copy alone will not do it. Thus the command is essentially
called three times in a row.

So there *is* a timing issue going on and just hammering it will
eventually get it to work. Now to find the best place to put the timing
delay...

Thanks!





signature.asc
Description: OpenPGP digital signature


Unexplained Kernel Panic / Hung Task

2013-12-04 Thread ~Stack~
Greetings,

I have a test system I use for testing deployments and when I am not
using it, it runs Boinc. It is a Scientific Linux 6.4 fully updated box.
Recently (last ~3 weeks) I have started getting the same kernel panic.
Sometimes it will be multiple times in a single day and other times it
will be days before the next one (it just had a 5 day uptime). But the
kernel panic looks pretty much the same. It is a complaint about a hung
task plus information about the ext4 file system. I have run the
smartmon tool against both drives (2 drives setup in a hardware RAID
mirror) and both drives checkout fine. I ran a fsck against the /
partition and everything looked fine (on this text box there is only /
and swap partitions). I even took out a drive at a time and had the same
crashes (though this could be an indicator that both drives are bad). I
am wondering if my RAID card is going bad.

When the crash happens I still have the SSH prompt, however, I can only
do basic things like navigating directories and sometimes reading files.
Writing to a file seems to hang, using tab-autocomplete will frequently
hang, running most programs (even `init 6` or `top`) will hang.

It crashed again last night, and I am kind of stumped. I would greatly
appreciate others thoughts and input on what the problem might be.

Thanks!
~Stack~

Dec  4 02:25:09 testbox kernel: INFO: task jbd2/cciss!c0d0:273 blocked
for more than 120 seconds.
Dec  4 02:25:09 testbox kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec  4 02:25:09 testbox kernel: jbd2/cciss!c0 D  0
 273  2 0x
Dec  4 02:25:09 testbox kernel: 8802142cfb30 0046
8802138b5800 1000
Dec  4 02:25:09 testbox kernel: 8802142cfaa0 81012c59
8802142cfae0 810a2431
Dec  4 02:25:09 testbox kernel: 880214157058 8802142cffd8
fb88 880214157058
Dec  4 02:25:09 testbox kernel: Call Trace:
Dec  4 02:25:09 testbox kernel: [] ? read_tsc+0x9/0x20
Dec  4 02:25:09 testbox kernel: [] ?
ktime_get_ts+0xb1/0xf0
Dec  4 02:25:09 testbox kernel: [] ?
ktime_get_ts+0xb1/0xf0
Dec  4 02:25:09 testbox kernel: [] ? sync_page+0x0/0x50
Dec  4 02:25:09 testbox kernel: [] io_schedule+0x73/0xc0
Dec  4 02:25:09 testbox kernel: [] sync_page+0x3d/0x50
Dec  4 02:25:09 testbox kernel: [] __wait_on_bit+0x5f/0x90
Dec  4 02:25:09 testbox kernel: []
wait_on_page_bit+0x73/0x80
Dec  4 02:25:09 testbox kernel: [] ?
wake_bit_function+0x0/0x50
Dec  4 02:25:09 testbox kernel: [] ?
pagevec_lookup_tag+0x25/0x40
Dec  4 02:25:09 testbox kernel: []
wait_on_page_writeback_range+0xfb/0x190
Dec  4 02:25:09 testbox kernel: [] ? submit_bio+0x8d/0x120
Dec  4 02:25:09 testbox kernel: []
filemap_fdatawait+0x2f/0x40
Dec  4 02:25:09 testbox kernel: []
jbd2_journal_commit_transaction+0x7e9/0x1500 [jbd2]
Dec  4 02:25:09 testbox kernel: [] ?
__switch_to+0x13d/0x320
Dec  4 02:25:09 testbox kernel: [] ?
try_to_del_timer_sync+0x7b/0xe0
Dec  4 02:25:09 testbox kernel: []
kjournald2+0xb8/0x220 [jbd2]
Dec  4 02:25:09 testbox kernel: [] ?
autoremove_wake_function+0x0/0x40
Dec  4 02:25:09 testbox kernel: [] ?
kjournald2+0x0/0x220 [jbd2]
Dec  4 02:25:09 testbox kernel: [] kthread+0x96/0xa0
Dec  4 02:25:09 testbox kernel: [] child_rip+0xa/0x20
Dec  4 02:25:09 testbox kernel: [] ? kthread+0x0/0xa0
Dec  4 02:25:09 testbox kernel: [] ? child_rip+0x0/0x20
Dec  4 02:25:09 testbox kernel: INFO: task master:1058 blocked for more
than 120 seconds.
Dec  4 02:25:09 testbox kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec  4 02:25:09 testbox kernel: masterD  0
1058  1 0x0080
Dec  4 02:25:09 testbox kernel: 88021535d948 0082
88021535d8d8 81065c75
Dec  4 02:25:09 testbox kernel: 880028216700 88021396b578
880214336ad8 880028216700
Dec  4 02:25:09 testbox kernel: 88021396baf8 88021535dfd8
fb88 88021396baf8
Dec  4 02:25:09 testbox kernel: Call Trace:
Dec  4 02:25:09 testbox kernel: [] ?
enqueue_entity+0x125/0x410
Dec  4 02:25:09 testbox kernel: [] ?
ktime_get_ts+0xb1/0xf0
Dec  4 02:25:09 testbox kernel: [] ? sync_buffer+0x0/0x50
Dec  4 02:25:09 testbox kernel: [] io_schedule+0x73/0xc0
Dec  4 02:25:09 testbox kernel: [] sync_buffer+0x40/0x50
Dec  4 02:25:09 testbox kernel: []
__wait_on_bit_lock+0x5a/0xc0
Dec  4 02:25:09 testbox kernel: [] ? sync_buffer+0x0/0x50
Dec  4 02:25:09 testbox kernel: []
out_of_line_wait_on_bit_lock+0x78/0x90
Dec  4 02:25:09 testbox kernel: [] ?
wake_bit_function+0x0/0x50
Dec  4 02:25:09 testbox kernel: [] ?
__find_get_block+0xa9/0x200
Dec  4 02:25:09 testbox kernel: [] __lock_buffer+0x36/0x40
Dec  4 02:25:09 testbox kernel: []
do_get_write_access+0x493/0x520 [jbd2]
Dec  4 02:25:09 testbox kernel: []
jbd2_journal_get_write_access+0x31/0x50 [jbd2]
Dec  4 02:25:09 testbox kernel: []
__ext4_journal_get_write_access+0x38/0x80 [ext4]
Dec 

Re: No DHCP on boot with a fresh install

2013-12-04 Thread ~Stack~
On 12/04/2013 08:19 AM, Mark Stodola wrote:
> I would suggest trying a NIC that uses a different driver or getting a
> newer driver from ELrepo (kmod-tg3).  Broadcom has been known to have
> issues in my experience.
Hrm. I don't seem to find any package with tg3 in it at all. Even
looking on the EPEL website[1] I don't see kmod-tg3. Is it under a
different name perhaps?
[1]
http://dl.fedoraproject.org/pub/epel/6/x86_64/repoview/letter_k.group.html

> Personally, I try to stick with Intel.
Me too. But these are old cast aways that the hardware is still good,
hence why they are test boxes. :-)

Thanks for the input!

~Stack~




signature.asc
Description: OpenPGP digital signature


Re: Unexplained Kernel Panic / Hung Task

2013-12-04 Thread ~Stack~
On 12/04/2013 07:56 AM, Paul Robert Marino wrote:
> Yup that's a hardware problem.

Drats. I was afraid of that.

> It may be a bad firmware on the controller I would check the firmware
> version first and see if there is a patch. I've seen this kind of thing
> with Dell OEMed RAID controllers enough over the years that that's
> almost always the first thing I try.

Will do. I will report back what I find.

Thanks!
~Stack~




signature.asc
Description: OpenPGP digital signature


Re: No DHCP on boot with a fresh install

2013-12-04 Thread ~Stack~
On 12/04/2013 05:13 PM, Alan Bartlett wrote:
> On 4 December 2013 23:07, ~Stack~  wrote:
>> On 12/04/2013 08:19 AM, Mark Stodola wrote:
>>> I would suggest trying a NIC that uses a different driver or getting a
>>> newer driver from ELrepo (kmod-tg3).  Broadcom has been known to have
>>> issues in my experience.
>> Hrm. I don't seem to find any package with tg3 in it at all. Even
>> looking on the EPEL website[1] I don't see kmod-tg3. Is it under a
>> different name perhaps?
>> [1]
>> http://dl.fedoraproject.org/pub/epel/6/x86_64/repoview/letter_k.group.html
>>
>>> Personally, I try to stick with Intel.
>> Me too. But these are old cast aways that the hardware is still good,
>> hence why they are test boxes. :-)
>>
>> Thanks for the input!
>>
>> ~Stack~
> 
> The ELRepo Project [1] is not Fedora's Extra Products for Enterprise Linux 
> [2].
> 
> Alan.
> 
> [1] http://elrepo.org
> [2] https://fedoraproject.org/wiki/EPEL
> 

Haha! Right on. It might help if I read things correctly. :-D

Thanks for pointing that out. I will go give that a try now.

~Stack~




signature.asc
Description: OpenPGP digital signature


Re: No DHCP on boot with a fresh install

2013-12-04 Thread ~Stack~
On 12/04/2013 05:39 PM, ~Stack~ wrote:
> On 12/04/2013 05:13 PM, Alan Bartlett wrote:
>> On 4 December 2013 23:07, ~Stack~  wrote:
>>> On 12/04/2013 08:19 AM, Mark Stodola wrote:
>>>> I would suggest trying a NIC that uses a different driver or getting a
>>>> newer driver from ELrepo (kmod-tg3).  Broadcom has been known to have
>>>> issues in my experience.
>>> Hrm. I don't seem to find any package with tg3 in it at all. Even
>>> looking on the EPEL website[1] I don't see kmod-tg3. Is it under a
>>> different name perhaps?
>>> [1]
>>> http://dl.fedoraproject.org/pub/epel/6/x86_64/repoview/letter_k.group.html
>>>
>>>> Personally, I try to stick with Intel.
>>> Me too. But these are old cast aways that the hardware is still good,
>>> hence why they are test boxes. :-)
>>>
>>> Thanks for the input!
>>>
>>> ~Stack~
>>
>> The ELRepo Project [1] is not Fedora's Extra Products for Enterprise Linux 
>> [2].
>>
>> Alan.
>>
>> [1] http://elrepo.org
>> [2] https://fedoraproject.org/wiki/EPEL
>>
> 
> Haha! Right on. It might help if I read things correctly. :-D
> 
> Thanks for pointing that out. I will go give that a try now.

Sadly, the updated driver didn't work for me.

Thanks anyway!

~Stack~




signature.asc
Description: OpenPGP digital signature


Re: Unexplained Kernel Panic / Hung Task

2013-12-04 Thread ~Stack~
On 12/04/2013 05:51 PM, Paul Robert Marino wrote:
> Well I tend to discount the driver idea because of an other problem he
> has involving multiple what I think are identical machines . Also any
> problems I've ever had with the ccsis driver were usually firmware
> related an a update or roll back usually corrects them.
> Besides the based on what I've heard this is low budget equipment and
> ProLiants aren't cheap. If I had to guess we are talking about Dells. 

You are right, in that I am experiencing two different issues and the
vast majority of my test lab is older cast-away parts. The difference is
that both issues are on very different systems.

The DHCP problem is on a bunch of similar generic Dells. This particular
problem is on a HP Prolient DL360 G4 which its twin (same hardware specs
and thanks to Puppet should be dang-near identical in terms of software)
so far has not displayed this problem.

Because the twin isn't having this problem and the problem only started
~3 weeks ago is why I thought for the last few weeks it was a disk drive
problem.

I am looking up the firmware versions for this box now. I am not hopeful
that I will find a newer firmware for this old of a system though.
Still, totally worth the try! :-)

Thanks!
~Stack~



signature.asc
Description: OpenPGP digital signature


Re: Unexplained Kernel Panic / Hung Task

2013-12-07 Thread ~Stack~
On 12/04/2013 07:03 PM, Bluejay Adametz wrote:
>> I am looking up the firmware versions for this box now. I am not hopeful
>> that I will find a newer firmware for this old of a system though.
>> Still, totally worth the try! :-)
> 
> I maintain racks of DL380 G4s, and have found recent firmware at
> http://h17007.www1.hp.com/us/en/enterprise/servers/products/service_pack/spp/index.aspx
> 
> Download the DVD, boot it, and see what you got.

Thanks for the link! So I discovered that my dvd drive for the DL360 no
longer works. I called a friend who has quite the graveyard and we dug
up another one. :-)

Got that installed just a short while ago. When I boot from the CD and
select the auto install it takes a while, kicks back a few errors then
reboots. I booted into the manual installation instead. It drops me to a
command line prompt with an error about not being able to launch X. I am
guessing it doesn't like this old ATI chipset.

I poked around and didn't see any command line documentation. Any idea
on how to run the driver installation from the command line off this disc?

Thanks!
~Stack~




signature.asc
Description: OpenPGP digital signature


Assistance in tracking a kernel error

2013-12-18 Thread ~Stack~
Greetings,

I have had my work laptop for well over a year now running SL6 and
everything has been fantastic. Never once have I had a kernel
panic/problem. That is until the latest updates of which the kernel(+
its extras) were the only things that were updated. Rolling back to the
previous kernel works perfectly, so I could just remove the kernel and
hope for the best in the future, but if it is feasible I would like to
track down the problem so that someone who knows what they are doing
might be able to fix it (and who knows, maybe it is already fixed
upstream?).

My laptop is Dell Precision M4600. 80% of the time it is plugged into my
Dell PRO2X laptop dock which has my dual Display Port Dell U2410
monitors (betcha can't tell what vendor my employer frequently buys
from... ;-)

My working kernel is: 2.6.32-358.23.2.el6.x86_64 (and anything before).

The problem kernel is: 2.6.32-431.1.2.el6.x86_64

There are two problems I am seeing.
1) VMWare workstation crashes constantly on the new kernel. And when it
does, full kernel panic. It happens frequently, but not something that I
can guarantee a crash if I do $actionA. I have a lot of VM's I do dev
work in and so far it has been a different VM every time it has crashed.
Never have had this problem with Workstation or this laptop.

2) This is the biggest one that annoys me. I get horrible color
distortions on the secondary display. This happens every time. At BIOS
both monitors display perfect color in a twin view showing the same
image on both screens. Grub only shows primary screen (normal for this
setup). When Grub loads the nouveau display driver the second screen
turns on and the display looks like a rainbow puked glitter. Not even
exaggerating. Crazy jumbled colors in warped patterns that move across
the screen. The primary monitor is just fine. At first I thought it was
the monitor, but swapping the monitors shows they are fine. Other
testing showed that the hardware is just fine too and the final proof
was evident by being able to boot into the previous kernel and not
having an issue.

I never have run the NVidia drivers on this laptop. It has always been
Nouveau. (I run NVidia drivers on other systems, but this laptop has
previously always been rock solid with SL+Nouveau so I never bothered).

As stated, I can reboot into another kernel and everything is fine. In
fact, I already set the old kernel to be default at boot. But if
possible, I would like to track this down.

What I am asking is:

1) Is it worth the time to track down? Anyone know if there have been a
bunch of Nouveau issues/patches pushed upstream that might be worth
investigating first?

2) I can put the kernel into all kinds of debugging and capture GB's of
data, but I would prefer to capture only what is useful. Any
recommendations on what output I should capture?

3) Any suggestions on narrowing the problem down to something other then
"Its the kernel"?

Thanks! I appreciate the help!



signature.asc
Description: OpenPGP digital signature


Re: CentOS + RHEL join forces...

2014-01-07 Thread ~Stack~
On 01/07/2014 08:27 PM, Steven Haigh wrote:
> On 8/01/2014 1:08 PM, Steven Miano wrote:
>> So how does that impact Scientific Linux?
> 
> In a nutshell? It doesn't.

I don't think it will hurt Scientific at all and from what I have been
reading it might make things easier and better. I (as a non-dev user, so
take this opinion accordingly) see two things that might help:
1) the hidden process of how CentOS rebuilds the SRPMs is being opened
up which should make CentOS even close to their binary-equivalent goal.
2) the variant ( http://centos.org/variants/ ) might actually make
things easier if Scientific just wanted to start with a core base and
build from there. I am sure there are going to be a dozen different
spin-offs of CentOS for this reason alone.

There are still a TON of details yet to be given, so we will see what is
actually delivered, but this is great news for the community as a whole.
Here is hoping that it makes things easier and better! Cheers!


~Stack~



signature.asc
Description: OpenPGP digital signature


Re: advice on auto version upgrade with sl6x.repo

2014-03-19 Thread ~Stack~
On 03/18/2014 10:48 AM, Ken Teh wrote:
> I've had 2 successful upgrades from 6.4 to 6.5 with the sl6x.repo
> enabled.  In the past, I've never done upgrades, preferring to re-install.
> 
> I'd like to know what folks are doing with respect to enabling the
> sl6x.repo.  Is it "just enable it!, it's ready from primetime" or are
> you still disabling it, doing a test drive on a test machine before
> reenabling across all machines?

Greetings,

On my personal builds and on the very few desktop systems I manage, I
leave it alone. None of those are very critical and I haven't had a huge
problem. Plus it provides a nice testing ground for possible problems.

On my servers, I control the repo server. I rsync a local mirror from
upstream that the dev boxes all update from. When I am ready and after
the updates seem to all be working, then I rsync from that local mirror
to the production mirror and let all the boxes auto update from it. All
6.X links are just symlinks to 6 which is what I rsync into (leaving
sl6x out completely on production).

So when it was time to update 6.4, well 6.4 was a symlink to 6 which
just got a 6.5 rsync update and a new 6.5 symlink was created. All of
the boxes saw new updates and updated on their own. Except kernel
updates; I have had too many auto-upgrades fail on the kernel for me so
we still have a guy who schedules downtime for large chunks of machines
at a time and manually parallel update/reboots into the new kernel
fixing anything that goes wrong. We also don't upgrade the kernel unless
it is a security update.

If you have a few boxes, a local repo probably doesn't help and will
probably just be ~50-60GB of used space and bandwidth. However, when you
are managing multiple hundreds of boxes from Desktops to a variety of
different server types, the local repo saves a lot of time and
bandwidth. It also gives you the flexibility to allow your servers to
only see the packages you want them to see.

Also, if you aren't using something like puppet to manage your repo
files on that many boxes, I found it very nifty to set a local DNS CNAME
to redirect the default baseurl to my local server. :-)

Hope that helps.
~Stack~



signature.asc
Description: OpenPGP digital signature


fastbugs: yum-autoupdate package broken?

2015-01-07 Thread ~Stack~
Greetings,
Is anyone else having this issue? A bunch of my servers sent me emails
this morning about the yum-autoupdate package. Should I just be patient
and wait for a while or is there an actual issue?

Thanks.
~Stack~

# yum clean all && yum update
Loaded plugins: security
Cleaning repos: epel sl sl-fastbugs sl-security
Cleaning up Everything
Loaded plugins: security
Setting up Update Process
epel/metalink

  |
 14 kB 00:00
epel

  |
4.4 kB 00:00
epel/primary_db

  |
6.4 MB 00:01
sl

  |
3.6 kB 00:00
sl/primary_db

  |
4.3 MB 00:03
sl-fastbugs

  |
3.0 kB 00:00
sl-fastbugs/primary_db

  |
256 kB 00:00
sl-security

  |
2.9 kB 00:00
sl-security/primary_db

  |
1.2 MB 00:00
Resolving Dependencies
--> Running transaction check
---> Package yum-autoupdate.noarch 5:2-6.3 will be updated
---> Package yum-autoupdate.noarch 5:2-6.6 will be an update
--> Finished Dependency Resolution

Dependencies Resolved

=
 Package   Arch
 Version
Repository
Size
=
Updating:
 yum-autoupdatenoarch
 5:2-6.6
sl-fastbugs
27 k

Transaction Summary
=
Upgrade   1 Package(s)

Total download size: 27 k
Is this ok [y/N]: y
Downloading Packages:
http://ftp.scientificlinux.org/linux/scientific/6.6/x86_64/updates/fastbugs/yum-autoupdate-2-6.6.noarch.rpm:
[Errno 14] PYCURL ERROR 22 - "The requested URL returned error: 404 Not
Found"
Trying other mirror.
http://ftp1.scientificlinux.org/linux/scientific/6.6/x86_64/updates/fastbugs/yum-autoupdate-2-6.6.noarch.rpm:
[Errno 14] PYCURL ERROR 22 - "The requested URL returned error: 404 Not
Found"
Trying other mirror.
http://ftp2.scientificlinux.org/linux/scientific/6.6/x86_64/updates/fastbugs/yum-autoupdate-2-6.6.noarch.rpm:
[Errno 14] PYCURL ERROR 22 - "The requested URL returned error: 404 Not
Found"
Trying other mirror.
ftp://ftp.scientificlinux.org/linux/scientific/6.6/x86_64/updates/fastbugs/yum-autoupdate-2-6.6.noarch.rpm:
[Errno 14] PYCURL ERROR 19 - "Given file does not exist"
Trying other mirror.


Error Downloading Packages:
  5:yum-autoupdate-2-6.6.noarch: failure:
yum-autoupdate-2-6.6.noarch.rpm from sl-fastbugs: [Errno 256] No more
mirrors to try.



signature.asc
Description: OpenPGP digital signature


What library is needed to have X11 borders?

2015-02-26 Thread ~Stack~
Greetings,

I have a SL6.6 server I built for a specific application. 99% of this
app is command line, however, one specific aspect of this app requires a
gui to use.

When I built this server I wanted it as lean as possible so I built it
with a minimal server install. Because the app acts funny when X
forwarded over SSH, I tossed on a vnc server and a wrapper script to
auto-start the app. It works!

The issue is that the app pops up in a tiny window and you can't move it
or scale it or anything else I tried. :-(

Well, I know why. X11 doesn't have a proper windowing manager which is
usually responsible for window borders. X11 is just the frame work. The
question is, how do I get borders without installing a full desktop?

I tried installing the group package "x11" but that didn't help.
I tried matchbox, but that didn't work.
I tried installing the group package for the "Ice Desktop", but that
didn't help.

I tried installing as little of Gnome as I could without installing a
bunch of junk like pulseaudio and NetworkManager (Both are great on my
laptop...but not on this server), and that just added a ton of clutter
and still didn't work.

I spent an hour bumming around on the Internet installing every package
I could find recommended and I still can't get borders around the
application.

I know it isn't my script or my app. If I run the script/app on a system
with a full gnome install, it works. I just don't want a full Gnome (or
any major desktop) install on this server.

Can anyone help me figure out which package I need?

Thanks!
~Stack~



signature.asc
Description: OpenPGP digital signature


Re: What library is needed to have X11 borders?

2015-02-27 Thread ~Stack~
Greetings,

On 02/26/2015 07:01 PM, Jim McCarthy wrote:
> For very minimal yet classic X11 functionality, I would recommend that
> you investigate installing "mwm" (the Motif window manager).   Other
> alternative are "twm" (Tom's window manager ?), and closer-to-gnome but
> much more minimal (in terms of footprint size on the system) is "icewm"
> (ICE window manager) that it appears you've already attempted using.

I couldn't get mwm or icewm to give me borders. However, twm did! So
thank you very much for that.

The user, however, thought that it looked ugly. I ended up installing
just gnome-panel and 20 of its dependencies. It isn't a full Gnome
install and when the vnc session isn't active the box doesn't have much
else going on. I was worried there would be a bunch of background stuff,
and there might be with a full gnome install, but just gnome-panels it
is still pretty light.

Thanks for the help!

Chris Stackpole



signature.asc
Description: OpenPGP digital signature


Re: What library is needed to have X11 borders?

2015-02-27 Thread ~Stack~
On 02/27/2015 08:32 AM, Brett Viren wrote:
> ~Stack~  writes:
> 
>> Because the app acts funny when X forwarded over SSH
> 
> What does "funny" mean?

I have plenty of network bandwidth, but the app will occasionally block
up. Sometimes it won't redraw after you select something from the drop
down menu rendering everything in that space unreadable. And
occasionally if another window gets placed on top of it you just get a
black window entirely. Ask the vendor and they just whine about how hard
it is to support so many variations of Linux. Point out it does the same
thing on RHEL and they say it is an unsupported method of use and don't
forward X.

Just another poorly done proprietary app that the users "can't live
without!". :-/

Thanks!







signature.asc
Description: OpenPGP digital signature


Re: What library is needed to have X11 borders?

2015-02-27 Thread ~Stack~
Greetings,

On 02/27/2015 12:18 AM, Francesco M. Taurino wrote:
> Install x2go server, icewm or openbox and a simple xterminal. With this
> "stack"   your gui apps will be usable even on slow networks.

Wow! Thanks for that! I haven't heard of x2go before. Reading their
site, I am quite excited to give it a try.

I used to be a big fan of NX Machine 3, but within the last couple of
years it has gotten /crazy/ flaky on newer distros (it still works well
with EL6 and Debian 6 Squeeze, but anything newer seems to be a
problem). NX3 also /really/ doesn't like any desktops other than Gnome
and KDE. I managed to get it working on LXDE once, but that wasn't easy
and the latest LXDE update breaks NX Machine. I was a part of the early
beta for NX Machine 4 and I provided a ton of feedback but it still is
not ready for real use in my opinion with too many things broken. I
support a bunch of users that use the paid-for NX Machine 4 (clients on
Linux/Mac/Windows) for a project and it seems like every update breaks
/something/. I can almost guarantee that when I get a call from that
group the problem they have is NX Machine 4 related. Every single time.

I have tried out a few projects for replacing NX Machine and never got
far. I will definitely give x2go a try.

Thanks!






signature.asc
Description: OpenPGP digital signature


Re: What library is needed to have X11 borders?

2015-02-27 Thread ~Stack~
On 02/27/2015 08:59 PM, Nico Kadel-Garcia wrote:
> On Fri, Feb 27, 2015 at 9:05 PM, ~Stack~  wrote:
>> Greetings,
>>
>> On 02/27/2015 12:18 AM, Francesco M. Taurino wrote:
>>> Install x2go server, icewm or openbox and a simple xterminal. With this
>>> "stack"   your gui apps will be usable even on slow networks.
>>
>> Wow! Thanks for that! I haven't heard of x2go before. Reading their
>> site, I am quite excited to give it a try.
>>
>> I used to be a big fan of NX Machine 3, but within the last couple of
>> years it has gotten /crazy/ flaky on newer distros (it still works well
>> with EL6 and Debian 6 Squeeze, but anything newer seems to be a
>> problem). NX3 also /really/ doesn't like any desktops other than Gnome
>> and KDE.
[snip]
> 
> It likes twm just fine. And it also likes vtwm
> (https://github.com/nkadel/vtwm-5.5.x-srpm)
> 

Yeah, I _just_ found their compatible Desktops page (I have been reading
some of their other wiki pages for a while).

http://wiki.x2go.org/doku.php/doc:de-compat

The short of it is that many DE's are supported without problem and
several of the DE's may require a bit of tweaking to get working, but at
least they are trying to support them! :-)

That and my fav DE right now (LXDE) is supported! [Too bad I can never
get the libraries to compile right under SL6...it would probably
guarantee I would run SL6 until 2020-09-30! :-D ]

This x2go is really good news for me. I have a few use cases I want to
test it out on. Especially that one project; I am kinda excited to rip
out NX Machine 4 and replace it with x2go. I will have to run it through
my dev environment first and make sure that the clients work well so it
will probably be a few months, but getting rid of that PITA would make
my life a lot easier. :-)



signature.asc
Description: OpenPGP digital signature


Re: flash alternatives?

2015-05-05 Thread ~Stack~
On 05/05/2015 03:29 PM, Jim Campbell wrote:
> On Tue, May 5, 2015, at 01:02 AM, ToddAndMargo wrote:
[snip]
>> Any other ways around Adobe's outdated flash plugin?

> If you want to watch Flash videos, though, probably the safest bet is to
> install Google Chrome, which embeds a "Pepper" flash-compatible plugin [0].

There is a internal website at my job that /requires/ flash and that I
need to access. I tried all kinds of things from installing from random
repos to rolling my own build for all kinds of alternatives. Chrome was
the only thing that worked for me. Then Google ditched RHEL6 support and
I had people angry at me for running an "outdated" browser. That is when
I stumbled on this:

http://chrome.richardlloyd.org.uk/

I know it has gotten both praise and hate on this list, but it works
really well for me and it keeps my Chrome up to date. I loath Adobe, I
prefer Firefox, and I can count all the sites I visit w/ Chrome on one
hand...but for those few sites I gotta have flash. :-/

Hopefully it works just as well for you.




signature.asc
Description: OpenPGP digital signature


SL 7.1 on a ECS Liva

2015-07-08 Thread ~Stack~
Greetings,

For Christmas I was given a ECS Liva Mini PC. 6 months later I still
can't get it to do anything useful. Between the weird drivers that are
required (all in certain versions of the Kernel) and that stupid UEFI
junk, I can't get anything but the latest version of Ubuntu to install
on it and I don't have much of a use for 15.04 on this device. So I have
been doomed to piss away a few hours a weekend, get frustrated, not
touch it for a few weeks, then repeat. :-/

I keep hoping one of these days I can find something interesting to do
with this thing. Which brings me to a online chat conversation yesterday
where a guy claims to have crammed CentOS 7.1 onto one!! Awesome! I
haven't messed with SL 7.1 much, and I really prefer SL over CentOS so
this could be a something useful!! Except 2 hours in and I am banging my
head on the table again...

I first tried PXE booting off a thumb drive for a network install (it is
how I do all my installs). SL 7.1 sees the network card, but won't do
anything with it. Just says the network is unavailable. OK fine. Try the
DVD version on a thumbdrive and Anaconda freaks out and can't find the
USB installation device anymore. Ugh! Dug out a USB DVD drive and burned
the 7.1 DVD. Boots! Awesome...no disks found. Huh?

Dig around in the journalctl (with lots of Internet searching to figure
that out):
mmc0 unrecognized EXT_CSD revision 7

Look online and sure enough, it can't see the internal hard disk. Poke
around a bit more and find several CentOS posts about this exact thing!
mmc-core and mmc-block aren't being loaded. Check lsmod, sure enough not
loaded. Modprobe those two drivers and it seems to be happy!! Alright!

Except the installer refuses to rescan and find the disk!! What?? Why?

OK. Try every trick I can think of to load the drivers in the DVD menu
before boot...nothing. It will not load the drivers until I manually run
modprobe. By that time, the Anaconda installer won't see the disk.

Grrr.

I am getting a little frustrated again, but if I can squeeze SL onto
this device I can actually use it! I have several systems running SL6
right now that I can replace with this thing...but only if I can get SL7
to install.

Does anyone have any clue how I might be able to either force the
drivers to be loaded at the DVD boot menu or get anaconda to rescan for
the disks once I load the module?

Any help or guidance would be /greatly/ appreciated.

Thanks!
~S~



signature.asc
Description: OpenPGP digital signature


Re: SL 7.1 on a ECS Liva

2015-07-09 Thread ~Stack~
Greetings,

On 07/09/2015 08:43 AM, Jonathan Barber wrote:
> On 8 July 2015 at 23:30, ~Stack~  <mailto:i.am.st...@gmail.com>> wrote:
> Does anyone have any clue how I might be able to either force the
> drivers to be loaded at the DVD boot menu or get anaconda to rescan for
> the disks once I load the module?
> 
> Any help or guidance would be /greatly/ appreciated.
> 
> 
> Are you using kickstart? If so, can you put the modprobe in a %pre
> section of the kickstart file:
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Installation_Guide/sect-kickstart-syntax.html#sect-kickstart-preinstall
> 
> you can then put the kickstart file on the DVD:
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Installation_Guide/sect-kickstart-howto.html#sect-kickstart-making-available


I was having issues with the PXE boot originally with the network, but I
didn't think about putting _on_ the DVD!

I will give it a try.

Thanks!



signature.asc
Description: OpenPGP digital signature


Re: SL 7.1 on a ECS Liva

2015-07-10 Thread ~Stack~
Greetings,

Just an update on my saga. I have had no luck getting either SL7 or
CentOS 7 to install on the Liva. It seems that I can add a mSATA drive
to the device and install Linux to that (someone who was putting a
different Linux distro with similar issues with the eMMC drivers claims
that is how he got it to work). But I don't really want to sink any
money into this device.

So back on the shelf it goes. I will try again in a few months and see
if things have gotten better for it. :-)

~S~




signature.asc
Description: OpenPGP digital signature


SL7.1 and eSATA

2015-08-15 Thread ~Stack~
Greetings,

My personal home web server that has been running SL6 for years finally
kicked the bucket. So I replaced it with a zotac id41. I don't need a
lot of CPU power and consuming less Watts is a good thing. :-)

I decided to take the plunge from 6 to 7 (and it has been quite the ride
learning new things) and nearly everything is up and running! Horay!

This morning I decided to tackle the last piece of this project and
attach an external eSATA drive. Nothing. And I do mean nothing. Plugging
it in results in no messages from anything. Nothing in /var/log, nothing
in dmesg, nothing with journalctl...just nothing.

Hrm.

Is the drive bad? Nope. Plugs in and works on another system.

Is the port bad? Nope. Reboot off of a Lubuntu live USB and I can see
and interact with the drive. Also, I can see it if I reboot into the
BIOS. It shows as a drive.

Is it something about the EL OS? Nope. Test an older SL6 install and the
eSATA drive shows up and mounts!

Maybe it is this enclosure? Nope. USB enclosures are found and all tests
with eSATA are a bust.

Ok. So it has to be something about my install of SL7. There has to be
something I have done to goof things up. I did do a minimal install.
Maybe I missed a package?

I have checked online and everything from the ahci to acpi. All of that
is installed and there. Everything about SATA and SL/CentOS/Red Hat 7 I
could find, I have tried. I even started reading articles about EL6 and
EL5 with no success.

I am stuck. Any help would be greatly appreciated.

lsmod and lspci are below.

Thanks!



# lsmod
Module  Size  Used by
ip6t_rpfilter  12546  1
ip6t_REJECT12939  2
ipt_REJECT 12541  2
xt_conntrack   12760  7
ebtable_nat12807  0
ebtable_broute 12731  0
bridge115385  1 ebtable_broute
stp12976  1 bridge
llc14552  2 stp,bridge
ebtable_filter 12827  0
ebtables   30913  3 ebtable_broute,ebtable_nat,ebtable_filter
ip6table_nat   12864  1
nf_conntrack_ipv6  18738  5
nf_defrag_ipv6 34651  1 nf_conntrack_ipv6
nf_nat_ipv614131  1 ip6table_nat
ip6table_mangle12700  1
ip6table_security  12710  1
ip6table_raw   12683  1
ip6table_filter12815  1
ip6_tables 27025  5
ip6table_filter,ip6table_mangle,ip6table_security,ip6table_nat,ip6table_raw
iptable_nat12875  1
nf_conntrack_ipv4  14862  4
nf_defrag_ipv4 12729  1 nf_conntrack_ipv4
nf_nat_ipv414115  1 iptable_nat
nf_nat 26146  2 nf_nat_ipv4,nf_nat_ipv6
nf_conntrack  105702  6
nf_nat,nf_nat_ipv4,nf_nat_ipv6,xt_conntrack,nf_conntrack_ipv4,nf_conntrack_ipv6
iptable_mangle 12695  1
iptable_security   12705  1
iptable_raw12678  1
iptable_filter 12810  1
ip_tables  27239  5
iptable_security,iptable_filter,iptable_mangle,iptable_nat,iptable_raw
iTCO_wdt   13480  0
iTCO_vendor_support13718  1 iTCO_wdt
snd_hda_codec_hdmi 51925  4
arc4   12608  2
ath9k 136983  0
ath9k_common   25638  1 ath9k
ath9k_hw  450617  2 ath9k_common,ath9k
coretemp   13435  0
ath29006  3 ath9k_common,ath9k,ath9k_hw
snd_hda_codec_realtek74707  1
snd_hda_codec_generic68937  1 snd_hda_codec_realtek
serio_raw  13462  0
mac80211  569655  1 ath9k
snd_hda_intel  30519  0
pcspkr 12718  0
lpc_ich21073  0
snd_hda_controller 31921  1 snd_hda_intel
ata_generic12910  0
snd_hda_codec 139320  5
snd_hda_codec_realtek,snd_hda_codec_hdmi,snd_hda_codec_generic,snd_hda_intel,snd_hda_controller
mfd_core   13435  1 lpc_ich
snd_hwdep  17698  1 snd_hda_codec
pata_acpi  13038  0
snd_seq63074  0
i2c_i801   18135  0
snd_seq_device 14497  1 snd_seq
snd_pcm   103996  4
snd_hda_codec_hdmi,snd_hda_codec,snd_hda_intel,snd_hda_controller
cfg80211  514740  4 ath,ath9k_common,ath9k,mac80211
snd_timer  29562  2 snd_pcm,snd_seq
rfkill 26536  1 cfg80211
snd75127  10
snd_hda_codec_realtek,snd_hwdep,snd_timer,snd_hda_codec_hdmi,snd_pcm,snd_seq,snd_hda_codec_generic,snd_hda_codec,snd_hda_intel,snd_seq_device
soundcore  15047  2 snd,snd_hda_codec
shpchp 37032  0
ext4  562430  2
mbcache14958  1 ext4
jbd2  102940  1 ext4
sd_mod 45499  5
crc_t10dif 12714  1 sd_mod
crct10dif_common   12595  1 crc_t10dif
nouveau  1227034  1
video  19263  1 nouveau
mxm_wmi13021  1 nouveau
wmi19070  2 mxm_wmi,nouveau
i2c_algo_bit   13413  1 nouveau
drm_kms_helper

Weird user permissions help

2015-08-28 Thread ~Stack~
Greetings,

I have a weird issue with permissions that is really getting to me on SL
6.6. I did a quick name replacement to simplify but most of the other
details are just copy-paste.


Here is the structure of my folders.

/data
drwxr-xr-x. 163 root root  12K Jan 28 16:52 data

/data/share
drwxrws---. 3 root share  12K Jan 28 16:55 share

/data/share/share1
/data/share/share2
drwxrws---. 4 root share1  12K Mar 4 8:20 share1
drwxrws---. 4 root share2  12K Apr 16 12:05 share2

And here are the groups:
share:x:690:user1,user2,user3
share1:x:1220:user1
share2:x:1342:user2

So, one would expect that all three users should be able to access
/data/share. However, only user1 should be able to access share1 and
only user2 should be able to access share2. Right?

Well, let's take a further step. ACL's are not enabled.

# file: share/
# owner: root
# group: share
user::rwx
group::rws
other::---

The other folders match. Nothing special; no ACL's in play.

So again. User3 should not be able to access the other two folders, right?

Except he can access share1...not share2, but he can access share1.
WTF?? Why can he access share1? Why share1 but not share2?? I don't know.

I have been pouring over this for an hour. I have asked 3 coworkers. I
can't figure it out. User3 isn't a part of any special group or anything.

In fact, I added user4 with NO other groups and verified that he can't
access /data/share. Then I added him to the share group. Now he has
access to share1, but not share2. Any user that is a part of share, has
access to share1 but not share2. Only users that are both in the share
AND share2 groups can see share2. That is precisely what it should be
for share1!

Well...maybe I have a weird SELinux rule?? I can't find anything
flagging it.

I took a look at strace while I ran ls on the directory from the users
perspective. As far as it is concerned, the user has full access to
share1 and gets permission errors on share2.

Fine. Let's take away permissions for everyone.

# chown root:root -R share1
# chmod g-s share1
# chmod a-rwx share1
# ls -ld share1
d-. 4 root root  12K Mar 4 8:20 share1


Let's see them get into that!!

user3 /> cd /data/share/share1
user3 /data/share/share1>

DAH!!! HOW!!?!?!?!???

Maybe a cached credential?? Completely log out the user and back in.
Nope. Still has full access to a folder that NO ONE should be able to
look into!

OK. Fine. Maybe a rename of the folder? Nope.

Delete the folder and create a new folder with the original file
permissions! Still the same result...

Share2 is working perfectly the way I expect it to. Share1 I am stumped on.

Anyone have a suggestion for how I can trace down the /how/ question to
a user having permissions? Something has to be over-ridding the file
system permissions but I am stumped as to what. I have never seen such
goofiness before in permissions when ACL's weren't involved and all of
my internet-search-foo has only returned the opposite problem (a user
should have access but doesn't).

Any suggestions would be greatly appreciated.

Thanks!
~Stack~



signature.asc
Description: OpenPGP digital signature


Re: Weird user permissions help

2015-08-31 Thread ~Stack~
On 08/31/2015 08:24 PM, Brett Viren wrote:
> ~Stack~  writes:
> 
>> I have been pouring over this for an hour. I have asked 3 coworkers. I
>> can't figure it out. User3 isn't a part of any special group or anything.
> 
> By chance are you falling fowl to user info caching?  Adding a user to a
> group won't affect any sessions that were already started before the
> change.
> 
> Having each user run the "groups" command will tell the story.  Or just
> have them log out/in again.

Greetings!

I have checked caching and it isn't the issue.

I mentioned that the path is /data/share/share{123}. If I "rsync -avHlP"
the directories to /data/temp/, it works perfectly the way it should.
/data is a single partition volume on the same file system.

In fact, I created /data/testing/ and verified that all of the
permissions are working properly. I then 'mv /data/testing
/data/share/.' and those _exact_ permissions that were working, stop.
The exact same problem as the original folders.

Some thing some how is making the permissions in /data/share more
liberal and I am at a complete loss as to what it is. I am convinced it
is something on the file system but that is about as far as I have got.

SELinux isn't flagging anything. ACL's are not enabled. And every Linux
system I mount this partition on shows the exact same odd behavior once
I copy over the users/groups. It has to be something on the file system,
but I haven't ever seen anything like this that so blatantly refuses to
adhere to the Linux permissions.


Thanks for the suggestion though! I do appreciate it.

~Stack~



signature.asc
Description: OpenPGP digital signature


Re: Weird user permissions help

2015-09-01 Thread ~Stack~
On 08/31/2015 10:20 PM, Brandon Vincent wrote:
> On Mon, Aug 31, 2015 at 7:42 PM, ~Stack~  wrote:
>> Some thing some how is making the permissions in /data/share more
>> liberal and I am at a complete loss as to what it is. I am convinced it
>> is something on the file system but that is about as far as I have got.
> 
> Could you post the mount line from /etc/mtab for this filesystem?

Greetings,

Sure.

panfs://panasas/data /data panfs
rw,nodev,noauto,panauto,callback-address-allow=192.168.0.31 0 0

The panauto is a Panasas option for "don't mount this until network is
up" and the callback-address-allow is to keep the Panasas kernel module
from trying to mount over any other network devices. The rest is pretty
standard.

Thanks!




signature.asc
Description: OpenPGP digital signature


Re: Weird user permissions help

2015-09-04 Thread ~Stack~
Greetings,

Just a follow up.

I never found a way to track the "why" behind a user getting permissions
when they were not supposed to. However, it appears that Panasas enabled
a new form of ACL's in their 5.5 update which we went to a few months
back. The ACL's are apparently always on and enabled outside of the
Linux permission scope. This is how users are getting permissions.

Thanks for all of the off-list suggestions people sent me. I do
appreciate all of the help.

~Stack~



signature.asc
Description: OpenPGP digital signature


Dual booting while encrypted

2015-09-09 Thread ~Stack~
Greetings,

I have a SL6 laptop that is partitioned like this:
Physical partition 1 - /boot - 5GB
Physical partition 2 - LVM
LVM - / - 40GB (encrypted)
LVM - swap - 5GB (encrypted)
LVM - /home - 300GB (encrypted)

Works great.

However, I would like to play videos when I travel and SL6 struggles in
this pretty badly. Especially when connecting to strange hotel Display
Port / HDMI / ect TV's (usually the display portion works, but getting
sound is a pain).

What I have been doing is booting Lubuntu 15.04 off of a thumb drive,
configuring the TV/sound, mounting the encrypted home, and playing
videos. But I would like to just move that to my SSD and leave the thumb
drive at home.

For testing purposes, I swapped hard drives (that way I don't lose
data). I reinstalled SL6 with the following:
Physical partition 1 - /boot - 5GB
Physical partition 2 - LVM
LVM - / - 40GB (encrypted)
LVM - swap - 5GB (encrypted)
LVM - /home - 20GB (encrypted)

Pretty much the exact same. Then I installed Lubuntu 15.04 so the drive
now looks like this:
Physical partition 1 - /boot - 5GB
Physical partition 2 - LVM
LVM - / - 40GB (encrypted)
LVM - swap - 5GB (encrypted)
LVM - /home - 300GB (encrypted)
LVM - / - lubuntu 40GB (encrypted)
LVM - swap - lubuntu 5GB (encrypted)

I set up two swaps because Lubuntu /really/ didn't want to share. Fine.
Whatever. Reboot after install and despite it saying it found SL6,
Lubuntu is the only boot option. I can't seem to get SL6 to boot again
(even breaking in via grub wasn't working).

OK. Fine. I will install SL6 again. It doesn't even mention that it
found Lubuntu...it just tosses itself right into /boot. On start up, it
sees all of the kernels, but thinks they are all SL kernels. I can't
boot into Lubuntu any more and if I select any kernel that isn't the SL
kernel, it freaks out (I expected as much but I was really curious).

OK, so neither OS will play nicely with each other. Let's try SL7.
Again, doesn't matter which order I install in, both claim they have to
control /boot for encrypted disks and stomp on each other. At least with
SL7, it sees and recognizes that Lubuntu is there...it just doesn't care...

I am pretty confident that if I removed the encryption piece, both
distros would play well with each other. That just isn't an option for
me though.

I have tried several things with various Virtual Machines (KVM, and
VMware) but the pass through never works properly for video/sound to
Display Port/HDMI.

Has anyone conquered this? Any suggestions? I have done about 5 installs
of both OS's today and I am really close to just going back to the USB
method of booting Lubuntu even if it is ridiculously slow.

Thanks!
~Stack~



signature.asc
Description: OpenPGP digital signature


Re: SL7 on a tablet

2016-07-15 Thread ~Stack~
Greetings,

On 07/14/2016 07:55 PM, Yasha Karant wrote:
> Are there any tablets in the Apple iPad price range that run SL 7?
> If not SL 7, other distros (say, Ubuntu "enterprise")?

I can say from personal experience that the Microsoft Surface Pro 2 runs
Ubuntu 16.04 really well. The Surface Pro 3 also works but is a bit
rougher (had some issues with the keyboard that took a while to fix;
wasn't working out of box. Also, the screen resolution is so high that
it made touching difficult to be precise). It wasn't a hard install at
all. Just disable UEFI and select traditional BIOS then you are good to go.

I also know that Johnny Hughes over at the CentOS team has done a ton of
work cramming CentOS 7 on a bunch of ARM devices. He may have a lead for
you on a tablet.


> Would such tablets also run VirtualBox or VMWare Player to allow the
> use of MS Win as a guest environment?

The Surface Pro certainly would. It has quite a bit of power. Not sure
about the low-end ARM processors though.

> How much RAM and mass storage should be procured?

I hate to sound like a marketer for the Surface Pro...However, the one I
had was 8GB of RAM w/ a 256GB SSD. Should be plenty I would suspect.

> The unit would need both USB ports as well as 802.11 support
> supported drivers under Linux for Network Manager, her WLAN
> application of choice).

You get wifi and one USB on the Surface Pro. There is a dock which
worked perfect under Ubuntu 16.04 for the Pro2, but I had issues with it
crashing on the Pro3. I know there were improvements made to the support
of the Surface Pro since I tested, but I no longer have the devices so
my testing is all a single point in time back in March-April with 16.04
pre and initial release.

Hope that helps. Good luck. And I would love to hear if you find one!

~Stack~



signature.asc
Description: OpenPGP digital signature


Re: sl6.8 libcgroup

2016-07-27 Thread ~Stack~
On 07/27/2016 03:53 AM, Stijn De Weirdt wrote:
> hi all,
> 
> 
> we have a update an sl67 node to sl68 (but not yet updated the kernel),
> and this updates
> libcgroup-0.40.rc1-17.el6_7.x86_64
> to
> libcgroup-0.40.rc1-18.el6_8.x86_64
> 
> however, it now seems that the cgonfigparser even fails to validate the
> distributed /etc/cgconfig.conf
> 
>> [root@test2802 ~]# /sbin/cgconfigparser -l /etc/cgconfig.conf
>> error at line number 17 at {:syntax error
>> Error: failed to parse file /etc/cgconfig.conf
>> /sbin/cgconfigparser; error loading /etc/cgconfig.conf: Have multiple paths 
>> for the same namespace
> 
> 
> the /etc/cgconfig.conf is the same in both rpms
> 
> anyone seeing this? or knows how to fix?

Greetings,

I discovered the exact same thing. I fully updated to 6.8 and rebooted
into the new kernel. I haven't filed a bug report against it yet as I
didn't have time yesterday to really dig into it. My "workaround" was to
"yum downgrade libcgroup" on all my hosts until I could figure it out.

~Stack~




signature.asc
Description: OpenPGP digital signature


Re: ELRepo Kernel Repository

2016-07-29 Thread ~Stack~
Greetings,

On 07/29/2016 10:42 AM, Yasha Karant wrote:
> There being so many repositories with "sub-repositories" that I do not
> keep track of many; however, yumex reveals many of these. For ELRepo,
> there is an EL7 community kernel repository.  Is anyone using this for
> production machines  (presumably, yes)?

On a limited number of production machines yes. I guess I should really
classify these as "production but we expect them to break" machines.
Why? Because we are crazy enough to run CephFS in "production" on SL6 (I
know. We are a bit nutz on this project but it works really well!! o_O)
which requires the newer kernels.


> If so, how do these differ from
> the "stock" SL (CentOS, RHEL, ...) kernels?

The kernel names are actually different. kernel vs kernel-ml. Obviously,
you get a lot of the perks of newer kernels, but you get occasional
weird issues because some "stock" kernel modules/packages/ect expect the
old kernel. So far these have all been incredibly trivial/minor for us.

Also, if you keep both you will probably run into issues where one
kernel (usually the stock kernel) updates something with the headers or
glibc or something important and just smashes everything you have
forcing itself to be the primary new kernel. On reboot, you find that it
is all borked up and you have to go back to an older kernel/recovery
mode to figure out what got smashed in the update.

Personally, I have had it with that. I strip out the old kernel
completely and just run on the ELRepo kernel/debug/header/ect packages.

> If one uses an ELRepo
> kernel, and one then does a minor release upgrade of SL (assuming yum
> upgrade or something similar actually works, not requiring smashing the
> system partitions), will the ELRepo kernel "parts" play nicely with such
> a SL upgrade, or are there conflicts resulting in either no-boot (system
> failure) or instabilities?

As long as it isn't updating the kernel (previously mentioned issues), I
have had no issues going from 6.6 to 6.7 to 6.8.

> For "new" laptops/tablets that do not have
> the necessary drivers in stock SL, does the ELRepo kernel repository
> provide additional current drivers (as might be present in Ubuntu or
> even fully enthusiast, not enterprise production, Linux distros)?

Yup! At least in my very limited testing. I only care about a few very
specific parts of the hardware driver updates + CephFS. I really haven't
tested much further then that.

Good luck!
~S~




signature.asc
Description: OpenPGP digital signature


Re: Python 2.7 OS requirements

2016-07-31 Thread ~Stack~
On 07/30/2016 06:36 PM, P. Larry Nelson wrote:
> Hi all,

Greetings!

[snip]
> I have been asked by one of our Professors that one of his grad students
> apparently needs Python 2.7.x installed on our cluster (optimally in
> /usr/local, which is an NFS mounted dir everywhere).
> 
> In my brief Googling, I have not found OS requirements for 2.7.x, but
> have inferred that it probably needs SL7.x.
> 
> Can anyone confirm that?
> Or has anyone installed Python 2.7.x (and which .x?) on an SL6.8 system
> without replacing 2.6.x?

I have the exact same problem. Don't try to replace 2.6.6. It broke ALL
KINDS of things including RPM when I tried it...did not go well at all
(but a good learning experience! :-).

Here are three solutions:

1) Software Collections: https://www.softwarecollections.org/en/
Upside, A certain favored upstream vendor backs a lot of these packages
(just no support). Downside, they run in a special environment which can
be tricky depending on the application. It basically runs in a subshell
and that confuses my users at least

2) Inline Upstream Stable (aka IUS): https://ius.io/
Upside: Backed by Rackspace (again no suport), easy RPMS, they do not
stomp on Upstream Vendor packages (different names), and they are kept
pretty up-to-date. Downside: they don't have a ton of packages, but they
do have your Python 2.7

3) Anaconda Python: https://www.continuum.io/downloads
Upside: it runs its own environment which plays nicely with cluster
modules (for the most part). You can update that environment inside
itself. Need a new version of scipy? 'conda update scipy'. Done. Need
Intel Accelerated Python? Or Python 3.5 too? Easy.

Downside: When it breaks or when it gets pissy, it can be a pain to
figure out. The documentation isn't great. It is open source and you can
get official support for it, but it is PRICEY!!!

We chose Open Source Anaconda when we needed a new python. For the most
part it does its job really well and we are very happy with it. More
importantly, the users are happy with it. We use IUS elsewhere but we
needed more packages then they provide and it was a pain managing all
those packages manually. Same with SCL. We use it elsewhere too, but
anything that the users have to interact with, we found they get
frustrated because of the weird subshell-environment it uses.

Good luck!
~Stack~




signature.asc
Description: OpenPGP digital signature


Re: CVE-2016-5195: mad cow disease

2016-10-23 Thread ~Stack~
On 10/22/2016 02:52 PM, Denice wrote:
> As well, the importance of this vulnerability hinges on user access;
> in SANS newsbites yesterday, one of the editors made this remark
> about this kernel vulnerablity (branded by the person(s) who raised
> the issue: "Dirty Cow"):
> 
>This is a privilege escalation vulnerability that was introduced in
> Linux
>about 11 years ago. An exploit has been used in some attacks to take
>advantage of this vulnerability, but the exploit has not been made
>public yet. Systems based on RedHat ES 5 and 6, which are vulnerable,
>appear to be not susceptible to the exploit as this particular exploit
>requires write access to /proc/self/mem. Given that this exploit
>requires user access, and the actual exploit is only in limited
>distribution (but this may change soon), "branding" this exploit is
>hyping a minor and common vulnerability and only serves to distract
>administrators from more important tasks. Deal with patches for this
>vulnerability like you would deal with any other kernel patch.
> 
> https://www.sans.org/newsletters/newsbites/xviii/84

Well said. Thank you for that link.

> 
> cheers, etc.

Cheers!




signature.asc
Description: OpenPGP digital signature


Need help Debuging boot

2017-02-16 Thread ~Stack~
Greetings,

I'm going to keep this "short" because I've just had 6hrs of things I've
tried that didn't work. It would take to long to list them all. :-)

I have a bunch of new SuperMicro servers. Installed 7.3 on it. Reboot
and it hangs at:
"Probing EDD (edd=off to disable)...ok"

And by hangs, I mean there is NO response out of anything. No Caps Lock
light on keyboard, nothing.

However, if I let it sit long enough it will boot (once one sat for an
hour before it continued on, most of the time it is closer to 30-40
minutes).

If I boot into rescue kernel, it instantly boots. Every time. This is so
puzzling to me.

If I wait and let it boot, then check 'systemd-analyze' it says my boot
time is sub 6 seconds (fancy new SSD's too!) and blame tells me that the
longest section to boot was 3 seconds on the networking. Well, that is
worthless because it just SAT THERE FOR THIRTY MINUTES!!! It obviously
starts recording time after the hang.

No matter the amount of logging I do or what debug mode I put it in, it
prints "Probing EDD (edd=off to disable)...ok" then hangs, and
EVERYTHING after that has nothing to do whatsoever with the reason for
the hang.

I have disabled just about everything I can think of from various online
suggestions. I removed the quiet flag (duh) and I've turned off
intel_pthread's and power states and ACPI and nomodeset and loglevel=7
and blah blah blah blah. Seriously, my string of crap tacked on to the
grub prompt is getting rather absurd. (I boot into recovery, modify
/etc/default/grub and run grub2-mkconfig to set the grub prompt; I
checked and this is working to set the grub parameters).

Still same result. Recovery kernel boots, the other kernel hangs.

Fine. I will install a kernel from El Repo! I'll get a fancy new 4 kernel!

Yeah. That doesn't do squat either.

Want to know the thing most infuriating? A single box in the whole
batch, shows this problem once every 10 boots or so. I can't tell that
there is a stinking thing different. BIOS is exactly the same, configs,
install, packages, everything. *shrug*

Are there *any* suggestions at all as to how I can figure out what it is
hanging on? Is there a list of things after EDD that I can just start
disabling till I get a different result?

Thoughts?

Thanks!
~Stack~



signature.asc
Description: OpenPGP digital signature


Re: Need help Debuging boot

2017-02-17 Thread ~Stack~
Greetings,
On 02/17/2017 04:37 AM, Bluejay Adametz wrote:
>> I have a bunch of new SuperMicro servers. Installed 7.3 on it. Reboot
>> and it hangs at:
>> "Probing EDD (edd=off to disable)...ok"
> 
> Did you get rid of the "quiet" option on the kernel line? If not, do
> so, so you're sure to get all the kernel messages. It might not be
> hanging where you think it is.
> 

I did remove 'quiet' as well as 'rhgb'.

Thanks!
~Stack~



signature.asc
Description: OpenPGP digital signature


Re: Need help Debuging boot

2017-02-17 Thread ~Stack~
Greetings,

Thank you all very much for your help. I wasn't in a position to reply
earlier, but I was watching the updates.

The EDD, I think, was a red herring. Turning it off just meant it locked
the screen without anything being printed at all. I installed to a
different hard drive and got the same results. Even when I disabled the
on board SATA in the BIOS and installed to an external disk, same thing.

I will spare all of the gruesome details of all the many things I tried
that didn't work. Here is what finally did work.

Install 7.2 then update everything but microcode_ctl.

Done. :-)

Such a simple statement for such a CRAZY few days of complex debugging.
The short version.

7.3 is fairly new, but many of the servers I have that are nearly the
exact same hardware config have been running for a while and only
recently updated to 7.3. So why not try 7.2? Works no problem. Well that
is odd. Update to 7.3, same problem.

Huh, none of my other boxes have this problem. I wonder what could be
the difference? Same parts. The only things that are different are minor
revision updates (eg, bios is 2 versions newer ect). Then I noticed that
my old boxes are "Intel Xeon E5-2630v3" and the new boxes are "Intel
Xeon E5-2630v4". Well that shouldn't matter...unless there is something
in the microcode or the linux-firmware...

So I started investigating and narrowed it down to the microcode. I'm
throwing this back up to my vendor to chase down w/ Red Hat as it is
reproducible on Red Hat Enterprise Linux 7.2/7.3 and I gotta get these
things on line (one benefit of paying for support is being able to say
"It's broke! Fix it for me!" :-D ). Besides, looking at the Kernel hex
code tracing things out today has given me a headache. :-D

I will find out next week if there is any strange fall out from this,
but for today they seem to be working just fine. I am hoping they
continue to do so until a patch/kernel update rolls down the line.

Thanks again! I really do appreciate the help. The ideas got me thinking
on the right track and helped eliminate variables.

~Stack~



signature.asc
Description: OpenPGP digital signature


Re: Need help Debuging boot

2017-02-18 Thread ~Stack~
On 02/17/2017 09:11 PM, Konstantin Olchanski wrote:
> Hi, there, we told you many things, some hopefully helpful, but you never
> told us what machine you have, and we did ask (twice).
> 
> Please, tell us what motherboard you have (from dmidecode) and
> what CPU you have (from /proc/cpuinfo).

Sorry. You're right. I forgot to tell the motherboard and I typo'd the
processor! Ugh. It was a long day. :-)

The processor is Intel Xeon E5-2637v4 (not the 2630) and it is in a
SuperMicro X10SRW-F board.

~Stack~






signature.asc
Description: OpenPGP digital signature


Re: Strange network device name chages on reboot of SL7 kvm guests

2017-02-26 Thread ~Stack~
On 02/25/2017 10:18 PM, Bill Maidment wrote:
> Hi again
Greetings!

> I have recently rebooted KVM guests with two virtual NICs (e.g. /dev/ens4 and 
> /dev/eth0) only to find that the device name of the eth0 changes to eth1 and 
> so the ifcfg-eth0 doesn't match.
> So I fix the ifcfg file and restart network - all OK.
> On a later reboot eth1 changes back to eth0.
> What is going on? Anyone else observe this phenomena?

Yes. It is part of the goofy new naming structure. It's nice when it
works, but it seems to not-work more often then it does-work. Then it is
infuriating.


Anyway...

Here's the short version.
en: ethernet prefix
o: onboard
s: slot
p: physical location of connector.

If the device is "unknown" then it gets "eth". And of course the numbers
increment for each new device.

So what your device names tell me is that you have one card in slot 4
and a second card that the OS can't figure out where it goes.

There are two ways of "fixing" this.
1)
*If* you have your HWADDR= assigned in your
/etc/sysconfig/network-scripts/ifcfg-eth0 then it *should* always get
that interface regardless of what the OS detects it to be on boot. I
have found this isn't always the case with some specialty cards that
take FOREVER to initialize. I tend to throw in the UUID as well and that
seems to resolve the problems for those cards.

2)
Now, here is the odd part because this next problem is also on my big
KVM host. None of the above worked. I never found a good answer to why
and I am wondering if it is related to KVM...but at this point my sample
size is two so probably not something to make a solid educated guess on...

Anyway..

Read this:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Networking_Guide/ch-Consistent_Network_Device_Naming.html

There are some good tips throughout chapter 8 on how to tweak and/or
disable udev from scanning devices. I had to rewrite the udev rules on
my KVM box right after 7.1 released because *every* *single* reboot
broke my networking. However, after I adjusted the udev rules based on
the RH documentation I haven't had a problem with reboots since.

I _really_ hope that #1 fixes your issues as it is by far the easiest to
do and manage. If not, the udev rules should do the trick.

One last thought. In that documentation there is something called
"biosdevname"; I'm pretty sure that wasn't in the docs when I was having
a problem with 7.1 as I don't recall seeing it before, but it looks
interesting. You might just want to go through that chapter and give
those suggestions a go.

Hope that helps!

~Stack~



signature.asc
Description: OpenPGP digital signature


Re: What does this SELinux command do?

2017-02-27 Thread ~Stack~
On 02/27/2017 07:29 PM, ToddAndMargo wrote:
>  setsebool -P mmap_low_allowed 1
> 
It allows applications to access the low portions of the kernel memory
space.

Personally, I can't think of a good reason to allow that. Maybe someone
else can?

I know I would need a good reason to enable it.

~Stack~



signature.asc
Description: OpenPGP digital signature


Re: Server going to uncommanded sleep or suspend

2017-03-09 Thread ~Stack~
On 03/09/2017 12:14 PM, Konstantin Olchanski wrote:
> Hi, there. I wonder if anybody else is seeing the same problem with el7:
> 
> The symptoms are: no ping, dead video, dead keyboard. After power cycle,
> syslog shows that the system has attempted to go into sleep or suspend
> or whatever they call it.
> 
> This is very strange, usualy a system will go into suspend mode when you
> close the laptop lid, but these are not laptops. They are normal desktop
> machines (and at least in one case, there is no local user to blame for
> pressing the "sleep" button).
> 
> So what's in the syslog:
> - normal activity (systemd spam)
> - network manager reports "sleep requested"
> - some kind of nm_dispatcher activity
> - systemd reaches sleep and suspend targets.
> - continues spewing sundry messages, never recovers (never goes into actual 
> sleep).
> 
> The machine is effectively dead after network manager put the network 
> interfaces to sleep.
> 
> The best google-advice I see it to disable the systemd sleep and suspend 
> targets:
> systemctl mask sleep.target suspend.target hibernate.target 
> hybrid-sleep.target systemd-suspend.service systemd-hybrid-sleep.service
> (now waiting for this machine to go to sleep).

I had something very similar.

First, make sure on your EL7 boxes that your systemd journal is being
saved between reboots (peeves me off that this isn't default, but that
is another matter). Systemd doesn't log everything to syslog and
journalctl can help capture information.

$ mkdir /var/log/journal
$ chown root:systemd-journal /var/log/journal
$ chmod 2755 /var/log/journal

Then I had to increase systemd's log level to spit out everything.

In /etc/default/grub edit GRUB_CMDLINE_LINUX to add
"systemd.log_target=kmsg systemd.log_level=debug"

Don't forget to update grub!
$ grub2-mkconfig -o /boot/grub2/grub.cfg

That should kick up all of the messages from systemd on next reboot. I
can't remember if I turned the log level to debug in the
/etc/systemd/system.conf...Meh, can't hurt. :-)

For me, once I did that the actual trigger stood out. I have had a
handful of systems that trigger systemd's sleep mode. Reported every
single one of them to the systemd-devs and they insist that it's "bad
hardware" every time. Sure, that's why the hardware works perfectly for
years on non-systemd OS's. *rolls eyes*

Anyway, once you get used to debugging and disabling all of the stupid
stuff systemd does, it's not /too/ bad. Just gotta learn to get good at
learning how to debug its weirdness. :-)

~Stack~






signature.asc
Description: OpenPGP digital signature


Compress kdumps over scp

2017-03-09 Thread ~Stack~
Greetings,

There has been some discussion in my group about setting something up to
capture kdump files in a single location. Never messed with it before,
but it seems pretty straight forward as /etc/kdump.conf is very well
documented.

My question: What is the best way to compress for sending to remote scp
in a kdump?

Longer version and configuration:

Set up a couple of SL7 virtual machines to start testing.

Default configuration:
$ sed -e '/^#/d' /etc/kdump.conf
path /var/crash
core_collector makedumpfile -l --message-level 1 -d 31

My changes:
$ sed -e '/^#/d' /etc/kdump.conf
ssh root@10.10.10.12
sshkey /root/.ssh/kdump_id_rsa
path /data/kdump_crash
core_collector scp
$ kdumpctl propagate
$ systemctl restart kdump

Whoo! It works...Huh...My crash seems to take a long time...Oh, look at
that, my vmcore is 4GB in size (vm memory size). o_O

I really don't think my other crashes are nearly that large...Reset to
default config and force a crash. 39MB. Ah, the '-l' option of
makedumpfile uses LZO compression. Makes sense.

Looking at the kdump.conf man page, I should be able to use makedumpfile
to compress then send to scp.

$ sed -e '/^#/d' /etc/kdump.conf
ssh root@10.10.10.12
sshkey /root/.ssh/kdump_id_rsa
path /data/kdump_crash
core_collector makedumpfile -F -l --message-level 1 -d 31

Hrm. It will only work if I specify '-F' which gives me a 'vmcore.flat'
instead of a normal 'vmcore' which my crash tools don't seem to like.
Looks like I can do conversions from a vmcore.flat file to use the debug
tools, but I don't like that.

I've tinkered with a few other options unsuccessfully. My options look
like: uncompressed (which will be fun with my 512GB nodes!) or
vmcore.flat files.

Any thoughts on the best way to compress the vmcore file while still
sending over scp?

Thanks!
~Stack~



signature.asc
Description: OpenPGP digital signature


nmcli question

2017-04-08 Thread ~Stack~
Greetings,
I will spare the details, but suffice to say I am in a position where
after many years knowing the 'network' commands I've been tasked to
learn nmcli much better than I do now. This is all on SL7.

I've been reading documents, building and tearing down networks for
hours, and trying to put into practice what I'm learning (still a long
way to go; haven't touched the infiniband parts yet). Something keeps
coming up in documentation that bothers me.

Here is an example of one of *many* documents:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Networking_Guide/sec-Network_Configuration_Using_sysconfig_Files.html

They mention taking down a network with:
$ nmcli dev disconnect interface-name

but bringing it up with:
$ nmcli con up interface-name

That is so infuriating to me. Why use different sub-commands? Especially
when there exist subcommands in the same context? Why not do this?
$ nmcli dev disconnect interface-name
$ nmcli dev connect interface-name

Or even this?
$ nmcli con down interface-name
$ nmcli con up interface-name

As far as I can tell, they are both doing the same thing. In fact the
only difference I can tell comes from the nmcli help documentation where
it says the difference is in the auto-activating:

$ nmcli d disconnect --help

The command disconnects the device and prevents it from auto-activating
further connections without user/manual intervention.

$ nmcli connection down --help

Deactivate a connection from a device (without preventing the device
from further auto-activation). 


If it was just one document, then whatever. But I've seen that in
several of the RH documents as well as on several blogs/webpages. What
am I missing? What is the difference and why should I prefer to take
down a connection with "device disconnect" but bring it up with
"connection up"?

Thank you!
~Stack~



signature.asc
Description: OpenPGP digital signature


Re: nmcli question

2017-04-09 Thread ~Stack~
On 04/08/2017 09:59 PM, Nico Kadel-Garcia wrote:
> On Sat, Apr 8, 2017 at 5:36 PM, ~Stack~  wrote:

> Because they're trying to weld NetworkManagers's graphical interface,
> on top of  poorly integrated command line interface, on top of the
> actual underlying bash scripts that do the real work. It's Fugly Out
> There(tm). NetworkManager for RHEL 7, and thus for CentOS 7, even
> introduced the concept of parsing multiple infividual ifcfg-* files to
> manage the same actual device, such as multiple file to manage
> ifcfg-eth0 in ifcfg-eth0 and ifcfg-ethp-slave. The result is madness.

I haven't looked at the code, so I can't comment directly. However, I
have it from both the RH documentation as well as a RH developer I know
from a convention that the majority of the scripts/programs we know and
love from yester-year are not actually the tools we think they are.
Networking, ifup, ifdown, ifconfig, ect are all under-the-hood
re-written to check in with NetworkManager first before they do anything.

It's part of why I need to learn these new tools better.

>
> In case it's unclear I am *not* happy with NetworkManager for servers
> or stable environments. Laptops that have to wander from environment
> to environment need multiple VPN's, yeah, OK, I can see having a more
> complex tool. But for a  VM? Or a server?

I 100% agree. When NetworkMangler (as I called it for a long time) first
came on the scene, I ripped it out of everything. Then I realized that
it actually did a darn good job of handling my wireless connections and
made it less painful than the manual methods I had been using.
Discovering profiles for my laptop for all the different networks I was
on was awesome! But I still ripped it off of all my servers.


> I'd like to introduce you to wone of my favorite settins for
> /etc/sysconfig/network-scripts/ifcfg-* files, or even for
> /etc/sysconfig/network, or if you feel really paranoid, /etc/profile.
> 
>   NM_CONTROLLED=no
> 
> Turn *off* NetworkManager manipulation for anything that doesn't need it.

Yup! That is on just about every single one of my servers. At least
until I understand NetworkManager much much better than I do now. Which,
I am trying. :-)

>> If it was just one document, then whatever. But I've seen that in
>> several of the RH documents as well as on several blogs/webpages. What
>> am I missing? What is the difference and why should I prefer to take
>> down a connection with "device disconnect" but bring it up with
>> "connection up"?
> 
> See above. NetworkManger is a complex management layer un top of the
> actual "ifconfig" tools managed by the various
> /etc/sysconfig/network-scripts, and for many operations it simply adds
> instability and confusion.

Not any more is it on top of those tools. NetworkManager is core and
they check in with it.

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Networking_Guide/sec-NetworkManager_and_the_Network_Scripts.html


Thanks for the feedback!
~Stack~



signature.asc
Description: OpenPGP digital signature


Re: nmcli question

2017-04-09 Thread ~Stack~
On 04/08/2017 10:48 PM, Steven Haigh wrote:
> On 09/04/17 12:59, Nico Kadel-Garcia wrote:
>> In case it's unclear I am *not* happy with NetworkManager for servers
>> or stable environments. Laptops that have to wander from environment
>> to environment need multiple VPN's, yeah, OK, I can see having a more
>> complex tool. But for a  VM? Or a server?
> 
> Yep - I've gone as far as removing NetworkManager completely from my
> servers.
> 
> A few months ago I drank the koolaid and set up nmcli with my Xen server
> - and it was a pain in the backside. Finally got it working, but it
> still decided to drop the bridging interfaces randomly (causing all VMs
> to disconnect from the network) and wouldn't bring them back up.
> 
> I ended up reverting to manually creating ifcfg-* config files and
> scrapping all plans of migrating to anything NetworkManager.
> 
> The down side is that you lose the network-online target for systemd -
> which can cause its own problems - but its worth working around those
> for a stable network config.
> 

At a conference in 2015, I attended a couple of presentations by RH
employees. On one day, a RH employee gave a talk I attended. In the
Question time at the end, he made a tangent comment about how
NetworkManager should never be disabled because RH was focused on making
it an enterprise reliable networking tool. The VERY NEXT DAY he gave a
talk regarding OpenStack & Kubernetes, and I kid you not, one of the
first comments was "disable NetworkManager" and use the old networking
tools!

I right then called him out on it. He handled it very well. :-)

The short, he reiterated that RH is committed to NetworkManager first as
a enterprise reliable networking tool. He admitted there were
shortcomings and that there were known issues where it didn't work well
(virtual networks with OpenStack for example), but that he fully
expected many of the kinks to be worked out "soon". He hoped and was
expectant that NetworkManager would be a full replacement before EL7
stopped getting feature updates. He made the disclosure that he can't
predict the future and speak completely for RH but that he was
absolutely confident that EL8 would not have anything but NetworkManger.

That isn't the first, nor the last time I've heard that. At
Supercomputing 2016 in Salt Lake City, a RH employee made a similar
comment as well (again not in official capacity ect ect).

Assuming they are right, it looks like I am going to have it as a future
tool. Add to it a current project in which the client is specifically
asking for integration with it AND the potential for Infiniband (which
NetworkManager is supposed to handle well), and I find myself needing to
learn it now.

At least one good thing has come from me studying this weekend. I
discovered the reason behind one of the oddities in networking on our
KVM server. So that was a plus! :-)

Thanks!
~Stack~



signature.asc
Description: OpenPGP digital signature


Re: nmcli question

2017-04-09 Thread ~Stack~
On 04/08/2017 04:36 PM, ~Stack~ wrote:


> They mention taking down a network with:
> $ nmcli dev disconnect interface-name
>
> but bringing it up with:
> $ nmcli con up interface-name
> That is so infuriating to me. Why use different sub-commands? Especially
> when there exist subcommands in the same context? Why not do this?
> $ nmcli dev disconnect interface-name
> $ nmcli dev connect interface-name
> 
> Or even this?
> $ nmcli con down interface-name
> $ nmcli con up interface-name
> 
> As far as I can tell, they are both doing the same thing. In fact the
> only difference I can tell comes from the nmcli help documentation where
> it says the difference is in the auto-activating:
> 
> $ nmcli d disconnect --help
> 
> The command disconnects the device and prevents it from auto-activating
> further connections without user/manual intervention.
> 
> $ nmcli connection down --help
> 
> Deactivate a connection from a device (without preventing the device
> from further auto-activation). 


Sorry for replying to my own question, but thought I would post some
relevant information.

I pinged a friend that has quite a few more RH certs than I do and asked
him. He didn't have an definitive answer. However he commented that he
makes the distinction precisely because of what I found. He said that
doing a "con down" means that any automated check or script or anything
may trigger that connection to come back up which he explicitly doesn't
want happening while he is editing files, testing configurations, ect.
Thus, a "dev disconnect" ensures it stays down till he wants it back up.
However, he also mentioned that he brings the interface up with "dev
connect" and as far as he knows he hasn't had issues. He recommended
just using "dev disconnect" and "dev connect".

It isn't a perfect answer for me as I still want to know why RH and
others recommend the method that they do, but it is at least a more
sensible answer than what I have found on my own searching the internet. :-)

I actually do have the ability to pester RH directly. I may do so and
see what they say. If I do ask, I will share if I find out. ;-)

Thanks!
~Stack~



signature.asc
Description: OpenPGP digital signature


Re: nmcli question

2017-04-11 Thread ~Stack~
On 04/11/2017 04:50 AM, David Sommerseth wrote:
> On 10/04/17 23:49, O'Neal, Miles wrote:
>> There are days I sort of wonder whether the Linux development crews
>> haven't been infiltrated by people trying to drive us into the OSX or
>> Windows camps.
> 
> That is very unfair.  To my knowledge, there exists no "Linux product
> management department" which sets the path forward for how any Linux
> distribution should go.
> 
> A lot of the new stuff happens in many distributions before it hits the
> enterprise Linux distributions - such as SL.  For RHEL and SL that means
> Fedora.  For SUSE Linux that means openSUSE.  For Debian, that's a a bit
> different story, as they have their unstable branch and is not a company
> compared to Red Hat or SUSE.  And many of these package maintainers and
> developers in distributions works with a broad range of upstream
> communities and projects.
> 
> The result is that if someone feels something could be improved, they
> start doing that inside the relevant upstream project involved - or they
> create their own new upstream project.  *Then* the various upstream
> projects and later on Linux distros decides to include these
> improvements.  And then it hits the enterprise Linux distributions.
> 
> So claiming that "Linux development crews" are infiltrated to make Linux
> s**k is just so wrong on every level.  First of all there exists no
> "Linux development crew" at any level, the development work is
> completely distributed and decentralized.  This is the kind of silly
> remarks which actually pays no respect to all the efforts and good faith
> provided by many people in many places.  Claiming those people are
> infiltrators is just beyond any reasonable limits of fairness.
> 
> If you dislike something ... Grab developers responsible for your
> dissatisfaction in IRC, join the mailing lists, go to conferences where
> you can meet these persons face to face or otherwise reach out the
> proper people directly.  *That way* Linux can truly be improved, by
> users actually giving real feedback to the persons who can do something
> about it.  Too much for you?  Get a Red Hat subscription and "outsource"
> that work through the Red Hat support channels.
> 
> And I encourage all of you to pay attention to the devconf.cz conference
> (lots of videos with past presentations on youtube too).  That does have
> a lot of focus on Fedora and RHEL, which is most relevant for SL.  There
> probably exists many other conferences too, where the heading of the
> various projects included in Linux distributions is presented.  And *do*
> provide feedback to those giving talks, that way things may improve.
> But of course, it is up to the developers to decide if to change or not,
> based upon how many gives feedback pointing in the same direction.
> 
> Ranting about the direction of "Linux" on a distribution ML which
> basically just ships what RHEL ships is just completely missing the
> mark.  It is a complete misunderstanding of what kind of distribution
> Scientific Linux is.  And then giving a remark so general it carries no
> argument of _what_ is wrong *and* _how_ you see it could be fixed ...
> How can that improve things?
> 
> 

Well said!




signature.asc
Description: OpenPGP digital signature


Re: tip: Secondary Selection clipboard

2017-07-01 Thread ~Stack~
On 06/27/2017 08:15 AM, Lars Behrens wrote:
> Am 27.06.2017 um 15:05 schrieb Tom H:
> 
>> We still (optionally) support the PRIMARY selection on the X11 backend,
>> and some compatibility layer for it on Wayland, but we have no plans on
>> adding support for the SECONDARY selection, as it's both barely
>> specified and, like the PRIMARY, highly confusing for anybody who is not
>> well-versed in 20+ years of use of textual interfaces on the X Windows
>> System. Personally, I would have jettisoned the PRIMARY selection a long
>> time ago as well, but apparently a very vocal minority is still holding
>> tight to that particular Easter egg. Adding support for the even more
>> esoteric SECONDARY selection on the X11 backend when we're trying to
>> move the Linux world towards the more modern and less legacy-ridden
>> Wayland display system would be problematic to say the least, and an ill
>> fit for the majority of graphical user experiences in use these days.
> 
> 
> Boy. I really dislike his arrogant sound. Already stumbled over
> something similar on the net, obviously written by that same guy.

No kidding! Don't know what his issue is.

> Primary selection for me is a major feature in GUI *and* shell.

Agreed. I use both the Ctrl+C and Middle-click extensively. Having
multiple clipboards that are easy and quick to use is fantastic!
Sometimes I need more in the buffer, which means one of my
multi-monitors usually has a never-saved text application open where I
dump tidbits for quick reuse.

LXER linked me to this article on additional clipboard managers recently.
https://opensource.com/article/17/6/clipboard-managers

I won't spoil it with personal biases as I am still working through the
list on my dev machine testing them out. There are some interesting
pro's to a few of them. If this is a need, give a few of them a try.

~Stack~



signature.asc
Description: OpenPGP digital signature


Re: EPEL Download

2017-10-29 Thread ~Stack~
On 10/29/2017 07:49 PM, Bill Maidment wrote:
> Hi
> Today I found out that EPEL had reorganized their repository structure 
> without warning.
> A 26 GB download ensued. Ouch. That's a quarter of my monthly quota.

I too mirror for my site, but I only mirror x86_64 for 6/7. I haven't
checked yet my mirror yet...what changed? I don't see anything obvious
looking at my upstream mirror.

~Stack~



signature.asc
Description: OpenPGP digital signature


Re: not recovering session after locking SL 7.4

2017-11-02 Thread ~Stack~
Greetings,

I have to preface this because so many like to hate on it just to hate
on it...but I am _not_ trying to start a fight. Just point in a
potentially helpful direction. It has it's pros and it has its pains.

With that said, take a look at your journald logs. Crank up the
verbosity and take note of what is there before putting the laptop to
sleep / closing the lid / hibernate / whatever.

100% of the time, without fail, every single problem I've had with my
Ubuntu 16.04 laptop and my SL7.4 laptop not sleeping, waking, or
crashing when I close the lid has gone back to systemd doing something
it shouldn't. The vast majority of the old ways of fixing these problems
don't work, you *must* fix it the systemd way. I've fixed almost* all of
my issues. It is a new tougher-challenge road for me, but it's the one
I'm on. :-)

* The only one remaining is that I can manually lock the screen and wake
the laptop up just fine. However, if I have an external monitor plugged
int and I manually lock the screen for some weird reason systemd puts
the laptop to sleep and won't wake up unless I open/close the lid. YET!
If I just let the screensaver timeout hit, it works perfectly. That one
has stumped me for the last couple of months. *shrug*

If you can figure out what systemd trigger is being tripped, you can
either disable that from running or potentially tweak it to work for you.

Unfortunately, I don't have good resources for you. Of all the times
I've asked the systemd IRC/mailing list, I have yet to get help that
didn't blame me or my hardware (even when I've proved it's neither). The
Ubuntu systemd group has helped several times and having a proper Up
Stream Vendor license that I can install on a spare drive, boot the
laptop, and use their support has helped me with several issues that I
was able to take back to SL7.4 (or wait for the patch to filter down).

It's primarily just been a ton of digging around on the Internet and in
the log files.

Good luck!
~Stack~


On 10/26/2017 09:23 AM, Stefano Vergani wrote:
> Hi all,
> 
> 
> I have just upgraded Scientific Linux to version 7.4 and I have found an
> issue I cannot fix. Anytime I lock my PC (ThinkPad Lenovo T440p)
> pressing the lock icon or simply I close the PC without locking it or
> shutting it down, I am not able anymore to return in my session. The
> screen remains black and I have to press the on/off button for some
> seconds until it reboots. 
> 
> How can I fix this? Anyone else with the same issue?
> 
> 
> thanks,
> 
> Stefano
> 
> 
> p.s. before this upgrade everything worked just fine
> 




signature.asc
Description: OpenPGP digital signature


Re: Best option for php version 7?

2018-01-04 Thread ~Stack~
On 01/04/2018 12:54 PM, Steve Gaarder wrote:
> I want to set up a web server with PHP version 7, preferrably 7.1 or
> newer, which is not available in SL or EPEL.  What is the best (i.e.
> stable and kept up-to-date) place to get it?

I've had lots of luck with https://ius.io/ and prefer it over Software
Collections when I can.

I'm actively running PHP 7.1 from IUS on a production server and I run
several of their packages across my data center.

Hope this helps!
~Stack~



signature.asc
Description: OpenPGP digital signature


Re: Fwd: Services - Server

2018-07-13 Thread ~Stack~
Greetings,

On 07/13/2018 07:22 AM, Manuel Sanchez wrote:
> Greetings to the whole community; I have installed version 6.5 but I
> need to know how to call the different servers or services and where are
> the directories of the configuration files; to facilitate the answer I
> try to simplify the text:
> 
> -Server DNS: Bind?                      Dir.: .
service named
/etc/named

> 
> -Server DHCP: Apache?           Dir.: .
service: httpd
/etc/httpd

> 
> -Server FTP/SMB: Samba?         Dir.: .
FTP really depends on what you are using.

service samba
/etc/smb

> 
> -Server SSH: LDap???            Dir.: .
LDAP really depends on what you are using (eg: idm, ipa, sssd, ect)

service sshd
/etc/ssh

> 
> -Server Intranet/Net:???                Dir.: .

I have no idea what you are asking here.

> 
> -Server SQL: MySQL?             Dir.: .
service mysqld

/etc/my.cfg

I think...been too long since I messed with it. If I remember correctly,
mariadb also uses the same config...

> 
> -Server Proxy: Squid?           Dir.: .

service squid

It's been far too long since I did anything with squid. You will have to
hit up the man page.

> 
> -Server Domain: ???                     Dir.: .

service ???
/etc/???

;-)

> Thank you.

You're welcome. :-)

~Stack~



signature.asc
Description: OpenPGP digital signature


Re: SL6 firefox issue.

2018-07-16 Thread ~Stack~
On 07/12/2018 05:34 PM, Akemi Yagi wrote:
> On Thu, Jul 12, 2018 at 3:04 PM, Akemi Yagi  wrote:
>> On Thu, Jul 12, 2018 at 11:19 AM, Jesse Bren  wrote:
>>> I've reports of the same, have not yet had a chance to test a rollback of
>>> firefox.
>>>
>>> Mozilla Firefox 60.1.0
>>>
>>> /var/log/messages at time of crash:
>>> Jul 12 13:15:51 hydra kernel: firefox[58474] trap int3 ip:7fd8e93eb5bf
>>> sp:7ffe5c4556a0 error:0
>>> Jul 12 13:15:51 hydra kernel: Chrome_~dThread[58622]: segfault at 0 ip
>>> 7f53afa4cf9d sp 7f53ad1eeaf0 error 6 in
>>> libxul.so[7f53af561000+532a000]
>>> Jul 12 13:15:51 hydra kernel: Chrome_~dThread[58700]: segfault at 0 ip
>>> 7f593d24cf9d sp 7f593a9eeaf0 error 6 in
>>> libxul.so[7f593cd61000+532a000]
>>> Jul 12 13:15:51 hydra kernel: Chrome_~dThread[58728]: segfault at 0 ip
>>> 7fd05fe4cf9d sp 7fd05d5eeaf0 error 6 in
>>> libxul.so[7fd05f961000+532a000]
>>>
>>> kernel version info:
>>> Linux hydra 2.6.32-754.2.1.el6.x86_64 #1 SMP Tue Jul 10 13:23:59 CDT 2018
>>> x86_64 x86_64 x86_64 GNU/Linux
>>
>>> On Wed, Jul 11, 2018 at 7:46 PM Franchisseur Robert 
>>> wrote:
> 
>> Maybe this upstream bug:
>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__bugzilla.redhat.com_show-5Fbug.cgi-3Fid-3D1596852&d=DwIBaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=X9unDwqvB-8suHJh2rQJ3FtCUONoQQfYfTeRgWDQnMc&s=S8y5nL-c9ztjpJV3nj8D2U7X9Xd8R1zGo1nbn-ktWBw&e=
>>
>> Akemi
> 
> I confirm that the patch provided in the above RHBZ fixes the problem.
> In short, open /usr/bin/firefox and change the last line:
> 
> exec $MOZ_LAUNCHER $script_args $MOZ_PROGRAM "$@"
> 
> to
> 
> exec env XDG_DATA_DIRS="$MOZ_LIB_DIR/firefox/bundled/share"
> $MOZ_LAUNCHER $script_args $MOZ_PROGRAM "$@"
> (note this is one line)
> 
> Akemi


Thanks for posting this. Just had users report this issue to me today.
The fix worked for me. :-)

~Stack~




signature.asc
Description: OpenPGP digital signature


XFS v EXT4 was: After Install last physical disk is not mounted on reboot

2018-10-12 Thread ~Stack~
On 10/12/2018 07:35 PM, Nico Kadel-Garcia wrote:
[snip]
> On SL 7? Why? Is there any reason not to use xfs? I've appreciated the
> ext filesystems, I've known its original author for decades. (He was
> my little brother in my fraternity!) But there's not a compelling
> reason to use it in recent SL releases.


Sure there is. Anyone who has to mange fluctuating disks in an LVM knows
precisely why you avoid XFS - Shrink an XFS formated LVM partition. Oh,
wait. You can't. ;-)

My server with EXT4 will be back on line with adjusted filesystem sizes
before the XFS partition has even finished backing up! It is a trivial,
well-documented, and quick process to adjust an ext4 file-system.

Granted, I'm in a world where people can't seem to judge how they are
going to use the space on their server and frequently have to come to me
needing help because they did something silly like allocate 50G to /opt
and 1G to /var. *rolls eyes* (sadly that was a true event.) Adjusting
filesystems for others happens far too frequently for me. At least it is
easy for the EXT4 crowd.

Also, I can't think of a single compelling reason to use XFS over EXT4.
Supposedly XFS is great for large files of 30+ Gb, but I can promise you
that most of the servers and desktops I support have easily 95% of their
files under 100M (and I would guess ~70% are under 1M). I know this,
because I help the backup team on occasion. I've seen the histograms of
file size distributions.

For all the arguments of performance, well I wouldn't use either XFS or
EXT4. I use ZFS and Ceph on the systems I want performance out of.

Lastly, (I know - single data point) I almost never get the "help my
file system is corrupted" from the EXT4 crowd but I've long stopped
counting how many times I've heard XFS eating files. And the few times
it is EXT4 I don't worry because the tools for recovery are long and
well tested. The best that can be said for XFS recovery tools is "Well,
they are better now then they were."

To me, it still boggles my mind why it is the default FS in the EL world.

But that's me. :-)

~Stack~



signature.asc
Description: OpenPGP digital signature


Re: XFS v EXT4 was: After Install last physical disk is not mounted on reboot

2018-10-14 Thread ~Stack~
On 10/13/2018 04:41 AM, Nico Kadel-Garcia wrote:
> On Fri, Oct 12, 2018 at 11:09 PM ~Stack~  wrote:
>>
>> On 10/12/2018 07:35 PM, Nico Kadel-Garcia wrote:
>> [snip]
>>> On SL 7? Why? Is there any reason not to use xfs? I've appreciated the
>>> ext filesystems, I've known its original author for decades. (He was
>>> my little brother in my fraternity!) But there's not a compelling
>>> reason to use it in recent SL releases.
>>
>>
>> Sure there is. Anyone who has to mange fluctuating disks in an LVM knows
>> precisely why you avoid XFS - Shrink an XFS formated LVM partition. Oh,
>> wait. You can't. ;-)
> 
> I gave up, roughly 10 years ago, on LVM and more partitions than I
> absolutely needed. I cope with it professionally, most recently with
> some tools to replicate a live OS to a new disk image with complex LVM
> layouts for the filesystem. LVM has usually involved complexity I
> do not need. In the modern day of virtualization and virtualization
> disk images, I just use disk images, not LVM, to create new
> filesystems of tuned sizes. Not so helpful for home desktops, I admit,
> but quite feasible in a "VirtualBox" or "Xen" or "VMWAre" set of Linux
> VMs.

Unfortunately, in the world I am in we have an Audit/Security
requirement that we *must* have separate partitions for /, swap, /tmp,
/home, and /var with a recommendation for /opt if it is heavily used.
I'm also in a world where researchers get to pick their layouts to have
the jr admins build the box to their specs. Then when they break
something, we few sr admins have to come in and fix it.


>> My server with EXT4 will be back on line with adjusted filesystem sizes
>> before the XFS partition has even finished backing up! It is a trivial,
>> well-documented, and quick process to adjust an ext4 file-system.
> 
> xfsresize is not working for you? Is that an LVM specific deficit?

Please provide more information. To the best of my knowledge, RH
official support still says shrinking an XFS partition can not be done.
Only growing. I am not familiar with a xfsresize command. Where do I
find it?

$ yum provides */xfsresize
No matches found
$ cat /etc/redhat-release
Scientific Linux release 7.5 (Nitrogen)

> 
>> Granted, I'm in a world where people can't seem to judge how they are
>> going to use the space on their server and frequently have to come to me
>> needing help because they did something silly like allocate 50G to /opt
>> and 1G to /var. *rolls eyes* (sadly that was a true event.) Adjusting
>> filesystems for others happens far too frequently for me. At least it is
>> easy for the EXT4 crowd.
> 
> That's a fairly compelling reason not to use the finely divided
> filesystems. The benefits in protecting a system from corrupting data
> when an application overflows a shared partition and interferes with
> another critical system has, typically, been overwhelmed by the
> wailing and gnashing of teeth when *one* partition overflows and
> screws up simple operations like logging, RPM updates, or SSH for
> anyone other than root.
> 
> If it's eating a lot of your time, there's a point where your time
> spent tuning systems is much more expensive than simply buying more
> storage and consistently overprovisioning. Not saying you should spend
> that money, just something I hope you keep in mind.

I don't disagree with you at all. But those partition regulations come
down from higher level than I. As for buying more disks, I'm quite glad
that the server class SSD's have fallen in price and they aren't buying
60GB disks anymore. Most of them are getting in the ~200GB range and it
is less of an issue.

> 
>> Also, I can't think of a single compelling reason to use XFS over EXT4.
>> Supposedly XFS is great for large files of 30+ Gb, but I can promise you
>> that most of the servers and desktops I support have easily 95% of their
>> files under 100M (and I would guess ~70% are under 1M). I know this,
>> because I help the backup team on occasion. I've seen the histograms of
>> file size distributions.
> 
> Personally, I found better performance for proxies, which wound up
> with many, many thousands of files in the same directory because the
> developers had never really thought about the cost of the kernel
> "stat" call to get an ordered list of the files in a directory. I
> ran into that one a lot, especially as systems were scaled up, and
> some people got bit *really hard* when they found g that some things
> did not scale up linearly.y.
> 
> Also: if you're running proxies, email archives, or other tools likely
> to support many small

Re: XFS v EXT4 was: After Install last physical disk is not mounted on reboot

2018-10-14 Thread ~Stack~
On 10/13/2018 11:22 AM, Adam Jensen wrote:
> On 10/12/2018 11:09 PM, ~Stack~ wrote:
>> For all the arguments of performance, well I wouldn't use either XFS or
>> EXT4. I use ZFS and Ceph on the systems I want performance out of.
> 
> For a single, modest server that runs everything - email, web, DBMS,
> etc. - I've recently switched from FreeBSD-11.2 with a four disk ZFS
> RAID-10 to SL-7.5 with XFS on a four disk hardware RAID-5. While ZFS was
> very convenient and had a lot of nifty capabilities, the resource
> consumption was enormous and performance didn't seem to be as good as it
> is now. (E3-1245, 32GB RAM, MR9266-4i)
> 

We do a pool of mirrored disks with fast SSD's for our ZFS caching.
Performance is fantastic and, as I mentioned in another reply, the
rebuild time of a failed drive (or a resilvering when I upgraded all of
the drives on the fly without downtime) is way faster than any RAID I've
ever worked on before (which is quite a few in my career).

However, even if performance wasn't great we would still probably be
using it because of the tooling around ZFS. We utilize a lot of the
tools it provides for shared file-systems, backups, compression,
de-dupe, ect.

Never used ZFS on *BSD. I've only used it on SL7 so I can't say anything
about an OS difference.

~Stack~


Re: [SCIENTIFIC-LINUX-USERS] Planning for hypothetical RHEL/CentOS cancellation

2019-01-07 Thread ~Stack~
On 1/7/19 8:06 AM, James M. Pulver wrote:
> I wonder how many Linux users are using something as unreliable and
> untrustworthy as a cloud storage provider that uses a proprietary
> interface?
[snip]

Yeah. There's no way I would trust my data to those vendors. Just look
at the news on just about any given week and at least one of those
vendors has had a data breach.

Most of the people I know who actually care about their data (and aren't
locked into something due to their company) host off a Nextcloud
installation. https://nextcloud.com/

Not only does Nextcloud provide self hosted, but they give a LOT of
details on best practices for securing the server. They even offer a
scanner to check for known issues/weaknesses https://scan.nextcloud.com/.

Works beautifully off of a Scientific Linux 7 system with the php
packages from IUS https://ius.io/ + a free dns domain + Let's Encrypt.
Easy setup and has clients for a ton of devices.

~Stack~



signature.asc
Description: OpenPGP digital signature


Re: Enterprise Linux 8 beta

2019-01-14 Thread ~Stack~
On 1/14/19 6:52 PM, Yasha Karant wrote:
> The following announcement appears [snip]
> 
>   Red Hat Enterprise Linux 8 beta
> 
> Red Hat Enterprise Linux 8 provides a consistent foundation for
> enterprise hybrid cloud, delivering any application on any footprint at
> any time. The public beta is now open.
> 
> End quote.
> 
> How soon may we expect a SL8 beta?
[snip]

Looking at the release history [1], I would expect the SL team to
release SL8 about 3 months after RHEL8. The dev's have already said that
any SL 8 betas will probably be internal until RHEL8 releases [2].

[1] https://en.wikipedia.org/wiki/Scientific_Linux#Release_history
[2]
https://listserv.fnal.gov/scripts/wa.exe?A2=ind1812&L=SCIENTIFIC-LINUX-DEVEL&P=421


I don't like FUD anyway, but I also happen to be in the camp that thinks
the fearmongering over IBM is over-hyped. Thus, I won't comment on the
rest. Red Hat is doing some things amazing well and they are doing other
things quite poorly. Fretting over what may or may not happen does no
one good. Let's just wait and see, shall we? :-)

~Stack~



signature.asc
Description: OpenPGP digital signature


Re: Can't reboot after latest updates

2019-05-04 Thread ~Stack~
Greetings,

I don't know this one, but my first inclination would be to modify your
grub boot. Remove the bits that you don't absolutely need (eg: quiet)
and add rd.break to get in as root. If that works, maybe you can explore
the issue a bit more. If not, maybe something in the messages on the
screen will give you a hint as to what to try next.

Do your servers support serial output? Can you possibly connect from
another server/laptop in order to watch the boot process? That helps
sometimes when the messages scroll past the screen way too fast.

~Stack~





On 5/4/19 3:47 AM, Bill Maidment wrote:
> Correction. Every 2 minutes, not every 2 seconds.
> I've advised my colleague to try a hard reboot to see if that gets us out of 
> trouble.
> 
>  
> -Original message-
>> From:Bill Maidment 
>> Sent: Saturday 4th May 2019 18:16
>> To: SCIENTIFIC-LINUX-USERS@FNAL.GOV
>> Subject: Can't reboot after latest updates
>>
>> Hi
>> Today, a colleague ran yum update on our SL7.6 server and after reboot the 
>> server refuses to start with the following message every 2 seconds:
>>
>> System Journal ID 20156 – failed to send WATCHDOG=1 notification message.  
>> Transport endpoint is not connected.
>>
>> Any ideas how to fix this. Google shows this as an occasional issue, but 
>> with no specific help.
>> .
>>
>> Cheers
>> Bill Maidment
> 




signature.asc
Description: OpenPGP digital signature


Re: No Scientific Linux 8 planned?

2019-08-03 Thread ~Stack~
On 8/3/19 6:45 PM, Nico Kadel-Garcia wrote:
> Looking back, I see that there is no plan to publish a Scientific
> Linux 8 release, and Fermilabs will be using CentOS 8 going forward.
> 
> I hope the developers and support that Scientific Linux has had can be
> migrated to the CentOS community, and that Red Hat is improved by the
> result. I especially hope that the recent purchase of Red Hat by IBM
> is good for the work people have done and appreciated.
> 
> Is there any plan to shut down these mailing lsists? It's been really
> quitet out there.

They said they would continue to support 7 until EoL. So I would expect
that the lists would continue to at least that point.

As for the lists being quiet, I guess I've always thought this list was
quiet. Especially compared to the other lists I'm on! :-D

~Stack~




signature.asc
Description: OpenPGP digital signature


Re: Who Uses Scientific Linux, and How/Why?

2020-02-24 Thread ~Stack~
On 2/24/20 8:09 AM, Peter Willis wrote:
> Hello,

Greetings!

> The variation in uses of t Scientific Linux is quite interesting.
> 
> As mentioned before, we are using it for fluid dynamics modelling and
> oceanography, in the context of parallel computing with OpenMP and MPICH.
> 
> I am curious to see what everyone else have been using it for.
> 
> Perhaps, if it’s not too much trouble, people on the list might give a
> short blurb about how they use it and why.

I had been using CentOS 5.x (and 4.X before that) for the base image in
a large High Performance Cluster (many hundreds of nodes) with Red Hat
as the infrastructure and important nodes but couldn't afford to pay for
that many nodes, especially since they were so minimalist. It was a
government institution that did everything from weather forecasts to
military parts modeling.

Red Hat released RHEL6 in November of 2010. I had a /really/ strong need
to move packages to EL6 for a new workload but again couldn't afford the
price tag (I can't remember how many nodes we had at that time...I think
over 300...we grew even more over the years). SL6 released 4-ish months
later in March 2011. I had been pestered to move to it but I wanted to
stick with CentOS because I knew it and was a little bit involved in the
community at the time. But there were a lot of issues and they were
re-tooling their build scripts. Finally in May, I gave in and made the
switch with the plan to revert back to CentOS6 when it released.

During the installations, I realized I goofed and my scripts that
figured out if it was on a RH host or a CentOS host were "failing" the
check and defaulting to RH...except they weren't failing...I thought it
was a fluke or something else was really broken so I dug into it. That's
when I realized that SL was actually closer to RH then CentOS was!

A lot of my "proofs" for this claim require quite a bit of setup or
configuration, but the easiest one that anyone can test is simply "yum
update --security". SL publishes a security channel, CentOS doesn't!
Which something even that simple means less work as I no longer needed
to scrape and parse out a massive list of CVE's to determine which
packages I needed to install (at the time I had to apply security
patches daily but I didn't like patching/updating packages just because
it was an update...if something broke I wanted as few things to check
for as I could. I'm at a different job now and still have the same
restriction though). Soon I ditched a TON of custom scripts for CentOS
because it all just worked great on SL the same as it did the RH hosts!
Bonus, RH6/SL6 was a lot more stable for us and let us do a few things
even better so I expanded the cluster a few hundred more nodes by the
end of 2011.

By the time CentOS 6 released in July 2011, I had zero desire to go back.

Today, I'm still the admin of big High Performance Clusters for a well
known economic modeling and research institution. Things that management
really cares about that they want to be able to pickup the phone and
yell at someone or get warm fuzzys about support contracts (eg: Ceph),
those are still RH. All the servers I care about being close to RH but
can't justify in the budget for (aka management won't pay for) RH are SL
6/7. At home, I run SL7 for all my servers (Lubuntu for my
desktop/laptops because I really like LXQT). I've even done the CERN
charity donations before where I send thank you notes to the SL devs in
the notes fields (no idea if they got them or not). :-D

We are just now exploring RH8/CentOS8. I've got a single RH8 VM I'm
doing testing in and I'm building a CentOS 8 later this week. There's
little reason for us to move to 8 at this moment...the bigger push is
that we still have a MASSIVE system (~100 nodes and quite important)
that is SL6 based and we need to get off of it by end of summer (both
hardware support ending and EL6 being EOL in November). So I'm trying to
figure out if I am going to take the easy path to SL7 that I know I can
do or if I jump it to 8... *shrug*

So that's more than just a short blurb...guess I will shut up now. :-D

~Stack~



signature.asc
Description: OpenPGP digital signature


Re: Who Uses Scientific Linux, and How/Why?

2020-02-28 Thread ~Stack~
On 2/28/20 5:00 AM, Paddy Doyle wrote:
> We're a university HPC centre.

[snip]

> Plus the stability of the longer RHEL life cycle has been a big
> plus for stable clusters (*).
> 
[snip]
> 
> (*) although more recently some people are asking more and more for the
> latest and greatest.. yes, we're looking at you ML and AI! :)

A combination of Singularity [1] and Spack [2] has kept my ML/AI users
quite happy, though there is a learning curve.

Singularity is pretty easy to grasp, it's just extra steps if the user
wants to build their own but it's been great because once they do they
jump on board the "reproducible science" aspect pretty quick. Greg
Kurtzer and his team are doing a great job with Singularity.

Spack is trivial if what they want is already there, a bit more
challenging if they have to add programs to it. Fortunately they've
offered full day training at Super Computing every year. The tutorial
section hasn't been posted yet for 2020, but here is their 2019 tutorial
and resources are online. [3]

[1] https://sylabs.io/singularity/
[2] https://spack.io/
[3]
https://sc19.supercomputing.org/?post_type=page&p=3479&id=tut164&sess=sess194

Hope this helps. I understand the sysadmin pain in trying to meet the
rapidly evolving ML/AI researcher demands. :-)

~Stack~



signature.asc
Description: OpenPGP digital signature


CentOS 8 EOL; CentOS Stream?

2020-12-08 Thread ~Stack~

Anyone else on the verge of tears after reading today's CentOS blog post?
https://urldefense.proofpoint.com/v2/url?u=https-3A__blog.centos.org_2020_12_future-2Dis-2Dcentos-2Dstream_&d=DwICaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=t2J9jUFVgun90FIMquH4QRfvlPyoP8v5iYSZEcA87_g&s=-5u1jeYbTrmg0sZScxVN-0qJ1ifC2BEGlmW4_B70SYw&e= 

If you don't know CentOS Stream, it's "upstream RHEL". No, not Fedora. 
Yes, that too is "upstream RHEL". CentOS Stream a rolling release (so 
good luck getting long term steady kernels/packages) that is trying to 
be Arch like but with RHEL flavor. It sits in between RHEL and Fedora. 
It isn't and won't track steady releases like RHEL. It will have things 
before RHEL, except for security patches which will still come in 
whenever someone gets around to it. And, no, they still won't tag their 
security patches as such because they expect you to apply patches (and 
potentially reboot) at their whim.


For those of us in the scientific community who have packages from 
vendors that standardize on RHEL dot releases, I'm not sure what we're 
going to do. We have RHEL licensing on the important infrastructure 
nodes but the hundreds of compute nodes, VM's, dev systems, and misc? 
Going all RHEL would kill our budget. And I don't care if Oracle Linux 
is free or how good of a clone it is, you only get burned by Oracle once 
(and you are usually to broke to be burned a second time).


I suppose we can shift nearly all of our infrastructure to Ubuntu LTS 
but there's a lot still left that I'm not sure we can move to CentOS 
Stream nor can we afford to go to RHEL. Guess we are freezing our 
conversations about moving away from SL7 and have year to figure it out 
then make it happen...


*sigh*

~Stack~


Rocky Linux

2020-12-09 Thread ~Stack~

Greetings,

Greg Kurtzer (the guy who started Caos which became CentOS, and he is 
also the guy who started Warewulf and Singularity), has announced Rocky 
Linux.


https://urldefense.proofpoint.com/v2/url?u=https-3A__rockylinux.org_&d=DwICaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=PcjfmOcTIXtiyJX8e4EjgVJf_0n7OjIkGzXbVXvtmIg&s=unWm5JFUE__5nWPA8pEHmO3aCjkyrMeVFRQp2XtnnaQ&e= 

It's aiming to be the replacement for a community backed EL8 distro. And 
wow, has he gotten support. It's blowing up on the slack page with 
people wanting to help.


https://urldefense.proofpoint.com/v2/url?u=https-3A__hpcng.slack.com&d=DwICaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=PcjfmOcTIXtiyJX8e4EjgVJf_0n7OjIkGzXbVXvtmIg&s=d5aX1hYZZmDZqWcoQLlqK_HhvOCvJ1iUa_92XhVJa4I&e= 


I think you can get an invite through the links on the rockylinux page.

I'm sure if any of the Scientific Linux team wanted to throw some 
pointers over to the group it would be very appreciated. As would any 
support/help from those who want to see a solid EL8 clone. :-)


Thanks!
~Stack~


Re: Rocky Linux

2020-12-09 Thread ~Stack~

On 12/9/20 9:16 PM, Yasha Karant wrote:

One thing does concern me:  having left CentOS (it was all "volunteer" 
effort at that epoch as I recall) for SL, a primary motivator was that 
SL had professional (employed, not volunteer) persons doing the distros, 
and this SL list amounting to support.


If Rocky is to be all volunteer, how reliable and professional will it 
be?  This is not a minor issue, as very few enthusiasts or other 
non-professionals provide a truly reliable deliverable.


I would say, give it time. It wouldn't be the first time Kurtzer started 
an open source project and turned into a company. :-) 	



For my use, is EL going to continue to be workstation friendly (e.g., 
laptop in which one cannot pick and choose to integrate only Linux 
traditionally supported controllers with appropriate drivers, such as 
sound "cards", but is stuck with whatever the laptop vendor has used -- 
typically MS Win "supported") or is it primarily a server distro? Ubuntu 
LTS still seems to be laptop friendly.


They are aiming for complete RHEL reproducibility. If the goal is to be 
as-true-as-possible-RHEL variant then the answer would be in how you use 
RHEL.


But do give it sometime. It's only been two days and the announcement I 
just saw said that there are now 750 people actively participating in 
the various forms to communication and they have direction, a plan, and 
leaders making it happen. And there's thousands of people who have 
noticed and are talking about it on /. , reddit, lwn, ect. That's pretty 
impressive and it speaks volumes about the number of people who really 
do want a true-to-RHEL variant.


~Stack~


Re: Rocky Linux

2020-12-10 Thread ~Stack~

On 12/10/20 4:47 PM, Yasha Karant wrote:
Again, my own needs are such that it is unacceptable to have a volunteer 
(and in many cases, amateur) developer/support arrangement for "mission 
critical" systems and applications software.



Greetings,

Well, there is Red Hat for professional support. Or if you don't like 
that option, Oracle Linux.


For our critical infrastructure, it will probably remain Red Hat proper 
for that reason. However, my small team and I will probably continue to 
pursue RH training classes then feel comfortable enough to maintain the 
hundreds of non-critical servers with a community backed variant.


Give it a month or so. I heard about another project starting up to be 
yet another variant today. I have a feeling that there will be more. And 
the better ones will emerge and I think we will all be able to make a 
more informed decision on our own directions in January when the initial 
excitement has died down. I do believe it is wise that you are following 
and keeping informed about the variants.


~Stack~


Sustainable computing - Re: CentOS EOL - politics?

2020-12-12 Thread ~Stack~

On 12/11/20 10:09 AM, Brett Viren wrote:

My hope is they (we) take this current situation as a lesson and make a
radical change that puts all of our computing on more sustainable
footing as we go into the next decades.


I'm curious about your thoughts on what it means to have that 
sustainable footing going forward.


We have been pushing our users to Singularity images for the last two 
years (we jumped on pretty early). A LOT of our application/code base is 
already Singularity behind the scenes. The users don't know and don't 
care because their applications still run the same on the same HPC 
equipment. However, getting our users to purposefully think in terms of 
Singularity images has been a long hard road and we still have so much 
further to go.


We are on the edge of shifting a few very critical and heavy 
computations to Kubernetes. I'm not yet convinced that it will replace a 
lot of the hard-core traditional HPC workloads anytime soon, but there 
are a surprising amount of workloads that can. Plus, it allows us to 
automate from Code->Gitlab->CI/CD->Kubernetes->results delightfully well.


But one of the absolute greatest things about it from the perspective of 
what CentOS just pulled is that my dev Kubernetes has three OS's. SL7, 
Ubuntu 20.04, and CentOS 8 (I JUST spun this up the Monday before the 
announcement). As an admin, I _don't_ care about the OS at this point of 
the Kubernetes process. I kill a node and rebuild it to anything that 
supports the docker requirements (plus a few other things I need for 
company audit/security) and join it to the cluster. Done! When I killed 
that CentOS 8 node I suffered no loss in the slightest in terms of 
functionality and only about an hour of time where I had to move the 
workload and rebuild the node Ubuntu.


Bigger shops with decent sized teams, these transitions can be done over 
time. But the vast majority of my career I've supported hundreds of 
compute nodes where the entire HPC team was just me plus my manager and 
we had to support the clusters for 5-8 years (especially when I was in 
the university world). I sympathize with the small HPC teams that just 
don't have the time nor flexibility to migrate. Although, I would 
HEAVILY suggest that they make the time to learn Singularity I don't 
expect them to make the transition to Kubernetes without some drastic 
changes.


I'm just curious what you are thinking about what it means to have a 
more sustainable footing within these clusters and what we as a 
community can do to lead the way such that in the next decades it 
matters less what OS is running on the hardware of these long term 
science HPC clusters.


~Stack~


Re: Rhel 8

2021-01-22 Thread ~Stack~

On 1/22/21 10:30 AM, Larry Linder wrote:

We are evaluating Debian at the moment   Since it is a RH derivative we
shall see.


I don't know if this is a typo or not...But Debian is not and never has 
been a derivative of Red Hat. If memory serves correct Debian officially 
kicked off in 1993 almost a year before Red Hat's first 1994 Halloween 
release. While there is and has been overlap in open source packages and 
tools, they are firmly two very distinct Linux sub-families.


Maybe you meant you were evaluating Rocky as a derivative of Red Hat or 
evaluating Debian as an alternative to Red Hat....


~Stack~


Re: Rhel 8

2021-01-23 Thread ~Stack~
 the vitriol 
aimed at those of us caught in the middle. Then when it started to get 
crazy, they got ban-hammer-happy. I got banned multiple times from the 
Debian forums (then reinstated every time because they knew me as 
someone who helped out) because I pointed out legitimate bugs. But those 
of us pointing out bugs were lost in the sea of outrage so the SystemD 
guys wouldn't even listen to valid criticism and errors. The systemd 
group itself even went into full lock-down because of the backlash so 
you couldn't submit a bug to them at all.


Then I got "banned" TWICE because someone complained of a legit systemd 
problem, and I responded with the answer to fix their problem. Why was I 
banned for helping? Because, no joke, at that point they just started 
auto-banning based on key words.


The first time, one of the devs personally reached out and apologized 
because he saw that I was trying to help. The second time, I was just 
done. I was sick and tired of the arrogance of the SystemD people and I 
was sick of the rage from the users. I backed out of contributing 
anything at all to the community for over a year. I still haven't had 
the strength to install pure Debian again...I still get sad and angry.


Even reading through the replys others have posted, you can see the pain 
and misery caused to others. And most of that is because of politics, 
not technical misery. Sure, there is probably that one time that really 
irks them. Just like I still remember that time my car sprung a oil leak 
which abandoned me in the middle of nowhere over night and I get ANGRY 
at that car even though I had 5 years of great reliability from that car 
before the incident and another 7 after. I'm not faulting nor 
diminishing that bad experience, rather reminding that we often remember 
the worst and best while forgetting the every day mundane when SystemD 
just did as it needed to and we didn't care that it was doing its thing.




These days, SystemD is just something that is there. It works well 
enough for most people that they don't know nor care that it is there. 
Even SysAdmins who get low into the internals like me just see it as 
another tool. Sure, there are those who have decided that's their hill 
to die on - more power to them. I do not want to criticize or speak ill 
of their choices. I'm glad that they have a choice. But before I engage 
with much more than what I've stated above, I want legit technical 
concerns to talk about. Because honestly, right now the number of 
systems I manage is in the hundreds and when I come across a system that 
DOESN'T have the SystemD tools, it's often more frustrating to me then 
all of the issues I have with SystemD combined today. I don't mind 
talking about legit technical problems.


As for talking politics? Well, I threw in the political towel with 
SystemD years ago. I've had enough biased politics the last few years 
shouted at me by the media. I sure as hell don't care to engage in 
SystemD politics any more. :-D


I hope that helps.

~Stack~


Re: Rhel 8

2021-01-23 Thread ~Stack~

On 1/23/21 6:59 PM, Yasha Karant wrote:
Does in fact SystemD provide for encapsulation and isolation, as in a 
proper OO design and implementation?  Could not a "proper" sysadmin 
interface be constructed to configure SystemD, rather than the 
"hodgepodge" of what seem to many as arbitrary and capricious 
incantations?  These questions are not posed as divisive or derisive, 
but instead requesting information.


Hrm. That's an interesting question. I may need to think about that more.

My initial response is that it could be tweaked to allow that to a 
point. I can configure $application1 to be set into a kernel cgroup such 
that I can constrain it and isolate it in terms of both resources and 
things it can talk to. However, there are still some level of kernel, 
cpu, memory, disk, logging requirements that will require $application1 
to interact with the low level services.


However, when I think about how various services work then there needs 
to be a lot of communication between applications. For instance, I've 
got a bunch of machines at work that use the big Nvidia GPU's. One of 
our researchers has code he's developing that sometimes will dead-lock 
the GPU's. The only way we've been able to get it unstuck is a reboot. 
And because we have had issues with the bleeding-edge development NVidia 
drivers, we've found that it's best to purge the drivers on shutdown, 
then reinstall fresh. Well, I don't want to do that manually. So when I 
tell a system to reboot, I wrote systemd code to ensure that the driver 
is purged before shutdown.


When the server starts, my systemd code checks for nvidia. If it is 
there (a power or hard reset didn't run the code on shutdown), then it 
yanks the driver and reboots. When it starts up and it isn't there, it 
waits on network. Then it waits for DNS services. Then it fetches the 
latest driver in our repo. Then it builds the driver and installs it. 
Then it verifies the driver is successful in loading the GPU. Only then 
does it finish the boot process.


In that one systemd script alone, I'm talking to a half dozen services, 
the kernel, and hardware. Could I shove all of that into hard isolation 
and encapsulation? Um...I'm not sure. There has to be a communication 
layer that breaks that isolation somewhere.


Could I do that without systemd? Absolutely. I did similar stuff like 
that long before systemd with scripts that were WAY longer and WAY 
gnarlier because I had to write all the test cases for failure and 
waiting myself instead of just telling systemd "Hey, let me know when 
this service is working."


And that's just not even talking about Desktop... When you get into the 
fancy notifications that you've received an email, or the music changed 
to a new track, or that there was a security alert, or that you have 
packages to be updated, or yadda yadda yadda the list goes on when you 
talk about Desktop level notifications. Everything that the UI has to do 
for all those fancy shiny integrations... I don't think it really could 
be isolated. Although if you allow for systemd to handle all 
communication then isolate the rest then yeah...that's kinda the point. :-)


Maybe someone who still gets down into the internals regularly might be 
able to answer better and more clearly then I did. Hopefully I didn't 
ramble too much! :-D


Oh, also. To your point about a better language choice potentially 
solving some of these issues (I snipped it). I will say it's been ~15 
years since I last did anything with language theory. I'm not someone 
who can really argue for or against these things. However, I do know 
that part of the drive for people to write more Rust code in the Linux 
kernel itself is to use Rusts type safety / memory clean up / ect as a 
measure against some of the issues that have occurred because of the C 
foundation. I'm not arguing for or against, just saying that this 
conversation is on going. Similar for Go, but ohh...I've got some love 
and hate for Go and will spare you that rambling rant! :-D


~Stack~


Re: Rhel 8

2021-01-25 Thread ~Stack~
lly a 
huge fan of terminator.


I don't want to count the number of shell currently open nor the number 
of systems I'm connected to...it might scare me... I use multiple 
desktops to help separate the mess into reasonable bins...and there's 
too many of those... :-D


For people like us, the desktop is more of a way to have more shell 
windows open! :-D


~Stack~


Re: Rhel 8

2021-01-25 Thread ~Stack~

On 1/25/21 5:50 PM, Konstantin Olchanski wrote:

On Mon, Jan 25, 2021 at 11:31:08PM +, Miles ONeal wrote:


| For me, the issues are not policital, but technical:

Agreed. One of mine is that the surety of being able to drop a lower runlevel 
and back up is gone. ...




If you ask me, systemd was designed and built to solve one and only one problem,
boot it's author's personal 1 core 300 MHz laptop as fast as possible. Today,
with 4 core 3000 MHz laptops and 16 core 4000 MHz "servers", many features
of systemd look quaint. ("waiting for USB devices to settle", really?).

Benchmarks that report "old" and "slow" SysV initscripts boot as fast as systemd
tend to support this viewpoint.

Each time I look at the systemd boot sequence trace, I see things like
"waiting 10 sec for disks that are not needed for booting" and
"waiting 10 sec for network not needed for booting". If unlucky, also see
"waiting forever for disk that failed and was removed" (hello, booting from 
degraded btrfs raid array).

How this stuff got into "E" linux and why paying customers put up with this,
is a mystery to me. Perhaps said paying customers "never reboot" and never
see systemd shortcomings (and get no benefit from "systemd fast booting").



As I mentioned before, there's a lot more to systemd then what the user 
sees or cares about. Most people don't care about fast boot and I rarely 
boot my servers. Yet, I do rely on a lot of things in systemd (see 
previous email about dealing with stuck NVidia GPU's).


It's not that I don't see systemd shortcomings. It has some. But so did 
SysV and the old init.


Again, it's just a tool. How it is used and if it is used well is up to 
the one who needs to use it. :-)


~Stack~


numfmt issue on SL 7.9; possible bug?

2021-02-10 Thread ~Stack~

Greetings,

Curious if anyone else can replicate this. I initially saw this in a 
certain upstream vendor 7.9, but I'm having issues replicating it and 
it's only in a certain environment (virtual and I've done strange and 
awful things to that as I've been trying to understand an unrelated 
project). However, I was trying to figure out if it was other places as 
well. Sure enough, I can replicate it on every single one of my SL 7.9 
instances that I've tested.


The short, numfmt should not return 'nan' when passed a zero.

$ echo 0 | numfmt
nan
$ rpm -q coreutils
coreutils-8.22-24.el7_9.2.x86_64

If I try on any other distro (Ubuntu/Debian/CentOS 8), it returns 0 as 
it should.


$ echo 0 | numfmt
0
$ rpm -q coreutils
coreutils-8.30-8.el8.x86_64

I may not be able to replicate it as reliably as I would prefer on 
upstream vendor, but every single SL 7.9 system I've tried has had 
coreutils-8.22-24.el7_9.2.x86_64 and incorrectly returns 'nan'.


I'm hoping the devs can confirm and/or offer suggestions.

Thanks!
~Stack~


Re: numfmt issue on SL 7.9; possible bug?

2021-02-10 Thread ~Stack~

Greetings,

Thanks for checking. That seems to fit that every other distro is fine. 
But I've now check a dozen different systems on SL 7.9 and all show it.


Thanks!
~Stack~




On 2/10/21 7:37 PM, Konstantin Olchanski wrote:


here goes, all on physical machines.

sl6, macos: no numfmt, sorry

ubuntu lts 20.04, centos-7, rhel-8 (ahem!):

$ echo 0 | numfmt
0

perhaps your 80387 chip is faulty. (happened to us once
on an R3000/R3010 SGI workstation)

K.O.



On Wed, Feb 10, 2021 at 06:17:26PM -0600, ~Stack~ wrote:

Greetings,

Curious if anyone else can replicate this. I initially saw this in a
certain upstream vendor 7.9, but I'm having issues replicating it
and it's only in a certain environment (virtual and I've done
strange and awful things to that as I've been trying to understand
an unrelated project). However, I was trying to figure out if it was
other places as well. Sure enough, I can replicate it on every
single one of my SL 7.9 instances that I've tested.

The short, numfmt should not return 'nan' when passed a zero.

$ echo 0 | numfmt
nan
$ rpm -q coreutils
coreutils-8.22-24.el7_9.2.x86_64

If I try on any other distro (Ubuntu/Debian/CentOS 8), it returns 0
as it should.

$ echo 0 | numfmt
0
$ rpm -q coreutils
coreutils-8.30-8.el8.x86_64

I may not be able to replicate it as reliably as I would prefer on
upstream vendor, but every single SL 7.9 system I've tried has had
coreutils-8.22-24.el7_9.2.x86_64 and incorrectly returns 'nan'.

I'm hoping the devs can confirm and/or offer suggestions.

Thanks!
~Stack~




Re: numfmt issue on SL 7.9; possible bug?

2021-02-11 Thread ~Stack~

On 2/11/21 2:25 AM, Akemi Yagi wrote:

On Wed, Feb 10, 2021 at 11:59 PM Dietrich, Stefan
 wrote:


Hi,

you might be running into this issue: 
bugzilla.redhat.com/show_bug.cgi?id=1925204

This has been introduced with glibc-2.17-322.el7_9 and has been fixed in 
glibc-2.17-323.el7_9.
CentOS 7 already ships the updated version, on SL7 the version seems to be not 
yet available.

Regards,
Stefan


Looks like the glibc-2.17-323.el7_9 update (RHBA-2021:0439) is not a
security fix. SL usually publishes non-security updates on Tuesdays.
Unless the devs decide to make it an exception, the update will be out
on next Tue, Feb 16.

Akemi



Awesome! Thank you both. I think you are absolutely onto something. This 
also may explain why I can't get it to work reliably on my upstream 
vendor OS. Even though they are all the same coreutils version, I've got 
different versions of glibc running on them (we have a rolling update 
cycle for certain environments to help catch upgrade errors). Some are 
older and some are newer. The one having the problem is the same version.


I didn't even think to check glibc.

I will wait till next Tuesday. Thank you!


Re: FWIW: AlmaLinux now available.

2021-04-04 Thread ~Stack~

Greetings,

On 4/3/21 10:39 AM, Lamar Owen wrote:

In the For What It's Worth department:


AlmaLinux stable release is now available.  I've used the 
almalinux-deploy shell script [snip links]


I'm planning to do the same thing with RockyLinux on another couple of 
CentOS 8 VMs, and I'm planning to do the same with Springdale.  This is 
for migrating already deployed CentOS 8 only; new deployments around 
here are Debian 10, soon to be 11.


I've been testing Alma out since they released their first beta. It 
works pretty much like I would expect RHEL 8.3 to work. Honestly, pretty 
boring and routine when I put it through its paces. Pretty much exactly 
what I expect and hope for out of a RHEL clone. So that gets flying 
colors from me.


I did notice that an application that says it works with RHEL + clones 
had issues with Alma, but I'm pretty certain it was because of the app 
being confused by seeing Alma in /etc/release. Although it didn't 
improve any when I copied over a CentOS release file. :shrug: I've let 
the application vendor know as I think it is more of an issue with them 
then Alma.


The one thing I would be curious about is how good do you find their 
support?


Personally, I've reached out a few times and haven't heard a peep back 
from them. Not via email and not via their Reddit "community". Although 
they just had a discourse forum go up four days ago and I've not tried 
that. I was a bit frustrated at them that I couldn't report back a 
problem I found. Instead I complained spoke to an 
acquaintance who is one of their cloud customers. He reported it and 
they responded instantly.


OK. Fine. They are a company. Makes sense they would listen and care 
about what their paying customers are saying. However, I found it 
frustrating and rude that I didn't get a single response back from 
Reddit and email when I was attempting to help Alma by reporting a problem.


Am I alone? Has anyone else tried reaching out to Alma and gotten 
responses to problems/issues/bugs?


On that front alone, I'm loving the Rocky community. It is a proper 
community.


Just my 2 cents.
~Stack~


Re: sudo - was Re: FWIW: AlmaLinux now available.

2021-04-11 Thread ~Stack~

> On 2021-04-07 9:28 a.m., Teh, Kenneth M. wrote:
>> If you need to run a lot of commands as root, the easiest sudo method
>> is simply 'sudo su -' which makes you into root.  The trailing '-'
>> does a login which replaces your environment with root's.


On 4/7/21 9:37 AM, Gilbert E. Detillieux wrote:
How is that in any way better than "sudo -i" (which I already suggested, 
and which avoids a needless extra command invocation)?




Greetings,
There's history to those commands, but the end result is dang-near 
identical these days. There's some distro-dependant differences that can 
be found but for the vast majority of the time it is the same and can be 
thought of as identical.


On *EL systems (RHEL/SL/Rocky/CentOS/ect) the end goal is damn near the 
same. But there are minor differences. Follow along if you want. Open up 
two shells side by side and in one run `sudo -i` and in the other `sudo 
su -`.


First up, take a look at the process hierarchy. Sudo launches a sub-exec 
of su from which your shell should now be running under if you did `sudo 
su -` but with `sudo -i` it launches it directly.


Next up, run the command: `env |sort`. You will see that the `sudo su -` 
stripped out all of the SUDO_* environment variables that `sudo -i` has.


Ok, so what?

Well... *shrug*

The short history is how and what bash resources were loaded. Since the 
su is a complete reloading of the profile it's the same as logging in as 
root with all the .profile and .bash_profile and .rc and blahblahblah 
files read in. The `sudo -i` (a long time ago in a distro far far away) 
used to only pull in a select subset of those profile files and there 
was some cross-environment variables that were kept around. Some of 
these details used to be more important for things like what gets listed 
as the ID in auditd logs, but I'm pretty confident that all of those 
things are similar and easily traced now with the newer audit logging 
tools. Thus, I *think* it's now identical...but it's too late at night 
for me to dig through audit logs to check! :-D


So what is the difference these days? It's one extra process vs a few 
shell environment variables. I think there are a few more even _more_ 
minor details but I can't remember them. I have yet to hear a convincing 
argument one over the other except for how many characters are typed. 
Since I tend to be old school my fingers just type `sudo su -` before my 
brain fully processes the thought. Yes, `sudo -i` is fewer characters 
but muscle memory...it just happens. *shrug* :-D


Not sure that was "helpful" information, but hopefully it answered the 
question. :-D


~Stack~


Re: Code bias video, watch it ASAP

2021-04-27 Thread ~Stack~

On 4/27/21 8:41 AM, LaToya Anderson wrote:
[snip]
Tell me, how many Black people are in this group? And what practices 
have been put into place to ensure you retain Black people? In other 
words, what has been done within this group to check bias to.ensure that 
you have a diverse group of people working together to improve this OS?


As someone with minority heritage myself, I say it doesn't matter. This 
isn't the place to re-imagine history and play political victim games. 
Take that to social media. I don't subscribe to this list for political bs.


I *do* subscribe to this list as a source of technical information 
around an OS that I use where I can be a part of a community comprised 
of people all over the world that helps others regardless of what their 
beliefs or political views. The only colors on this list I see are the 
black text on white background with blue links in my email.


There's a right time and right place for your conversation, and it isn't 
in this list.


As I've heard the saying: "the best way to lose friends is to talk 
politics". How about we just stay on list-topic about how much we 
appreciate Scientific Linux?


~S~


Re: c++17,17,20... - was Re: [SL-Users] Re: any update on CERN Linux and CentOS-8 situation?

2021-05-05 Thread ~Stack~

On 5/5/21 5:21 PM, Konstantin Olchanski wrote:

On Tue, May 04, 2021 at 11:00:00PM +0100, Andrew C Aitchison wrote:

On Tue, 4 May 2021, Konstantin Olchanski wrote:

[snip]

(OK, C++20 support in g++ 10.2.1 is "experimental).



And so what?

I can take SL-6 and graft modern versions of all important packages,
one does not even need the devtoolset, GCC is easy to build from sources.

But this is no longer "SL-6", it is "SL-6-KO1", at best.

Same thing, "CentOS-7 with devtoolset, php from webtatic, python from pip, kernel 
from ELREPO, etc" is not CentOS-7.

It is an irreproducible Franken-monster-bashed-together-locally thing.

Is this the new standard, the best way to go, "the new thing" for production 
environments?



I would say, no. The way forward is to use something like 
https://urldefense.proofpoint.com/v2/url?u=https-3A__spack.io_&d=DwICaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=Rcl3bOlhZYTsg6ao7N9s2R8gMaZj5RFHR3ZfE-XUUZg&s=XrKMgW2x7TS-6ye6hlykdflYSWiGTXaDqw3_WO5bTZw&e=  for reproducible builds of software. Or better yet, 
starting the difficult process of moving user applications into 
Singularity containers (https://urldefense.proofpoint.com/v2/url?u=https-3A__sylabs.io_&d=DwICaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=Rcl3bOlhZYTsg6ao7N9s2R8gMaZj5RFHR3ZfE-XUUZg&s=auYx06d5nuVcpRlexBCYemaHd0-4W213prqtSaLByHA&e= ). And getting Spack to build 
the Singularity images is even better! Both of those are fully Open 
Source tools with really good community support and free online training.


Once you can get the user applications into a container, you can 
abstract out the operating system (mostly; still needs to be Linux 
kernel - usually). Since Singularity is designed with HPC in mind, 
performance is fantastic.


We took an app that was built for RHEL 6, built it in a Singularity 
container, and can now run it on any Linux distro. As we move more of 
our user apps into Singularity containers we can start upgrading the OS 
and tools underneath the HPC environment without users ever knowing 
something changed (hopefully they notice the improvements).


Not saying that there isn't a learning curve for those creating the 
containers. I'm still not there in understanding it all and the 
container world is huge and varied. But it helped to just stick to 
Singularity and well establish formats until I got my head around it. 
And we haven't gotten to the point of letting users do it themselves 
yet. It's still an admin-only creation process. But we are getting there 
and the users don't have a clue how the app is installed/tweaked/tuned - 
they just know it works.


~Stack~


Re: In search for a SL replacement - almalinux

2021-05-06 Thread ~Stack~

On 5/6/21 11:07 AM, Larry Linder wrote:
[snip]

I am going to sign up to do testing for "almaLinux", I would like to see
it succeed.


Feel free to test out Rockylinux.org as well. The first RC just happened 
and there's a lot of good tests and updates going on over there.


I think it's good to have multiple distros taking place of CentOS and 
the healthier both communities are the better for everyone.


~Stack~


Re: In search for a SL replacement - almalinux

2021-05-06 Thread ~Stack~

On 5/6/21 3:21 PM, Yasha Karant wrote:

Excerpt from a previous post on this matter:

On Thu, 2021-05-06 at 02:43 -0400, Nico Kadel-Garcia wrote:
The misfeatures you've groused about are not due to AlmaLinux, they're
straight RHEL problems. Let's assign blame and credit where they are
due.
End excerpt.

Presumably, Rocky Linux (IBM RHEL "clone") does have the same 
"misfeatures"?  Would one of the after-EL repos (ElRepo et al.) be 
willing to produce an alternative set of RPMs to address the 
"misfeatures"?  Would Alma or Rocky or ... ?  No point in mentioning 
Fermilab/CERN -- SL8 will never exist.


Since Rocky is aiming for exact 1:1 compatibility then I would say that 
there won't be much deviation there.


However, there are a number of Special Interests Groups forming around 
Rocky. I know that when you look at the member list of groups like the 
Rocky SIG/HPC some of the names (and edu's associated) are quite well 
known. Even though Rocky _just_ released their first RC, there is 
already development underway for HPC support packages for Rocky. Several 
of the SIG's are gearing up for development and it won't surprise me if 
several are ready at or near official Rocky release (soon!).


If there was interest in a SIG/HEP I have no doubt they'd help carve out 
a community. I know you've voiced in the past you wanted an education or 
commercial entity backing your HEP operating system and you won't fully 
get that with Rocky. But if enough in the HEP community got together to 
form a Rocky SIG it might be easier to address the concerns you have 
with the OS tools already built for you.


Hope that helps.
~Stack~


Re: In search for a SL replacement

2021-05-06 Thread ~Stack~

On 5/6/21 7:02 PM, Nico Kadel-Garcia wrote:

On Thu, May 6, 2021 at 5:10 PM ~Stack~  wrote:


On 5/6/21 3:21 PM, Yasha Karant wrote:

Excerpt from a previous post on this matter:

On Thu, 2021-05-06 at 02:43 -0400, Nico Kadel-Garcia wrote:
The misfeatures you've groused about are not due to AlmaLinux, they're
straight RHEL problems. Let's assign blame and credit where they are
due.
End excerpt.

Presumably, Rocky Linux (IBM RHEL "clone") does have the same
"misfeatures"?  Would one of the after-EL repos (ElRepo et al.) be
willing to produce an alternative set of RPMs to address the
"misfeatures"?  Would Alma or Rocky or ... ?  No point in mentioning
Fermilab/CERN -- SL8 will never exist.


Since Rocky is aiming for exact 1:1 compatibility then I would say that
there won't be much deviation there.


Yeah. One big issue right now is anaconda, the installer suite, which
is vulnerable to feature creep at the cost of useful options. It's no
longer possible in the GUI to pick and choose individual packages or
bundles of software, and the lack of a default mirror for network
based installation is just plain stupid.



No argument from me there. I'm not a fan of anaconda. But I rarely use 
that for installing these days anyway. Even my home lab is kickstarted.

 > Unless you're a weasel like

me, you're unlikely to deduce that a working URL for CentOS
installation these days isis:

   https://urldefense.proofpoint.com/v2/url?u=https-3A__mirrors.edge.kernel.org_centos_8.3.2011_BaseOS_x86-5F64_os_&d=DwICaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=CTZq3-snhnq7A-XxoSrTrqRFjeCB_15gt967vVnnavQ&s=Qobxq5QzLlehm36QMC_hvjGC8txbB1gyj7FkCaW1_XI&e= 


Notice the "BasOS", then architecture, then "os", instead of the older:

  https://urldefense.proofpoint.com/v2/url?u=https-3A__mirrors.edge.kernel.org_centos_7.9.2009_os_x86-5F64_&d=DwICaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=CTZq3-snhnq7A-XxoSrTrqRFjeCB_15gt967vVnnavQ&s=b4D24OBMzTPw_Q5_uK8zk9cZhLt3p42h7s5WWyRMVfA&e= 


Notice that "BaseOS" is now considered more important than the
architecture or the OS, even though the channel only really makes
sense as part of the "os". Every RHEL 8 fork is going to have to deal
with this unwelcome segregation of paritally overlapping channels, all
of which are needed for a pretty basic server grade OS installation.



Again. No argument from me there. It still annoys me too.



I think the theory was to split the load at Red Hat, to reduce the
metadata burden of having a single channel and reduce the load on
their servers. It's not helpful.


Actually, it is because of how they wanted to split out certain modules 
for certain groups and activities. It would make sense that if a certain 
module had conflicting packages that it would be segregated and those 
that need it would know how to use it. At least in theory. There's 
plenty of people who will argue the pros and cons of this modularity. It 
used to annoy me but once I realized how to get what I need, I scripted 
it and don't really care that much these days. YMMV.




Anaconda has other issues: the difficulty of scripting and making its
behavior consistent isn't burdensome if you're, well, *me* and you can
handwrite ks.cfg files. But the kickstart GUI is pretty bad, and has
no options to include muliple '%pre' or '%post' scripts, it erases
multiple such scripts that may be read from a reference ks.cfg.



Creating Kickstarts isn't that bad. Once you do a install with the GUI, 
you are given the /root/anaconda.ks file which serves as a base. And I 
know it is dry reading, especially for those that aren't SysAdmin Nerds 
but RH does provide good documentation for it:
https://urldefense.proofpoint.com/v2/url?u=https-3A__access.redhat.com_documentation_en-2Dus_red-5Fhat-5Fenterprise-5Flinux_8_html-2Dsingle_performing-5Fan-5Fadvanced-5Frhel-5Finstallation_index&d=DwICaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=CTZq3-snhnq7A-XxoSrTrqRFjeCB_15gt967vVnnavQ&s=qPZLgW8wrzJijssESVieCcywSrMdI0_9PhpLmKQkrHQ&e= 


I've never used a gui for the kickstarts so I can't comment there.

~Stack~


Re: In search for a SL replacement

2021-05-06 Thread ~Stack~

On 5/6/21 7:43 PM, Yasha Karant wrote:
 From the Proofpoint URL (that Mozilla Thunderbird Bluhell reports as a 
"click through") and that translates to:


https://urldefense.proofpoint.com/v2/url?u=https-3A__access.redhat.com_documentation_en-2Dus_red-5Fhat-5Fenterprise-5Flinux_8_html-2Dsingle_performing-5Fan-5Fadvanced-5Frhel-5Finstallation_index-23bht.cd562e2a-2Dc743-2D442e-2Dafea-2D4a0a644b567a.7&d=DwIDaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=5QVvunZPOpisz1tvqpaD8tJ-w5kUxQOjP4riwIx6tEw&s=-dkcAFjruac91EaDXEXWhcqs545EJhktibPpjkyNOvs&e= 


Converting a RHEL 7 Kickstart file for RHEL 8 installation

You can use the Kickstart Converter tool to convert a RHEL 7 Kickstart 
file for use in a new RHEL 8 installation. For more information about 
the tool and how to use it to convert a RHEL 7 Kickstart file, see 
https://urldefense.proofpoint.com/v2/url?u=https-3A__access.redhat.com_labs_kickstartconvert_&d=DwIDaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=5QVvunZPOpisz1tvqpaD8tJ-w5kUxQOjP4riwIx6tEw&s=dbbJeNHpQdBEpbXgURXhMZHJztbOwgAw86LhRHjLl1o&e= 


End excerpt.

Presumably, those who have a working SL 7 could use the above technique 
to create the Kickstart install for UDEL8 (UD -- Unspecified 
Distribution).  Has anyone tried it?  Does it work?


I have taken SL7 kickstarts, modified them, and then used them with 
minimal changes on Rocky, Alma, and CentOS. In fact, my test system I 
switch between all three with a single kickstart that I just switch by 
commenting out the `url` line I don't want and uncommenting the `url` 
like that I do. It's that easy. I get a consistent build with any of the 
three OS every time.


~Stack~


Re: Rocky Linux 8.4 General Availability

2021-06-24 Thread ~Stack~

Thanks Dave!

It's been a bit busy for me recently and I've fallen behind on a lot of 
my emails. It seems most of the questions have been addressed, but if 
anyone has Rocky questions, feel free to reach out.


I was doing a lot of testing on Rocky and officially became a part of 
the testing team a few weeks back. I don't have all the answers, but I'm 
happy to try.


Also, I don't want to flood the Scientific Linux list with Rocky related 
conversation. Feel free to reach out to me directly but we do have 
forums and a Mattermost chat where the community is growing.


https://urldefense.proofpoint.com/v2/url?u=https-3A__forums.rockylinux.org_&d=DwICaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=uAQJH-2OdPuT0wfzfLVJqMVh0JTQ49SkczrjTlQLIi4&s=xWMncVVvMqTMDnyj5XLl5Ky7RJ_J5ASM3278eQAINrY&e= 
https://urldefense.proofpoint.com/v2/url?u=https-3A__chat.rockylinux.org_&d=DwICaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=uAQJH-2OdPuT0wfzfLVJqMVh0JTQ49SkczrjTlQLIi4&s=LX9L94ELZNAG4m70udC-Ve4zGZFP2eAizZO1RNMG4-c&e= 


Thanks!
~Stack~


On 6/22/21 10:45 AM, Dave Dykstra wrote:
https://urldefense.proofpoint.com/v2/url?u=https-3A__forums.rockylinux.org_t_rocky-2Dlinux-2D8-2D4-2Davailable-2Dnow&d=DwICaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=uAQJH-2OdPuT0wfzfLVJqMVh0JTQ49SkczrjTlQLIi4&s=d60VqXDI2pjkM3nijOE_HRP2XSMNH_BpuRDGXDbnO90&e= 


Dave



Re: timeshift

2021-08-09 Thread ~Stack~

On 8/9/21 1:47 AM, Nico Kadel-Garcia wrote:

"rsnapshot". Old, stable, and extremely effective at configuring backs
of both system files and user data.


I second rsnapshot. Been using it for years and it is easy to set up 
Yearly/Monthly/Daily/Hourly backups and how many of each you want to 
keep. Since it uses symlinks, only the data changed takes up space.


Since I'm backing up a LOT of systems, I've got a dedicated server. But 
I've used it before on just a single laptop with an external drive.


The two cautions I'll give are:
* Have an off-site backup too. I have two external drives that I rotate 
weekly to a secure location (it can be your house) that just has the 
most current backup. The way I do it, if I lose /everything/ else then 
worst case scenario I still have my data as of two weeks ago. I have 
lived through a catastrophic failure and I did so with very little data 
loss.


* Backups can be very challenging. The more options you want and the 
more devices and the more OS's and the more things you want to tweak 
just make backup that much more complex. Pretty soon you find the only 
thing that matches your requirements are enterprise solutions like 
Bacula. Rsnapshot is simple and has several things you can tweak, but 
don't expect a lot of bells and whistles other then the basics. I've 
found that's true of most of the simple backup interfaces.


Good luck!
~Stack~


Re: timeshift

2021-08-09 Thread ~Stack~

On 8/9/21 10:48 AM, Yasha Karant wrote:
She wants an incremental backup system that uses a removable external 
drive, and that she can initiate (not time interval daemon driven), and 
that allows her to "find" a deleted file that she needs -- but for which 
she looks both by the file name, but also by scanning content when 
necessary (including viewing an image file such as JPEG or a video file 
such as MP4).


Rsnapshot allows for you to run manually whenever you want.

As for finding files, it is just any utility you want to use to look at 
the filesystem.


~Stack~


Re: customize bash promt to full path

2021-08-12 Thread ~Stack~

On 8/12/21 5:58 PM, Ekkard Gerlach wrote:

Hello,

how can I customize the bash/ terminal promt  always

[root@arthur Desktop]#

to

[root@arthur /home/user41/Desktop]#

or root@arthur:/home/user41/Desktop#

?



Try editing your ~/.bashrc

Look for a line that looks like this:

 PS1='$\u@\h:\w\$ '

Then tack on PWD like this:

 PS1='$\u@\h:\w\$PWD$ '


That should do it or at least get you close enough that a web search can 
get you the rest of the way.


Hope that helps.
~Stack~


Re: google-chrome

2021-10-23 Thread ~Stack~

On 10/23/21 11:11 AM, Götz Waschk wrote:

Am 22.10.21 um 16:40 schrieb Stephen Isard:

For the past couple of days, I've been getting

--
/etc/cron.daily/0yum-daily.cron:

Failed to check for updates with the following error message:
Failed to build transaction: google-chrome-stable-95.0.4638.54-1.x86_64
requires libc.so.6(GLIBC_2.18)(64bit)
--

Disabling the google-chrome repo makes the error message go away, of 
course,
and lets check-update proceed.  But is this the end of the road for 
chrome

updates on SL7, or is there some reasonably straightforward work-around?

Stephen Isard


Hi Stephen,

I have seen this. The solution was rpm -e google-chrome-stable . You 
could try to run Google Chrome in an EL8 singularity container, e.g. 
based on CentOS8, AlmaLinux8 or Rocky Linux 8.


I'm going to second Götz. Officially Chrome isn't supported on RH. They 
say Fedora 24+...


https://urldefense.proofpoint.com/v2/url?u=https-3A__support.google.com_chrome_answer_95346-3Fhl-3Den&d=DwIDaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=dgMSM-KkpmXdR-TaFWz2pQPbuaFF5Ip3DaeYI-W9P-WL7T8PxtNz6HqLNf6hhX9Y&s=jH9qcs4D1gLr98CA7sw5fP8cWMTF0UDYVnGBq_BBxkI&e= 

We /MUCH/ prefer Firefox but we ran into an issue where we needed it on 
a SL7 system for a specific workload but it seemed like it broke every 
other update. It was a pain trying to manage Chrome. We finally just 
started using containers for Chrome until the end of the project and 
then we went back to Firefox for everything.


I too recommend Singularity for containerization, but there are other 
containers out there if you look.


~Stack~


Re: Fermilab/CERN recommendation for Linux distribution

2021-10-26 Thread ~Stack~

On 10/25/21 9:27 PM, Patrick J. LoPresti wrote:
The right decision is to restart Scientific Linux. Obviously that is not 
going to happen, which leaves organizations like mine in a bind. I am 
not sure what we will do, but CentOS Stream is definitely not it.





The restart is called 
https://urldefense.proofpoint.com/v2/url?u=https-3A__rockylinux.org_&d=DwICaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=zu2WhRSN4T1horhPg87GzkI7xBN78HdOdKckGDLwxqmE0n02jZXf3KF02Xed8D--&s=M-7zZdP1KjXMAWHjuu55mPPcibTiP2p9wdKKMlJ-2IE&e=
  ;-)

There's a big (and growing fast) group of HPC and scientific computing 
professionals using Rocky 8 already.


~Stack~


Re: Fermilab/CERN recommendation for Linux distribution

2021-10-26 Thread ~Stack~

On 10/26/21 8:21 AM, Mark Stodola wrote:
I haven't checked in on this in quite some time.  Has there been a clear 
preference to Rocky over Alma thus far?  I know I intend to use one or 
the other, but would be curious to know the direction of the wind among 
the scientific community.



First, I want to make it clear that I'm heavily involved with Rocky and 
thus biased. :-)


I don't think there is a "winner". I think it is great we have two 
options replacing CentOS. I've got nothing against Alma. However, I 
don't know what kind of HPC/Scientific community it has so I can't 
comment there.


But I do know that Rocky already has a lot of strong support in the 
scientific and HPC world and it's growing. I know of some /really/ big 
systems running Rocky already with more planned. I know a few have some 
announcements planned for SuperComputing this year and I don't want to 
steal their thunder. :-)
If you are interested in more information, you can create an account at 
chat.rockylinux.org and then add the SIG/HPC channel to chat with a lot 
of people in the Rocky HPC community.


Happy computing!
~Stack~


Re: Repo for updateso to an old SL

2021-12-19 Thread ~Stack~

Greetings,

On 12/19/21 2:34 PM, Elio Fabri wrote:

Hi all,
I'm stil using SL 6.2 (waiting for the fog about future SL, CentOS ... 
to dissolve).
I'm in need to update firefox. A version for RHEL6 systems exists 
(firefox-91.4.1esr) but yum finds no repo for that package.

Some help? Thx



A few things:
1. If you need packages out of CentOS 6, you need to go to their vault:
https://urldefense.proofpoint.com/v2/url?u=https-3A__vault.centos.org_6.10_&d=DwICaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=RssexQlgUrAddeuE-9yiY87HDGU6eSjvkRwF63qmYK6dsioceQtrv1H-wsE_SgP8&s=-P68E4Ng8ZpCy8JdtBwpCnXMNN5TN-SJOl7IWjYqwkw&e= 

(For SL6 packages: 
https://urldefense.proofpoint.com/v2/url?u=http-3A__ftp1.scientificlinux.org_linux_scientific_obsolete_6.10_&d=DwICaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=RssexQlgUrAddeuE-9yiY87HDGU6eSjvkRwF63qmYK6dsioceQtrv1H-wsE_SgP8&s=nQscbrAjx88ZwI0F0XI99v_AQkG07VuRhS4C2YL2VXY&e=  )


2. I really doubt that version of firefox will be in there. That ESR tag 
probably indicates the Extended Support Release. If RH patched it for 
customers paying extended support, I doubt it's been maintained by the 
CentOS community, though you might be able to ask RH for it (or the 
source to build yourself). EL 6.10 is old, but 6.2? That's REALLY old 
and REALLY vulnerable to security concerns. Please plan an update sooner 
rather then later.


3. The Rocky community has welcomed quite a few people from the SL 
community over. We'd love to have anyone that wants to join.


Mailing list: https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.resf.org_archives_&d=DwICaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=RssexQlgUrAddeuE-9yiY87HDGU6eSjvkRwF63qmYK6dsioceQtrv1H-wsE_SgP8&s=7PKlxXHyN0amrxNzXC9fD8Tq8kwZpMK_-PYQtMNIMSU&e= 
Forum: https://urldefense.proofpoint.com/v2/url?u=https-3A__forums.rockylinux.org_&d=DwICaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=RssexQlgUrAddeuE-9yiY87HDGU6eSjvkRwF63qmYK6dsioceQtrv1H-wsE_SgP8&s=95a37yIxHQp8PS-osLw756gFN434UBDoJeNtcsGglt4&e= 
Mattermost chat: https://urldefense.proofpoint.com/v2/url?u=https-3A__chat.rockylinux.org_&d=DwICaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=RssexQlgUrAddeuE-9yiY87HDGU6eSjvkRwF63qmYK6dsioceQtrv1H-wsE_SgP8&s=I4NZKJ693jinKprUrR8t6_15ttv1XecSEKYiFM7mfB8&e= 



~Stack~