Re: rsyslogd fails to write after logrotate and grub2 hangs after a restart

2012-08-16 Thread Alvin
On 5/08/2012 7:11, Scott Kitterman wrote:
> 
> Jim Tarvid  wrote:
> 
>> These two bugs are destroying my equanimity.  I wrote a cron kludge to
>> work
>> around the first but I have to drive to the colo to restart my system.
>>
>> Needless to say I am not a happy Ubuntu Server user
>>
> Not everyone is having these problems. Your complaints are far to generic for 
> anyone to take action on.  For example, I can assure you that logging works 
> just fine after logrorate here.  We need some specifics.

Yes, add at least the bug numbers.

I believe you might be talking about bug 407862
https://bugs.launchpad.net/bugs/407862
"Messages not being sent to system logs"
It's a rather old bug, and yes, I see it too on all 10.04 LTS servers.

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: Kernel Panics Ubuntu Server 11.04 and Samba

2011-05-25 Thread Alvin
On Wednesday 25 May 2011 00:23:22 zongo saiba wrote:
> Hi  Guys,
> 
> I have been running Ubuntu Server 11.04 for a couple of weeks with a Samba
> server and it is running great. The other day a client win7 tried to do a
> backup to the allocated Samba share and the server kernel-panicked on me.
> 
> Since then, every single time win7 clients are trying to do backups on
> allocated shares through Samba protocol, the server kernel-panicked and I
> have to reboot it.
> 
> The only reason I found in the logs for the KP :" kernel panic - not
> syncing: fatal exception in interrupt".
> 
> I immediately tested my ram with memtest-86 and all is ok
> 
> I mounted two external usb drives on the shares and tried to run the
> backups on the usb drives and the issue is identical. The server KP every
> single time.
> 
> I then run HDD test on both external USB drives and internal drive of the
> server and all is ok.
> 
> I have searched google for an answer but to  no avail. All the posts
> referred back to a potential hardware issue on the server. Hardware wise,
> my server has no issues.
> 
> Any help is much appreciated on sorting the issue as I have no idea on
> where to look for a solution.

I've given up on using Ubuntu as a file server solution. It can't handle high 
disk I/O. I'd still like to know where the problem lies though.

Have you only been testing with samba? You could try installing rsync[1]. 
(deltacopy) on Windows and copy a large amount of data over to the Ubuntu 
server. Make sure you can copy at high speed.

Do you have more information about your system?
- What's the memory usage during a copy?
- Do you use LVM? Do you have snapshots. (snapshots would explain the issue)
- Are there any more messages in /var/log/kern.log?

[1] https://bugs.launchpad.net/bugs/750359
Kernel panics under load

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: qcow2 snapshotting: does it influence future IO performance?

2011-03-31 Thread Alvin

On Thursday 31 March 2011 15:15:35 jurgen.depic...@let.be wrote:

Hi Alvin.

I read on your site that you stopped work on a project because LVM
snapshots potentially decreased IO performance, and I also saw that you
worked on a perl project to make copies of VMs, so therefore I address my
question directly to you.

Your work reminded me of something I read some weeks ago on qcow2 on
http://people.gnome.org/~markmc/qcow-image-format.html and which raised my
brows at the time:
"Snapshots - "real snapshots" - are represented in the original image
itself. Each snapshot is a read-only record of the image a past instant.
The original image remains writable and as modifications are made to it, a
copy of the original data is made for any snapshots referring to it. "

I tried to find more info about this, but didn't yet.  So here's my
question: does anybody know whether, having made snapshots of a VM during
several stages of its life (clean install, important service installed,
...), affects IO performance of any writes made after snapshotting, as one
could suspect from my quote above?


There is a difference between qcow2 snapshots and lvm snapshots. You 
probably want qcow2 snapshots in a typical testing scenario where you 
start from a good working virtual machine, snapshot it, break it, and 
then return to your snapshotted situation.
I didn't do that yet. Last time I tried, the snapshotting itself hung 
libvirt, but that was a long time ago. Things might be better now. I 
have no idea about the resulting performance. They are both 
'copy-on-write', so they might suffer from the same problems. I don't 
know if that's true, but it would be interesting to do some research here.


The performance drop caused by LVM snapshots is the reason why I 
abandoned the project. I originally took the idea from our ZFS setup at 
work. We're using /a lot/ of ZFS snapshots. It's a success story. If I 
would ever take that away, the users would be at my door with torches 
and pitchforks. It's the single reason to still have dealings with 
Oracle support. So, I started using LVM snapshots at regular intervals. 
The script I wrote was for easier management and auto-mounting of the 
snapshots. It ended in disaster and halted production on the Ubuntu 
servers. The reason is that ZFS snapshots and LVM snapshots are too 
different. The ZFS faq states that you need about 1GB of RAM for every 
10.000 filesystems. That includes snapshots, so that gives you an idea 
of the possibilities (and the different approach to filesystems). Even a 
single LVM  snapshot can bring an Ubuntu server on its knees and can 
lead to a kernel panic given the right (in a manner of speaking) 
circumstances.


It seemed a good idea at the time, but it will set fire to your daemons, 
hang your kernel and send your users screaming.


I did file a little bug report[1]. Originally, I thought it was 
qemu-img, but qemu-img is just a good I/O stress test. You can also use 
rsync or simply cp to hang the kernel.


What I currently do in the above testing scenario is copying the 
(offline) virtual machine, but I might try the qcow2 snapshots one day. 
Currently this is not possible because my test vm's run on logical 
volumes on RAID0. It speeds up the virtual machine itself, but the 
copying process is heavy and slow.


Maybe btrfs will be the answer in the near future?

[1] https://bugs.launchpad.net/bugs/712392

--
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: [Oneiric-Topic] Server Boot

2011-03-30 Thread Alvin
On Wednesday 30 March 2011 14:52:14 Serge E. Hallyn wrote:
> Quoting Scott Kitterman (ubu...@kitterman.com):
> > There was a lot of discussion around improving the server boot experience
> > before the UDS-M.  A number of people expressed interest in seeing more
> > useful diagnostic information during boot.  Others expressed concerns
> > with boot reliability on the more complex hardware typically found in
> > servers.
> > 
> > How are we doing on this?  Personally, I can't remember the last time I
> > rebooted a server and it wasn't via SSH and the hardware I use is the
> > sort there were problems with.  Are these still issues for the Ubuntu
> > Server community?
> > 
> > Scott K
> 
> I think right now these issues are oveshadowed by the fact that a
> great deal of server software is not yet upstartified.  I think that
> needs to be addressed for O.

Yes, they are certainly still issues (and the primary reason the company I 
work for is abandoning Ubuntu.)

I agree that a lot of servers are not often rebooted, but not every server is 
a webserver. Some are used only during certain hours and can be booted 
automatically (BIOS or WOL) when needed in order to keep the electricity bill 
down. Booting should be a reliable and automated process. Accurate logging is 
important in order to know what went wrong in case the unthinkable happens.

The current boot.log looks like:
> mount.nfs: DNS resolution failed for 192.168.xxx.3: Name or service not 
known
> mount.nfs4: Failed to resolve server exampleserver: Name or service not 
known
> mountall: mount /srv/example [1134] terminated with status 32
> mount error(101): Network is unreachable
while in reality filesystems are mounted. Now, when something goes wrong, the 
log is identical. conclusion: boot.log is useless. (actually, the log is 
probably correct. it can't resolve server names at that specific time.)
Proper boot logging would be popular[1].

Take the following example of a server boot. Let's also assume that nothing 
goes wrong that could lead to a busybox console. (It certainly can![2][3])
So, you're now sitting in front of a nice prompt. Everything looks ok, but is 
it? The server mounts NFS shares from another server, it runs KVM/libvirt with 
a netfs storage pool for its virtual machines and a quasselcore for IRC that 
stores it's data on a postgresql on another server. The local filesystem uses 
mdadm for RAID1 and LVM on op of that. Very server-like. (I once made this 
setup to test some things.) In order to keep things under control, there are 
/no/ LVM snapshots. That is another ugly story.

So, what happens now:
- The RAID will be broken! [4][5]
- The NFS shares in /etc/fstab might not be mounted, [6][7]
  even when you told the system to wait with _netdev. [8]
- Your virtual machines on netfs will not be running. [9]
- The quasselcore with external db will not be started. [10]

The array can be assembled by running a command and all of the above daemons 
can be started manually.

I talked about some of those topics on IRC, and the following workarounds came 
up. There are also some workarounds in the bug reports.
- Put NFS shares in /etc/fstab, and don't configure them as netfs storage 
pools.
- Put the IP addresses of your NFS servers in /etc/hosts.

For most servers, speeding up the boot process is less important than 
reliability. Why not take a look at how Debian does it? You can disable 
running the boot scripts in parallel with 'CONCURRENCY=none' in 
/etc/default/rcS.

Also, think about daemons of commercial software without upstart scripts. You 
never know whether they will start at boot or not.

Links:
[1] "init: support logging of job output"
https://bugs.launchpad.net/bugs/328881

[2] "Gave up waiting for root device after upgrade then busybox console"
https://bugs.launchpad.net/bugs/360378

[3] "karmic rc: root device sometimes not found"
https://bugs.launchpad.net/bugs/460914

[4] mdadm cannot assemble array as cannot open drive with O_EXCL
https://bugs.launchpad.net/bugs/27037

[5] "mdadm cannot assemble array"
https://bugs.launchpad.net/bugs/599135

[6] "nfs mounts specified in fstab is not mounted on boot."
https://bugs.launchpad.net/bugs/275451

[7] "nfs shares are not automounted anymore in intrepid"
https://bugs.launchpad.net/bugs/285013

[8] "_netdev not working"
https://bugs.launchpad.net/bugs/384347

[9] "Libvirt NFS mount on boot."
https://bugs.launchpad.net/bugs/351307

[10] "quasselcore does not connect to database at boot"
 https://bugs.launchpad.net/bugs/612729

-- 
Alvin

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: run our business's mail server (W2003-DOMINO) as a KVM

2011-03-10 Thread Alvin
On Wednesday 09 March 2011 14:00:23 Serge E. Hallyn wrote:
> Quoting jurgen.depic...@let.be (jurgen.depic...@let.be):
> > Hi all,
> > 
> > I posted a small question on #ubuntu-server: I noticed that ubuntu
> > 10.04.2 LTS has quite old libvirt versions (libvir 0.7.5) which lacks
> > many functions, like managedsave or snapshotting.  What's the
> > recommended way to get the latest versions there?
> 
> What exactly is your goal?  If you just want some data snapshotted,
> I would think the safest way would be to put that data on a separate
> LVM partition on the host, passed in as a virtio device, and use
> LVM snapshots to back up the data.

Be very careful with that. In theory it should work, but when you backup the 
snapshot, the load can bring down your server. Especially running virtual 
machines will suffer.

That reminds me I still have to add this information to bug 712392 [1]... 
done.

Links:
[1] https://bugs.launchpad.net/bugs/712392

-- 
Alvin

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: recovery of virtual machines on KVM

2011-02-10 Thread Alvin
On Wednesday 09 February 2011 19:39:57 Tapas Mishra wrote:
> I  am having a virtualization setup via KVM on a Ubuntu 10.04 64 bit
> server.
> 
> A recent dbus update cause a crash of my Host OS.It was a post install
> script of dbus which ultimately brought everything down.
> 
> Now I have to basically format the host OS.My cause of concern are the
> virtual machines which were running on it when the environment was
> stable.Which were in separate LVM partitions.
> 
> Some thing like
> 
> /dev/virtualization/vm1
> /dev/virtualization/vm2
> /dev/virtualization/vm3
> /dev/virtualization/vm4
> If some one has experienced recovery of this sort in past let me know
> what did they do to get things back. All my Virtual Machines were on
> separate partition and in same VolumeGroup this volume group was on
> Host OS. Will formatting of HOST os clear the Virtual Machines also in
> my situation or just be re installing the host and importing the
> Virtual Machines via a tool such as virt-manager I will be able to get
> them back.

Before your reinstall, dump the configuration of your virtual machines like 
this:
$ virsh dumpxml vm1 > vm1.xml
After reinstall, redefine the machines
$ virsh define vm1.xml

It's that simple, but of course you have to leave your LVM volumes in place 
and make sure your host has the same network interfaces (as defined in the 
xml).

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: Server load too high when using qemu-img

2011-02-03 Thread Alvin
On Thursday 03 February 2011 15:01:37 Serge E. Hallyn wrote:
> Quoting Alvin (i...@alvin.be):
> > I have long standing performance problems on Lucid when handling large
> > files.
> > 
> > I notice this on several servers, but here is a detailed example of a
> > scenario I encountered yesterday.
> > 
> > Server (stilgar) is a Quad-core with 8 GB ram. The server has 3 disks. 1
> > Disk contains the operating system. The other two are mdadm RAID0 with
> > LVM. I need to recreate the RAID manually[1] on most boots, but
> > otherwise it is working fine.
> > (Before there are any heart attacks from reading 'raid0': the data on it
> > is NOT important, and only meant for testing.)
> > The server runs 4 virtual machines (KVM).
> > - 2 Lucid servers on qcow, residing on the local (non-raid) disk.
> > - 1 Lucid server on a fstab mounted NFS4 share.
> > - 1 Windows desktop on a logical volume.
> > 
> > I have an NFS mounted backup disk. When I restore the Windows image from
> > the backup (60GB), I encounter bug 658131[2]. All running virtual
> > machines will start showing errors like in bug 522014[3] in their logs
> > (hung_task_timeout_secs) and services on them will no longer be
> > reachable. The load on the server can climb to >30. Libvirt will no
> > longer be able to
> 
> Is it possible for you to use CIFS instead of NFS?
> 
> It's been a few years, but when I had my NAS at home I found CIFS far more
> stable and reliable than NFS.

Yes. I know NFS is somewhat neglected in Ubuntu, but why use MS Windows file 
sharing between Linux machines? That makes no sense. NFS is easier to set up. 
In short: I could try CIFS, but in order to exclude the network share from 
this issue I copied the image file locally first. It is true that NFS (maybe 
CIFS too) has an impact on this. The load gets even higher when using it.

> > shutdown the virtual machines. Nothing else can be done than a reboot of
> > the whole machine.
> > 
> > From the bug report, it looks like this might be NFS related, but I'm not
> > convinced. If I copy the image first and then restore it, the load also
> > climbs insanely high and the virtual machines will be on the verge of
> > crashing. Services will be temporaraly unavailable.
> 
> (Not trying to be critical)  What do you expect to happen?  I.e what do you
> think is the bug there?  Is it that ionice seems to be insufficient?  I'm
> asking in particular about the conversion by itself, not the copy, as I
> agree the copy pinning CPU must be a (kernel) bug.

Well, I expect a performance hit, but no hung tasks. Especially when using 
ionice.

> > The software used is qemu-img or dd. In all cases I'm running the
> > commands with 'ionice -c 3'.
> > 
> > This is only an example. Any high IO (e.g. rsync with large files) can
> > crash Lucid servers,
> 
> Over NFS, or any rsync?

Both. In the example, NFS/rsync was not used. I only told that because I've 
had the same trouble when using them on other servers.

> For that matter, rsync tries to be smart and slice and dice the file to
> minimize network traffic.  What about a simple ftp/scp?
> 
> > but what should I do? Sometimes it is necessary to copy large
> > files. That should be something that can be done without taking down the
> > entire server. Any thoughts on the matter?
> 
> It might be worth testing other IO schedulers.
> 
> It also might be worth testing a more current kernel.  The kernel team
> does produce backports of newer kernels to lucid which, while surely not
> officially supported, should work and may fix these issues.

I might try those. I see you found my new bug report[1]. You're on to 
something there! I didn't remove an usb drive, but there are similar troubles 
I did not link to this before:
- mdadm does not auto-assemble [2]
- I have an LVM snapshot present on that system! Even worse, the snapshot is 
100% full and thus corrupt.

Now, I didn't think of the snapshot. The presence of an LVM snapshot is a huge 
IO performance hit, so that explains the extreme load. In my example I was 
reading the raw image from its parent volume.

Because of your comment I also found a blog post[3] about the issue:
"Non-existent Device Mapper Volumes Causing I/O Errors?"

So, I will first contact all users and find a moment to take the server 
offline for some testing. Then, i'll post my findings in the bug report.

Thanks for the tips.

Links:
[1] https://bugs.launchpad.net/bugs/712392
[2] https://bugs.launchpad.net/bugs/27037
[3] http://slated.org/device_mapper_weirdness

-- 
Alvin

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Server load too high when using qemu-img

2011-02-01 Thread Alvin
I have long standing performance problems on Lucid when handling large files.

I notice this on several servers, but here is a detailed example of a scenario 
I encountered yesterday.

Server (stilgar) is a Quad-core with 8 GB ram. The server has 3 disks. 1 Disk 
contains the operating system. The other two are mdadm RAID0 with LVM. I need 
to recreate the RAID manually[1] on most boots, but otherwise it is working 
fine.
(Before there are any heart attacks from reading 'raid0': the data on it is 
NOT important, and only meant for testing.)
The server runs 4 virtual machines (KVM).
- 2 Lucid servers on qcow, residing on the local (non-raid) disk.
- 1 Lucid server on a fstab mounted NFS4 share.
- 1 Windows desktop on a logical volume.

I have an NFS mounted backup disk. When I restore the Windows image from the 
backup (60GB), I encounter bug 658131[2]. All running virtual machines will 
start showing errors like in bug 522014[3] in their logs 
(hung_task_timeout_secs) and services on them will no longer be reachable. The 
load on the server can climb to >30. Libvirt will no longer be able to 
shutdown the virtual machines. Nothing else can be done than a reboot of the 
whole machine.

>From the bug report, it looks like this might be NFS related, but I'm not 
convinced. If I copy the image first and then restore it, the load also climbs 
insanely high and the virtual machines will be on the verge of crashing. 
Services will be temporaraly unavailable.

The software used is qemu-img or dd. In all cases I'm running the commands 
with 'ionice -c 3'.

This is only an example. Any high IO (e.g. rsync with large files) can crash 
Lucid servers, but what should I do? Sometimes it is necessary to copy large 
files. That should be something that can be done without taking down the 
entire server. Any thoughts on the matter?

Links:
[1] https://bugs.launchpad.net/bugs/27037
[2] https://bugs.launchpad.net/bugs/658131
[3] https://bugs.launchpad.net/bugs/522014

Example from /var/log/messages (kernel) on the server:

kvm   D  0  9632  1 0x
 8801a4269ca8 0086 00015bc0 00015bc0
 8802004fdf38 8801a4269fd8 00015bc0 8802004fdb80
 00015bc0 8801a4269fd8 00015bc0 8802004fdf38
Call Trace:
 [] __mutex_lock_slowpath+0x107/0x190
 [] mutex_lock+0x23/0x50
 [] generic_file_aio_write+0x59/0xe0
 [] ext4_file_write+0x39/0xb0
 [] do_sync_write+0xfa/0x140
 [] ? autoremove_wake_function+0x0/0x40
 [] ? security_file_permission+0x16/0x20
 [] vfs_write+0xb8/0x1a0
 [] sys_pwrite64+0x82/0xa0
 [] system_call_fastpath+0x16/0x1b
kdmflush  D 0002 0   396  2 0x
 88022eeb3d10 0046 00015bc0 00015bc0
 88022f489a98 88022eeb3fd8 00015bc0 88022f4896e0
 00015bc0 88022eeb3fd8 00015bc0 88022f489a98
Call Trace:
 [] io_schedule+0x47/0x70
 [] dm_wait_for_completion+0xa3/0x160
 [] ? default_wake_function+0x0/0x20
 [] ? __split_and_process_bio+0x127/0x190
 [] dm_flush+0x2a/0x70
 [] dm_wq_work+0x4c/0x1c0
 [] ? dm_wq_work+0x0/0x1c0
 [] run_workqueue+0xc7/0x1a0
 [] worker_thread+0xa3/0x110
 [] ? autoremove_wake_function+0x0/0x40
 [] ? worker_thread+0x0/0x110
 [] kthread+0x96/0xa0
 [] child_rip+0xa/0x20
 [] ? kthread+0x0/0xa0
 [] ? child_rip+0x0/0x20

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: Hardware Raid Controller Card.

2010-12-29 Thread Alvin
On Wednesday 29 December 2010 13:27:19 Jussi Jaurola wrote:
> > Hi
> > 
> > I have loaded Ubuntu 10.04 LTS server on HP DL 360 G6.
> > Can someone please recommend good Hardware RAID Controller Card for this
> > specific HP Server.
> > 
> > Thanks
> > 
> > Kaushal
> 
> 3ware has several great performing and great working raid-cards, i would
> recommend them.

I wouldn't. The cards are well supported, but the performance is really bad.

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: taking lvm backup

2010-09-28 Thread Alvin
On Tuesday 28 September 2010 10:45:58 Tapas Mishra wrote:
> On Tue, Sep 28, 2010 at 11:11 AM, Clint Byrum  wrote:
> > mount /dev/nintendo/lvm4 /mnt -o ro
> > 
> > If it has a filesystem on it, that should detect it, and mount it
> 
> I get following message when I do as above that you said.
> mount: you must specify the filesystem type

If they are used for virtual machines, the LVM volumes are probably 
partitioned. You can still mount them with -o loop, but it's not recommended.

You have to shutdown the virtual OS in order to take a safe backup.

One way to take a backup is to convert the LVM volumes to compressed qcow2 
images, like this:
# qemu-img convert -c -f raw -O qcow2 /dev/nintendo/lvm1 /backup/lvm1.img

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


double information in motd

2010-07-14 Thread Alvin
If I log in remotely, some servers show double statistics for apt, like this:

0 packages can be updated.
0 updates are security updates.

0 packages can be updated.
0 updates are security updates.

/var/run/motd contains the duplicate information.
/usr/lib/update-notifier/update-motd-updates-available contains the command 
that produces this output (/usr/lib/update-notifier/apt-check --human-
readable)

Any idea why this output appears twice?

-- 
Alvin

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Lucid: LVM snapshots on RAID1

2010-04-28 Thread Alvin
Dustin Kirkland kindly requested me not to hijack his thread for this purpose, 
so this is a new one. I hereby apologize.

I'm looking for people to confirm https://launchpad.net/bugs/563895
What you need is a system running Lucid on mdadm RAID1 with 1 LVM volume 
group, spanning the entire md.
Then create a snapshot and reboot. If I'm correct, you will land in a grub 
rescue shell. In that case, please set the above bug to confirmed.

If it is confirmed, this should make it to the release notes.

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: Lucid Root on RAID1

2010-04-27 Thread Alvin
On Monday 26 April 2010 16:22:13 Dustin Kirkland wrote:
> On Mon, Apr 26, 2010 at 8:50 AM, Imre Gergely  wrote:
> > I've tried various things but couldn't reproduce the problem... See my
> > comments in the bugreport.
> 
> Thanks, Imre.
> 
> Please, anyone else looking at this, comment in the bug report:
>  * https://bugs.edge.launchpad.net/ubuntu/+source/mdadm/+bug/569900
> 
> At this point, it appears to be related to the geometry of the disks.
> Still investigating.

This also worksforme(tm), but I'll misuse this thread to ask the people with 
the same setup if they want to take an lvm snapshot and reboot their server.

WARNING: This could render your server unbootable. If it does, please set the 
following bug to confirmed:
https://bugs.launchpad.net/bugs/563895

(You can use the rescue option on the server CD to remove the snapshot and 
reboot in order to restore your system to a working state)

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: Bug or hardware failure

2010-04-19 Thread Alvin
On Thursday 15 April 2010 10:57:24 Alvin wrote:
> A week ago, a server of mine suddenly started to halt on random moments.
> Blank screen, no input. Drives and memory where fine. I needed frequent
> reboots to be able to finally start the machine (always that blank screen)
> Nothing in the logs hours before a sudden crash, nothing in /var/crash.
> 
> After a BIOS upgrade, the only message I got after a flash of grub was:
> PANIC: early exception 08 rip 246:10 error 810356e6 cr2 0
> (Older kernel gives same message.)
> 
> A replacement board showed the same error on boot, so unless both boards
> are faulty, a software error is more likely.
> 
> I had more success getting the system to boot with 'Intel Trusted
> Execution' disabled.
> 
> So, how would you debug a situation like this?

Well, the unexpected happened. It was a hardware error. Both boards (or CPU?) 
were faulty. I received a replacement board and CPU.

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Bug or hardware failure

2010-04-15 Thread Alvin
A week ago, a server of mine suddenly started to halt on random moments. Blank 
screen, no input. Drives and memory where fine. I needed frequent reboots to 
be able to finally start the machine (always that blank screen)
Nothing in the logs hours before a sudden crash, nothing in /var/crash.

After a BIOS upgrade, the only message I got after a flash of grub was:
PANIC: early exception 08 rip 246:10 error 810356e6 cr2 0
(Older kernel gives same message.)

A replacement board showed the same error on boot, so unless both boards are 
faulty, a software error is more likely.

I had more success getting the system to boot with 'Intel Trusted Execution' 
disabled.

So, how would you debug a situation like this?

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: Problem with no display on 10.04 beta2 server

2010-04-11 Thread Alvin
On Sunday 11 April 2010 12:16:55 James Gray wrote:
> On 11/04/2010, at 8:06 PM, Alvin wrote:
> > On Sunday 11 April 2010 11:59:58 James Gray wrote:
> >> On 11/04/2010, at 7:15 PM, Janåke Rönnblom wrote:
> >>> I have an IBM 3550 server where I have installed the 10.04 beta2 server
> >>> on. On reboot after the BIOS messages all I get is a blinking cursor
> >>> and then it disappears. If I try ALT+F1/F2, ENTER and so nothing
> >>> happens no login prompt nothing!
> >>> 
> >>> Connecting through ssh works but I fear the day when I need to
> >>> troubleshoot the server at the console...
> >>> 
> >>> -J
> >> 
> >> What happens if you ssh in, edit /boot/grub/menu.lst and modify the
> >> default kernel parameters to remove "quiet splash" (or if that fails,
> >> try include "text"), then reboot.  See if that makes any difference to
> >> the result. Might be telling grandma how to suck eggs, but this little
> >> trick has revealed volumes in the past.  I know it should "Just Work"
> >> but this might help isolate the fault.  My initial guess is there is
> >> something a bit henky with the frame buffer support.
> > 
> > In the past, it might have, but I tried this on a desktop edition of
> > Lucid beta and it was a bad idea. Boot halted (because of the bug where
> > you can't mount more than 4 lvm volumes.)
> > 
> > Without splash and quiet, you can see the messages, but not the buttons
> > you have to press to skip mounting the volume.
> 
> But if it gets as far as mounting the file systems, surely it's well into
> the boot process at that stage...unless you have 4 LVM volumes for the
> root file system (??).  Not 100% sure I'm following you - booting with the
> "quiet splash" you see nothing, and other than that, boots fine.  So how
> would removing them to see the boot process cause the filesystem mounts to
> fail??

Yes, it is starting to mount filesystems, but something could go wrong.

Some examples:
Lucid beta will not succeed if you have more than 4 lvm filesystem defined in 
/etc/fstab. [https://bugs.launchpad.net/bugs/557909]
(Also, if you have defined mount points for snapshots, the upgrade to lucid 
will have converted them to UUID's. Bad idea, because snapshots have the same 
UUID as the filesystem where they are a snapshot from, so they could get 
mounted in the wrong place. This will not halt boot, but could cause 
confusion.)
Or, you could have defined dhcp in /etc/network/interfaces, which will cause 
resolv.conf to stay empty. If you mount network shares by hostname, this will 
halt your boot process. [https://bugs.launchpad.net/bugs/558384]

With splash, plymouth will ask what to do. (for example, skip mounting that 
filesystem, or go into recovery shell,..)
So, if anything happens, you will just sit there if you have no splash, but 
you will see the error messages better.

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: Problem with no display on 10.04 beta2 server

2010-04-11 Thread Alvin
On Sunday 11 April 2010 11:59:58 James Gray wrote:
> On 11/04/2010, at 7:15 PM, Janåke Rönnblom wrote:
> > I have an IBM 3550 server where I have installed the 10.04 beta2 server
> > on. On reboot after the BIOS messages all I get is a blinking cursor and
> > then it disappears. If I try ALT+F1/F2, ENTER and so nothing happens no
> > login prompt nothing!
> > 
> > Connecting through ssh works but I fear the day when I need to
> > troubleshoot the server at the console...
> > 
> > -J
> 
> What happens if you ssh in, edit /boot/grub/menu.lst and modify the default
> kernel parameters to remove "quiet splash" (or if that fails, try include
> "text"), then reboot.  See if that makes any difference to the result. 
> Might be telling grandma how to suck eggs, but this little trick has
> revealed volumes in the past.  I know it should "Just Work" but this might
> help isolate the fault.  My initial guess is there is something a bit
> henky with the frame buffer support.

In the past, it might have, but I tried this on a desktop edition of Lucid 
beta and it was a bad idea. Boot halted (because of the bug where you can't 
mount more than 4 lvm volumes.)

Without splash and quiet, you can see the messages, but not the buttons you 
have to press to skip mounting the volume.

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: Changes in booting with ubuntu-server 10.04

2010-03-26 Thread Alvin
On Friday 26 March 2010 19:15:09 Etienne Goyer wrote:
> 'Soren Hansen' wrote:
> > On Fri, Mar 26, 2010 at 09:42:19AM +0100, Egbert Jan wrote:
> >> But what heck, nobody asked to have fancy server bootspash screens on
> >> servers.
> > 
> > That's simply not true. /You/ may not have asked for it, but it's
> > certainly been asked for. I myself, for instance, don't mind a pretty
> > boot sequence (brief as it may be).
> 
> I have not seen anybody complaining on the look of the Server Edition
> boot process either.  Was that discussed at a UDS, or something?  If so,
> I must have missed the blueprint.
> 
> Just because of the potential for regressions and unforeseen problems, I
> think it is a terrible idea to introduce that feature in an LTS cycle.
> I hope it get backed out before release.

I haven't seen the Lucid boot process yet, and that is the sole reason for not 
having complained yet. Currently, we lack every form of boot logging. Some 
bugs on Launchpad have pictures of the boot process attached to them, taken by 
a digital camera.

In Karmic, there were several bugs introduced by mountall/upstart/plymouth.
See the latest comments of https://bugs.launchpad.net/bugs/470776
Error messages will be hidden by default.
The specific error messages here can sometimes be correct, and sometimes be 
wrong. Most people will rather see the source of that problem fixed instead of 
covered up with a nice animation.

As a server administrator, I'm not interested in fast boot times, nor in fancy 
graphics. I'm interested in reliable booting and knowing what is going on. I'd 
like to have upstart because it eliminates the need to set a specific order in 
the processes (no more rcX), but I'd never sacrifice reliability for that.

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: mount.nfs: No such device

2010-02-16 Thread Alvin
On Friday 12 February 2010 11:38:28 Kaushal Shriyan wrote:
> Hi,
> 
> I get this error mount.nfs: No such device inside the domU while
> trying to mount the nfs inside the VM
> 
> "No such device" - NFS is not configured into the client's kernel.
> 
> Is there a workaround for it?

I don't understand the situation completely. You have a (virtual) ubuntu-
server that needs to mount a share?

- What kernel are you using on that client?
- Is nfs-common installed?

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: Earth Computing

2010-01-28 Thread Alvin
On Wednesday 27 January 2010 20:17:00 Joe McDonagh wrote:
> Alvin wrote:
> > A lot of questions in the annual user survey concern cloud computing. I
> > administer some small businesses and use Ubuntu in most of them. Maybe my
> > biggest client will one day use a personal cloud, and I applaud the
> > efforts, but I can 't help but notice that other things are left in the
> > cold. [...]

> IDK what's with the 'hey brah' subject line, but you probably shouldn't
> be using an LTS-interim release like karmic or jaunty for the business.
> You will run into more show-stopping bugs. I prefer RHEL for business,
> but whatever.

This is true in a lot of situations, but Hardy is not an option for us. We're 
using the virtual servers to do (a lot) of assembling of .xml files. That's a 
lot of integer calculating and pretty heavy I/O over NFS. I tested Hardy, 
Intrepid, Jaunty and Karmic. Karmic has the only version of KVM that does not 
crash while doing those calculations.

RHEL is actually the supported Linux distribution for the software we're using 
(http://www.sdlxysoft.com/en/products/xpp/), and we seriously considered it. 
The main reason we tried Ubuntu, is because I know the Debian way better and 
that's an important consideration. The problems with LD_LIBRARY_PATH and 
libmotif do not exist in RHEL, but we found workarounds for them.
 
> And for the firewall you might want to think about moving to OpenBSD.

OpenBSD is certainly made for that job, but the out-of-the-box solution we're 
using now is doing great for now. (http://www.collax.com/) Besides, they have 
really excellent support.

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: Earth Computing

2010-01-28 Thread Alvin
On Wednesday 27 January 2010 23:15:37 Dustin Kirkland wrote:
> On Tue, Dec 15, 2009 at 6:40 AM, Alvin  wrote:
> ...
> 
> > This is a real-life scenario. Is it common? I don't know. It's not free
> > of struggles as you can see. So, this is a plea for quality. Cloud
> > Computing might be very important, but please don't lose sight of the
> > little guys who just want some 'classic' servers.
> 
> As one of the Canonical Ubuntu Server Developers working on our Ubuntu
> Enterprise Cloud, I'll respond to this ...
> 
> We absolutely have a focus on quality for the "classic servers", as
> you call them.  Basic Ubuntu Servers are our foundation, truly, for
> all of our Cloud Computing efforts, both as a Virtualization Guest,
> and a Virtualization Hypervisor.
> 
> Your email is well constructed, and having a targeted bug list is an
> excellent start.  My guidance to you would be:
>  * Help get each of those bugs into a triaged state, and with an
> appropriate priority set, and filed against the correct package.
>  * Re-test them with the latest development code, the 10.04 release
> that's under development.
>  * Provide all the information you possibly can to help a developer
> work to solve those problem.
>  * Nominate them for release, probably targeting Ubuntu 10.04 Lucid at
> this point.
>  * If you have hacks, work arounds, fixes, or patches, please, by all
> means, attach them to the bugs, as again, this will help a developer
> get an appropriate fix into Ubuntu sooner than later.
>  * And have just a bit of patience and understanding that we have 100s
> (or 1000s) of bugs, and merely a 5-person Ubuntu Server Team in
> Canonical (plus a few dozen more Ubuntu and Debian community members
> that help out).

Thanks. I appreciate the suggestions and all your work on this. It's about 
time I start testing Lucid. Eventually we will make the switch and then stay 
on Lucid on all non-desktop machines until the next LTS release.

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: Earth Computing

2010-01-28 Thread Alvin
On Thursday 28 January 2010 09:31:00 Etienne Goyer wrote:
> > Alvin wrote:
> >>   Why not Ubuntu?
> >>   - ZFS (does not need much explanation)
> 
> Not looking to make excuse, but just so you know, ZFS on Linux is
> unlikely to happen due to licensing issue:
> 
> http://en.wikipedia.org/wiki/ZFS#Linux

I know. It's still the most useful filesystem around, and we want to use the 
best tools for the job. LVM is not as flexible as ZFS. Maybe things will 
brighten for Linux when BTRFS comes around.

> >>   These run Jaunty because of the above bugs and because of a regression
> >>   [bug
> >> 
> >> 224138] "No NFS modules in karmic 32-bit"
> 
> Again, not trying to make excuse, and not sure I understand the problem
> correctly, but that sounds like an overstatement.  It seems like the
> -virtual kernel flavor is missing some modules (including those for
> NFS*v4*), but you could just as well use the -generic or -server flavor.
>  Or am I misunderstanding something?

Yes, virtio. The virtual kernels perform faster. In extreme cases, a 
calculation could take as much as 6 hours. You will run in stability issues 
when using the normal kernel. (Our old Solaris8 machines do the same in 6 
days.)
Besides, when the virtual machines were build, we had no choice due to this 
bug: [kvm guests not using virtio for networking lose network connectivity]
https://bugs.launchpad.net/bugs/286101

> >>   - [bug 374907] "libmotif3 crashes"
> 
> <...snip...>
> 
> >> Sometimes you hear: "it's open source. Don't complain and fix it
> >> yourself." That's partly true. I'm not a programmer, but I was able to
> >> patch libmotif3 to solve the crashes.
> >> The kind people in ubuntu-bugs also managed to convince me that I could
> >> package the new version of openmotif myself and put it in Debian. Maybe
> >> I'll learn how to do that, so that bug can at least be closed. I can
> >> understand that there is not a lot of interest in this package, but we
> >> need it and will probably need it for some time to come.
> 
> IIRC, the Citrix ICA client depends on OpenMotif (not sure which
> version), so that bug would be a biggie indeed if it breaks the ICA
> Client.  We have been using the ICA Client on hardy without any problem
> so far, but I am putting that on my radar.  Thanks for the heads up, I
> will be looking into it.

Thanks! There is also a [needs-packaging] bug report:
https://bugs.launchpad.net/bugs/462182

> >> What I can't understand is that there would be no interest in NFS. Is
> >> everyone using samba between unix machines these days?
> 
> To be honest, yes.  NFS is only really useful for read-only share, as
> NFS  < v4 does not have any form of authentication, where CIFS mount can
> be authenticated.  It is still not good enough, as the file operation
> themselves are not encrypted (supposed to come in Samba any time now),
> but it is a step in the right direction.  NFSv4, because of its reliance
> on Kerberos, is too hairy to set up in most case.
> 
> In general, I try to avoid NFS whenever possible, except for trivial
> things.  CIFS with Unix Extensions has been serving me well so far.

Is LTSP not heavily dependant on NFS? I think it's a mistake to throw the Unix 
way overboard in favour of MS Windows solutions. NFS and CIFS have different 
usage scenarios. There is certainly room for improvement, but that's why NFSv4 
exists. I can't remember if there were questions about file sharing technology 
in the Ubuntu Server Survey, but you should try to put it in a survey. I'm 
curious about the result.

Also, keep in mind that in some situations, security is not important. 
Security often comes at a price (complexity and speed). I admit, central user 
management is only partly functional in this company. Our current security 
system is based on the concept of 'User Private Groups'. That works fine over 
NFS (not v4). In order to use Akonadi on shared /home you do need NFSv4 due to 
locking issues. Implementation is tricky though.
Users need access to the same filesystems anyway. Of course they do make 
mistakes and erase files, or move them to the wrong place. That's why the 
regular ZFS snapshots can be so handy. Our network has been based around NFS 
for at least over 12 years, and it was quite a shock to see that computers did 
no longer boot after the update to Karmic.

> But thanks for your feedbacks, you are doing the right thing.  I am not
> working in the distro team (I am in Corporate Services), so I cannot do
> anything to help with your bugs directly, but I think it is a very good
> thing that we get this kind of feedback.

I must say I'm quite happy with this attention. Your offer for commercial 
support will certainly be accepted in light of the fact that there is interest 
to know what people are actually doing (or trying to do) with Ubuntu.

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: Earth Computing

2010-01-28 Thread Alvin
On Thursday 28 January 2010 11:58:31 Etienne Goyer wrote:
> Ante Karamatić wrote:
> > On 15.12.2009 13:40, Alvin wrote:
> >>- Helios, A commercial application to provide file and print sharing
> >>for
> >> 
> >> Macintosh.
> > 
> > Is there something wrong with netatalk? It's an open source application
> > that provides file and print sharing. For OSX, AFP is deprecated anyway
> > (and printing works much better with CUPS, which is owned by Apple).
> 
> Indeed, modern MacOS actually speaks CIFS, so Samba is all you need.
> netatalk is only really useful when speaking to ancient Mac, it is
> somewhat obsolete these days.

I told the samba server to deny access to macintosh due to a multitude of 
problems. MacOS does speak CIFS, and netatalk has probably evolved since I 
last tested it, but there are different reasons here to choose Helios:

- The product has been proven to be reliable
- Searches are a lot faster (last time I checked).
- CIFS is not an option for Mac in some environments. Have you tried it? 
Resource forks will get lost and Mac users will see files that aren't there or 
will not see files that are there. This is reportedly fixed in MacOS 10.6, but 
we're not willing to replace all those expensive PPC machines and Adobe 
software yet.
- We still have Mac OS9 machines in active use. (I know...)
- Other than that, Helios adds some functionality like OPI, but we're not 
using that anymore.

On the whole, Macs do not play well in a networking environment. Adobe on Mac 
certainly does not make things easier. They officially do not support 
networked computers. Helios makes life a bit easier for admins.

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam

Re: Commercial support for fixing bugs

2010-01-07 Thread Alvin
On Tuesday 05 January 2010 18:12:39 Ante Karamatić wrote:
> On 05.01.2010 16:50, Alvin wrote:
> > So, the bugs I mentioned in my previous post would receive higher
> > priority. I only wonder whether this specific bug would be solved in
> > karmic, because it needs another solution than the one that will be used
> > for lucid. After all, is is already fixed.
> 
> 
> 
> I, and rest of the server people, were hit by this or some other
> upstart/mountall/whatever bug that renders Karmic unusable in NFS
> environment.
> 
> [...]
> 
> 

Let's call that a workaround. There are other possibilities to work around 
this, but I don't like doing a lot of stuff in rc.local that SHOULD be done by 
the boot mechanism. Currently there are 3 things that need to be done after 
boot.

- mount NFS shares (bug 470776)
- restart samba (bug 462169)
- restart libvirtd (bug 491273)

Only the samba bug is fixed in karmic proposed. This bug is mentioned in the 
release notes. Shouldn't there be a warning in those release notes about NFS 
too?

I don't care for booting under 10 seconds on a server, but I do care for 
consistent booting. Right now, there are so many things going wrong that I 
don't know where to begin reporting. Even error messages are wrong (504224), 
and without boot logging (328881) they are the only thing we got to know what 
is going wrong.

I've had a talk about all this with my colleague, and we will probably try a 
support contract and see how it turns out.

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam

Commercial support for fixing bugs

2010-01-05 Thread Alvin
I'd like to upgrade a lot of machines under my care to karmic, because of 
different reasons, but there is a bug that 'stops the show'.

retry remote devices when parent is ready after SIGUSR1
https://bugs.launchpad.net/bugs/470776

In short, NFS mounts can not be trusted. Sometimes, it is needed to go into a 
recovery shell, mount manually, and continue booting (with some other errors). 
This is all good if you can manage the console, but for "human beings" who 
will be using some of these systems, it's just too much to ask.

(I mentioned this and other bugs in a previous post (Earth Computing), but 
there were only personal reactions.)

The good news is, that this is fixed in lucid. The bad news is: karmic will 
remain broken.

I contacted Canonical a while ago, and a support contract will give you the 
right to 10 support cases per annum. I'm perfectly willing to pay for support, 
if only to give some people the chance to be payed for the work they do on 
Ubuntu.

The situation with bugs (I quote):
- Support customer bugs receive higher priority by Canonical developers.
- Bugs reported as support cases do not actually count against your allocation 
of support tickets for that subscription.
- Bugs that require assistance from upstream projects are also managed 
directly by Canonical, increasing the likelihood, and speed of successful 
resolution.

So, the bugs I mentioned in my previous post would receive higher priority. I 
only wonder whether this specific bug would be solved in karmic, because it 
needs another solution than the one that will be used for lucid. After all, is 
is already fixed.

So, will taking a support contract guarantee bootable machines in the next 4 
months?

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Earth Computing

2009-12-15 Thread Alvin
A lot of questions in the annual user survey concern cloud computing. I 
administer some small businesses and use Ubuntu in most of them. Maybe my 
biggest client will one day use a personal cloud, and I applaud the efforts, 
but I can 't help but notice that other things are left in the cold.
The survey wants to know how Ubuntu Server is used. I'm curious about the 
results and really wonder how many Ubuntu clouds there really are, in contrast 
to file, web, terminal and other servers.

I'd like to give an example of how we are using Ubuntu in one company and 
where it could be put to use in the future, along with the issues encountered. 
The reason for this is that I think there is a lot of room for improvement 
outside the cloud.

We're a prepress company with a mixed network.
- 3 Solaris Servers with Helios.
  Why not Ubuntu?
  - ZFS (does not need much explanation)
  - Helios, A commercial application to provide file and print sharing for 
Macintosh.
  - [bug 462169] "nmbd dies on startup when network interfaces are not up yet"
  These run Samba and are NFS servers.
  These machines are an example of what stability should be. No serious bugs.

- 3 Ubuntu Virtual Hosts
  These run Karmic. They are basic installs with ubuntu-virt-server installed.
  They do suffer from some problems.
  - [bug 460914] "root device is sometimes not found"
  - [bug 446031] "static network interfaces do not come up at boot"
  - [bug 470776] "NFS shares do not mount at boot"
  - [bug 491273] "netfs storage pools are not autostarted"
  - [bug 444563] "udev errors all over the place"
  Aside from that, IF they want to find the root drive, are set to DHCP, and 
libvirt-bin is restarted, we can run virtual machines.
  kvm runs well, but I'm scared of reboots.
  When Karmic was just released, we used separate /boot on all servers which 
also rendered them unbootable. [bug 462961, fixed]
  Due to the above problems, I would love to have some sort of boot log [bug 
328881]

- 4 Ubuntu Virtual machines.
  These run Jaunty because of the above bugs and because of a regression [bug 
224138] "No NFS modules in karmic 32-bit"
  2 of these machines run our most important commercial production software.
  kubuntu-desktop is installed on them and the users use XMDCP to work on 
these servers.
  Users also run rdesktop from here to get to Microsoft Word on a MS Windows 
Terminal server.
  They do suffer from some problems.
  (I'm not mentioning Kubuntu stuff. It's not that bad)
  - [bug 366728] "LD_LIBRARY_PATH not loads from .profile"
  - [bug 374907] "libmotif3 crashes"
  - [bug 251709] "Caps Lock does not work in rdesktop"
  - [bug 86021 or 234543] "XDMCP does not work without reverse dns, or with 
the basic /etc/hosts"

- 1 Debian based commercial router/firewall/mailserver
  Ubuntu could do this, but we're pretty happy with this machine.

- There are also a lot of Windows Servers, virtual and physical. These will 
probably never be replaced.

- The clients run Kubuntu, Windows and Mac OS 9/X. The Kubuntu machines are 
XDMCP server and normal workstations.

Sometimes you hear: "it's open source. Don't complain and fix it yourself." 
That's partly true. I'm not a programmer, but I was able to patch libmotif3 to 
solve the crashes.
The kind people in ubuntu-bugs also managed to convince me that I could 
package the new version of openmotif myself and put it in Debian. Maybe I'll 
learn how to do that, so that bug can at least be closed. I can understand 
that there is not a lot of interest in this package, but we need it and will 
probably need it for some time to come.
What I can't understand is that there would be no interest in NFS. Is everyone 
using samba between unix machines these days?

This is a real-life scenario. Is it common? I don't know. It's not free of 
struggles as you can see. So, this is a plea for quality. Cloud Computing 
might be very important, but please don't lose sight of the little guys who 
just want some 'classic' servers.

Links
-
Ubuntu Server user survey:
  http://ubuntu.com/server
Bugs, "In order of apprearance":
  https://bugs.launchpad.net/bugs/462169
  https://bugs.launchpad.net/bugs/460914
  https://bugs.launchpad.net/bugs/446031
  https://bugs.launchpad.net/bugs/470776
  https://bugs.launchpad.net/bugs/491273
  https://bugs.launchpad.net/bugs/444563
  https://bugs.launchpad.net/bugs/462961
  https://bugs.launchpad.net/bugs/328881
  https://bugs.launchpad.net/bugs/224138
  https://bugs.launchpad.net/bugs/366728
  https://bugs.launchpad.net/bugs/374907
  https://bugs.launchpad.net/bugs/251709
  https://bugs.launchpad.net/bugs/86021
  https://bugs.launchpad.net/bugs/234543

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam


Re: kvm problem on karmic

2009-12-01 Thread Alvin
On Tuesday 01 December 2009 14:17:12 Aljoša Mohorović wrote:
> i have no idea why kvm is not working, any ideas?
> 
> # /etc/init.d/kvm restart
>  * Loading kvm module kvm_intel FATAL: Error inserting kvm_intel
> (/lib/modules/2.6.31-15-generic-pae/kernel/arch/x86/kvm/kvm-intel.ko):
> Operation not supported
> 
> system info:
> $ uname -a
> Linux  2.6.31-15-generic-pae #50-Ubuntu SMP Tue Nov 10
> 16:12:10 UTC 2009 i686 GNU/Linux
> alj...@muzgavac:~$ egrep '(vmx|svm)' --color=always /proc/cpuinfo
> flags   : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca
> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm
> constant_tsc arch_perfmon pebs bts pni dtes64 monitor ds_cpl vmx est
> tm2 ssse3 cx16 xtpr pdcm lahf_lm tpr_shadow
> flags   : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca
> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm
> constant_tsc arch_perfmon pebs bts pni dtes64 monitor ds_cpl vmx est
> tm2 ssse3 cx16 xtpr pdcm lahf_lm tpr_shadow

Yes, check for BIOS updates.

-- 
ubuntu-server mailing list
ubuntu-server@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server
More info: https://wiki.ubuntu.com/ServerTeam