Re: [OpenAFS] Tracking down AFS Fileserver corruption

2011-12-02 Thread Jack Neely
Folks,

To follow up, I was able to solve or work around this particular issue.
Turns out, the emcpower devices do need some sort of abstraction layer
to work properly.  This could be using linux's LVM system or having a
partition table with one large partition.  Those configuration scenarios
do not exhibit the corruption issues I was having when attempting to
utilize the raw block device.

Jack Neely
-- 
Jack Neely jjne...@ncsu.edu
Linux Czar, OIT Campus Linux Services
Office of Information Technology, NC State University
GPG Fingerprint: 1917 5AC1 E828 9337 7AA4  EA6B 213B 765F 3B6A 5B89
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Tracking down AFS Fileserver corruption

2011-11-28 Thread Jack Neely
Folks,

I'm deploying new OpenAFS 1.6.0 DAFS file servers on fully update RHEL
6.1 servers and I've stumbled across a data corruption problem.  My ext4
filesystem on the vice mounts are not getting corrupted, just the AFS
volume data.

Our /vicep[ab] mounts are provided by an EMC Clariion SAN array using
the PowerPath driver.  Each of the two vice mounts have 4 paths and are
not partitioned.  I've directly formatted the /dev/emcpower[ab] block
device as ext4.  Of course, the /dev/emcpowerX device is mounted on
/vicepX.

Every hour our OCS Inventory agent runs which eventually runs fdisk -l
to get statistics for the storage on the server.  When I was moving test
volumes to the new server and the agent ran fdisk -l the kernel would
print:

Nov 28 13:01:39 xxx kernel: sdc: unknown partition table
Nov 28 13:01:39 xxx kernel: sde: unknown partition table
Nov 28 13:01:49 xxx kernel: sdc: unknown partition table
Nov 28 13:01:49 xxx kernel: sde: unknown partition table

and the volume being moved at that exact time would be corrupt.  Usually
the server would soon detect this and salvage the volume, but the level
of corruptions has varied.

The above messages and corruption only seem to happen when volume moves
are in progress.  Running fdisk -l on an idle server produces no
messages.

Other things cause the above messages to be re-printed, such as running
fsck -yf /dev/emcpowera.  They occur during the early hours of the
morning as well from something that appears to be related to a cron job
I've not tracked down yet.  

I need some help in figuring out what is causing the corruption and,
more importantly, how to fix things.

Thanks,
Jack Neely

-- 
Jack Neely jjne...@ncsu.edu
Linux Czar, OIT Campus Linux Services
Office of Information Technology, NC State University
GPG Fingerprint: 1917 5AC1 E828 9337 7AA4  EA6B 213B 765F 3B6A 5B89
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Tracking down AFS Fileserver corruption

2011-11-28 Thread Stephan Wiesand
Hi Jack,

no help, just a few dumb questions inline:

On Nov 28, 2011, at 19:13 , Jack Neely wrote:

 Folks,
 
 I'm deploying new OpenAFS 1.6.0 DAFS file servers on fully update RHEL
 6.1 servers and I've stumbled across a data corruption problem.  My ext4
 filesystem on the vice mounts are not getting corrupted, just the AFS
 volume data.
 
 Our /vicep[ab] mounts are provided by an EMC Clariion SAN array using
 the PowerPath driver.  Each of the two vice mounts have 4 paths and are
 not partitioned.  I've directly formatted the /dev/emcpower[ab] block
 device as ext4.  Of course, the /dev/emcpowerX device is mounted on
 /vicepX.

emcpower{a,b} map to sdc{c,e} ?

 Every hour our OCS Inventory agent runs which eventually runs fdisk -l
 to get statistics for the storage on the server.  When I was moving test
 volumes to the new server and the agent ran fdisk -l the kernel would
 print:
 
Nov 28 13:01:39 xxx kernel: sdc: unknown partition table
Nov 28 13:01:39 xxx kernel: sde: unknown partition table
Nov 28 13:01:49 xxx kernel: sdc: unknown partition table
Nov 28 13:01:49 xxx kernel: sde: unknown partition table

If the devices aren't partitioned, why would it ever find a partition table?

This may have changed, but Red Hat used to not support setups with filesystems 
on unpartitioned block devices, I believe.

 and the volume being moved at that exact time would be corrupt.  Usually
 the server would soon detect this and salvage the volume, but the level
 of corruptions has varied.

I don't have experience with running 1.6 servers in production yet, but since 
the AFS fileserver is entirely running in userland, it should not cause this 
kind of corruption. That being said, there's an open BZ regarding ext4 
corruption due to Ceph userland processes...

 The above messages and corruption only seem to happen when volume moves
 are in progress.  Running fdisk -l on an idle server produces no
 messages.

Any messages if you run bonnie++ or iozone on the filesystem when the agent 
runs?

 Other things cause the above messages to be re-printed, such as running
 fsck -yf /dev/emcpowera.

Is this safe to do on a mounted ext4 filesystem?

  They occur during the early hours of the
 morning as well from something that appears to be related to a cron job
 I've not tracked down yet.  
 
 I need some help in figuring out what is causing the corruption and,
 more importantly, how to fix things.

If the AFS fileserver could be run under a different account than root, one 
could be completely confident it's not the culprit. As things are, I'm only 99% 
confident...

Best regards,
Stephan
 
 Thanks,
 Jack Neely
 
 -- 
 Jack Neely jjne...@ncsu.edu
 Linux Czar, OIT Campus Linux Services
 Office of Information Technology, NC State University
 GPG Fingerprint: 1917 5AC1 E828 9337 7AA4  EA6B 213B 765F 3B6A 5B89
 ___
 OpenAFS-info mailing list
 OpenAFS-info@openafs.org
 https://lists.openafs.org/mailman/listinfo/openafs-info

-- 
Stephan Wiesand
DESY -DV-
Platanenenallee 6
15738 Zeuthen, Germany

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Tracking down AFS Fileserver corruption

2011-11-28 Thread Jack Neely
On Mon, Nov 28, 2011 at 08:34:00PM +0100, Stephan Wiesand wrote:
 Hi Jack,
 
 no help, just a few dumb questions inline:
 
 On Nov 28, 2011, at 19:13 , Jack Neely wrote:
 
  Folks,
  
  I'm deploying new OpenAFS 1.6.0 DAFS file servers on fully update RHEL
  6.1 servers and I've stumbled across a data corruption problem.  My ext4
  filesystem on the vice mounts are not getting corrupted, just the AFS
  volume data.
  
  Our /vicep[ab] mounts are provided by an EMC Clariion SAN array using
  the PowerPath driver.  Each of the two vice mounts have 4 paths and are
  not partitioned.  I've directly formatted the /dev/emcpower[ab] block
  device as ext4.  Of course, the /dev/emcpowerX device is mounted on
  /vicepX.
 
 emcpower{a,b} map to sdc{c,e} ?
 

emcpowera is made of the paths: sdc sde sdg sdi

emcpowerb is made of the paths: sdb sdd sdf sdh

Here's the information from the powermt tool:
http://pastebin.com/sfmJX5Kc

  Every hour our OCS Inventory agent runs which eventually runs fdisk -l
  to get statistics for the storage on the server.  When I was moving test
  volumes to the new server and the agent ran fdisk -l the kernel would
  print:
  
 Nov 28 13:01:39 xxx kernel: sdc: unknown partition table
 Nov 28 13:01:39 xxx kernel: sde: unknown partition table
 Nov 28 13:01:49 xxx kernel: sdc: unknown partition table
 Nov 28 13:01:49 xxx kernel: sde: unknown partition table
 
 If the devices aren't partitioned, why would it ever find a partition table?

It shouldn't.  But why does it keep looking (and cause corruption)?
Before I figured out that the corruption was happening at the same time
as these messages I didn't think that there was any connection.

 
 This may have changed, but Red Hat used to not support setups with 
 filesystems on unpartitioned block devices, I believe.
 

I have a support case open with Red Hat as well and they have not
indicated this.  In fact, not partitioning SAN devices (especially large
ones) seems to be accepted practice nowadays.

  and the volume being moved at that exact time would be corrupt.  Usually
  the server would soon detect this and salvage the volume, but the level
  of corruptions has varied.
 
 I don't have experience with running 1.6 servers in production yet, but since 
 the AFS fileserver is entirely running in userland, it should not cause this 
 kind of corruption. That being said, there's an open BZ regarding ext4 
 corruption due to Ceph userland processes...
 

The ext4 file system is not corrupted...so I think the afs daemons are
somehow being disturbed and not writing complete data.

  The above messages and corruption only seem to happen when volume moves
  are in progress.  Running fdisk -l on an idle server produces no
  messages.
 
 Any messages if you run bonnie++ or iozone on the filesystem when the agent 
 runs?
 

Haven't tried yet.  Good idea though.

  Other things cause the above messages to be re-printed, such as running
  fsck -yf /dev/emcpowera.
 
 Is this safe to do on a mounted ext4 filesystem?
 

I ran fsck on the unmounted SAN LUN to make sure I didn't have file
system corruption.  I was surprised that it seemed to trigger partition
rescans again

Jack

   They occur during the early hours of the
  morning as well from something that appears to be related to a cron job
  I've not tracked down yet.  
  
  I need some help in figuring out what is causing the corruption and,
  more importantly, how to fix things.
 
 If the AFS fileserver could be run under a different account than root, one 
 could be completely confident it's not the culprit. As things are, I'm only 
 99% confident...
 
 Best regards,
   Stephan
  
  Thanks,
  Jack Neely
  
  -- 
  Jack Neely jjne...@ncsu.edu
  Linux Czar, OIT Campus Linux Services
  Office of Information Technology, NC State University
  GPG Fingerprint: 1917 5AC1 E828 9337 7AA4  EA6B 213B 765F 3B6A 5B89
  ___
  OpenAFS-info mailing list
  OpenAFS-info@openafs.org
  https://lists.openafs.org/mailman/listinfo/openafs-info
 
 -- 
 Stephan Wiesand
 DESY -DV-
 Platanenenallee 6
 15738 Zeuthen, Germany
 
 ___
 OpenAFS-info mailing list
 OpenAFS-info@openafs.org
 https://lists.openafs.org/mailman/listinfo/openafs-info
 

-- 
Jack Neely jjne...@ncsu.edu
Linux Czar, OIT Campus Linux Services
Office of Information Technology, NC State University
GPG Fingerprint: 1917 5AC1 E828 9337 7AA4  EA6B 213B 765F 3B6A 5B89
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info