Hi

I'm experiencing the following problem while using OCFS2 over DRBD partition.

My config is the following:

2 servers with pacemaker+corosync stack configured

Debian Lenny/Squeeze mixed:

kernel - linux-image-2.6.32-bpo.5-amd64 (2.6.32-26~bpo50+1)
kernel modules - drbd = 8.3.7 (api:88/proto:86-91) ocfs2 = 1.5.0

packages:

pacemaker = 1.0.9.1
corosync = 1.2.1-2
dlm-pcmk = 3.0.12-2
ocfs2-tools-pacemaker(contains ocfs2_controld.pcmk binary )=1.4.4-3
ocfs2-tools = 1.4.4-3

Kernel trace follows here:


[ 3128.804789] block drbd0: Handshake successful: Agreed network protocol version 91
[ 3128.805094] block drbd0: Peer authenticated using 20 bytes of 'sha1' HMAC
[ 3128.805176] block drbd0: conn( WFConnection -> WFReportParams )
[ 3128.805274] block drbd0: Starting asender thread (from drbd0_receiver [4776])
[ 3128.805533] block drbd0: data-integrity-alg: <not-used>
[ 3128.805626] block drbd0: drbd_sync_handshake:
[ 3128.805695] block drbd0: self B4F22E41814A97AB:ADC1DEC415E06ACD:0E1A98B5C70EAE0E:578A64518662F9CF bits:202 flags:0 [ 3128.805788] block drbd0: peer ADC1DEC415E06ACC:0000000000000000:0E1A98B5C70EAE0E:578A64518662F9CF bits:0 flags:0
[ 3128.805880] block drbd0: uuid_compare()=1 by rule 70
[ 3128.805953] block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) [ 3129.365716] block drbd0: conn( WFBitMapS -> SyncSource ) pdsk( Outdated -> Inconsistent ) [ 3129.365816] block drbd0: Began resync as SyncSource (will sync 808 KB [202 bits set]). [ 3129.441670] block drbd0: Resync done (total 1 sec; paused 0 sec; 808 K/sec) [ 3129.441746] block drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate )
[ 3154.019560] block drbd0: peer( Secondary -> Primary )
[ 3156.462341] dlm: got connection from 1191233546
[ 3162.458368] (5378,4):ocfs2_truncate_file:465 ERROR: bug expression: le64_to_cpu(fe->i_size) != i_size_read(inode) [ 3162.458466] (5378,4):ocfs2_truncate_file:465 ERROR: Inode 1714687, inode i_size = 556 != di i_size = 604, i_flags = 0x1
[ 3162.458586] ------------[ cut here ]------------
[ 3162.458654] kernel BUG at /tmp/buildd/linux-2.6-2.6.32/debian/build/source_amd64_none/fs/ocfs2/file.c:465!
[ 3162.458745] invalid opcode: 0000 [#1] SMP
[ 3162.458901] last sysfs file: /sys/kernel/dlm/D9348641B1E04D0E907EFF8D978F348A/control
[ 3162.458988] CPU 4
[ 3162.459095] Modules linked in: ocfs2 jbd2 ocfs2_nodemanager quota_tree ocfs2_stack_user ocfs2_stackglue sha1_generic hmac drbd lru_cache cn dlm configfs ip_vs_rr ip_vs sctp crc32c libcrc32c nfsd exportfs nfs lockd fscache nfs_acl auth_rpcgss sunrpc ipip tunnel4 8021q garp stp xt_MARK iptable_mangle xt_tcpudp iptable_filter ip_tables x_tables coretemp w83627hf w83793 hwmon_vid loop snd_pcsp snd_pcm_oss snd_mixer_oss snd_pcm radeon ttm drm_kms_helper snd_timer drm snd i5k_amb soundcore i2c_algo_bit container i5000_edac rng_core snd_page_alloc edac_core evdev button processor ioatdma dca shpchp pci_hotplug i2c_i801 i2c_core ext3 jbd mbcache dm_mod ses enclosure sd_mod crc_t10dif sg sr_mod cdrom ata_piix ata_generic libata aacraid ehci_hcd uhci_hcd scsi_mod thermal thermal_sys usbcore e1000e nls_base [last unloaded: scsi_wait_scan] [ 3162.462354] Pid: 5378, comm: apache2 Not tainted 2.6.32-bpo.5-amd64 #1 X7DBU [ 3162.462354] RIP: 0010:[<ffffffffa05e006f>] [<ffffffffa05e006f>] ocfs2_setattr+0x631/0x172a [ocfs2]
[ 3162.462354] RSP: 0018:ffff8801fa71bc28  EFLAGS: 00010292
[ 3162.462354] RAX: 0000000000000081 RBX: ffff8801d5afb000 RCX: 0000000000001977 [ 3162.462354] RDX: 0000000000000000 RSI: 0000000000000092 RDI: 0000000000000246 [ 3162.462354] RBP: 0000000000000000 R08: 000000000000f71d R09: 000000000000000a [ 3162.462354] R10: 0000000000000000 R11: ffffffff811b7371 R12: 0000000000000000 [ 3162.462354] R13: ffff8801f8fc5ec8 R14: ffff8801f8fc5ec8 R15: ffff8801f8f752a0 [ 3162.462354] FS: 00007fb03993b710(0000) GS:ffff880008d00000(0000) knlGS:0000000000000000
[ 3162.462354] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3162.462354] CR2: 00000000010ccbc8 CR3: 00000001fd16e000 CR4: 00000000000006e0 [ 3162.462354] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3162.462354] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 3162.462354] Process apache2 (pid: 5378, threadinfo ffff8801fa71a000, task ffff8801f9d0b880)
[ 3162.462354] Stack:
[ 3162.462354] 000000000000022c 000000000000025c 0000000000000001 ffff880227649000 [ 3162.462354] <0> ffff8801f8fc5b60 ffff8801fa71bd68 ffff880227649000 0000000100000292 [ 3162.462354] <0> ffff8801fa4bc800 ffff8801f8fc5b78 000000004cee87ac ffff880227649000
[ 3162.462354] Call Trace:
[ 3162.462354]  [<ffffffff81051f59>] ? current_fs_time+0x1e/0x24
[ 3162.462354]  [<ffffffff81100bbb>] ? notify_change+0x180/0x2c5
[ 3162.462354]  [<ffffffff810ed880>] ? do_truncate+0x63/0x7e
[ 3162.462354]  [<ffffffff810f5a18>] ? get_write_access+0x18/0x4b
[ 3162.462354]  [<ffffffff810f7c17>] ? may_open+0x191/0x1c8
[ 3162.462354]  [<ffffffff810f84fa>] ? do_filp_open+0x4bf/0x94b
[ 3162.462354]  [<ffffffff810f1833>] ? cp_new_stat+0xe9/0xfc
[ 3162.462354]  [<ffffffff810ecb5f>] ? do_sys_open+0x55/0xfc
[ 3162.462354]  [<ffffffff81010b42>] ? system_call_fastpath+0x16/0x1b
[ 3162.462354] Code: 89 fb 62 a0 65 8b 14 25 a8 e3 00 00 89 44 24 10 48 8b 43 20 48 63 d2 48 89 44 24 08 49 8b 46 68 48 89 04 24 31 c0 e8 0e 92 d1 e0 <0f> 0b eb fe 49 39 cc 48 8b 05 c3 7b f8 ff 0f 86 b1 00 00 00 a9
[ 3162.462354] RIP  [<ffffffffa05e006f>] ocfs2_setattr+0x631/0x172a [ocfs2]
[ 3162.462354]  RSP <ffff8801fa71bc28>
[ 3162.469653] ---[ end trace 3a74db6ea3c5066f ]---


I don't know how to exactly reproduce this bug. Kernel doesn't stall after hiting this bug. But it is rather annoying and I am worried about file system consistency.

Any help would be appreciated.


--
Yours Faithfully

Vladimir Kuklin

Network Services Specialist
JSC "SMM"
51/4 build. 1, Shepkina str.
Moscow, 129110
Russia

phone +74952296363 ext. 1514
fax +74952296365
cell +79197848963

e-mail v.kuk...@smm.ru
site http://smm.ru

_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Reply via email to