Re: [Gluster-devel] [ovirt-users] Can we debug some truths/myths/facts about hosted-engine and gluster?
On 07/21/2014 02:08 PM, Jiri Moskovcak wrote: On 07/19/2014 08:58 AM, Pranith Kumar Karampuri wrote: On 07/19/2014 11:25 AM, Andrew Lau wrote: On Sat, Jul 19, 2014 at 12:03 AM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: On 07/18/2014 05:43 PM, Andrew Lau wrote: On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur vbel...@redhat.com mailto:vbel...@redhat.com wrote: [Adding gluster-devel] On 07/18/2014 05:20 PM, Andrew Lau wrote: Hi all, As most of you have got hints from previous messages, hosted engine won't work on gluster . A quote from BZ1097639 Using hosted engine with Gluster backed storage is currently something we really warn against. I think this bug should be closed or re-targeted at documentation, because there is nothing we can do here. Hosted engine assumes that all writes are atomic and (immediately) available for all hosts in the cluster. Gluster violates those assumptions. I tried going through BZ1097639 but could not find much detail with respect to gluster there. A few questions around the problem: 1. Can somebody please explain in detail the scenario that causes the problem? 2. Is hosted engine performing synchronous writes to ensure that writes are durable? Also, if there is any documentation that details the hosted engine architecture that would help in enhancing our understanding of its interactions with gluster. Now my question, does this theory prevent a scenario of perhaps something like a gluster replicated volume being mounted as a glusterfs filesystem and then re-exported as the native kernel NFS share for the hosted-engine to consume? It could then be possible to chuck ctdb in there to provide a last resort failover solution. I have tried myself and suggested it to two people who are running a similar setup. Now using the native kernel NFS server for hosted-engine and they haven't reported as many issues. Curious, could anyone validate my theory on this? If we obtain more details on the use case and obtain gluster logs from the failed scenarios, we should be able to understand the problem better. That could be the first step in validating your theory or evolving further recommendations :). I'm not sure how useful this is, but Jiri Moskovcak tracked this down in an off list message. Message Quote: == We were able to track it down to this (thanks Andrew for providing the testing setup): -b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine' Traceback (most recent call last): File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, line 165, in handle response = success + self._dispatch(data) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, line 261, in _dispatch .get_all_stats_for_service_type(**options) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py, line 41, in get_all_stats_for_service_type d = self.get_raw_stats_for_service_type(storage_dir, service_type) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py, line 74, in get_raw_stats_for_service_type f = os.open(path, direct_flag | os.O_RDONLY) OSError: [Errno 116] Stale file handle: '/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata' Andrew/Jiri, Would it be possible to post gluster logs of both the mount and bricks on the bz? I can take a look at it once. If I gather nothing then probably I will ask for your help in re-creating the issue. Pranith Unfortunately, I don't have the logs for that setup any more.. I'll try replicate when I get a chance. If I understand the comment from the BZ, I don't think it's a gluster bug per-say, more just how gluster does its replication. hi Andrew, Thanks for that. I couldn't come to any conclusions because no logs were available. It is unlikely that self-heal is involved because there were no bricks going down/up according to the bug description. Hi, I've never had such setup, I guessed problem with gluster based on OSError: [Errno 116] Stale file handle: which happens when the file opened by application on client gets removed on the server. I'm pretty sure we (hosted-engine) don't remove that file, so I think it's some gluster magic moving the data
Re: [Gluster-devel] Duplicate entries and other weirdness in a 3*4 volume
On 07/21/2014 05:03 PM, Anders Blomdell wrote: On 2014-07-19 04:43, Pranith Kumar Karampuri wrote: On 07/18/2014 07:57 PM, Anders Blomdell wrote: During testing of a 3*4 gluster (from master as of yesterday), I encountered two major weirdnesses: 1. A 'rm -rf some_dir' needed several invocations to finish, each time reporting a number of lines like these: rm: cannot remove ‘a/b/c/d/e/f’: Directory not empty 2. After having successfully deleted all files from the volume, i have a single directory that is duplicated in gluster-fuse, like this: # ls -l /mnt/gluster total 24 drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/ drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/ any idea on how to debug this issue? What are the steps to recreate? We need to first find what lead to this. Then probably which xlator leads to this. Would a pcap network dump + the result from 'tar -c --xattrs /brick/a/gluster' on all the hosts before and after the following commands are run be of any help: # mount -t glusterfs gluster-host:/test /mnt/gluster # mkdir /mnt/gluster/work2 ; # ls /mnt/gluster work2 work2 Are you using ext4? Is this on latest upstream? Pranith If so, where should I send them (size is 2*12*31MB [.tar] + 220kB [pcap]) /Anders ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Duplicate entries and other weirdness in a 3*4 volume
On 2014-07-21 13:36, Pranith Kumar Karampuri wrote: On 07/21/2014 05:03 PM, Anders Blomdell wrote: On 2014-07-19 04:43, Pranith Kumar Karampuri wrote: On 07/18/2014 07:57 PM, Anders Blomdell wrote: During testing of a 3*4 gluster (from master as of yesterday), I encountered two major weirdnesses: 1. A 'rm -rf some_dir' needed several invocations to finish, each time reporting a number of lines like these: rm: cannot remove ‘a/b/c/d/e/f’: Directory not empty 2. After having successfully deleted all files from the volume, i have a single directory that is duplicated in gluster-fuse, like this: # ls -l /mnt/gluster total 24 drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/ drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/ any idea on how to debug this issue? What are the steps to recreate? We need to first find what lead to this. Then probably which xlator leads to this. Would a pcap network dump + the result from 'tar -c --xattrs /brick/a/gluster' on all the hosts before and after the following commands are run be of any help: # mount -t glusterfs gluster-host:/test /mnt/gluster # mkdir /mnt/gluster/work2 ; # ls /mnt/gluster work2 work2 Are you using ext4? Yes Is this on latest upstream? kernel is 3.14.9-200.fc20.x86_64, if that is latest upstream, I don't know. gluster is from master as of end of last week If there are known issues with ext4 i could switch to something else, but during the last 15 years or so, I have had very little problems with ext2/3/4, thats the reason for choosing it. /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Duplicate entries and other weirdness in a 3*4 volume
On 07/21/2014 05:17 PM, Anders Blomdell wrote: On 2014-07-21 13:36, Pranith Kumar Karampuri wrote: On 07/21/2014 05:03 PM, Anders Blomdell wrote: On 2014-07-19 04:43, Pranith Kumar Karampuri wrote: On 07/18/2014 07:57 PM, Anders Blomdell wrote: During testing of a 3*4 gluster (from master as of yesterday), I encountered two major weirdnesses: 1. A 'rm -rf some_dir' needed several invocations to finish, each time reporting a number of lines like these: rm: cannot remove ‘a/b/c/d/e/f’: Directory not empty 2. After having successfully deleted all files from the volume, i have a single directory that is duplicated in gluster-fuse, like this: # ls -l /mnt/gluster total 24 drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/ drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/ any idea on how to debug this issue? What are the steps to recreate? We need to first find what lead to this. Then probably which xlator leads to this. Would a pcap network dump + the result from 'tar -c --xattrs /brick/a/gluster' on all the hosts before and after the following commands are run be of any help: # mount -t glusterfs gluster-host:/test /mnt/gluster # mkdir /mnt/gluster/work2 ; # ls /mnt/gluster work2 work2 Are you using ext4? Yes Is this on latest upstream? kernel is 3.14.9-200.fc20.x86_64, if that is latest upstream, I don't know. gluster is from master as of end of last week If there are known issues with ext4 i could switch to something else, but during the last 15 years or so, I have had very little problems with ext2/3/4, thats the reason for choosing it. The problem is afrv2 + dht + ext4 offsets. Soumya and Xavier were working on it last I heard(CCed) Pranith /Anders ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Duplicate entries and other weirdness in a 3*4 volume
On 2014-07-21 13:49, Pranith Kumar Karampuri wrote: On 07/21/2014 05:17 PM, Anders Blomdell wrote: On 2014-07-21 13:36, Pranith Kumar Karampuri wrote: On 07/21/2014 05:03 PM, Anders Blomdell wrote: On 2014-07-19 04:43, Pranith Kumar Karampuri wrote: On 07/18/2014 07:57 PM, Anders Blomdell wrote: During testing of a 3*4 gluster (from master as of yesterday), I encountered two major weirdnesses: 1. A 'rm -rf some_dir' needed several invocations to finish, each time reporting a number of lines like these: rm: cannot remove ‘a/b/c/d/e/f’: Directory not empty 2. After having successfully deleted all files from the volume, i have a single directory that is duplicated in gluster-fuse, like this: # ls -l /mnt/gluster total 24 drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/ drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/ any idea on how to debug this issue? What are the steps to recreate? We need to first find what lead to this. Then probably which xlator leads to this. Would a pcap network dump + the result from 'tar -c --xattrs /brick/a/gluster' on all the hosts before and after the following commands are run be of any help: # mount -t glusterfs gluster-host:/test /mnt/gluster # mkdir /mnt/gluster/work2 ; # ls /mnt/gluster work2 work2 Are you using ext4? Yes Is this on latest upstream? kernel is 3.14.9-200.fc20.x86_64, if that is latest upstream, I don't know. gluster is from master as of end of last week If there are known issues with ext4 i could switch to something else, but during the last 15 years or so, I have had very little problems with ext2/3/4, thats the reason for choosing it. The problem is afrv2 + dht + ext4 offsets. Soumya and Xavier were working on it last I heard(CCed) Should I switch to xfs or be guinea pig for testing a fixed version? /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Duplicate entries and other weirdness in a 3*4 volume
On Monday 21 July 2014 13:53:19 Anders Blomdell wrote: On 2014-07-21 13:49, Pranith Kumar Karampuri wrote: On 07/21/2014 05:17 PM, Anders Blomdell wrote: On 2014-07-21 13:36, Pranith Kumar Karampuri wrote: On 07/21/2014 05:03 PM, Anders Blomdell wrote: On 2014-07-19 04:43, Pranith Kumar Karampuri wrote: On 07/18/2014 07:57 PM, Anders Blomdell wrote: During testing of a 3*4 gluster (from master as of yesterday), I encountered two major weirdnesses: 1. A 'rm -rf some_dir' needed several invocations to finish, each time reporting a number of lines like these: rm: cannot remove ‘a/b/c/d/e/f’: Directory not empty 2. After having successfully deleted all files from the volume, i have a single directory that is duplicated in gluster-fuse, like this: # ls -l /mnt/gluster total 24 drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/ drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/ any idea on how to debug this issue? What are the steps to recreate? We need to first find what lead to this. Then probably which xlator leads to this. Would a pcap network dump + the result from 'tar -c --xattrs /brick/a/gluster' on all the hosts before and after the following commands are run be of any help: # mount -t glusterfs gluster-host:/test /mnt/gluster # mkdir /mnt/gluster/work2 ; # ls /mnt/gluster work2 work2 Are you using ext4? Yes Is this on latest upstream? kernel is 3.14.9-200.fc20.x86_64, if that is latest upstream, I don't know. gluster is from master as of end of last week If there are known issues with ext4 i could switch to something else, but during the last 15 years or so, I have had very little problems with ext2/3/4, thats the reason for choosing it. The problem is afrv2 + dht + ext4 offsets. Soumya and Xavier were working on it last I heard(CCed) Should I switch to xfs or be guinea pig for testing a fixed version? There is a patch for this [1]. It should work for this particular configuration, but there are some limitations in the general case, specially for future scalability, that we tried to solve but it seems quite difficult. Maybe Soumya has newer information about that. XFS should work without problems if you need it. Xavi [1] http://review.gluster.org/8201/ ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Running single regression tests on jenkins
On 21/07/2014, at 11:03 AM, Anders Blomdell wrote: Is it possible to run a single regression test on Jenkins (preferably from a suitably crafted rfc, in order not to clutter BZ with random noise [i.e. my feeble test-cases])? H, with a bit of mucking around, that can be done. You've got access in Jenkins to create new jobs, so I'd recommend doing this: * Clone the existing rackspace-regression-2GB job to a new one named something like rackspace-regression-2GB-anders * Look at the script in the Rackspace job used to kick things off. (it'll be in the configure link after you've created the new job) In this script where it runs /opt/qa/regression.sh, instead put in some commands to run the single test you're after. Might take some mucking around, but shouldn't be too hard. :) + Justin -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [ovirt-users] Can we debug some truths/myths/facts about hosted-engine and gluster?
On 07/19/2014 08:58 AM, Pranith Kumar Karampuri wrote: On 07/19/2014 11:25 AM, Andrew Lau wrote: On Sat, Jul 19, 2014 at 12:03 AM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: On 07/18/2014 05:43 PM, Andrew Lau wrote: On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur vbel...@redhat.com mailto:vbel...@redhat.com wrote: [Adding gluster-devel] On 07/18/2014 05:20 PM, Andrew Lau wrote: Hi all, As most of you have got hints from previous messages, hosted engine won't work on gluster . A quote from BZ1097639 Using hosted engine with Gluster backed storage is currently something we really warn against. I think this bug should be closed or re-targeted at documentation, because there is nothing we can do here. Hosted engine assumes that all writes are atomic and (immediately) available for all hosts in the cluster. Gluster violates those assumptions. I tried going through BZ1097639 but could not find much detail with respect to gluster there. A few questions around the problem: 1. Can somebody please explain in detail the scenario that causes the problem? 2. Is hosted engine performing synchronous writes to ensure that writes are durable? Also, if there is any documentation that details the hosted engine architecture that would help in enhancing our understanding of its interactions with gluster. Now my question, does this theory prevent a scenario of perhaps something like a gluster replicated volume being mounted as a glusterfs filesystem and then re-exported as the native kernel NFS share for the hosted-engine to consume? It could then be possible to chuck ctdb in there to provide a last resort failover solution. I have tried myself and suggested it to two people who are running a similar setup. Now using the native kernel NFS server for hosted-engine and they haven't reported as many issues. Curious, could anyone validate my theory on this? If we obtain more details on the use case and obtain gluster logs from the failed scenarios, we should be able to understand the problem better. That could be the first step in validating your theory or evolving further recommendations :). I'm not sure how useful this is, but Jiri Moskovcak tracked this down in an off list message. Message Quote: == We were able to track it down to this (thanks Andrew for providing the testing setup): -b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine' Traceback (most recent call last): File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, line 165, in handle response = success + self._dispatch(data) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, line 261, in _dispatch .get_all_stats_for_service_type(**options) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py, line 41, in get_all_stats_for_service_type d = self.get_raw_stats_for_service_type(storage_dir, service_type) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py, line 74, in get_raw_stats_for_service_type f = os.open(path, direct_flag | os.O_RDONLY) OSError: [Errno 116] Stale file handle: '/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata' Andrew/Jiri, Would it be possible to post gluster logs of both the mount and bricks on the bz? I can take a look at it once. If I gather nothing then probably I will ask for your help in re-creating the issue. Pranith Unfortunately, I don't have the logs for that setup any more.. I'll try replicate when I get a chance. If I understand the comment from the BZ, I don't think it's a gluster bug per-say, more just how gluster does its replication. hi Andrew, Thanks for that. I couldn't come to any conclusions because no logs were available. It is unlikely that self-heal is involved because there were no bricks going down/up according to the bug description. Hi, I've never had such setup, I guessed problem with gluster based on OSError: [Errno 116] Stale file handle: which happens when the file opened by application on client gets removed on the server. I'm pretty sure we (hosted-engine) don't remove that file, so I think it's some gluster magic moving the data around... --Jirka Pranith
Re: [Gluster-devel] [Gluster-users] Random and frequent split brain
On 19-Jul-2014 11:06 pm, Niels de Vos nde...@redhat.com wrote: On Sat, Jul 19, 2014 at 08:23:29AM +0530, Pranith Kumar Karampuri wrote: Guys, Does anyone know why device-id can be different even though it is all single xfs filesystem? We see the following log in the brick-log. [2014-07-16 00:00:24.358628] W [posix-handle.c:586:posix_handle_hard] 0-home-posix: mismatching ino/dev between file The device-id (major:minor number) of a block-device can change, but will not change while the device is in use. Device-mapper (DM) is part of the stack that includes multipath and lvm (and more, but these are most common). The stack for the block-devices is built dynamically, and the device-id is assigned when the block-device is made active. The ordering of making devices active can change, hence the device-id too. It is also possible to deactivate some logical-volumes, and activate them in a different order. (You can not deactivate a dm-device when it is in use, for example mounted.) Without device-mapper in the io-stack, re-ordering disks is possible too, but requires a little more (advanced sysadmin) work. So, the main questions I'd ask would be: 1. What kind of block storage is used, LVM, multipath, ...? A single RAID10 XFS partition 2. Were there any issues on the block-layer, scsi-errors, reconnects? Yes, one of the servers had a bad disk that was replaced 3. Were there changes in the underlaying disks or their structure? Disks added, removed or new partitions created. No 4. Were disks deactivated+activated again, for example for creating backups or snapshots on a level below the (XFS) filesystem? No HTH, Niels /data/gluster/home/techiebuzz/ techie-buzz.com/wp-content/cache/page_enhanced/techie-buzz.com/social-networking/facebook-will-permanently-remove-your-deleted-photos.html/_index.html.old (1077282838/2431) and handle /data/gluster/home/.glusterfs/ae/f0/aef0404b-e084-4501-9d0f-0e6f5bb2d5e0 (1077282836/2431) [2014-07-16 00:00:24.358646] E [posix.c:823:posix_mknod] 0-home-posix: setting gfid on /data/gluster/home/techiebuzz/ techie-buzz.com/wp-content/cache/page_enhanced/techie-buzz.com/social-networking/facebook-will-permanently-remove-your-deleted-photos.html/_index.html.old failed Pranith On 07/17/2014 07:06 PM, Nilesh Govindrajan wrote: log1 was the log from client of node2. The filesystems are mounted locally. /data is a raid10 array and /data/gluster contains 4 volumes, one of which is home which is a high read/write one (the log of which was attached here). On Thu, Jul 17, 2014 at 11:54 AM, Pranith Kumar Karampuri pkara...@redhat.com wrote: On 07/17/2014 08:41 AM, Nilesh Govindrajan wrote: log1 and log2 are brick logs. The others are client logs. I see a lot of logs as below in 'log1' you attached. It seems like the device ID of where the file where it is actually stored, where the gfid-link of the same file is stored i.e inside brick-dir/.glusterfs/ are different. What all devices/filesystems are present inside the brick represented by 'log1'? [2014-07-16 00:00:24.358628] W [posix-handle.c:586:posix_handle_hard] 0-home-posix: mismatching ino/dev between file /data/gluster/home/techiebuzz/ techie-buzz.com/wp-content/cache/page_enhanced/techie-buzz.com/social-networking/facebook-will-permanently-remove-your-deleted-photos.html/_index.html.old (1077282838/2431) and handle /data/gluster/home/.glusterfs/ae/f0/aef0404b-e084-4501-9d0f-0e6f5bb2d5e0 (1077282836/2431) [2014-07-16 00:00:24.358646] E [posix.c:823:posix_mknod] 0-home-posix: setting gfid on /data/gluster/home/techiebuzz/ techie-buzz.com/wp-content/cache/page_enhanced/techie-buzz.com/social-networking/facebook-will-permanently-remove-your-deleted-photos.html/_index.html.old failed Pranith On Thu, Jul 17, 2014 at 8:08 AM, Pranith Kumar Karampuri pkara...@redhat.com wrote: On 07/17/2014 07:28 AM, Nilesh Govindrajan wrote: On Thu, Jul 17, 2014 at 7:26 AM, Nilesh Govindrajan m...@nileshgr.com wrote: Hello, I'm having a weird issue. I have this config: node2 ~ # gluster peer status Number of Peers: 1 Hostname: sto1 Uuid: f7570524-811a-44ed-b2eb-d7acffadfaa5 State: Peer in Cluster (Connected) node1 ~ # gluster peer status Number of Peers: 1 Hostname: sto2 Port: 24007 Uuid: 3a69faa9-f622-4c35-ac5e-b14a6826f5d9 State: Peer in Cluster (Connected) Volume Name: home Type: Replicate Volume ID: 54fef941-2e33-4acf-9e98-1f86ea4f35b7 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: sto1:/data/gluster/home Brick2: sto2:/data/gluster/home Options Reconfigured: performance.write-behind-window-size: 2GB performance.flush-behind: on performance.cache-size: 2GB cluster.choose-local: on storage.linux-aio: on transport.keepalive: on performance.quick-read: on performance.io-cache: on
Re: [Gluster-devel] Cmockery2 in GlusterFS
On 2014-07-20 16:01, Niels de Vos wrote: On Fri, Jul 18, 2014 at 02:52:18PM -0400, Luis Pabón wrote: Hi all, A few months ago, the unit test framework based on cmockery2 was in the repo for a little while, then removed while we improved the packaging method. Now support for cmockery2 ( http://review.gluster.org/#/c/7538/ ) has been merged into the repo again. This will most likely require you to install cmockery2 on your development systems by doing the following: * Fedora/EPEL: $ sudo yum -y install cmockery2-devel * All other systems please visit the following page: https://github.com/lpabon/cmockery2/blob/master/doc/usage.md#installation Here is also some information about Cmockery2 and how to use it: * Introduction to Unit Tests in C Presentation: http://slides-lpabon.rhcloud.com/feb24_glusterfs_unittest.html#/ * Cmockery2 Usage Guide: https://github.com/lpabon/cmockery2/blob/master/doc/usage.md * Using Cmockery2 with GlusterFS: https://github.com/gluster/glusterfs/blob/master/doc/hacker-guide/en-US/markdown/unittest.md When starting out writing unit tests, I would suggest writing unit tests for non-xlator interface files when you start. Once you feel more comfortable writing unit tests, then move to writing them for the xlators interface files. Awesome, many thanks! I'd like to add some unittests for the RPC and NFS layer. Several functions (like ip-address/netmask matching for ACLs) look very suitable. Did you have any particular functions in mind that you would like to see unittests for? If so, maybe you can file some bugs for the different tests so that we won't forget about it? Depending on the tests, these bugs may get the EasyFix keyword if there is a clear description and some pointers to examples. Looks like parts of cmockery was forgotten in glusterfs.spec.in: # rpm -q -f `which gluster` glusterfs-cli-3.7dev-0.9.git5b8de97.fc20.x86_64 # ldd `which gluster` linux-vdso.so.1 = (0x74dfe000) libglusterfs.so.0 = /lib64/libglusterfs.so.0 (0x7fe034cc4000) libreadline.so.6 = /lib64/libreadline.so.6 (0x7fe034a7d000) libncurses.so.5 = /lib64/libncurses.so.5 (0x7fe034856000) libtinfo.so.5 = /lib64/libtinfo.so.5 (0x7fe03462c000) libgfxdr.so.0 = /lib64/libgfxdr.so.0 (0x7fe034414000) libgfrpc.so.0 = /lib64/libgfrpc.so.0 (0x7fe0341f8000) libxml2.so.2 = /lib64/libxml2.so.2 (0x7fe033e8f000) libz.so.1 = /lib64/libz.so.1 (0x7fe033c79000) libm.so.6 = /lib64/libm.so.6 (0x7fe033971000) libdl.so.2 = /lib64/libdl.so.2 (0x7fe03376d000) libcmockery.so.0 = not found libpthread.so.0 = /lib64/libpthread.so.0 (0x7fe03354f000) libcrypto.so.10 = /lib64/libcrypto.so.10 (0x7fe033168000) libc.so.6 = /lib64/libc.so.6 (0x7fe032da9000) libcmockery.so.0 = not found libcmockery.so.0 = not found libcmockery.so.0 = not found liblzma.so.5 = /lib64/liblzma.so.5 (0x7fe032b82000) /lib64/ld-linux-x86-64.so.2 (0x7fe0351f1000) Should I file a bug report or could someone on the fast-lane fix this? /Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] glusterfs-3.5.2beta1 released
SRC: http://bits.gluster.org/pub/gluster/glusterfs/src/glusterfs-3.5.2beta1.tar.gz This release is made off jenkins-release-84 -- Gluster Build System ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] NFS directory not empty log messages.
From what I can tell I incorrectly assumed that I was seeing a problem deleting directories as described in: https://bugzilla.redhat.com/show_bug.cgi?id=1121347#c4 This is definitely different than Anders is seeing in the duplicate entries email thread. Should I close my original issue and split it up as I descried in the above comment? -b ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Duplicate entries and other weirdness in a 3*4 volume
On 07/21/2014 07:33 PM, Anders Blomdell wrote: On 2014-07-21 14:36, Soumya Koduri wrote: On 07/21/2014 05:35 PM, Xavier Hernandez wrote: On Monday 21 July 2014 13:53:19 Anders Blomdell wrote: On 2014-07-21 13:49, Pranith Kumar Karampuri wrote: On 07/21/2014 05:17 PM, Anders Blomdell wrote: On 2014-07-21 13:36, Pranith Kumar Karampuri wrote: On 07/21/2014 05:03 PM, Anders Blomdell wrote: On 2014-07-19 04:43, Pranith Kumar Karampuri wrote: On 07/18/2014 07:57 PM, Anders Blomdell wrote: During testing of a 3*4 gluster (from master as of yesterday), I encountered two major weirdnesses: 1. A 'rm -rf some_dir' needed several invocations to finish, each time reporting a number of lines like these: rm: cannot remove ‘a/b/c/d/e/f’: Directory not empty 2. After having successfully deleted all files from the volume, i have a single directory that is duplicated in gluster-fuse, like this: # ls -l /mnt/gluster total 24 drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/ drwxr-xr-x 2 root root 12288 18 jul 16.17 work2/ any idea on how to debug this issue? What are the steps to recreate? We need to first find what lead to this. Then probably which xlator leads to this. Would a pcap network dump + the result from 'tar -c --xattrs /brick/a/gluster' on all the hosts before and after the following commands are run be of any help: # mount -t glusterfs gluster-host:/test /mnt/gluster # mkdir /mnt/gluster/work2 ; # ls /mnt/gluster work2 work2 Are you using ext4? Yes Is this on latest upstream? kernel is 3.14.9-200.fc20.x86_64, if that is latest upstream, I don't know. gluster is from master as of end of last week If there are known issues with ext4 i could switch to something else, but during the last 15 years or so, I have had very little problems with ext2/3/4, thats the reason for choosing it. The problem is afrv2 + dht + ext4 offsets. Soumya and Xavier were working on it last I heard(CCed) Should I switch to xfs or be guinea pig for testing a fixed version? There is a patch for this [1]. It should work for this particular configuration, but there are some limitations in the general case, specially for future scalability, that we tried to solve but it seems quite difficult. Maybe Soumya has newer information about that. XFS should work without problems if you need it. As long as it does not start using 64-bit offsets as well :-) Sounds like I should go for XFS right now? Tell me if you need testers. Sure. yes :) XFS doesn't have this issue. It still seems to use 32-bit offset. Thats right. This patch works fine with the current supported/limited configuration. But we need a much more generalized approach or maybe a design change as Xavi had suggested to make it more scalable. Is that the patch in [1] you are referring to? yes [1] is a possible solution for the current issue. This change is still under review. The problem in short -- 'ext4' uses large offsets/the bits which even GlusterFS may need to store subvol id along with the offset. This could be end up in few offsets being modified when given back to the filesystem resulting in missing files etc. Avati has proposed a solution to overcome this issue based on the assumption that both EXT4/XFS are tolerant in terms of the accuracy of the value presented back in seekdir(). i.e, a seekdir(val) actually seeks to the entry which has the closest true offset. For more info, please check http://review.gluster.org/#/c/4711/. This is AFAICT already in the version that failed, as commit e0616e9314c8323dc59fca7cad6972f08d72b936 That's right. This change was done by Anand Avati in the dht translator and it works as expected had AFR not come into picture. When the same change was done in the AFR(v2) translator, it resulted in the loss of brick-id. [1] is a potential fix for now. Had to change the transform-logic in these two translators. But as Xavi had mentioned, our goal is to come up with a solution, to make it uniform across all the translators without any loss of subvol-id and keep the offset gaps to the minimal. But this offset gap widens as and when more translators (which need to store subvol-id) get added to the gluster stack which may eventually result in the similar issue which you are facing now. Thanks, Soumya Xavi [1] http://review.gluster.org/8201/ Thanks! /Anders ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Cmockery2 in GlusterFS
The cmockery2 rpm is only available for the current supported Fedora versions which currently are 19 and 20. Have you tried installing cmockery2 from the source? Luis -Original Message- From: Santosh Pradhan [sprad...@redhat.com] Received: Monday, 21 Jul 2014, 10:45AM To: Luis Pabón [lpa...@redhat.com] CC: gluster-devel@gluster.org Subject: Re: [Gluster-devel] Cmockery2 in GlusterFS Hi Luis, I am using Fedora18 in my laptop and after your patch, I am not able to compile the gluster from source. Yum install also does not find the cmockery2 bundle. How to fix it? Thanks, Santosh On 07/21/2014 07:57 PM, Anders Blomdell wrote: On 2014-07-21 16:17, Anders Blomdell wrote: On 2014-07-20 16:01, Niels de Vos wrote: On Fri, Jul 18, 2014 at 02:52:18PM -0400, Luis Pabón wrote: Hi all, A few months ago, the unit test framework based on cmockery2 was in the repo for a little while, then removed while we improved the packaging method. Now support for cmockery2 ( http://review.gluster.org/#/c/7538/ ) has been merged into the repo again. This will most likely require you to install cmockery2 on your development systems by doing the following: * Fedora/EPEL: $ sudo yum -y install cmockery2-devel * All other systems please visit the following page: https://github.com/lpabon/cmockery2/blob/master/doc/usage.md#installation Here is also some information about Cmockery2 and how to use it: * Introduction to Unit Tests in C Presentation: http://slides-lpabon.rhcloud.com/feb24_glusterfs_unittest.html#/ * Cmockery2 Usage Guide: https://github.com/lpabon/cmockery2/blob/master/doc/usage.md * Using Cmockery2 with GlusterFS: https://github.com/gluster/glusterfs/blob/master/doc/hacker-guide/en-US/markdown/unittest.md When starting out writing unit tests, I would suggest writing unit tests for non-xlator interface files when you start. Once you feel more comfortable writing unit tests, then move to writing them for the xlators interface files. Awesome, many thanks! I'd like to add some unittests for the RPC and NFS layer. Several functions (like ip-address/netmask matching for ACLs) look very suitable. Did you have any particular functions in mind that you would like to see unittests for? If so, maybe you can file some bugs for the different tests so that we won't forget about it? Depending on the tests, these bugs may get the EasyFix keyword if there is a clear description and some pointers to examples. Looks like parts of cmockery was forgotten in glusterfs.spec.in: # rpm -q -f `which gluster` glusterfs-cli-3.7dev-0.9.git5b8de97.fc20.x86_64 # ldd `which gluster` linux-vdso.so.1 = (0x74dfe000) libglusterfs.so.0 = /lib64/libglusterfs.so.0 (0x7fe034cc4000) libreadline.so.6 = /lib64/libreadline.so.6 (0x7fe034a7d000) libncurses.so.5 = /lib64/libncurses.so.5 (0x7fe034856000) libtinfo.so.5 = /lib64/libtinfo.so.5 (0x7fe03462c000) libgfxdr.so.0 = /lib64/libgfxdr.so.0 (0x7fe034414000) libgfrpc.so.0 = /lib64/libgfrpc.so.0 (0x7fe0341f8000) libxml2.so.2 = /lib64/libxml2.so.2 (0x7fe033e8f000) libz.so.1 = /lib64/libz.so.1 (0x7fe033c79000) libm.so.6 = /lib64/libm.so.6 (0x7fe033971000) libdl.so.2 = /lib64/libdl.so.2 (0x7fe03376d000) libcmockery.so.0 = not found libpthread.so.0 = /lib64/libpthread.so.0 (0x7fe03354f000) libcrypto.so.10 = /lib64/libcrypto.so.10 (0x7fe033168000) libc.so.6 = /lib64/libc.so.6 (0x7fe032da9000) libcmockery.so.0 = not found libcmockery.so.0 = not found libcmockery.so.0 = not found liblzma.so.5 = /lib64/liblzma.so.5 (0x7fe032b82000) /lib64/ld-linux-x86-64.so.2 (0x7fe0351f1000) Should I file a bug report or could someone on the fast-lane fix this? My bad (installation with --nodeps --force :-() ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Cmockery2 in GlusterFS
Niels you are correct. Let me take a look. Luis -Original Message- From: Niels de Vos [nde...@redhat.com] Received: Monday, 21 Jul 2014, 10:41AM To: Luis Pabon [lpa...@redhat.com] CC: Anders Blomdell [anders.blomd...@control.lth.se]; gluster-devel@gluster.org Subject: Re: [Gluster-devel] Cmockery2 in GlusterFS On Mon, Jul 21, 2014 at 04:27:18PM +0200, Anders Blomdell wrote: On 2014-07-21 16:17, Anders Blomdell wrote: On 2014-07-20 16:01, Niels de Vos wrote: On Fri, Jul 18, 2014 at 02:52:18PM -0400, Luis Pabón wrote: Hi all, A few months ago, the unit test framework based on cmockery2 was in the repo for a little while, then removed while we improved the packaging method. Now support for cmockery2 ( http://review.gluster.org/#/c/7538/ ) has been merged into the repo again. This will most likely require you to install cmockery2 on your development systems by doing the following: * Fedora/EPEL: $ sudo yum -y install cmockery2-devel * All other systems please visit the following page: https://github.com/lpabon/cmockery2/blob/master/doc/usage.md#installation Here is also some information about Cmockery2 and how to use it: * Introduction to Unit Tests in C Presentation: http://slides-lpabon.rhcloud.com/feb24_glusterfs_unittest.html#/ * Cmockery2 Usage Guide: https://github.com/lpabon/cmockery2/blob/master/doc/usage.md * Using Cmockery2 with GlusterFS: https://github.com/gluster/glusterfs/blob/master/doc/hacker-guide/en-US/markdown/unittest.md When starting out writing unit tests, I would suggest writing unit tests for non-xlator interface files when you start. Once you feel more comfortable writing unit tests, then move to writing them for the xlators interface files. Awesome, many thanks! I'd like to add some unittests for the RPC and NFS layer. Several functions (like ip-address/netmask matching for ACLs) look very suitable. Did you have any particular functions in mind that you would like to see unittests for? If so, maybe you can file some bugs for the different tests so that we won't forget about it? Depending on the tests, these bugs may get the EasyFix keyword if there is a clear description and some pointers to examples. Looks like parts of cmockery was forgotten in glusterfs.spec.in: # rpm -q -f `which gluster` glusterfs-cli-3.7dev-0.9.git5b8de97.fc20.x86_64 # ldd `which gluster` linux-vdso.so.1 = (0x74dfe000) libglusterfs.so.0 = /lib64/libglusterfs.so.0 (0x7fe034cc4000) libreadline.so.6 = /lib64/libreadline.so.6 (0x7fe034a7d000) libncurses.so.5 = /lib64/libncurses.so.5 (0x7fe034856000) libtinfo.so.5 = /lib64/libtinfo.so.5 (0x7fe03462c000) libgfxdr.so.0 = /lib64/libgfxdr.so.0 (0x7fe034414000) libgfrpc.so.0 = /lib64/libgfrpc.so.0 (0x7fe0341f8000) libxml2.so.2 = /lib64/libxml2.so.2 (0x7fe033e8f000) libz.so.1 = /lib64/libz.so.1 (0x7fe033c79000) libm.so.6 = /lib64/libm.so.6 (0x7fe033971000) libdl.so.2 = /lib64/libdl.so.2 (0x7fe03376d000) libcmockery.so.0 = not found libpthread.so.0 = /lib64/libpthread.so.0 (0x7fe03354f000) libcrypto.so.10 = /lib64/libcrypto.so.10 (0x7fe033168000) libc.so.6 = /lib64/libc.so.6 (0x7fe032da9000) libcmockery.so.0 = not found libcmockery.so.0 = not found libcmockery.so.0 = not found liblzma.so.5 = /lib64/liblzma.so.5 (0x7fe032b82000) /lib64/ld-linux-x86-64.so.2 (0x7fe0351f1000) Should I file a bug report or could someone on the fast-lane fix this? My bad (installation with --nodeps --force :-() Actually, I was not expecting a dependency on cmockery2. My understanding was that only some temporary test-applications would be linked with libcmockery and not any binaries that would get packaged in the RPMs. Luis, could you clarify that? Thanks, Niels ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Cmockery2 in GlusterFS
Cmockery2 is a hard dependency before GlusterFS can be compiled in upstream master now - we could make it conditional and enable if necessary? since we know we do not have the cmockery2 packages available on all systems? On Mon, Jul 21, 2014 at 10:16 AM, Luis Pabon lpa...@redhat.com wrote: Niels you are correct. Let me take a look. Luis -Original Message- From: Niels de Vos [nde...@redhat.com] Received: Monday, 21 Jul 2014, 10:41AM To: Luis Pabon [lpa...@redhat.com] CC: Anders Blomdell [anders.blomd...@control.lth.se]; gluster-devel@gluster.org Subject: Re: [Gluster-devel] Cmockery2 in GlusterFS On Mon, Jul 21, 2014 at 04:27:18PM +0200, Anders Blomdell wrote: On 2014-07-21 16:17, Anders Blomdell wrote: On 2014-07-20 16:01, Niels de Vos wrote: On Fri, Jul 18, 2014 at 02:52:18PM -0400, Luis Pabón wrote: Hi all, A few months ago, the unit test framework based on cmockery2 was in the repo for a little while, then removed while we improved the packaging method. Now support for cmockery2 ( http://review.gluster.org/#/c/7538/ ) has been merged into the repo again. This will most likely require you to install cmockery2 on your development systems by doing the following: * Fedora/EPEL: $ sudo yum -y install cmockery2-devel * All other systems please visit the following page: https://github.com/lpabon/cmockery2/blob/master/doc/usage.md#installation Here is also some information about Cmockery2 and how to use it: * Introduction to Unit Tests in C Presentation: http://slides-lpabon.rhcloud.com/feb24_glusterfs_unittest.html#/ * Cmockery2 Usage Guide: https://github.com/lpabon/cmockery2/blob/master/doc/usage.md * Using Cmockery2 with GlusterFS: https://github.com/gluster/glusterfs/blob/master/doc/hacker-guide/en-US/markdown/unittest.md When starting out writing unit tests, I would suggest writing unit tests for non-xlator interface files when you start. Once you feel more comfortable writing unit tests, then move to writing them for the xlators interface files. Awesome, many thanks! I'd like to add some unittests for the RPC and NFS layer. Several functions (like ip-address/netmask matching for ACLs) look very suitable. Did you have any particular functions in mind that you would like to see unittests for? If so, maybe you can file some bugs for the different tests so that we won't forget about it? Depending on the tests, these bugs may get the EasyFix keyword if there is a clear description and some pointers to examples. Looks like parts of cmockery was forgotten in glusterfs.spec.in: # rpm -q -f `which gluster` glusterfs-cli-3.7dev-0.9.git5b8de97.fc20.x86_64 # ldd `which gluster` linux-vdso.so.1 = (0x74dfe000) libglusterfs.so.0 = /lib64/libglusterfs.so.0 (0x7fe034cc4000) libreadline.so.6 = /lib64/libreadline.so.6 (0x7fe034a7d000) libncurses.so.5 = /lib64/libncurses.so.5 (0x7fe034856000) libtinfo.so.5 = /lib64/libtinfo.so.5 (0x7fe03462c000) libgfxdr.so.0 = /lib64/libgfxdr.so.0 (0x7fe034414000) libgfrpc.so.0 = /lib64/libgfrpc.so.0 (0x7fe0341f8000) libxml2.so.2 = /lib64/libxml2.so.2 (0x7fe033e8f000) libz.so.1 = /lib64/libz.so.1 (0x7fe033c79000) libm.so.6 = /lib64/libm.so.6 (0x7fe033971000) libdl.so.2 = /lib64/libdl.so.2 (0x7fe03376d000) libcmockery.so.0 = not found libpthread.so.0 = /lib64/libpthread.so.0 (0x7fe03354f000) libcrypto.so.10 = /lib64/libcrypto.so.10 (0x7fe033168000) libc.so.6 = /lib64/libc.so.6 (0x7fe032da9000) libcmockery.so.0 = not found libcmockery.so.0 = not found libcmockery.so.0 = not found liblzma.so.5 = /lib64/liblzma.so.5 (0x7fe032b82000) /lib64/ld-linux-x86-64.so.2 (0x7fe0351f1000) Should I file a bug report or could someone on the fast-lane fix this? My bad (installation with --nodeps --force :-() Actually, I was not expecting a dependency on cmockery2. My understanding was that only some temporary test-applications would be linked with libcmockery and not any binaries that would get packaged in the RPMs. Luis, could you clarify that? Thanks, Niels ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel -- Religious confuse piety with mere ritual, the virtuous confuse regulation with outcomes ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] Duplicate entries and other weirdness in a 3*4 volume
On 2014-07-21 19:14, Jeff Darcy wrote: But this offset gap widens as and when more translators (which need to store subvol-id) get added to the gluster stack which may eventually result in the similar issue which you are facing now. Perhaps it's time to revisit the idea of making assumptions about d_off values +1 :-) and twiddling them back and forth, vs. maintaining a precise mapping between our values and local-FS values. http://review.gluster.org/#/c/4675/ That patch is old and probably incomplete, but at the time it worked just as well as the one that led us into the current situation. Seems a lot sounder than: However both these filesystmes (EXT4 more importantly) are tolerant in terms of the accuracy of the value presented back in seekdir(). i.e, a seekdir(val) actually seeks to the entry which has the closest true offset. Let me know if you revisit this this one. Thanks Anders -- Anders Blomdell Email: anders.blomd...@control.lth.se Department of Automatic Control Lund University Phone:+46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
[Gluster-devel] The Fifth Elephant 2014 - Discount codes for GlusterFS users
The Fifth Elephant is a Big Data and Analytics conference on 23-26 July 2014 in Bangalore, India. https://fifthelephant.in/2014/conference We are proud to support the folks there, and they have extended a 10% discount to the GlusterFS Community. Ticket purchase URL: http://hasgeek.us2.list-manage.com/track/click?u=c84c1486d2d6a025417c5d146id=8b9e5676cde=d2c198233a Highlights ** * India's largest gathering of big data and analytics practitioners * Talks will showcase latest trends in the field and the opportunities that can be leveraged * Speakers will share practical experiences with data mining, machine learning and building technology for analytics Featured Talks ** * The Genome Project Personalized Medicine and Big Data - Anu Acharya, Mapmygenome * The ART of Data Mining Practical Learnings from Real-world Data Mining applications - Shailesh Kumar, Google https://funnel.hasgeek.com/fifthel2014/1166-the-art-of-data-mining-practical-learnings-from-re There are more listed on the conference website. These two sound interesting to me personally. I'd go to them if I was in Bangalore: :) * Crafting Visual Stories with Data Amit Kapoor, narrativeVIZ Consulting https://funnel.hasgeek.com/fifthel2014/1098-crafting-visual-stories-with-data * Scaling real time visualisations for Elections 2014 S. Anand, Chief Data Scientist at Gramener https://funnel.hasgeek.com/fifthel2014/1146-scaling-real-time-visualisations-for-elections-201 Format ** * Analytics infrastructure Platforms, tools in-house solutions. * Data mining Machine learning Applications in different domains Who should attend? ** * Data scientists, developers, architects, DBAs, researchers, business - from data enthusiasts to data geeks For further information *** Website: fifthelephant.in Schedule: funnel.hasgeek.com Tickets: fifthel.doattend.com Send your queries to i...@hasgeek.com or call +91 80 6768 4422 Regards and best wishes, Justin Clift -- GlusterFS - http://www.gluster.org An open source, distributed file system scaling to several petabytes, and handling thousands of clients. My personal twitter: twitter.com/realjustinclift ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [ovirt-users] Can we debug some truths/myths/facts about hosted-engine and gluster?
On 07/21/2014 05:09 AM, Pranith Kumar Karampuri wrote: On 07/21/2014 02:08 PM, Jiri Moskovcak wrote: On 07/19/2014 08:58 AM, Pranith Kumar Karampuri wrote: On 07/19/2014 11:25 AM, Andrew Lau wrote: On Sat, Jul 19, 2014 at 12:03 AM, Pranith Kumar Karampuri pkara...@redhat.com mailto:pkara...@redhat.com wrote: On 07/18/2014 05:43 PM, Andrew Lau wrote: On Fri, Jul 18, 2014 at 10:06 PM, Vijay Bellur vbel...@redhat.com mailto:vbel...@redhat.com wrote: [Adding gluster-devel] On 07/18/2014 05:20 PM, Andrew Lau wrote: Hi all, As most of you have got hints from previous messages, hosted engine won't work on gluster . A quote from BZ1097639 Using hosted engine with Gluster backed storage is currently something we really warn against. I think this bug should be closed or re-targeted at documentation, because there is nothing we can do here. Hosted engine assumes that all writes are atomic and (immediately) available for all hosts in the cluster. Gluster violates those assumptions. I tried going through BZ1097639 but could not find much detail with respect to gluster there. A few questions around the problem: 1. Can somebody please explain in detail the scenario that causes the problem? 2. Is hosted engine performing synchronous writes to ensure that writes are durable? Also, if there is any documentation that details the hosted engine architecture that would help in enhancing our understanding of its interactions with gluster. Now my question, does this theory prevent a scenario of perhaps something like a gluster replicated volume being mounted as a glusterfs filesystem and then re-exported as the native kernel NFS share for the hosted-engine to consume? It could then be possible to chuck ctdb in there to provide a last resort failover solution. I have tried myself and suggested it to two people who are running a similar setup. Now using the native kernel NFS server for hosted-engine and they haven't reported as many issues. Curious, could anyone validate my theory on this? If we obtain more details on the use case and obtain gluster logs from the failed scenarios, we should be able to understand the problem better. That could be the first step in validating your theory or evolving further recommendations :). I'm not sure how useful this is, but Jiri Moskovcak tracked this down in an off list message. Message Quote: == We were able to track it down to this (thanks Andrew for providing the testing setup): -b686-4363-bb7e-dba99e5789b6/ha_agent service_type=hosted-engine' Traceback (most recent call last): File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, line 165, in handle response = success + self._dispatch(data) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/listener.py, line 261, in _dispatch .get_all_stats_for_service_type(**options) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py, line 41, in get_all_stats_for_service_type d = self.get_raw_stats_for_service_type(storage_dir, service_type) File /usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py, line 74, in get_raw_stats_for_service_type f = os.open(path, direct_flag | os.O_RDONLY) OSError: [Errno 116] Stale file handle: '/rhev/data-center/mnt/localhost:_mnt_hosted-engine/c898fd2a-b686-4363-bb7e-dba99e5789b6/ha_agent/hosted-engine.metadata' Andrew/Jiri, Would it be possible to post gluster logs of both the mount and bricks on the bz? I can take a look at it once. If I gather nothing then probably I will ask for your help in re-creating the issue. Pranith Unfortunately, I don't have the logs for that setup any more.. I'll try replicate when I get a chance. If I understand the comment from the BZ, I don't think it's a gluster bug per-say, more just how gluster does its replication. hi Andrew, Thanks for that. I couldn't come to any conclusions because no logs were available. It is unlikely that self-heal is involved because there were no bricks going down/up according to the bug description. Hi, I've never had such setup, I guessed problem with gluster based on OSError: [Errno 116] Stale file handle: which happens when the file opened by application on client gets removed on the server. I'm pretty sure we (hosted-engine) don't remove that file, so I
Re: [Gluster-devel] Cmockery2 in GlusterFS
Yes, it is a simple bug. I filed https://bugzilla.redhat.com/show_bug.cgi?id=1121822 , thank you very much for finding this Anders. I have sent a fix. - Luis On 07/21/2014 01:16 PM, Luis Pabon wrote: Niels you are correct. Let me take a look. Luis -Original Message- From: Niels de Vos [nde...@redhat.com] Received: Monday, 21 Jul 2014, 10:41AM To: Luis Pabon [lpa...@redhat.com] CC: Anders Blomdell [anders.blomd...@control.lth.se]; gluster-devel@gluster.org Subject: Re: [Gluster-devel] Cmockery2 in GlusterFS On Mon, Jul 21, 2014 at 04:27:18PM +0200, Anders Blomdell wrote: On 2014-07-21 16:17, Anders Blomdell wrote: On 2014-07-20 16:01, Niels de Vos wrote: On Fri, Jul 18, 2014 at 02:52:18PM -0400, Luis Pabón wrote: Hi all, A few months ago, the unit test framework based on cmockery2 was in the repo for a little while, then removed while we improved the packaging method. Now support for cmockery2 ( http://review.gluster.org/#/c/7538/ ) has been merged into the repo again. This will most likely require you to install cmockery2 on your development systems by doing the following: * Fedora/EPEL: $ sudo yum -y install cmockery2-devel * All other systems please visit the following page: https://github.com/lpabon/cmockery2/blob/master/doc/usage.md#installation Here is also some information about Cmockery2 and how to use it: * Introduction to Unit Tests in C Presentation: http://slides-lpabon.rhcloud.com/feb24_glusterfs_unittest.html#/ * Cmockery2 Usage Guide: https://github.com/lpabon/cmockery2/blob/master/doc/usage.md * Using Cmockery2 with GlusterFS: https://github.com/gluster/glusterfs/blob/master/doc/hacker-guide/en-US/markdown/unittest.md When starting out writing unit tests, I would suggest writing unit tests for non-xlator interface files when you start. Once you feel more comfortable writing unit tests, then move to writing them for the xlators interface files. Awesome, many thanks! I'd like to add some unittests for the RPC and NFS layer. Several functions (like ip-address/netmask matching for ACLs) look very suitable. Did you have any particular functions in mind that you would like to see unittests for? If so, maybe you can file some bugs for the different tests so that we won't forget about it? Depending on the tests, these bugs may get the EasyFix keyword if there is a clear description and some pointers to examples. Looks like parts of cmockery was forgotten in glusterfs.spec.in: # rpm -q -f `which gluster` glusterfs-cli-3.7dev-0.9.git5b8de97.fc20.x86_64 # ldd `which gluster` linux-vdso.so.1 = (0x74dfe000) libglusterfs.so.0 = /lib64/libglusterfs.so.0 (0x7fe034cc4000) libreadline.so.6 = /lib64/libreadline.so.6 (0x7fe034a7d000) libncurses.so.5 = /lib64/libncurses.so.5 (0x7fe034856000) libtinfo.so.5 = /lib64/libtinfo.so.5 (0x7fe03462c000) libgfxdr.so.0 = /lib64/libgfxdr.so.0 (0x7fe034414000) libgfrpc.so.0 = /lib64/libgfrpc.so.0 (0x7fe0341f8000) libxml2.so.2 = /lib64/libxml2.so.2 (0x7fe033e8f000) libz.so.1 = /lib64/libz.so.1 (0x7fe033c79000) libm.so.6 = /lib64/libm.so.6 (0x7fe033971000) libdl.so.2 = /lib64/libdl.so.2 (0x7fe03376d000) libcmockery.so.0 = not found libpthread.so.0 = /lib64/libpthread.so.0 (0x7fe03354f000) libcrypto.so.10 = /lib64/libcrypto.so.10 (0x7fe033168000) libc.so.6 = /lib64/libc.so.6 (0x7fe032da9000) libcmockery.so.0 = not found libcmockery.so.0 = not found libcmockery.so.0 = not found liblzma.so.5 = /lib64/liblzma.so.5 (0x7fe032b82000) /lib64/ld-linux-x86-64.so.2 (0x7fe0351f1000) Should I file a bug report or could someone on the fast-lane fix this? My bad (installation with --nodeps --force :-() Actually, I was not expecting a dependency on cmockery2. My understanding was that only some temporary test-applications would be linked with libcmockery and not any binaries that would get packaged in the RPMs. Luis, could you clarify that? Thanks, Niels ___ Gluster-devel mailing list Gluster-devel@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-devel