Re: [Gluster-users] Slow read performance
Sorry for the late reply. The call profiles look OK on the server side. I suspect it is still something to do with the client or network. Have you mounted the FUSE client with any special options? like --direct-io-mode? That can have a significant impact on read performance as read-ahead in the page-cache (which is way more efficient than gluster's read-ahead translator due to lack of context switch to serve the future page) is effectively turned off. I'm not sure if any of your networking (tcp/ip) configuration is either good or bad. Avati On Mon, Mar 11, 2013 at 9:02 AM, Thomas Wakefield tw...@cola.iges.orgwrote: Is there a way to make a ramdisk support extended attributes? These are my current sysctl settings (and I have tried many different options): net.ipv4.ip_forward = 0 net.ipv4.conf.default.rp_filter = 1 net.ipv4.conf.default.accept_source_route = 0 kernel.sysrq = 0 kernel.core_uses_pid = 1 net.ipv4.tcp_syncookies = 1 kernel.msgmnb = 65536 kernel.msgmax = 65536 kernel.shmmax = 68719476736 kernel.shmall = 4294967296 kernel.panic = 5 net.core.rmem_max = 67108864 net.core.wmem_max = 67108864 net.ipv4.tcp_rmem = 4096 87380 67108864 net.ipv4.tcp_wmem = 4096 65536 67108864 net.core.netdev_max_backlog = 25 net.ipv4.tcp_congestion_control = htcp net.ipv4.tcp_mtu_probing = 1 Here is the output from a dd write and dd read. [root@cpu_crew1 ~]# dd if=/dev/zero of=/shared/working/benchmark/test.cpucrew1 bs=512k count=1 ; dd if=/shared/working/benchmark/test.cpucrew1 of=/dev/null bs=512k 1+0 records in 1+0 records out 524288 bytes (5.2 GB) copied, 7.21958 seconds, 726 MB/s 1+0 records in 1+0 records out 524288 bytes (5.2 GB) copied, 86.4165 seconds, 60.7 MB/s ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Slow read performance
Joe, Understood. No problem at all. Regards, Rodrigo On Mon, Mar 11, 2013 at 7:24 PM, Joe Julian j...@julianfamily.org wrote: I apologize. I normally tend to try to be much more eloquent with my debates. I woke up this morning to learn that the CentOS 6.4 rollout broke all my end-user stations (yes, I have to do automatic updates. I just don't have time to review every package and do everything else I need to do all by my self). Put 200 employees without computers on my shoulders and I tend to stress a little until it's resolved. I took a pot shot and it was uncalled for. Please forgive me. On 03/11/2013 12:10 PM, Rodrigo Severo wrote: On Mon, Mar 11, 2013 at 4:04 PM, Joe Julian j...@julianfamily.org wrote: Which is why we don't run Rodigux Oh Joe, that remark sounds rather inappropriate to me. Apparently we disagree on more levels that just kernel and applications compatibility policies. Regards, Rodrigo Severo On 03/11/2013 12:02 PM, Rodrigo Severo wrote: On Mon, Mar 11, 2013 at 3:46 PM, Bryan Whitehead dri...@megahappy.netwrote: This is clearly something Linus should support (forcing ext4 fix). There is an ethos Linus always champions and that is *never* break userspace. NEVER. Clearly this ext4 change has broken userspace. GlusterFS is not in the kernel at all and this change has broken it. Apparently one year after the change having made into the kernel you believe this argument is still relevant. I don't, really don't. Rodrigo Severo On Mon, Mar 11, 2013 at 11:34 AM, Rodrigo Severo rodr...@fabricadeideias.com wrote: If you prefer to say that Linus recent statement isn't pertinent to Gluster x ext4 issue (as I do), or that ext4 developers are being hypocritical/ignoring Linus orientation (as you do) or anything similar isn't really relevant any more. This argument could have been important in March 2012, the month the ext4 change as applied. Today, March 2013, or Gluster devs decides to assume it's incompatible with ext4 and states it clearly in it's installations and migration documentation, or fixes it's current issues with ext4. No matter what is done, it should have been done months ago. Regards, Rodrigo Severo On Mon, Mar 11, 2013 at 2:49 PM, John Mark Walker johnm...@redhat.comwrote: -- I know where this statement came from. I believe you are both: - trying to apply some statement on a context it's not pertinent to and No, it's actually quite applicable. I'm aware of the context of that statement by Linus, and it applies to this case. Kernel devs, at least the ext4 maintainers, are being hypocritical. There were a few exchanges between Ted T'so and Avati, among other people, on gluster-devel. I highly recommend you read them: http://lists.nongnu.org/archive/html/gluster-devel/2013-02/msg00050.html - fouling yourself and/or others arguing that this issue will/should be fixed in the kernel. This is probably true. I'm *this* close to declaring that, at least for the Gluster community, ext4 is considered harmful. There's a reason Red Hat started pushing XFS over ext4 a few years ago. And Red Hat isn't alone here. The ext4 hash size change was applied in the kernel an year ago. I don't believe it will be undone. Gluster developers could argue that this change was hard on them, and that it shouldn't be backported to Enterprise kernels but after one year not having fixed it is on Gluster developers. Arguing otherwise seems rather foolish to me. I think that's a legitimate argument to make. This is a conversation that is worth taking up on gluster-devel. But I'm not sure what can be done about it, seeing as how the ext4 maintainers are not likely to make the change. Frankly, dropping ext4 as an FS we can recommend solves a lot of headaches. -JM ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing listGluster-users@gluster.orghttp://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing listGluster-users@gluster.orghttp://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Slow read performance
On 03/11/2013 12:02 PM, Thomas Wakefield wrote: Is there a way to make a ramdisk support extended attributes? When I need a ramdisk, I usually do it this way: * Use dd if=/dev/zero ... to create a large file in tmpfs, which can use swap but will behave like a ramdisk as long as there's memory for it. * Make the file appear as a disk using losetup. * Use mkfs on the pseudo-disk and mount it. The semantics, therefore, are the semantics of whatever filesystem you used to format the pseudo-disk. That includes xattrs if you used ext4, XFS, etc. ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Slow read performance
I know where this statement came from. I believe you are both: - trying to apply some statement on a context it's not pertinent to and - fouling yourself and/or others arguing that this issue will/should be fixed in the kernel. The ext4 hash size change was applied in the kernel an year ago. I don't believe it will be undone. Gluster developers could argue that this change was hard on them, and that it shouldn't be backported to Enterprise kernels but after one year not having fixed it is on Gluster developers. Arguing otherwise seems rather foolish to me. Regards, Rodrigo Severo On Mon, Mar 11, 2013 at 2:00 PM, Joe Julian j...@julianfamily.org wrote: This bug is in the kernel. If a change results in user programs breaking, it's a bug in the kernel. We never EVER blame the user programs. - Linus Torvalds http://lkml.org/lkml/2012/12/23/75 On 03/08/2013 12:42 PM, Stephan von Krawczynski wrote: I really do wonder if this bug in _glusterfs_ is not fixed. It really makes no sense to do an implementation that breaks on the most used fs on linux. And just as you said: don't wait on btrfs, it will never be production-ready. And xfs is no solution, it is just a bad work-around. On Fri, 8 Mar 2013 10:43:41 -0800 Bryan Whitehead dri...@megahappy.net dri...@megahappy.net wrote: Here are some details about ext4 changes in the kernel screwing up glusterfs:http://www.gluster.org/2012/08/glusterfs-bit-by-ext4-structure-change/https://bugzilla.redhat.com/show_bug.cgi?id=838784 I thought I read there was a work-around in recent versions of gluster but I think it came at a cost somewhere. I'm not sure since I've been using xfs since the 1.x days of gluster and only see random ext3/4 problems bubble up on these maillinglist. In general, ext4 was just a stopgap for the wait on btrfs getting flushed out. That said, I don't see ext4 going away for a long long time. :-/ NOTE: I don't even know if this is your problem. You might try updating 2 bricks that are replica pairs to use xfs then do some performance tests on files living on them to confirm. Example, you have 20 some servers/bricks. If hostD and hostE are replica pairs for some subset of files, shutdown glusterd on HostD, change fs to xfs, fire glusterd back up - let it resync and recover all the files, do the same on hostE (once hostD is good), then see if there is a read speed improvement for files living on those two host pairs. On Fri, Mar 8, 2013 at 6:40 AM, Thomas Wakefield tw...@cola.iges.org tw...@cola.iges.orgwrote: I am still confused how ext4 is suddenly slow to read when it's behind Gluster, but plenty fast stand alone reading? And it writes really fast from both the server and client. On Mar 8, 2013, at 4:07 AM, Jon Tegner teg...@renget.se teg...@renget.se wrote: We had issues with ext4 about a bit less than a year ago, at that time I upgraded the servers to CentOS-6.2. But that gave us large problems (more than slow reads). Since I didn't want to reformat the disks at that time (and switch to XFS) I went back to CentOS-5.5 (which we had used before). On some link (think it washttps://bugzilla.redhat.com/show_bug.cgi?id=713546 but can't seem to reach that now) it was stated that the ext4-issue was present even on later versions of CentOS-5 (I _think_ 5.8 was affected). Are there hope that the ext4-issue will be solved in later kernels/versions of gluster? If not, it seems one is eventually forced to switch to XFS. Regards, /jon On Mar 8, 2013 03:27 Thomas Wakefield tw...@iges.org tw...@iges.org tw...@iges.org tw...@iges.orgwrote: inode size is 256. Pretty stuck with these settings and ext4. I missed the memo that Gluster started to prefer xfs, back in the 2.x days xfs was not the preferred filesystem. At this point it's a 340TB filesystem with 160TB used. I just added more space, and was doing some followup testing and wasn't impressed with the results. But I am sure I was happier before with the performance. Still running CentOS 5.8 Anything else I could look at? Thanks, Tom On Mar 7, 2013, at 5:04 PM, Bryan Whitehead dri...@megahappy.net dri...@megahappy.net wrote: I'm sure you know, but xfs is the recommended filesystem for glusterfs. Ext4 has a number of issues. (Particularly on CentOS/Redhat6). The default inode size for ext4 (and xfs) is small for the number of extended attributes glusterfs uses. This causes a minor hit in performance on xfs if theextended attributes grow more than 265 (xfs default size). In xfs, this is fixed by setting the size of an inode to 512. How big the impact is on ext4 is something I don't know offhand. But looking at a couple of boxes I have it looks like some ext4 filesystems have 128 inode size and some have 256 inode size (both of which are too small for glusterfs). The performance hit is everytimeextended attributes need to be read several inodes need to be seeked and
Re: [Gluster-users] Slow read performance
Joe Julian j...@julianfamily.org writes: This bug is in the kernel. If a change results in user programs breaking, it's a bug in the kernel. We never EVER blame the user programs. - Linus Torvalds http://lkml.org/lkml/2012/12/23/75 Understood. However, there was an update to your post here: http://www.gluster.org/2012/08/glusterfs-bit-by-ext4-structure-change/ : It [a patchset for gluster addressing the ext4 changes] is still being actively worked on, though, and is a high priority. The 'high priority' bit raised a few expectations; that was more than 6 months ago. While I feel like the gluster devs don't owe me anything, this issue does effect pretty much every user with an ext3/4 brick. There (to my recollection) hasn't been any guidance on how the community should address this in their installations. It's pretty clear in the Redhat Storage docs that Gluster has XFS (and lvm) as a hard dependency... but the 3.3 admin guide dosn't say anything useful (about fs choice) and it 3.1/3.2 docs used to recommend ext4 as a well tested option. It doesn't seem like there's any talk of reverting the commit in the mainline kernel. It seems to be very useful for preventing hash collisions for a lot of kNFS+ext3/4 workflows and it's been backported to the enterprise distributions. It's not a gluster bug, but it's here to stay. What do we do now? (Rhetorical question, I am removing bricks and reformatting with xfs and the re-adding/rebalancing... slowly). -- Shawn Nock (OpenPGP: 0x65118FA5) pgp6tottTaH5G.pgp Description: PGP signature ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Slow read performance
- Original Message - I know where this statement came from. I believe you are both: * trying to apply some statement on a context it's not pertinent to and No, it's actually quite applicable. I'm aware of the context of that statement by Linus, and it applies to this case. Kernel devs, at least the ext4 maintainers, are being hypocritical. There were a few exchanges between Ted T'so and Avati, among other people, on gluster-devel. I highly recommend you read them: http://lists.nongnu.org/archive/html/gluster-devel/2013-02/msg00050.html * fouling yourself and/or others arguing that this issue will/should be fixed in the kernel. This is probably true. I'm *this* close to declaring that, at least for the Gluster community, ext4 is considered harmful. There's a reason Red Hat started pushing XFS over ext4 a few years ago. And Red Hat isn't alone here. The ext4 hash size change was applied in the kernel an year ago. I don't believe it will be undone. Gluster developers could argue that this change was hard on them, and that it shouldn't be backported to Enterprise kernels but after one year not having fixed it is on Gluster developers. Arguing otherwise seems rather foolish to me. I think that's a legitimate argument to make. This is a conversation that is worth taking up on gluster-devel. But I'm not sure what can be done about it, seeing as how the ext4 maintainers are not likely to make the change. Frankly, dropping ext4 as an FS we can recommend solves a lot of headaches. -JM ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Slow read performance
As I read this: https://bugzilla.redhat.com/show_bug.cgi?id=838784 the bug is, from Gluster's POV, between ext4 and the kernel. Is this correct, that Gluster can't safely use ext4 on recent kernels until ext4's relationship with the kernel is fixed? If Gluster can't be simply patched to fix this, and we want to use ext4 (for which there are good arguments, including a rich toolset), should we be leaning more on the ext4 developers? I see there's a wiki we might document the brokenness on at https://ext4.wiki.kernel.org/ which mentions nothing about Gluster compatibility, and a bugzilla at https://bugzilla.kernel.org/buglist.cgi?product=File+Systembug_status=NEWbug_status=REOPENEDbug_status=ASSIGNEDcomponent=ext4 Searching for ext Gluster there gives Zarro Boogs found though. Should someone who can make a fuller report on this bug than I can be making sure that the ext4 project is focused on this problem? Update: Got interrupted before sending this off. I see from other emails since that T'so has been leaned on, and apparently doesn't want to fix ext4?! I know Ted's got rank. But should we collectively be trying to push this to Linus's attention? I'm unclear whether for practical purposes Ted just _is_ ext4, or whether his co-developers count there. Whit ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Slow read performance
If you prefer to say that Linus recent statement isn't pertinent to Gluster x ext4 issue (as I do), or that ext4 developers are being hypocritical/ignoring Linus orientation (as you do) or anything similar isn't really relevant any more. This argument could have been important in March 2012, the month the ext4 change as applied. Today, March 2013, or Gluster devs decides to assume it's incompatible with ext4 and states it clearly in it's installations and migration documentation, or fixes it's current issues with ext4. No matter what is done, it should have been done months ago. Regards, Rodrigo Severo On Mon, Mar 11, 2013 at 2:49 PM, John Mark Walker johnm...@redhat.comwrote: -- I know where this statement came from. I believe you are both: - trying to apply some statement on a context it's not pertinent to and No, it's actually quite applicable. I'm aware of the context of that statement by Linus, and it applies to this case. Kernel devs, at least the ext4 maintainers, are being hypocritical. There were a few exchanges between Ted T'so and Avati, among other people, on gluster-devel. I highly recommend you read them: http://lists.nongnu.org/archive/html/gluster-devel/2013-02/msg00050.html - fouling yourself and/or others arguing that this issue will/should be fixed in the kernel. This is probably true. I'm *this* close to declaring that, at least for the Gluster community, ext4 is considered harmful. There's a reason Red Hat started pushing XFS over ext4 a few years ago. And Red Hat isn't alone here. The ext4 hash size change was applied in the kernel an year ago. I don't believe it will be undone. Gluster developers could argue that this change was hard on them, and that it shouldn't be backported to Enterprise kernels but after one year not having fixed it is on Gluster developers. Arguing otherwise seems rather foolish to me. I think that's a legitimate argument to make. This is a conversation that is worth taking up on gluster-devel. But I'm not sure what can be done about it, seeing as how the ext4 maintainers are not likely to make the change. Frankly, dropping ext4 as an FS we can recommend solves a lot of headaches. -JM ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Slow read performance
This is clearly something Linus should support (forcing ext4 fix). There is an ethos Linus always champions and that is *never* break userspace. NEVER. Clearly this ext4 change has broken userspace. GlusterFS is not in the kernel at all and this change has broken it. On Mon, Mar 11, 2013 at 11:34 AM, Rodrigo Severo rodr...@fabricadeideias.com wrote: If you prefer to say that Linus recent statement isn't pertinent to Gluster x ext4 issue (as I do), or that ext4 developers are being hypocritical/ignoring Linus orientation (as you do) or anything similar isn't really relevant any more. This argument could have been important in March 2012, the month the ext4 change as applied. Today, March 2013, or Gluster devs decides to assume it's incompatible with ext4 and states it clearly in it's installations and migration documentation, or fixes it's current issues with ext4. No matter what is done, it should have been done months ago. Regards, Rodrigo Severo On Mon, Mar 11, 2013 at 2:49 PM, John Mark Walker johnm...@redhat.comwrote: -- I know where this statement came from. I believe you are both: - trying to apply some statement on a context it's not pertinent to and No, it's actually quite applicable. I'm aware of the context of that statement by Linus, and it applies to this case. Kernel devs, at least the ext4 maintainers, are being hypocritical. There were a few exchanges between Ted T'so and Avati, among other people, on gluster-devel. I highly recommend you read them: http://lists.nongnu.org/archive/html/gluster-devel/2013-02/msg00050.html - fouling yourself and/or others arguing that this issue will/should be fixed in the kernel. This is probably true. I'm *this* close to declaring that, at least for the Gluster community, ext4 is considered harmful. There's a reason Red Hat started pushing XFS over ext4 a few years ago. And Red Hat isn't alone here. The ext4 hash size change was applied in the kernel an year ago. I don't believe it will be undone. Gluster developers could argue that this change was hard on them, and that it shouldn't be backported to Enterprise kernels but after one year not having fixed it is on Gluster developers. Arguing otherwise seems rather foolish to me. I think that's a legitimate argument to make. This is a conversation that is worth taking up on gluster-devel. But I'm not sure what can be done about it, seeing as how the ext4 maintainers are not likely to make the change. Frankly, dropping ext4 as an FS we can recommend solves a lot of headaches. -JM ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Slow read performance
On Mon, Mar 11, 2013 at 3:46 PM, Bryan Whitehead dri...@megahappy.netwrote: This is clearly something Linus should support (forcing ext4 fix). There is an ethos Linus always champions and that is *never* break userspace. NEVER. Clearly this ext4 change has broken userspace. GlusterFS is not in the kernel at all and this change has broken it. Apparently one year after the change having made into the kernel you believe this argument is still relevant. I don't, really don't. Rodrigo Severo On Mon, Mar 11, 2013 at 11:34 AM, Rodrigo Severo rodr...@fabricadeideias.com wrote: If you prefer to say that Linus recent statement isn't pertinent to Gluster x ext4 issue (as I do), or that ext4 developers are being hypocritical/ignoring Linus orientation (as you do) or anything similar isn't really relevant any more. This argument could have been important in March 2012, the month the ext4 change as applied. Today, March 2013, or Gluster devs decides to assume it's incompatible with ext4 and states it clearly in it's installations and migration documentation, or fixes it's current issues with ext4. No matter what is done, it should have been done months ago. Regards, Rodrigo Severo On Mon, Mar 11, 2013 at 2:49 PM, John Mark Walker johnm...@redhat.comwrote: -- I know where this statement came from. I believe you are both: - trying to apply some statement on a context it's not pertinent to and No, it's actually quite applicable. I'm aware of the context of that statement by Linus, and it applies to this case. Kernel devs, at least the ext4 maintainers, are being hypocritical. There were a few exchanges between Ted T'so and Avati, among other people, on gluster-devel. I highly recommend you read them: http://lists.nongnu.org/archive/html/gluster-devel/2013-02/msg00050.html - fouling yourself and/or others arguing that this issue will/should be fixed in the kernel. This is probably true. I'm *this* close to declaring that, at least for the Gluster community, ext4 is considered harmful. There's a reason Red Hat started pushing XFS over ext4 a few years ago. And Red Hat isn't alone here. The ext4 hash size change was applied in the kernel an year ago. I don't believe it will be undone. Gluster developers could argue that this change was hard on them, and that it shouldn't be backported to Enterprise kernels but after one year not having fixed it is on Gluster developers. Arguing otherwise seems rather foolish to me. I think that's a legitimate argument to make. This is a conversation that is worth taking up on gluster-devel. But I'm not sure what can be done about it, seeing as how the ext4 maintainers are not likely to make the change. Frankly, dropping ext4 as an FS we can recommend solves a lot of headaches. -JM ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Slow read performance
Which is why we don't run Rodigux On 03/11/2013 12:02 PM, Rodrigo Severo wrote: On Mon, Mar 11, 2013 at 3:46 PM, Bryan Whitehead dri...@megahappy.net mailto:dri...@megahappy.net wrote: This is clearly something Linus should support (forcing ext4 fix). There is an ethos Linus always champions and that is *never* break userspace. NEVER. Clearly this ext4 change has broken userspace. GlusterFS is not in the kernel at all and this change has broken it. Apparently one year after the change having made into the kernel you believe this argument is still relevant. I don't, really don't. Rodrigo Severo On Mon, Mar 11, 2013 at 11:34 AM, Rodrigo Severo rodr...@fabricadeideias.com mailto:rodr...@fabricadeideias.com wrote: If you prefer to say that Linus recent statement isn't pertinent to Gluster x ext4 issue (as I do), or that ext4 developers are being hypocritical/ignoring Linus orientation (as you do) or anything similar isn't really relevant any more. This argument could have been important in March 2012, the month the ext4 change as applied. Today, March 2013, or Gluster devs decides to assume it's incompatible with ext4 and states it clearly in it's installations and migration documentation, or fixes it's current issues with ext4. No matter what is done, it should have been done months ago. Regards, Rodrigo Severo On Mon, Mar 11, 2013 at 2:49 PM, John Mark Walker johnm...@redhat.com mailto:johnm...@redhat.com wrote: I know where this statement came from. I believe you are both: * trying to apply some statement on a context it's not pertinent to and No, it's actually quite applicable. I'm aware of the context of that statement by Linus, and it applies to this case. Kernel devs, at least the ext4 maintainers, are being hypocritical. There were a few exchanges between Ted T'so and Avati, among other people, on gluster-devel. I highly recommend you read them: http://lists.nongnu.org/archive/html/gluster-devel/2013-02/msg00050.html * fouling yourself and/or others arguing that this issue will/should be fixed in the kernel. This is probably true. I'm *this* close to declaring that, at least for the Gluster community, ext4 is considered harmful. There's a reason Red Hat started pushing XFS over ext4 a few years ago. And Red Hat isn't alone here. The ext4 hash size change was applied in the kernel an year ago. I don't believe it will be undone. Gluster developers could argue that this change was hard on them, and that it shouldn't be backported to Enterprise kernels but after one year not having fixed it is on Gluster developers. Arguing otherwise seems rather foolish to me. I think that's a legitimate argument to make. This is a conversation that is worth taking up on gluster-devel. But I'm not sure what can be done about it, seeing as how the ext4 maintainers are not likely to make the change. Frankly, dropping ext4 as an FS we can recommend solves a lot of headaches. -JM ___ Gluster-users mailing list Gluster-users@gluster.org mailto:Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Slow read performance
On Mon, Mar 11, 2013 at 4:04 PM, Joe Julian j...@julianfamily.org wrote: Which is why we don't run Rodigux Oh Joe, that remark sounds rather inappropriate to me. Apparently we disagree on more levels that just kernel and applications compatibility policies. Regards, Rodrigo Severo On 03/11/2013 12:02 PM, Rodrigo Severo wrote: On Mon, Mar 11, 2013 at 3:46 PM, Bryan Whitehead dri...@megahappy.netwrote: This is clearly something Linus should support (forcing ext4 fix). There is an ethos Linus always champions and that is *never* break userspace. NEVER. Clearly this ext4 change has broken userspace. GlusterFS is not in the kernel at all and this change has broken it. Apparently one year after the change having made into the kernel you believe this argument is still relevant. I don't, really don't. Rodrigo Severo On Mon, Mar 11, 2013 at 11:34 AM, Rodrigo Severo rodr...@fabricadeideias.com wrote: If you prefer to say that Linus recent statement isn't pertinent to Gluster x ext4 issue (as I do), or that ext4 developers are being hypocritical/ignoring Linus orientation (as you do) or anything similar isn't really relevant any more. This argument could have been important in March 2012, the month the ext4 change as applied. Today, March 2013, or Gluster devs decides to assume it's incompatible with ext4 and states it clearly in it's installations and migration documentation, or fixes it's current issues with ext4. No matter what is done, it should have been done months ago. Regards, Rodrigo Severo On Mon, Mar 11, 2013 at 2:49 PM, John Mark Walker johnm...@redhat.comwrote: -- I know where this statement came from. I believe you are both: - trying to apply some statement on a context it's not pertinent to and No, it's actually quite applicable. I'm aware of the context of that statement by Linus, and it applies to this case. Kernel devs, at least the ext4 maintainers, are being hypocritical. There were a few exchanges between Ted T'so and Avati, among other people, on gluster-devel. I highly recommend you read them: http://lists.nongnu.org/archive/html/gluster-devel/2013-02/msg00050.html - fouling yourself and/or others arguing that this issue will/should be fixed in the kernel. This is probably true. I'm *this* close to declaring that, at least for the Gluster community, ext4 is considered harmful. There's a reason Red Hat started pushing XFS over ext4 a few years ago. And Red Hat isn't alone here. The ext4 hash size change was applied in the kernel an year ago. I don't believe it will be undone. Gluster developers could argue that this change was hard on them, and that it shouldn't be backported to Enterprise kernels but after one year not having fixed it is on Gluster developers. Arguing otherwise seems rather foolish to me. I think that's a legitimate argument to make. This is a conversation that is worth taking up on gluster-devel. But I'm not sure what can be done about it, seeing as how the ext4 maintainers are not likely to make the change. Frankly, dropping ext4 as an FS we can recommend solves a lot of headaches. -JM ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing listGluster-users@gluster.orghttp://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Slow read performance
- Original Message - On Mon, Mar 11, 2013 at 4:04 PM, Joe Julian j...@julianfamily.org wrote: Which is why we don't run Rodigux Oh Joe, that remark sounds rather inappropriate to me.\ Agreed - rule #1 on gluster.org is 'be respectful'. You made a valid point. -JM ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Slow read performance
I apologize. I normally tend to try to be much more eloquent with my debates. I woke up this morning to learn that the CentOS 6.4 rollout broke all my end-user stations (yes, I have to do automatic updates. I just don't have time to review every package and do everything else I need to do all by my self). Put 200 employees without computers on my shoulders and I tend to stress a little until it's resolved. I took a pot shot and it was uncalled for. Please forgive me. On 03/11/2013 12:10 PM, Rodrigo Severo wrote: On Mon, Mar 11, 2013 at 4:04 PM, Joe Julian j...@julianfamily.org mailto:j...@julianfamily.org wrote: Which is why we don't run Rodigux Oh Joe, that remark sounds rather inappropriate to me. Apparently we disagree on more levels that just kernel and applications compatibility policies. Regards, Rodrigo Severo On 03/11/2013 12:02 PM, Rodrigo Severo wrote: On Mon, Mar 11, 2013 at 3:46 PM, Bryan Whitehead dri...@megahappy.net mailto:dri...@megahappy.net wrote: This is clearly something Linus should support (forcing ext4 fix). There is an ethos Linus always champions and that is *never* break userspace. NEVER. Clearly this ext4 change has broken userspace. GlusterFS is not in the kernel at all and this change has broken it. Apparently one year after the change having made into the kernel you believe this argument is still relevant. I don't, really don't. Rodrigo Severo On Mon, Mar 11, 2013 at 11:34 AM, Rodrigo Severo rodr...@fabricadeideias.com mailto:rodr...@fabricadeideias.com wrote: If you prefer to say that Linus recent statement isn't pertinent to Gluster x ext4 issue (as I do), or that ext4 developers are being hypocritical/ignoring Linus orientation (as you do) or anything similar isn't really relevant any more. This argument could have been important in March 2012, the month the ext4 change as applied. Today, March 2013, or Gluster devs decides to assume it's incompatible with ext4 and states it clearly in it's installations and migration documentation, or fixes it's current issues with ext4. No matter what is done, it should have been done months ago. Regards, Rodrigo Severo On Mon, Mar 11, 2013 at 2:49 PM, John Mark Walker johnm...@redhat.com mailto:johnm...@redhat.com wrote: I know where this statement came from. I believe you are both: * trying to apply some statement on a context it's not pertinent to and No, it's actually quite applicable. I'm aware of the context of that statement by Linus, and it applies to this case. Kernel devs, at least the ext4 maintainers, are being hypocritical. There were a few exchanges between Ted T'so and Avati, among other people, on gluster-devel. I highly recommend you read them: http://lists.nongnu.org/archive/html/gluster-devel/2013-02/msg00050.html * fouling yourself and/or others arguing that this issue will/should be fixed in the kernel. This is probably true. I'm *this* close to declaring that, at least for the Gluster community, ext4 is considered harmful. There's a reason Red Hat started pushing XFS over ext4 a few years ago. And Red Hat isn't alone here. The ext4 hash size change was applied in the kernel an year ago. I don't believe it will be undone. Gluster developers could argue that this change was hard on them, and that it shouldn't be backported to Enterprise kernels but after one year not having fixed it is on Gluster developers. Arguing otherwise seems rather foolish to me. I think that's a legitimate argument to make. This is a conversation that is worth taking up on gluster-devel. But I'm not sure what can be done about it, seeing as how the ext4 maintainers are not likely to make the change. Frankly, dropping ext4 as an FS we can recommend solves a lot of headaches. -JM ___ Gluster-users mailing list Gluster-users@gluster.org mailto:Gluster-users@gluster.org
Re: [Gluster-users] Slow read performance
Have you run oprofile on the client and server simultaneously to see if there's some race condition developing? Obviously the NFS client is fine, but it's clear that there's nothing wrong with the hardware. oprofile will at least reveal where the the bits are vacationing and may point a specific bottleneck. oprofile.sf.net for docs and examples (pretty good); fairly easy to set up to profile applications; a bit more trouble if you're trying to profile kernel interactions, but it looks like you might not have to. I wouldn't want to forklift 160TB either. My sympathies. hjm On Thursday, March 07, 2013 09:27:42 PM Thomas Wakefield wrote: inode size is 256. Pretty stuck with these settings and ext4. I missed the memo that Gluster started to prefer xfs, back in the 2.x days xfs was not the preferred filesystem. At this point it's a 340TB filesystem with 160TB used. I just added more space, and was doing some followup testing and wasn't impressed with the results. But I am sure I was happier before with the performance. Still running CentOS 5.8 Anything else I could look at? Thanks, Tom ... --- Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine [m/c 2225] / 92697 Google Voice Multiplexer: (949) 478-4487 415 South Circle View Dr, Irvine, CA, 92697 [shipping] MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps) --- ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Slow read performance
I really do wonder if this bug in _glusterfs_ is not fixed. It really makes no sense to do an implementation that breaks on the most used fs on linux. And just as you said: don't wait on btrfs, it will never be production-ready. And xfs is no solution, it is just a bad work-around. On Fri, 8 Mar 2013 10:43:41 -0800 Bryan Whitehead dri...@megahappy.net wrote: Here are some details about ext4 changes in the kernel screwing up glusterfs: http://www.gluster.org/2012/08/glusterfs-bit-by-ext4-structure-change/ https://bugzilla.redhat.com/show_bug.cgi?id=838784 I thought I read there was a work-around in recent versions of gluster but I think it came at a cost somewhere. I'm not sure since I've been using xfs since the 1.x days of gluster and only see random ext3/4 problems bubble up on these maillinglist. In general, ext4 was just a stopgap for the wait on btrfs getting flushed out. That said, I don't see ext4 going away for a long long time. :-/ NOTE: I don't even know if this is your problem. You might try updating 2 bricks that are replica pairs to use xfs then do some performance tests on files living on them to confirm. Example, you have 20 some servers/bricks. If hostD and hostE are replica pairs for some subset of files, shutdown glusterd on HostD, change fs to xfs, fire glusterd back up - let it resync and recover all the files, do the same on hostE (once hostD is good), then see if there is a read speed improvement for files living on those two host pairs. On Fri, Mar 8, 2013 at 6:40 AM, Thomas Wakefield tw...@cola.iges.orgwrote: I am still confused how ext4 is suddenly slow to read when it's behind Gluster, but plenty fast stand alone reading? And it writes really fast from both the server and client. On Mar 8, 2013, at 4:07 AM, Jon Tegner teg...@renget.se wrote: We had issues with ext4 about a bit less than a year ago, at that time I upgraded the servers to CentOS-6.2. But that gave us large problems (more than slow reads). Since I didn't want to reformat the disks at that time (and switch to XFS) I went back to CentOS-5.5 (which we had used before). On some link (think it was https://bugzilla.redhat.com/show_bug.cgi?id=713546 but can't seem to reach that now) it was stated that the ext4-issue was present even on later versions of CentOS-5 (I _think_ 5.8 was affected). Are there hope that the ext4-issue will be solved in later kernels/versions of gluster? If not, it seems one is eventually forced to switch to XFS. Regards, /jon On Mar 8, 2013 03:27 Thomas Wakefield tw...@iges.org tw...@iges.orgwrote: inode size is 256. Pretty stuck with these settings and ext4. I missed the memo that Gluster started to prefer xfs, back in the 2.x days xfs was not the preferred filesystem. At this point it's a 340TB filesystem with 160TB used. I just added more space, and was doing some followup testing and wasn't impressed with the results. But I am sure I was happier before with the performance. Still running CentOS 5.8 Anything else I could look at? Thanks, Tom On Mar 7, 2013, at 5:04 PM, Bryan Whitehead dri...@megahappy.net wrote: I'm sure you know, but xfs is the recommended filesystem for glusterfs. Ext4 has a number of issues. (Particularly on CentOS/Redhat6). The default inode size for ext4 (and xfs) is small for the number of extended attributes glusterfs uses. This causes a minor hit in performance on xfs if theextended attributes grow more than 265 (xfs default size). In xfs, this is fixed by setting the size of an inode to 512. How big the impact is on ext4 is something I don't know offhand. But looking at a couple of boxes I have it looks like some ext4 filesystems have 128 inode size and some have 256 inode size (both of which are too small for glusterfs). The performance hit is everytimeextended attributes need to be read several inodes need to be seeked and found. run dumpe2fs -h blockdevice | grep size on your ext4 mountpoints. If it is not too much of a bother - I'd try xfs as your filesystem for the bricks mkfs.xfs -i size=512 blockdevice Please see this for more detailed info: https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Storage/2.0/ht ml-single/Administration_Guide/index.html#chap-User_Guide-Setting_Volu mes On Thu, Mar 7, 2013 at 12:08 PM, Thomas Wakefield tw...@cola.iges.org wrote: Everything is built as ext4, no options other than lazy_itable_init=1 when I built the filesystems. Server mount example: LABEL=disk2a /storage/disk2a ext4defaults 0 0 Client mount: fs-disk2:/shared /shared glusterfs defaults 0 0 Remember, the slow reads are only from gluster clients, the disks are really fast when I am
Re: [Gluster-users] Slow read performance
How are you doing the read/write tests on the fuse/glusterfs mountpoint? Many small files will be slow because all the time is spent coordinating locks. On Wed, Feb 27, 2013 at 9:31 AM, Thomas Wakefield tw...@cola.iges.orgwrote: Help please- I am running 3.3.1 on Centos using a 10GB network. I get reasonable write speeds, although I think they could be faster. But my read speeds are REALLY slow. Executive summary: On gluster client- Writes average about 700-800MB/s Reads average about 70-80MB/s On server- Writes average about 1-1.5GB/s Reads average about 2-3GB/s Any thoughts? Here are some additional details: Nothing interesting in any of the log files, everything is very quite. All servers had no other load, and all clients are performing the same way. Volume Name: shared Type: Distribute Volume ID: de11cc19-0085-41c3-881e-995cca244620 Status: Started Number of Bricks: 26 Transport-type: tcp Bricks: Brick1: fs-disk2:/storage/disk2a Brick2: fs-disk2:/storage/disk2b Brick3: fs-disk2:/storage/disk2d Brick4: fs-disk2:/storage/disk2e Brick5: fs-disk2:/storage/disk2f Brick6: fs-disk2:/storage/disk2g Brick7: fs-disk2:/storage/disk2h Brick8: fs-disk2:/storage/disk2i Brick9: fs-disk2:/storage/disk2j Brick10: fs-disk2:/storage/disk2k Brick11: fs-disk2:/storage/disk2l Brick12: fs-disk2:/storage/disk2m Brick13: fs-disk2:/storage/disk2n Brick14: fs-disk2:/storage/disk2o Brick15: fs-disk2:/storage/disk2p Brick16: fs-disk2:/storage/disk2q Brick17: fs-disk2:/storage/disk2r Brick18: fs-disk2:/storage/disk2s Brick19: fs-disk2:/storage/disk2t Brick20: fs-disk2:/storage/disk2u Brick21: fs-disk2:/storage/disk2v Brick22: fs-disk2:/storage/disk2w Brick23: fs-disk2:/storage/disk2x Brick24: fs-disk3:/storage/disk3a Brick25: fs-disk3:/storage/disk3b Brick26: fs-disk3:/storage/disk3c Options Reconfigured: performance.write-behind: on performance.read-ahead: on performance.io-cache: on performance.stat-prefetch: on performance.quick-read: on cluster.min-free-disk: 500GB nfs.disable: off sysctl.conf settings for 10GBe # increase TCP max buffer size settable using setsockopt() net.core.rmem_max = 67108864 net.core.wmem_max = 67108864 # increase Linux autotuning TCP buffer limit net.ipv4.tcp_rmem = 4096 87380 67108864 net.ipv4.tcp_wmem = 4096 65536 67108864 # increase the length of the processor input queue net.core.netdev_max_backlog = 25 # recommended default congestion control is htcp net.ipv4.tcp_congestion_control=htcp # recommended for hosts with jumbo frames enabled net.ipv4.tcp_mtu_probing=1 Thomas W. Sr. Systems Administrator COLA/IGES tw...@cola.iges.org Affiliate Computer Scientist GMU ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users ___ Gluster-users mailing list Gluster-users@gluster.org http://supercolony.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Slow read performance
Every time you open/close a file or a directory you will have to wait for locks which take time. This is totally expected. Why don't you share what you want to do? iozone benchmarks look like crap but serving qcow2 files to qemu works fantastic for me. What are you doing? Make a benchmark that does that. If you are going to have many files with a wide variety of sizes glusterfs/fuse might not be what you are looking for. On Wed, Feb 27, 2013 at 12:56 PM, Thomas Wakefield tw...@cola.iges.orgwrote: I have tested everything, small and large files. I have used file sizes ranging from 128k up to multiple GB files. All the reads are bad. Here is a fairly exhaustive iozone auto test: random random bkwd record stride KB reclen write rewritereadrereadread write read rewrite read fwrite frewrite fread freread 64 4 40222 6349226868300601620 71037 157270570312947709672475 1473613928 64 8 99207 11636613591135133214 97690 3155 10997828920 152018 158480 1893617625 64 16 230257 2537662515628713 10867 223732 8873 24429754796 303383 312204 1506213545 64 32 255943 234481 5735102 7100397 11897 318502 13681 34780124214 695778 528618 2583828094 64 64 214096 681644 6421025 7100397 27453 292156 28117 62165727338 376062 512471 2856932534 128 4 74329 7546826428410891131 72857 111866976 15977377878343 1335113026 128 8 100862 13517024966167342617 118966 2560 12040639156 125121 146613 1617716180 128 16 115114 25398328212178545307 246180 5431 22984347335 255920 271173 2725624445 128 32 256042 3913603984864258 11329 290230 9905 42956338176 490380 463696 2091719219 128 64 248573 592699 4557257 6812590 19583 452366 29263 60335742967 814915 692017 7632737604 128 128 921183 526444 5603747 5379161 45614 390222 65441 82620241384 662962 1040839 7852639023 256 4 76212 7733740295321251289 71866 126164645 14365730953048 2307329550 256 8 126922 14197626237251302566 128058 2565 138981 2985 125060 133603 2284024955 256 16 242883 26363641850243714902 250009 5290 24879289353 243821 247303 2696526199 256 32 409074 4397324010139335 11953 436870 11209 43021883743 409542 479390 3082127750 256 64 259935 5715026484071847 22537 617161 23383 39204791852 672010 802614 4167353111 256 128 847597 812329 18551783198 49383 708831 44668 79488974267 1180188 1662639 5430341018 256 256 481324 709299 5217259 5320671 44668 719277 40954 80805041302 790209 771473 6222435754 512 4 77667 7522635102296961337 66262 145167680 14136926569142 4208427897 512 8 134311 14434130144246462102 134143 2209 134699 2296 108110 128616 2510429123 512 16 200085 24878730235256974196 247240 4179 256116 4768 250003 226436 3235128455 512 32 330341 43980526440392848744 457611 8006 424168 125953 425935 448813 2766026951 512 64 483906 7337294874741121 16032 555938 17424 587256 187343 366977 735740 4170041548 512 128 836636 9077176935994921 42443 761031 36828 964378 123165 651383 695697 5836844459 512 256 520879 860437 145534 135523 40267 847532 31585 66325269696 1270846 1492545 4882248092 512 512 782951 973118 3099691 2942541 42328 871966 46218 91118449791 953248 1036527 5272348347 1024 4 76218 6936236431287111137 66171 117468938 11257056670845 3494228914 1024 8 126045 14052437836156642698 126000 2557 125566 2567 110858 127255 2676427945 1024 16 243398 26142940238232633987 246400 3882 260746 4093 236652 236874 3142925076 1024 32 383109 42207641731