Re: [Gluster-users] glusterfsd initscript default sequence
On 09/07/2009 12:18 AM, Jeff Evans wrote: I too think S90 is off, although I'm not sure where it should go, or how to make it start glusterfsd before it gets to /etc/fstab mounting? I think the only way to ensure glusterfsd comes up before fstab mounting (mount -a) is by using the noauto option and then mounting it later in rc.local or whenever you are ready. In my case, I want glusterfs available ASAP and using S50 was adequate as this is before anything like smb/nfs/httpd starts looking for the mount. FYI: On Fedora 11, I set glusterfsd to use S20, and have the /etc/fstab mount include the _netdev option. It works. At least on Fedora / RHEL, it looks like _netdev is how you trigger delayed mounting as part of /etc/init.d/netfs (S25, which is later than S20). I'd still like to see the autofs problem addressed - but in the mean time, I have an acceptable solution. Cheers, mark -- Mark Mielke ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] The continuing story ...
> Yep, I experience this exact lock-up state on the 2.x train of GlusterFS > with two severs, each with local client, and have so far given up testing :( > - I run 1.3 in production which still has problems when one of the servers > goes down, and was hoping to move up to 2.x quickly, but cant at the moment. > > Every time a new version comes out I update hoping it will be solved. > > Because the machine that hangs, hangs so completely one can't ssh in and > can't get a proper dump from the process, and any DEBUG log enabled has no > information in it either, so I haven't been able to provide anything useful > to the team to work from :( Daniel, Since you say your machines have glusterfs mounts as well, we would like to know if you can do some debugging by having an open login before you start the filesystem and once you face the hang, can you tell if the "hang" is on the backend fs or on the glusterfs mountpoint? you can kill -11 the glusterfsd process and it will dump the pending syscall info in the logfile which can be of great help. While the symptoms can be very similar to the issue on this thread, note that the thread is about system hang where there is no glusterfs mountpoint, and the hang is confirmed to be on the backend fs. We are very much interested to debug and fix _glusterfs_ mountpoint hangs. Avati ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] The continuing story ...
> Altough is clear that the bug itself is a kernel bug it's also > clear that glusterfs is triggering that bug. The same system under > the same load but using nfs instead of gluster does not have this > problem. This problem also does not happen copying lots of data > using scp. Also, i have never seen such this hangs in more than > 10 years using unix boxes. But the more strange thing is that this > is a bug that can make glusterfs totally unusable and the developers > seem to don't worry even in finding what is exactly causing that > problem. I would like to politely disagree with your final statement. In a previous thread we have indeed promised that we will be fixing the timeout techniques to take into consideration the situation where the backend fs is hanging so that the entire glusterfs volume does not become unusable. As far as debugging the system hang is concerned, you need to be looking for kernel logs and dmesg output. You really are wasting your time trying to debug a kernel fs hang by looking for logs from a user application. The kernel oops backtrace shows you exactly where the kernel is locking up. Take the backtrace to the kernel developers and they will tell you the next step. It is for this very reason the kernel supports serial console logging to extract hints when the system cannot log to files. It is not that we do not want to help, but there is only so much we can do as a user application. We issue system calls and process the result. The effort needed to programmatically figure out which the hanging system call is (with wierd and awkwardly implemented ad-hoc timeouts in the code) and the amount of hint you get from that is far less worth than directly going to the heart of the problem - get the kernel backtrace from a serial console and you will be just one step from your solution. If you can also post back a link to the thread on the appropriate ML where you post your kernel backtrace, we would be interested to keep a watch on it, or provide more (specific) info if found necessary by those developers. Almost always the kernel backtrace would be sufficient. That is the correct first step for debugging this problem. Avati ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] running glustersd without root
Hi, On Wed, Sep 9, 2009 at 7:44 AM, Wei Dong wrote: > Hi All, > > Is it possible to run glusterfsd without root? In my machines, without root > privilege, glusterfsd always complains that extended attributes are not > supported. You have to be root to run glusterfs. Regards, Sachidananda. ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] running glustersd without root
Hi All, Is it possible to run glusterfsd without root? In my machines, without root privilege, glusterfsd always complains that extended attributes are not supported. Thanks, - Wei ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] client coherence problem with locks and truncate
Rob, Thanks for reporting this. We are working on it and you can track progress at: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=252 Vikas -- Engineer - http://gluster.com/ ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] glusterfsd initscript default sequence
The init script is also wrong if you used a non-default install path. It always points to /usr/sbin/glusterfsd and not your --prefix specified path. liam On Sun, Sep 6, 2009 at 9:18 PM, Jeff Evans wrote: > >> In the case that the node is both a server and a client, as I >> wish to use it (3-node cluster, where each is both a client and >> server in cluster/replicate configuration), I found that using >> /etc/fstab to mount and the default glusterfsd initscript of S90 >> causes the mount to be made before glusterfsd is up. > > My scenario exactly. > >> In a test I >> just ran where I restarted all three nodes at the same time, for >> the server that came up first, it seems the client decided >> nothing was up. > > Yes, and this causes anything that depends upon the glusterfs mount to > wait at startup for the FS to become available. > >>I too think S90 is off, >> although I'm not sure where it should go, or how to make it start >> glusterfsd before it gets to /etc/fstab mounting? > > I think the only way to ensure glusterfsd comes up before fstab > mounting (mount -a) is by using the noauto option and then mounting it > later in rc.local or whenever you are ready. > > In my case, I want glusterfs available ASAP and using S50 was adequate > as this is before anything like smb/nfs/httpd starts looking for the > mount. > > Thanks, Jeff. > > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] double traffic usage since upgrade?
Any other thoughts on why i'm seeing double the inbound traffic? We're have a large increase in site traffic the last few weeks and my out bound traffic has increase to almost 400mbit/sec which has translated to 800mbit of backend gluster traffic. I'm basically at the limit of gigabit ethernet unless i do bounding. Ideas on how to fix this? thanks, liam On Mon, Aug 17, 2009 at 3:28 PM, Liam Slusser wrote: > On Mon, Aug 17, 2009 at 7:42 AM, Mark Mielke wrote: >> On 08/17/2009 08:06 AM, Shehjar Tikoo wrote: >>> >>> For a start, we've aimed at getting apache and unfs3 to work with booster. >>> The functional support for both in booster is complete in >>> 2.0.6 release. >>> >>> For a list of system calls supported by booster, please see: >>> http://www.gluster.org/docs/index.php/BoosterConfiguration >>> >>> There can be applications which need un-boosted syscalls also to be >>> usable over GlusterFS. For such a scenario we have two ways booster >>> can be used. Both approaches are described at the page linked above >>> but in short, you're right in thinking that when the un-supported >>> syscalls are also needed to go over FUSE, we are, as you said, leaking >>> or redirecting calls over the FUSE mount point. >>> >> >> Hi Shehjar: >> >> That's fine, I think, as long as it is recognized that trapping system call >> open() as booster is implemented today probably does not trap fopen() on >> Linux. If apache and unfs3 always call open() directly, and you are trapping >> this, then your purpose is being served. >> >> I was kind of hoping you had found a way around --disable-hidden-plt, so I >> could steal the idea from you. Too bad. :-) >> >> Cheers, >> mark >> >> -- >> Mark Mielke >> > > Just a FYI - I am not using booster at all on our feed boxes, this is > just straight fuse and the glusterfs process [with the box we're > seeing the traffic doubling on]. > > liam > ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] Is glusterfs replication intended for hard drive failure
After reading all the emails about replication, I start to worry about hard drive failure. Let's say one hard drive fails and I replace it with a new one. According to the previous discussion, a file is auto-healed only if a client accesses it. I have a lot of files and it's unlikely that each file will be accessed by some client in a short period of time, so those not accessed will be left unhealed. If the other hard drive dies, then all these unhealed data will be lost. Is that correct? Then what's the right way to deal with hard drive failure, or glusterfs is simply not designed for network failures but assumes reliable underlying storage? - Wei Dong ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] How does replication work?
On 09/08/2009 01:18 PM, Daniel Maher wrote: For "shared nothing", each node really does need to be fully independent and able to make its own decisions. I think the GlusterFS folk have the model right in this regard. The remaining question is whether they have the *implementation* right. :-) You're taking my statement too far. :) All i meant was that i don't think the clients should be responsible for replication - that, in my mind, is the job of the servers. Purposefully so, I think. More like stealing your thread to start one of my own. :-) But, to stay with yours for a second - Shouldn't it be possible to configure GlusterFS such that the server does replication today? That is, the client connects to one of the servers, and the server then has a cluster/replication volume with one local volume, and several remote volumes. Do this on each of the servers. Then, the configuration is for the client to use a cluster/ha volume so that it can connect to multiple servers if one server is down? I haven't tried it myself, but the concept of "servers responsible for replication" seems to be possible to do today. :-) It also forces the understanding of what replication involves. Ultimately, somebody must do the replication, and ultimately, the client must be able to connect to multiple servers. The real difference between the recommended configuration and the configuration I suggest above, is which node is actually responsible for sending (N-1) x each request to the "other" nodes in the replication cluster. Is it client->server bandwidth (client side replication) or server -> server bandwidth (server side replication). The other questions are which model has the most potential for optimization, and which model has the most potential for automatic failure recovery. I think these answers are a bit grey right now. GlusterFS is pushing the envelope for client side replication. Other solutions such as Lustre give up on one or both of metadata or content replication. Cheers, mark -- Mark Mielke ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] How does replication work?
Mark Mielke wrote: For Daniel: For the seems crazy, compared to what? Every time I look at other solutions such as Lustre and see how they rely on a single metadata server, that itself is supposed to be highly available using other means, I have to ask, are they really solving the highly availability problem, or are they just narrowing the scope? If the whole For "shared nothing", each node really does need to be fully independent and able to make its own decisions. I think the GlusterFS folk have the model right in this regard. The remaining question is whether they have the *implementation* right. :-) You're taking my statement too far. :) All i meant was that i don't think the clients should be responsible for replication - that, in my mind, is the job of the servers. Basically, i *like* it when the clients are independant, and the servers work together - not the other way around. That's all. -- Daniel Maher ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] How does replication work?
On 09/08/2009 01:01 PM, Alan Ivey wrote: This is for running some "cloud" servers where we want all files available on each machine locally so all servers have the same files but can still get local performance. I don't think I'll need to run a cron like that but it's not my network so I'm trying to figure out how to get all of the gears working together. Note that unless you are doing mostly read - or unless you have GigE or better connections to each of these servers - you are not going to "still get local performance". As stat() calls are distributed to multiple machines, and write() operations would require replication to multiple machines, these will be significantly slower than local disk, and significantly slower than NFS (which does not do replication). The more machines you have, the worse it will get. For example, in a test I just did, where we only have 100 Mbit/s between the nodes right now, with a 3-node replication cluster, I was only getting 5 Mbyte/s writes, but 70 - 130 Mbyte/s reads. Why? Because my write to the one server needed to be replicated to 2 other nodes, and if we divide 100 Mbit/s transmit speed by 2, we get 50 Mbit/s or 5 Mbyte/s to each. If you are going to have a "cloud" of 10 servers instead of 3 - this means that every write needs to be sent to all 10, or 9 other nodes (depending on if the client is on the server), which would divide your network upload capacity by 10, or 9. Go up to 100, and it gets even worse. The machines themselves - every time I do a write, every node in the replication cluster would do a write. They're all working in unison. So, if 10 machines in the cluster are all issuing writes, then 10X as many writes are all happening to local disk, which means 10X as many seeks, and 10X fewer I/O throughput to the disk. I want to make sure you understand that clustering in this way is not really giving you the ability to have each machine with local disk speeds. For most uses, clustering is really providing fail over / redundancy capabilities. However, if your workload is entirely dominated by reads, and very few writes, than this would also provide effective load balancing capabilities. I did just discover that changing permissions does not seem to heal. I had two servers up, killed glusterfsd on server2, on client I changed the permissions of a file, brought server2 back online, ran ls on client, and server2 still had the old permissions. It's not really a big issue since I don't see myself changing permissions often, esp with any servers down, but interesting nonetheless. I think those settings are all configurable - I haven't played with them myself. More checks at run time = more expensive at run time. Cheers, mark -- Mark Mielke ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] How does replication work?
Daniel Maher wrote, > Well, auto-healing only needs to happen if one or more of the storage bricks > was inaccessible - otherwise files stay synchronised via replication as > normal. > If you're considering running a process in order to trigger self-heal all the > time, then presumably something is really wrong with your network, and you > should probably address that before trying to get Gluster going. :) This is for running some "cloud" servers where we want all files available on each machine locally so all servers have the same files but can still get local performance. I don't think I'll need to run a cron like that but it's not my network so I'm trying to figure out how to get all of the gears working together. I did just discover that changing permissions does not seem to heal. I had two servers up, killed glusterfsd on server2, on client I changed the permissions of a file, brought server2 back online, ran ls on client, and server2 still had the old permissions. It's not really a big issue since I don't see myself changing permissions often, esp with any servers down, but interesting nonetheless. ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] How does replication work?
On 09/08/2009 04:14 AM, Daniel Maher wrote: Alan Ivey wrote: Like the subject implies, how does replication work exactly? If a client is the only one that has the IP addresses defined for the servers, does that mean that only a client writing a file ensures that it goes to both servers? That would tell me that the servers don't directly communicate with each other for replication. If so, how does healing work? Since the client is the only configuration with the multiple server IP addresses, is it the client's "task" to make sure the server heals itself once it's back online? If not, how do they servers know each other exist if not for the client config file? You've answered your own question. :) AFAIK, in the recommended simple replication scenario, the client is actually responsible for replication, as each server is functionally independant. (This seems crazy to me, but yes, that's how it works.) For Alan: Active healing should only be necessary if the system is not working properly. Healing should only be required after a system crash or bug, a GlusterFS server or client crash or bug, or somebody messing around with the backing store file system underneath. For systems that are up and running without problems, healing should be completely unnecessary. For Daniel: For the seems crazy, compared to what? Every time I look at other solutions such as Lustre and see how they rely on a single metadata server, that itself is supposed to be highly available using other means, I have to ask, are they really solving the highly availability problem, or are they just narrowing the scope? If the whole cluster of 2 to 1000 nodes is relying on a single server to being up, this is the weakest link. Sure, having one weakest link to deal with is easier to solve using traditional means that having 1000 weakest links, but it seems clear that Lustre has not SOLVED the problem. They've just reduced it to something that might be more manageable. Even the "traditional means" of shared disk storage such as GFS and OCFS rely on a single piece of hardware - the shared storage. As a result, they make the shared storage really expensive - dual interfaces, dual power supplies, dual disks, ... but it's still one piece of hardware that everything else is reliant on. For "shared nothing", each node really does need to be fully independent and able to make its own decisions. I think the GlusterFS folk have the model right in this regard. The remaining question is whether they have the *implementation* right. :-) Right now they seem to be in a compromised position between simplicity, performance, and correctness. It seems it is a difficult problem to have all three no matter which model is selected (shared disk, shared metadata only, shared nothing). The self-healing is a good feature, but they seem to be leaning on it to provide correctness, so that they can provide performance with some amount of simplicity. An example here is how directory listings come from "the first up server". In theory, we could have correctness through self-healing if directory listing always queried all servers. The combined directory listing would be shown, and self healing would kick off in the back ground. But, this would cost performance - as all servers in the cluster would be involved in directory listing. This is just one example. I think GlusterFS has a lot of potential to close off on holes such as these. I don't think it would be difficult to add in things like an automatic election model for defining which machines are considered stable and the safest masters to use (simplest might be 'the one with the highest glusterfsd uptime'?), and having clients choose to pull things like directory listings only from the first stable / safest master, and having the non-stable / non-safe machines go into automatic full self-heal until they are back up-to-date with the master. In such a model, I'd like to see the locks being held against the stable/safe masters used for reads. Just throwing stuff out there... For me, I'm looking at this as - I have a problem to solve, and very few solutions seem to meet my requirements. GlusterFS looks very close. Do I write my own, which would probably start out only solving my requirements, and since my requirements will probably grow, this would mean eventually writing something the size of GlusterFS? Or do I start looking in to this GlusterFS thing - point out the problems, and see if I can help? I'm leaning towards the latter - try it out, point out the problems, see if I can help. As it is, I think GlusterFS is very stable with sufficient performance for the requirements of most potential users. It's the people who are really trying to push it to its limits that are causing the majority of the breakage being reported here. For these people, which includes me, I've looked around - and the solutions out there that are competitive are eith
Re: [Gluster-users] How does replication work?
Alan Ivey wrote: Thanks for the reply Daniel. I've been experimenting with it this morning and specifically the auto-healing feature. I've found out that it really only auto-heals when I ls the client directory. That's the only time I was able to get the second server to catch up with the files that were written while the second server was down. I was hoping that any operating performed on the client would cause the second server to catch all the way up to the second server, but creating a new file replicated it on both machines, but not the files created in the meantime. So, my question now is, what operations cause the auto-healing to execute? I was only able to get it to catch up when running ls on the client-side. I'm envisioning using this as HA-NFS, and so if I want all servers to always have the correct files and perms, should I create a cron to run ls on the client directory every minute? According to the documentation at http://www.gluster.com/community/documentation/index.php/Understanding_AFR_Translator#Frequently_Asked_Questions , it looks like indeed running an ls is a sure-fire way to self-heal all directories. Without this intervention, what glusterfs client commands will cause this to happen? Thanks again! Well, auto-healing only needs to happen if one or more of the storage bricks was inaccessible - otherwise files stay synchronised via replication as normal. If you're considering running a process in order to trigger self-heal all the time, then presumably something is really wrong with your network, and you should probably address that before trying to get Gluster going. :) -- Daniel Maher ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] How does replication work?
>> Like the subject implies, how does replication work exactly? >> >> If a client is the only one that has the IP addresses defined for the >> servers, does that mean that only a client writing a file ensures that it >> goes to both servers? That would tell me that the servers don't directly >> communicate with each other for replication. >> >> If so, how does healing work? Since the client is the only configuration >> with the multiple server IP addresses, is it the client's "task" to make >> sure the server heals itself once it's back online? >> >> If not, how do they servers know each other exist if not for the client >> config file? > You've answered your own question. :) AFAIK, in the recommended simple > replication scenario, the client is actually responsible for replication, as > each server is functionally independant. > (This seems crazy to me, but yes, that's how it works.) Thanks for the reply Daniel. I've been experimenting with it this morning and specifically the auto-healing feature. I've found out that it really only auto-heals when I ls the client directory. That's the only time I was able to get the second server to catch up with the files that were written while the second server was down. I was hoping that any operating performed on the client would cause the second server to catch all the way up to the second server, but creating a new file replicated it on both machines, but not the files created in the meantime. So, my question now is, what operations cause the auto-healing to execute? I was only able to get it to catch up when running ls on the client-side. I'm envisioning using this as HA-NFS, and so if I want all servers to always have the correct files and perms, should I create a cron to run ls on the client directory every minute? According to the documentation at http://www.gluster.com/community/documentation/index.php/Understanding_AFR_Translator#Frequently_Asked_Questions , it looks like indeed running an ls is a sure-fire way to self-heal all directories. Without this intervention, what glusterfs client commands will cause this to happen? Thanks again! ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] The continuing story ...
Hi Altough is clear that the bug itself is a kernel bug it's also clear that glusterfs is triggering that bug. The same system under the same load but using nfs instead of gluster does not have this problem. This problem also does not happen copying lots of data using scp. Also, i have never seen such this hangs in more than 10 years using unix boxes. But the more strange thing is that this is a bug that can make glusterfs totally unusable and the developers seem to don't worry even in finding what is exactly causing that problem. To make an analogy, supose that users that have a specific brand of tires in his car complain to the manufacturer about that tires breaking too fast where other tires has no problem and the manufacturer says that this is due to the road deffects and does nothing to improve the tires. It's clear what will happen to this manufacturer, right ? -- Best regards ... David Saez Padroshttp://www.ols.es On-Line Services 2000 S.L. telf+34 902 50 29 75 ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] The continuing story ...
On Tue, 8 Sep 2009 05:37:09 -0700 Anand Avati wrote: > >> > I doubt that this can be a real solution. My guess is that glusterfsd > >> > runs > >> > into some race condition where it locks itself up completely. > >> > It is not funny to debug something the like on a production setup. Best > >> > would > >> > be to have debugging output sent from the servers' glusterfsd directly > >> > to a > >> > client to save the logs. I would not count on syslog in this case, if it > >> > survives one could use a serial console for syslog output though. > > I'm going to iterate through this yet again at the risk of frustrating > you. glusterfsd (on the server side) is yet another process running > only system calls. If glusterfsd has a race condition and locks itself > up, then it locks _only its own process_ up. What you are having is a > frozen system. There is no way glusterfsd can lock up your system > through just VFS system calls, even if it wanted to, intentionally. It > is a pure user space process and has no power to lock up the system. > The worst glusterfsd can do to your system is deadlock its own process > resulting in a glusterfs fuse mountpoint hang, or segfault and result > in a core dump. > > Please consult system/kernel programmers you trust. Or ask on the > kernel-devel mailing list. The system freeze you are facing is not > something which can be caused by _any_ user space application. Please read carefully what I told about the system condition. The fact that I can ping the box means that the kernel is not messed up, i.e. this is no freeze. But as I cannot login nor use any other user-space software to get hands on the box only means that an application should only be able to mess up the userspace to an extent that every other application gets few to no timeslices, or some system resource is eaten up to an extent that others are simply locked out. That does not sound impossible to me as it is just like a local DoS attack which is possible. Maybe one only needs some messed up pointers to create such a situation. What really bothers me more is the fact that you continously deny to see what several people on the list described. It is not our intention to waste someones time, we try to give as much information as possible to go out and find some problem. Unfortunately we cannot do that job, because we don't have the background knowledge about your code. Since it all is userspace maybe it would be helpful to have a version that just outputs logs to serial, so that we can trace where it went before things blew up. Maybe we can watch it cycling somewhere... Do you really deny that a local DoS attack is generally possible? -- Regards, Stephan ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] The continuing story ...
On Tue, Sep 08, 2009 at 05:37:09AM -0700, Anand Avati wrote: > >> > I doubt that this can be a real solution. My guess is that glusterfsd > >> > runs > >> > into some race condition where it locks itself up completely. > >> > It is not funny to debug something the like on a production setup. Best > >> > would > >> > be to have debugging output sent from the servers' glusterfsd directly > >> > to a > >> > client to save the logs. I would not count on syslog in this case, if it > >> > survives one could use a serial console for syslog output though. > > I'm going to iterate through this yet again at the risk of frustrating > you. glusterfsd (on the server side) is yet another process running > only system calls. If glusterfsd has a race condition and locks itself > up, then it locks _only its own process_ up. What you are having is a > frozen system. There is no way glusterfsd can lock up your system > through just VFS system calls, even if it wanted to, intentionally. It > is a pure user space process and has no power to lock up the system. > The worst glusterfsd can do to your system is deadlock its own process > resulting in a glusterfs fuse mountpoint hang, or segfault and result > in a core dump. It appears OP has no core-dump. It appears OP has no gluster logs. It appears OP cannot log in/ ssh to observe results, but instead must cold boot. Debugging opportunities are getting slim. Are there kernel instrumention utils that OP can use, to determine one or more of: - file descriptors running out - thread deadlock condition occurring - some other kernel level subsystem failure - eg networking, fs, scheduler/memory ??? I have been watching closely. I am potential gluster user, monitoring this situation - thanks to all parties for ongoing analysis and patience in this case. Gluster appears to be a new technology, with excellent potential. Regards Zenaan -- Homepage: www.SoulSound.net -- Free Australia: www.UPMART.org Please respect the confidentiality of this email as sensibly warranted. ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] The continuing story ...
>> > I doubt that this can be a real solution. My guess is that glusterfsd runs >> > into some race condition where it locks itself up completely. >> > It is not funny to debug something the like on a production setup. Best >> > would >> > be to have debugging output sent from the servers' glusterfsd directly to a >> > client to save the logs. I would not count on syslog in this case, if it >> > survives one could use a serial console for syslog output though. I'm going to iterate through this yet again at the risk of frustrating you. glusterfsd (on the server side) is yet another process running only system calls. If glusterfsd has a race condition and locks itself up, then it locks _only its own process_ up. What you are having is a frozen system. There is no way glusterfsd can lock up your system through just VFS system calls, even if it wanted to, intentionally. It is a pure user space process and has no power to lock up the system. The worst glusterfsd can do to your system is deadlock its own process resulting in a glusterfs fuse mountpoint hang, or segfault and result in a core dump. Please consult system/kernel programmers you trust. Or ask on the kernel-devel mailing list. The system freeze you are facing is not something which can be caused by _any_ user space application. The correlation you see that the freeze happens only when glusterfsd is running does NOT make glusterfsd _responsible_ for it. I'm not sure if you understand how user processes and kernels work and interact with each other. Think of this almost-perfect analogy. If you have an ftp daemon on a system and your system ends up freezing in the way you describe, you blame the kernel, not the ftp daemon. glusterfsd is no different from an ftp daemon in terms of how potentially disastrous it can be. glusterfs has other bugs, we admit it, but what you are describing here is really a problem in the kernel. I say this confidently because glusterfsd CANNOT freeze a system, even if it wanted to, intentionally. It is a user-space process. If glusterfs has bugs, then it segfaults, or the process hangs. That is fundamentally very different from a system lock up. As far as your problem is concerned, we can point you to the right place if you can report with kernel/dmesg logs. Please understand that even if we wanted to somehow solve your server lock-up problem by that hypothetical fix in glusterfs, it is just not possible, even theoretically. The fix you need is not in glusterfs. It is not a userspace application you fix for system lock ups. > The system acts as pure server for both glusterfs and nfs. It has no fuse nor > nfs client mount points. However, if you are facing hangs on the glusterfs fuse mountpoint, then it is very likely that it is a glusterfs bug. We are very much interested to hear about those issues. Avati ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] The continuing story ...
On Tue, 8 Sep 2009 03:23:37 -0700 Anand Avati wrote: > > I doubt that this can be a real solution. My guess is that glusterfsd runs > > into some race condition where it locks itself up completely. > > It is not funny to debug something the like on a production setup. Best > > would > > be to have debugging output sent from the servers' glusterfsd directly to a > > client to save the logs. I would not count on syslog in this case, if it > > survives one could use a serial console for syslog output though. > > Does the system which is locking up have a fuse mountpoint? or is it a > pure glusterfsd export server without a glusterfs mountpoint? > > Avati The system acts as pure server for both glusterfs and nfs. It has no fuse nor nfs client mount points. -- Regards, Stephan ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] development occurs and is tested on what environment ?
> > I am curious to know what environment the developers of GlusterFS use to > develop and test themselves ? If, for example, releases are being pushed > from tests done on CentOS 5.2 x64, or Ubuntu 8.04 32bit, or whatever. > > If i'm going to set up any new gluster machines, i'd like them to be as > close to the proven environment as possible. :) > We use centos 5.2 mostly for our testing. Avati ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] The continuing story ...
> I doubt that this can be a real solution. My guess is that glusterfsd runs > into some race condition where it locks itself up completely. > It is not funny to debug something the like on a production setup. Best would > be to have debugging output sent from the servers' glusterfsd directly to a > client to save the logs. I would not count on syslog in this case, if it > survives one could use a serial console for syslog output though. Does the system which is locking up have a fuse mountpoint? or is it a pure glusterfsd export server without a glusterfs mountpoint? Avati ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] development occurs and is tested on what environment ?
Hello, I am curious to know what environment the developers of GlusterFS use to develop and test themselves ? If, for example, releases are being pushed from tests done on CentOS 5.2 x64, or Ubuntu 8.04 32bit, or whatever. If i'm going to set up any new gluster machines, i'd like them to be as close to the proven environment as possible. :) -- Daniel Maher ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
[Gluster-users] The continuing story ... YAU (Yet Another Update)
Hello all, as we have something like a pseudo-stable base for testing (1 client, 1 server) we tried some performance enhancement, therefore added the following to the client setup: volume cache type performance/io-cache option cache-size 64MB option priority *.cfg:3,*:1 # option cache-timeout 2 subvolumes writebehind end-volume If we do that the client goes crazy within around 3 hours. We could not save the logs for that because it produced around 4 GB of it and we could not save the setup without deleting it. As you can see we also use writebehind, so maybe its a combined problem. Without writebehind we cannot test the setup because it becomes very slow and the runtime of our scripts does not fit into a 5 mins slot. -- Regards, Stephan ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] The continuing story ...
On Tue, 8 Sep 2009 10:13:17 +1000 (EST) "Jeff Evans" wrote: > > - server was ping'able > > - glusterfsd was disconnected by the client because of missing > > ping-pong - no login possible > > - no fs action (no lights on the hd-stack) > > - no screen (was blank, stayed blank) > > This is very similar to what I have seen many times (even back on > 1.3), and have also commented on the list. > > It seems that we have quite a few ACK's on this, or similar problems. > > The only thing different in my scenario, is that the console doesn't > stay blank. When attempting to login I get the last login message, and > nothing more, no prompt ever. Also, I can see that other processes are > still listening on sockets etc.. so it seems like the kernel just > can't grab new FD's. > > I too found the hang happens more easily if a downed node from a > replicate pair re-joins after some time. > > Following suggestions that this is all kernel related, I have just > moved up to RHEL 5.4 in the hope that the new kernel will > help. > > This fix stood out as potentially related for me: > https://bugzilla.redhat.com/show_bug.cgi?id=44543 This is an ext3 fix, unlikely that we run into a similar effect on reiserfs3, they are really very different in internals and coding. > We also have a broadcom network card, which had reports of hangs under > load, the kernel has a patch for that too. We used tg3 in this setup, but the load was not very high (below 10 MBit on a 1000MBit link). > If I still run into the hangs, I'll try xfs. I doubt that this can be a real solution. My guess is that glusterfsd runs into some race condition where it locks itself up completely. It is not funny to debug something the like on a production setup. Best would be to have debugging output sent from the servers' glusterfsd directly to a client to save the logs. I would not count on syslog in this case, if it survives one could use a serial console for syslog output though. > Thanks, Jeff. -- Regards, Stephan ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Re: [Gluster-users] How does replication work?
Alan Ivey wrote: Like the subject implies, how does replication work exactly? If a client is the only one that has the IP addresses defined for the servers, does that mean that only a client writing a file ensures that it goes to both servers? That would tell me that the servers don't directly communicate with each other for replication. If so, how does healing work? Since the client is the only configuration with the multiple server IP addresses, is it the client's "task" to make sure the server heals itself once it's back online? If not, how do they servers know each other exist if not for the client config file? You've answered your own question. :) AFAIK, in the recommended simple replication scenario, the client is actually responsible for replication, as each server is functionally independant. (This seems crazy to me, but yes, that's how it works.) -- Daniel Maher ___ Gluster-users mailing list Gluster-users@gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users