Re: [Gluster-users] glusterfsd initscript default sequence

2009-09-08 Thread Mark Mielke

On 09/07/2009 12:18 AM, Jeff Evans wrote:

I too think S90 is off,
although I'm not sure where it should go, or how to make it start
  glusterfsd before it gets to /etc/fstab mounting?
 

I think the only way to ensure glusterfsd comes up before fstab
mounting (mount -a) is by using the noauto option and then mounting it
later in rc.local or whenever you are ready.

In my case, I want glusterfs available ASAP and using S50 was adequate
as this is before anything like smb/nfs/httpd starts looking for the
mount.
   


FYI: On Fedora 11, I set glusterfsd to use S20, and have the /etc/fstab 
mount include the _netdev option. It works.


At least on Fedora / RHEL, it looks like _netdev is how you trigger 
delayed mounting as part of /etc/init.d/netfs (S25, which is later than 
S20).


I'd still like to see the autofs problem addressed - but in the mean 
time, I have an acceptable solution.


Cheers,
mark

--
Mark Mielke

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] The continuing story ...

2009-09-08 Thread Anand Avati
> Yep, I experience this exact lock-up state on the 2.x train of GlusterFS
> with two severs, each with local client, and have so far given up testing :(
> - I run 1.3 in production which still has problems when one of the servers
> goes down, and was hoping to move up to 2.x quickly, but cant at the moment.
>
>  Every time a new version comes out I update hoping it will be solved.
>
>  Because the machine that hangs, hangs so completely one can't ssh in and
> can't get a proper dump from the process, and any DEBUG log enabled has no
> information in it either, so I haven't been able to provide anything useful
> to the team to work from :(

Daniel,
 Since you say your machines have glusterfs mounts as well, we would
like to know if you can do some debugging by having an open login
before you start the filesystem and once you face the hang, can you
tell if the "hang" is on the backend fs or on the glusterfs
mountpoint? you can kill -11 the glusterfsd process and it will dump
the pending syscall info in the logfile which can be of great help.

While the symptoms can be very similar to the issue on this thread,
note that the thread is about system hang where there is no glusterfs
mountpoint, and the hang is confirmed to be on the backend fs. We are
very much interested to debug and fix _glusterfs_ mountpoint hangs.

Avati
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] The continuing story ...

2009-09-08 Thread Anand Avati
>  Altough is clear that the bug itself is a kernel bug it's also
>  clear that glusterfs is triggering that bug. The same system under
>  the same load but using nfs instead of gluster does not have this
>  problem. This problem also does not happen copying lots of data
>  using scp. Also, i have never seen such this hangs in more than
>  10 years using unix boxes. But the more strange thing is that this
>  is a bug that can make glusterfs totally unusable and the developers
>  seem to don't worry even in finding what is exactly causing that
>  problem.

I would like to politely disagree with your final statement. In a
previous thread we have indeed promised that we will be fixing the
timeout techniques to take into consideration the situation where the
backend fs is hanging so that the entire glusterfs volume does not
become unusable.

As far as debugging the system hang is concerned, you need to be
looking for kernel logs and dmesg output. You really are wasting your
time trying to debug a kernel fs hang by looking for logs from a user
application. The kernel oops backtrace shows you exactly where the
kernel is locking up. Take the backtrace to the kernel developers and
they will tell you the next step. It is for this very reason the
kernel supports serial console logging to extract hints when the
system cannot log to files.

It is not that we do not want to help, but there is only so much we
can do as a user application. We issue system calls and process the
result. The effort needed to programmatically figure out which the
hanging system call is (with wierd and awkwardly implemented ad-hoc
timeouts in the code) and the amount of hint you get from that is far
less worth than directly going to the heart of the problem - get the
kernel backtrace from a serial console and you will be just one step
from your solution.

If you can also post back a link to the thread on the appropriate ML
where you post your kernel backtrace, we would be interested to keep a
watch on it, or provide more (specific) info if found necessary by
those developers. Almost always the kernel backtrace would be
sufficient. That is the correct first step for debugging this problem.

Avati
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] running glustersd without root

2009-09-08 Thread sac
Hi,

On Wed, Sep 9, 2009 at 7:44 AM, Wei Dong wrote:
> Hi All,
>
> Is it possible to run glusterfsd without root?  In my machines, without root
> privilege, glusterfsd always complains that extended attributes are not
> supported.

You have to be root to run glusterfs.

Regards,
Sachidananda.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] running glustersd without root

2009-09-08 Thread Wei Dong

Hi All,

Is it possible to run glusterfsd without root?  In my machines, without 
root privilege, glusterfsd always complains that extended attributes are 
not supported.


Thanks,

- Wei
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] client coherence problem with locks and truncate

2009-09-08 Thread Vikas Gorur
Rob,

Thanks for reporting this. We are working on it and you can track
progress at: http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=252

Vikas
-- 
Engineer - http://gluster.com/

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] glusterfsd initscript default sequence

2009-09-08 Thread Liam Slusser
The init script is also wrong if you used a non-default install path.
It always points to /usr/sbin/glusterfsd and not your --prefix
specified path.

liam


On Sun, Sep 6, 2009 at 9:18 PM, Jeff Evans  wrote:
>
>> In the case that the node is both a server and a client, as I
>> wish to  use it (3-node cluster, where each is both a client and
>> server in  cluster/replicate configuration), I found that using
>> /etc/fstab to mount  and the default glusterfsd initscript of S90
>> causes the mount to be made  before glusterfsd is up.
>
> My scenario exactly.
>
>> In a test I
>> just ran where I restarted all  three nodes at the same time, for
>> the server that came up first, it  seems the client decided
>> nothing was up.
>
> Yes, and this causes anything that depends upon the glusterfs mount to
> wait at startup for the FS to become available.
>
>>I too think S90 is off,
>> although I'm not sure where it should go, or how to make it start
>>  glusterfsd before it gets to /etc/fstab mounting?
>
> I think the only way to ensure glusterfsd comes up before fstab
> mounting (mount -a) is by using the noauto option and then mounting it
> later in rc.local or whenever you are ready.
>
> In my case, I want glusterfs available ASAP and using S50 was adequate
> as this is before anything like smb/nfs/httpd starts looking for the
> mount.
>
> Thanks, Jeff.
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] double traffic usage since upgrade?

2009-09-08 Thread Liam Slusser
Any other thoughts on why i'm seeing double the inbound traffic?
We're have a large increase in site traffic the last few weeks and my
out bound traffic has increase to almost 400mbit/sec which has
translated to 800mbit of backend gluster traffic.  I'm basically at
the limit of gigabit ethernet unless i do bounding.

Ideas on how to fix this?

thanks,
liam


On Mon, Aug 17, 2009 at 3:28 PM, Liam Slusser  wrote:
> On Mon, Aug 17, 2009 at 7:42 AM, Mark Mielke wrote:
>> On 08/17/2009 08:06 AM, Shehjar Tikoo wrote:
>>>
>>> For a start, we've aimed at getting apache and unfs3 to work with booster.
>>> The functional support for both in booster is complete in
>>> 2.0.6 release.
>>>
>>> For a list of system calls supported by booster, please see:
>>> http://www.gluster.org/docs/index.php/BoosterConfiguration
>>>
>>> There can be applications which need un-boosted syscalls also to be
>>> usable over GlusterFS. For such a scenario we have two ways booster
>>> can be used. Both approaches are described at the page linked above
>>> but in short, you're right in thinking that when the un-supported
>>> syscalls are also needed to go over FUSE, we are, as you said, leaking
>>> or redirecting calls over the FUSE mount point.
>>>
>>
>> Hi Shehjar:
>>
>> That's fine, I think, as long as it is recognized that trapping system call
>> open() as booster is implemented today probably does not trap fopen() on
>> Linux. If apache and unfs3 always call open() directly, and you are trapping
>> this, then your purpose is being served.
>>
>> I was kind of hoping you had found a way around --disable-hidden-plt, so I
>> could steal the idea from you. Too bad. :-)
>>
>> Cheers,
>> mark
>>
>> --
>> Mark Mielke
>>
>
> Just a FYI - I am not using booster at all on our feed boxes, this is
> just straight fuse and the glusterfs process [with the box we're
> seeing the traffic doubling on].
>
> liam
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] Is glusterfs replication intended for hard drive failure

2009-09-08 Thread Wei Dong
After reading all the emails about replication, I start to worry about hard
drive failure.  Let's say one hard drive fails and I replace it with a new
one.  According to the previous discussion, a file is auto-healed only if a
client accesses it.  I have a lot of files and it's unlikely that each file
will be accessed by some client in a short period of time, so those not
accessed will be left unhealed.  If the other hard drive dies, then all
these unhealed data will be lost.  Is that correct?  Then what's the right
way to deal with hard drive failure, or glusterfs is simply not designed for
network failures but assumes reliable underlying storage?

- Wei Dong
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] How does replication work?

2009-09-08 Thread Mark Mielke

On 09/08/2009 01:18 PM, Daniel Maher wrote:


For "shared nothing", each node really does need to be fully 
independent and able to make its own decisions. I think the GlusterFS 
folk have the model right in this regard.


The remaining question is whether they have the *implementation* 
right. :-)


You're taking my statement too far. :)  All i meant was that i don't 
think the clients should be responsible for replication - that, in my 
mind, is the job of the servers.


Purposefully so, I think. More like stealing your thread to start one of 
my own. :-)


But, to stay with yours for a second -

Shouldn't it be possible to configure GlusterFS such that the server 
does replication today? That is, the client connects to one of the 
servers, and the server then has a cluster/replication volume with one 
local volume, and several remote volumes. Do this on each of the 
servers. Then, the configuration is for the client to use a cluster/ha 
volume so that it can connect to multiple servers if one server is down?


I haven't tried it myself, but the concept of "servers responsible for 
replication" seems to be possible to do today. :-)


It also forces the understanding of what replication involves. 
Ultimately, somebody must do the replication, and ultimately, the client 
must be able to connect to multiple servers. The real difference between 
the recommended configuration and the configuration I suggest above, is 
which node is actually responsible for sending (N-1) x each request to 
the "other" nodes in the replication cluster. Is it client->server 
bandwidth (client side replication) or server -> server bandwidth 
(server side replication).


The other questions are which model has the most potential for 
optimization, and which model has the most potential for automatic 
failure recovery. I think these answers are a bit grey right now. 
GlusterFS is pushing the envelope for client side replication. Other 
solutions such as Lustre give up on one or both of metadata or content 
replication.


Cheers,
mark

--
Mark Mielke

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] How does replication work?

2009-09-08 Thread Daniel Maher

Mark Mielke wrote:

For Daniel: For the seems crazy, compared to what? Every time I look at 
other solutions such as Lustre and see how they rely on a single 
metadata server, that itself is supposed to be highly available using 
other means, I have to ask, are they really solving the highly 
availability problem, or are they just narrowing the scope? If the whole 


For "shared nothing", each node really does need to be fully independent 
and able to make its own decisions. I think the GlusterFS folk have the 
model right in this regard.


The remaining question is whether they have the *implementation* right. :-)


You're taking my statement too far. :)  All i meant was that i don't 
think the clients should be responsible for replication - that, in my 
mind, is the job of the servers.


Basically, i *like* it when the clients are independant, and the servers 
work together - not the other way around.  That's all.



--
Daniel Maher 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] How does replication work?

2009-09-08 Thread Mark Mielke

On 09/08/2009 01:01 PM, Alan Ivey wrote:

This is for running some "cloud" servers where we want all files available on 
each machine locally so all servers have the same files but can still get local 
performance. I don't think I'll need to run a cron like that but it's not my network so 
I'm trying to figure out how to get all of the gears working together.
   


Note that unless you are doing mostly read - or unless you have GigE or 
better connections to each of these servers - you are not going to 
"still get local performance". As stat() calls are distributed to 
multiple machines, and write() operations would require replication to 
multiple machines, these will be significantly slower than local disk, 
and significantly slower than NFS (which does not do replication). The 
more machines you have, the worse it will get.


For example, in a test I just did, where we only have 100 Mbit/s between 
the nodes right now, with a 3-node replication cluster, I was only 
getting 5 Mbyte/s writes, but 70 - 130 Mbyte/s reads. Why? Because my 
write to the one server needed to be replicated to 2 other nodes, and if 
we divide 100 Mbit/s transmit speed by 2, we get 50 Mbit/s or 5 Mbyte/s 
to each. If you are going to have a "cloud" of 10 servers instead of 3 - 
this means that every write needs to be sent to all 10, or 9 other nodes 
(depending on if the client is on the server), which would divide your 
network upload capacity by 10, or 9. Go up to 100, and it gets even worse.


The machines themselves - every time I do a write, every node in the 
replication cluster would do a write. They're all working in unison. So, 
if 10 machines in the cluster are all issuing writes, then 10X as many 
writes are all happening to local disk, which means 10X as many seeks, 
and 10X fewer I/O throughput to the disk.


I want to make sure you understand that clustering in this way is not 
really giving you the ability to have each machine with local disk 
speeds. For most uses, clustering is really providing fail over / 
redundancy capabilities. However, if your workload is entirely dominated 
by reads, and very few writes, than this would also provide effective 
load balancing capabilities.




I did just discover that changing permissions does not seem to heal. I had two 
servers up, killed glusterfsd on server2, on client I changed the permissions 
of a file, brought server2 back online, ran ls on client, and server2 still had 
the old permissions. It's not really a big issue since I don't see myself 
changing permissions often, esp with any servers down, but interesting 
nonetheless.


I think those settings are all configurable - I haven't played with them 
myself. More checks at run time = more expensive at run time.


Cheers,
mark

--
Mark Mielke

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] How does replication work?

2009-09-08 Thread Alan Ivey
Daniel Maher wrote,

> Well, auto-healing only needs to happen if one or more of the storage bricks 
> was inaccessible - otherwise files stay synchronised via replication as 
> normal.

> If you're considering running a process in order to trigger self-heal all the 
> time, then presumably something is really wrong with your network, and you 
> should probably address that before trying to get Gluster going. :)

This is for running some "cloud" servers where we want all files available on 
each machine locally so all servers have the same files but can still get local 
performance. I don't think I'll need to run a cron like that but it's not my 
network so I'm trying to figure out how to get all of the gears working 
together. 

I did just discover that changing permissions does not seem to heal. I had two 
servers up, killed glusterfsd on server2, on client I changed the permissions 
of a file, brought server2 back online, ran ls on client, and server2 still had 
the old permissions. It's not really a big issue since I don't see myself 
changing permissions often, esp with any servers down, but interesting 
nonetheless.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] How does replication work?

2009-09-08 Thread Mark Mielke

On 09/08/2009 04:14 AM, Daniel Maher wrote:

Alan Ivey wrote:

Like the subject implies, how does replication work exactly?

If a client is the only one that has the IP addresses defined for the 
servers, does that mean that only a client writing a file ensures 
that it goes to both servers? That would tell me that the servers 
don't directly communicate with each other for replication.


If so, how does healing work? Since the client is the only 
configuration with the multiple server IP addresses, is it the 
client's "task" to make sure the server heals itself once it's back 
online?


If not, how do they servers know each other exist if not for the 
client config file?


You've answered your own question. :)  AFAIK, in the recommended 
simple replication scenario, the client is actually responsible for 
replication, as each server is functionally independant.

(This seems crazy to me, but yes, that's how it works.)


For Alan: Active healing should only be necessary if the system is not 
working properly. Healing should only be required after a system crash 
or bug, a GlusterFS server or client crash or bug, or somebody messing 
around with the backing store file system underneath. For systems that 
are up and running without problems, healing should be completely 
unnecessary.


For Daniel: For the seems crazy, compared to what? Every time I look at 
other solutions such as Lustre and see how they rely on a single 
metadata server, that itself is supposed to be highly available using 
other means, I have to ask, are they really solving the highly 
availability problem, or are they just narrowing the scope? If the whole 
cluster of 2 to 1000 nodes is relying on a single server to being up, 
this is the weakest link. Sure, having one weakest link to deal with is 
easier to solve using traditional means that having 1000 weakest links, 
but it seems clear that Lustre has not SOLVED the problem. They've just 
reduced it to something that might be more manageable. Even the 
"traditional means" of shared disk storage such as GFS and OCFS rely on 
a single piece of hardware - the shared storage. As a result, they make 
the shared storage really expensive - dual interfaces, dual power 
supplies, dual disks, ... but it's still one piece of hardware that 
everything else is reliant on.


For "shared nothing", each node really does need to be fully independent 
and able to make its own decisions. I think the GlusterFS folk have the 
model right in this regard.


The remaining question is whether they have the *implementation* right. :-)

Right now they seem to be in a compromised position between simplicity, 
performance, and correctness. It seems it is a difficult problem to have 
all three no matter which model is selected (shared disk, shared 
metadata only, shared nothing). The self-healing is a good feature, but 
they seem to be leaning on it to provide correctness, so that they can 
provide performance with some amount of simplicity. An example here is 
how directory listings come from "the first up server". In theory, we 
could have correctness through self-healing if directory listing always 
queried all servers. The combined directory listing would be shown, and 
self healing would kick off in the back ground. But, this would cost 
performance - as all servers in the cluster would be involved in 
directory listing. This is just one example.


I think GlusterFS has a lot of potential to close off on holes such as 
these. I don't think it would be difficult to add in things like an 
automatic election model for defining which machines are considered 
stable and the safest masters to use (simplest might be 'the one with 
the highest glusterfsd uptime'?), and having clients choose to pull 
things like directory listings only from the first stable / safest 
master, and having the non-stable / non-safe machines go into automatic 
full self-heal until they are back up-to-date with the master. In such a 
model, I'd like to see the locks being held against the stable/safe 
masters used for reads. Just throwing stuff out there...


For me, I'm looking at this as - I have a problem to solve, and very few 
solutions seem to meet my requirements. GlusterFS looks very close. Do I 
write my own, which would probably start out only solving my 
requirements, and since my requirements will probably grow, this would 
mean eventually writing something the size of GlusterFS? Or do I start 
looking in to this GlusterFS thing - point out the problems, and see if 
I can help?


I'm leaning towards the latter - try it out, point out the problems, see 
if I can help.


As it is, I think GlusterFS is very stable with sufficient performance 
for the requirements of most potential users. It's the people who are 
really trying to push it to its limits that are causing the majority of 
the breakage being reported here. For these people, which includes me, 
I've looked around - and the solutions out there that are competitive 
are eith

Re: [Gluster-users] How does replication work?

2009-09-08 Thread Daniel Maher

Alan Ivey wrote:


Thanks for the reply Daniel. I've been experimenting with it this morning and 
specifically the auto-healing feature. I've found out that it really only 
auto-heals when I ls the client directory. That's the only time I was able to 
get the second server to catch up with the files that were written while the 
second server was down. I was hoping that any operating performed on the client 
would cause the second server to catch all the way up to the second server, but 
creating a new file replicated it on both machines, but not the files created 
in the meantime.

So, my question now is, what operations cause the auto-healing to execute? I 
was only able to get it to catch up when running ls on the client-side. I'm 
envisioning using this as HA-NFS, and so if I want all servers to always have 
the correct files and perms, should I create a cron to run ls on the client 
directory every minute?

According to the documentation at 
http://www.gluster.com/community/documentation/index.php/Understanding_AFR_Translator#Frequently_Asked_Questions
 , it looks like indeed running an ls is a sure-fire way to self-heal all 
directories. Without this intervention, what glusterfs client commands will 
cause this to happen?

Thanks again!


Well, auto-healing only needs to happen if one or more of the storage 
bricks was inaccessible - otherwise files stay synchronised via 
replication as normal.


If you're considering running a process in order to trigger self-heal 
all the time, then presumably something is really wrong with your 
network, and you should probably address that before trying to get 
Gluster going. :)



--
Daniel Maher 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] How does replication work?

2009-09-08 Thread Alan Ivey
>> Like the subject implies, how does replication work exactly?

>> 
>> If a client is the only one that has the IP addresses defined for the 
>> servers, does that mean that only a client writing a file ensures that it 
>> goes to both servers? That would tell me that the servers don't directly 
>> communicate with each other for replication.
>> 
>> If so, how does healing work? Since the client is the only configuration 
>> with the multiple server IP addresses, is it the client's "task" to make 
>> sure the server heals itself once it's back online?
>> 
>> If not, how do they servers know each other exist if not for the client 
>> config file?

> You've answered your own question. :)  AFAIK, in the recommended simple 
> replication scenario, the client is actually responsible for replication, as 
> each server is functionally independant.

> (This seems crazy to me, but yes, that's how it works.)

Thanks for the reply Daniel. I've been experimenting with it this morning and 
specifically the auto-healing feature. I've found out that it really only 
auto-heals when I ls the client directory. That's the only time I was able to 
get the second server to catch up with the files that were written while the 
second server was down. I was hoping that any operating performed on the client 
would cause the second server to catch all the way up to the second server, but 
creating a new file replicated it on both machines, but not the files created 
in the meantime.

So, my question now is, what operations cause the auto-healing to execute? I 
was only able to get it to catch up when running ls on the client-side. I'm 
envisioning using this as HA-NFS, and so if I want all servers to always have 
the correct files and perms, should I create a cron to run ls on the client 
directory every minute?

According to the documentation at 
http://www.gluster.com/community/documentation/index.php/Understanding_AFR_Translator#Frequently_Asked_Questions
 , it looks like indeed running an ls is a sure-fire way to self-heal all 
directories. Without this intervention, what glusterfs client commands will 
cause this to happen?

Thanks again!
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] The continuing story ...

2009-09-08 Thread David Saez Padros

Hi

Altough is clear that the bug itself is a kernel bug it's also
clear that glusterfs is triggering that bug. The same system under
the same load but using nfs instead of gluster does not have this
problem. This problem also does not happen copying lots of data
using scp. Also, i have never seen such this hangs in more than
10 years using unix boxes. But the more strange thing is that this
is a bug that can make glusterfs totally unusable and the developers
seem to don't worry even in finding what is exactly causing that
problem.

To make an analogy, supose that users that have a specific brand of
tires in his car complain to the manufacturer about that tires
breaking too fast where other tires has no problem and the manufacturer
says that this is due to the road deffects and does nothing to improve
the tires. It's clear what will happen to this manufacturer, right ?

--
Best regards ...


   David Saez Padroshttp://www.ols.es
   On-Line Services 2000 S.L.   telf+34 902 50 29 75



___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] The continuing story ...

2009-09-08 Thread Stephan von Krawczynski
On Tue, 8 Sep 2009 05:37:09 -0700
Anand Avati  wrote:

> >> > I doubt that this can be a real solution. My guess is that glusterfsd 
> >> > runs
> >> > into some race condition where it locks itself up completely.
> >> > It is not funny to debug something the like on a production setup. Best 
> >> > would
> >> > be to have debugging output sent from the servers' glusterfsd directly 
> >> > to a
> >> > client to save the logs. I would not count on syslog in this case, if it
> >> > survives one could use a serial console for syslog output though.
> 
> I'm going to iterate through this yet again at the risk of frustrating
> you. glusterfsd (on the server side) is yet another process running
> only system calls. If glusterfsd has a race condition and locks itself
> up, then it locks _only its own process_ up. What you are having is a
> frozen system. There is no way glusterfsd can lock up your system
> through just VFS system calls, even if it wanted to, intentionally. It
> is a pure user space process and has no power to lock up the system.
> The worst glusterfsd can do to your system is deadlock its own process
> resulting in a glusterfs fuse mountpoint hang, or segfault and result
> in a core dump.
> 
> Please consult system/kernel programmers you trust. Or ask on the
> kernel-devel mailing list. The system freeze you are facing is not
> something which can be caused by _any_ user space application.

Please read carefully what I told about the system condition. The fact that I
can ping the box means that the kernel is not messed up, i.e. this is no
freeze. But as I cannot login nor use any other user-space software to get
hands on the box only means that an application should only be able to mess up
the userspace to an extent that every other application gets few to no
timeslices, or some system resource is eaten up to an extent that others are
simply locked out. That does not sound impossible to me as it is just like a
local DoS attack which is possible. Maybe one only needs some messed up
pointers to create such a situation. What really bothers me more is the fact
that you continously deny to see what several people on the list described.
It is not our intention to waste someones time, we try to give as much
information as possible to go out and find some problem. Unfortunately we
cannot do that job, because we don't have the background knowledge about your
code. 
Since it all is userspace maybe it would be helpful to have a version that
just outputs logs to serial, so that we can trace where it went before things
blew up. Maybe we can watch it cycling somewhere...

Do you really deny that a local DoS attack is generally possible? 
-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] The continuing story ...

2009-09-08 Thread Zenaan Harkness
On Tue, Sep 08, 2009 at 05:37:09AM -0700, Anand Avati wrote:
> >> > I doubt that this can be a real solution. My guess is that glusterfsd 
> >> > runs
> >> > into some race condition where it locks itself up completely.
> >> > It is not funny to debug something the like on a production setup. Best 
> >> > would
> >> > be to have debugging output sent from the servers' glusterfsd directly 
> >> > to a
> >> > client to save the logs. I would not count on syslog in this case, if it
> >> > survives one could use a serial console for syslog output though.
> 
> I'm going to iterate through this yet again at the risk of frustrating
> you. glusterfsd (on the server side) is yet another process running
> only system calls. If glusterfsd has a race condition and locks itself
> up, then it locks _only its own process_ up. What you are having is a
> frozen system. There is no way glusterfsd can lock up your system
> through just VFS system calls, even if it wanted to, intentionally. It
> is a pure user space process and has no power to lock up the system.
> The worst glusterfsd can do to your system is deadlock its own process
> resulting in a glusterfs fuse mountpoint hang, or segfault and result
> in a core dump.

It appears OP has no core-dump.

It appears OP has no gluster logs.

It appears OP cannot log in/ ssh to observe results, but instead must
cold boot.

Debugging opportunities are getting slim.

Are there kernel instrumention utils that OP can use, to determine
one or more of:

   -  file descriptors running out
   -  thread deadlock condition occurring
   -  some other kernel level subsystem failure
  -  eg networking, fs, scheduler/memory

???

I have been watching closely. I am potential gluster user, monitoring
this situation - thanks to all parties for ongoing analysis and
patience in this case. Gluster appears to be a new technology, with
excellent potential.

Regards
Zenaan

-- 
Homepage: www.SoulSound.net -- Free Australia: www.UPMART.org
Please respect the confidentiality of this email as sensibly warranted.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] The continuing story ...

2009-09-08 Thread Anand Avati
>> > I doubt that this can be a real solution. My guess is that glusterfsd runs
>> > into some race condition where it locks itself up completely.
>> > It is not funny to debug something the like on a production setup. Best 
>> > would
>> > be to have debugging output sent from the servers' glusterfsd directly to a
>> > client to save the logs. I would not count on syslog in this case, if it
>> > survives one could use a serial console for syslog output though.

I'm going to iterate through this yet again at the risk of frustrating
you. glusterfsd (on the server side) is yet another process running
only system calls. If glusterfsd has a race condition and locks itself
up, then it locks _only its own process_ up. What you are having is a
frozen system. There is no way glusterfsd can lock up your system
through just VFS system calls, even if it wanted to, intentionally. It
is a pure user space process and has no power to lock up the system.
The worst glusterfsd can do to your system is deadlock its own process
resulting in a glusterfs fuse mountpoint hang, or segfault and result
in a core dump.

Please consult system/kernel programmers you trust. Or ask on the
kernel-devel mailing list. The system freeze you are facing is not
something which can be caused by _any_ user space application. The
correlation you see that the freeze happens only when glusterfsd is
running does NOT make glusterfsd _responsible_ for it.  I'm not sure
if you understand how user processes and kernels work and interact
with each other. Think of this almost-perfect analogy. If you have an
ftp daemon on a system and your system ends up freezing in the way you
describe, you blame the kernel, not the ftp daemon. glusterfsd is no
different from an ftp daemon in terms of how potentially disastrous it
can be.

glusterfs has other bugs, we admit it, but what you are describing
here is really a problem in the kernel. I say this confidently because
glusterfsd CANNOT freeze a system, even if it wanted to,
intentionally. It is a user-space process. If glusterfs has bugs, then
it segfaults, or the process hangs. That is fundamentally very
different from a system lock up.

As far as your problem is concerned, we can point you to the right
place if you can report with kernel/dmesg logs. Please understand that
even if we wanted to somehow solve your server lock-up problem by that
hypothetical fix in glusterfs, it is just not possible, even
theoretically. The fix you need is not in glusterfs. It is not a
userspace application you fix for system lock ups.

> The system acts as pure server for both glusterfs and nfs. It has no fuse nor
> nfs client mount points.

However, if you are facing hangs on the glusterfs fuse mountpoint,
then it is very likely that it is a glusterfs bug. We are very much
interested to hear about those issues.

Avati
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] The continuing story ...

2009-09-08 Thread Stephan von Krawczynski
On Tue, 8 Sep 2009 03:23:37 -0700
Anand Avati  wrote:

> > I doubt that this can be a real solution. My guess is that glusterfsd runs
> > into some race condition where it locks itself up completely.
> > It is not funny to debug something the like on a production setup. Best 
> > would
> > be to have debugging output sent from the servers' glusterfsd directly to a
> > client to save the logs. I would not count on syslog in this case, if it
> > survives one could use a serial console for syslog output though.
> 
> Does the system which is locking up have a fuse mountpoint? or is it a
> pure glusterfsd export server without a glusterfs mountpoint?
> 
> Avati

The system acts as pure server for both glusterfs and nfs. It has no fuse nor
nfs client mount points.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] development occurs and is tested on what environment ?

2009-09-08 Thread Anand Avati
>
> I am curious to know what environment the developers of GlusterFS use to
> develop and test themselves ?  If, for example, releases are being pushed
> from tests done on CentOS 5.2 x64, or Ubuntu 8.04 32bit, or whatever.
>
> If i'm going to set up any new gluster machines, i'd like them to be as
> close to the proven environment as possible. :)
>

We use centos 5.2 mostly for our testing.

Avati
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] The continuing story ...

2009-09-08 Thread Anand Avati
> I doubt that this can be a real solution. My guess is that glusterfsd runs
> into some race condition where it locks itself up completely.
> It is not funny to debug something the like on a production setup. Best would
> be to have debugging output sent from the servers' glusterfsd directly to a
> client to save the logs. I would not count on syslog in this case, if it
> survives one could use a serial console for syslog output though.

Does the system which is locking up have a fuse mountpoint? or is it a
pure glusterfsd export server without a glusterfs mountpoint?

Avati
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] development occurs and is tested on what environment ?

2009-09-08 Thread Daniel Maher

Hello,

I am curious to know what environment the developers of GlusterFS use to 
develop and test themselves ?  If, for example, releases are being 
pushed from tests done on CentOS 5.2 x64, or Ubuntu 8.04 32bit, or whatever.


If i'm going to set up any new gluster machines, i'd like them to be as 
close to the proven environment as possible. :)


--
Daniel Maher 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] The continuing story ... YAU (Yet Another Update)

2009-09-08 Thread Stephan von Krawczynski
Hello all,

as we have something like a pseudo-stable base for testing (1 client, 1
server) we tried some performance enhancement, therefore added the following
to the client setup:

volume cache
  type performance/io-cache
  option cache-size 64MB
  option priority *.cfg:3,*:1
  # option cache-timeout 2
  subvolumes writebehind
end-volume

If we do that the client goes crazy within around 3 hours. We could not save
the logs for that because it produced around 4 GB of it and we could not save
the setup without deleting it.
As you can see we also use writebehind, so maybe its a combined problem.
Without writebehind we cannot test the setup because it becomes very slow and
the runtime of our scripts does not fit into a 5 mins slot.

-- 
Regards,
Stephan
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] The continuing story ...

2009-09-08 Thread Stephan von Krawczynski
On Tue, 8 Sep 2009 10:13:17 +1000 (EST)
"Jeff Evans"  wrote:

> > - server was ping'able
> > - glusterfsd was disconnected by the client because of missing
> > ping-pong - no login possible
> > - no fs action (no lights on the hd-stack)
> > - no screen (was blank, stayed blank)
> 
> This is very similar to what I have seen many times (even back on
> 1.3), and have also commented on the list.
> 
> It seems that we have quite a few ACK's on this, or similar problems.
> 
> The only thing different in my scenario, is that the console doesn't
> stay blank. When attempting to login I get the last login message, and
> nothing more, no prompt ever. Also, I can see that other processes are
> still listening on sockets etc.. so it seems like the kernel just
> can't grab new FD's.
> 
> I too found the hang happens more easily if a downed node from a
> replicate pair re-joins after some time.
> 
> Following suggestions that this is all kernel related, I have just
> moved up to RHEL 5.4 in the hope that the new kernel will
> help.
> 
> This fix stood out as potentially related for me:
> https://bugzilla.redhat.com/show_bug.cgi?id=44543

This is an ext3 fix, unlikely that we run into a similar effect on reiserfs3,
they are really very different in internals and coding.
 
> We also have a broadcom network card, which had reports of hangs under
> load, the kernel has a patch for that too.

We used tg3 in this setup, but the load was not very high (below 10 MBit on a
1000MBit link). 

> If I still run into the hangs, I'll try xfs.

I doubt that this can be a real solution. My guess is that glusterfsd runs
into some race condition where it locks itself up completely.
It is not funny to debug something the like on a production setup. Best would
be to have debugging output sent from the servers' glusterfsd directly to a
client to save the logs. I would not count on syslog in this case, if it
survives one could use a serial console for syslog output though.
 
> Thanks, Jeff.

-- 
Regards,
Stephan

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] How does replication work?

2009-09-08 Thread Daniel Maher

Alan Ivey wrote:

Like the subject implies, how does replication work exactly?

If a client is the only one that has the IP addresses defined for the servers, 
does that mean that only a client writing a file ensures that it goes to both 
servers? That would tell me that the servers don't directly communicate with 
each other for replication.

If so, how does healing work? Since the client is the only configuration with the 
multiple server IP addresses, is it the client's "task" to make sure the server 
heals itself once it's back online?

If not, how do they servers know each other exist if not for the client config 
file?


You've answered your own question. :)  AFAIK, in the recommended simple 
replication scenario, the client is actually responsible for 
replication, as each server is functionally independant.


(This seems crazy to me, but yes, that's how it works.)


--
Daniel Maher 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users