Re: [Gluster-users] double traffic usage since upgrade?

2009-09-08 Thread Liam Slusser
Any other thoughts on why i'm seeing double the inbound traffic?
We're have a large increase in site traffic the last few weeks and my
out bound traffic has increase to almost 400mbit/sec which has
translated to 800mbit of backend gluster traffic.  I'm basically at
the limit of gigabit ethernet unless i do bounding.

Ideas on how to fix this?

thanks,
liam


On Mon, Aug 17, 2009 at 3:28 PM, Liam Slusser lslus...@gmail.com wrote:
 On Mon, Aug 17, 2009 at 7:42 AM, Mark Mielkem...@mark.mielke.cc wrote:
 On 08/17/2009 08:06 AM, Shehjar Tikoo wrote:

 For a start, we've aimed at getting apache and unfs3 to work with booster.
 The functional support for both in booster is complete in
 2.0.6 release.

 For a list of system calls supported by booster, please see:
 http://www.gluster.org/docs/index.php/BoosterConfiguration

 There can be applications which need un-boosted syscalls also to be
 usable over GlusterFS. For such a scenario we have two ways booster
 can be used. Both approaches are described at the page linked above
 but in short, you're right in thinking that when the un-supported
 syscalls are also needed to go over FUSE, we are, as you said, leaking
 or redirecting calls over the FUSE mount point.


 Hi Shehjar:

 That's fine, I think, as long as it is recognized that trapping system call
 open() as booster is implemented today probably does not trap fopen() on
 Linux. If apache and unfs3 always call open() directly, and you are trapping
 this, then your purpose is being served.

 I was kind of hoping you had found a way around --disable-hidden-plt, so I
 could steal the idea from you. Too bad. :-)

 Cheers,
 mark

 --
 Mark Mielkem...@mielke.cc


 Just a FYI - I am not using booster at all on our feed boxes, this is
 just straight fuse and the glusterfs process [with the box we're
 seeing the traffic doubling on].

 liam

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] double traffic usage since upgrade?

2009-08-17 Thread Shehjar Tikoo

Mark Mielke wrote:

Possibly relevant here -

At work, we have used a tool which does something similar to
booster to accelerate an extremely slow remote file system. It
works the same way with LD_PRELOAD, however, it also requires GLIBC
to be compiled with --disable-hidden-plt. Reviewing the Internet
for similar solutions, will find PlasticFS which also has the same
requirement.

Recent versions of GLIBC call open() internally without following
the regular the regular PLT name resolution model. This increases 
performance as the PLT indirect lookup model has an expense

associated. For example, GLIBC fopen() calls open() directly rather
than going through the PLT. So, overriding open() does not
intercept calls to fopen()?

Is this something the booster developers are aware of? Have they
found a way around this, or is it possible that booster is only
boosting *some* types of access, and other types of access are
still falling through to FUSE?

I've asked the developer who wrote out library what he thought of 
glusterfs/booster not requiring GLIBC with --disable-hidden-plt,

and he thinks glusterfs/booster cannot be working (or cannot be
intercepting all calls and some calls are leaking through to FUSE).
Comments?

If some calls were leaking through, this might have the double
traffic effect, since FUSE would have its own cache separate from
booster?



I dont know what a PLT is but I'll attempt to provide some clarity here.

It is true, that booster does not support or boost all system calls.
We do not require that glibc be built with --disable-hidden-plt
for those calls which we do support.
For a start, we've aimed at getting apache and unfs3 to work with 
booster. The functional support for both in booster is complete in

2.0.6 release.

For a list of system calls supported by booster, please see:
http://www.gluster.org/docs/index.php/BoosterConfiguration

There can be applications which need un-boosted syscalls also to be
usable over GlusterFS. For such a scenario we have two ways booster
can be used. Both approaches are described at the page linked above
but in short, you're right in thinking that when the un-supported
syscalls are also needed to go over FUSE, we are, as you said, leaking
or redirecting calls over the FUSE mount point.

That page is a bit long so feel free to ask any questions here.

Thanks
-Shehjar



Cheers, mark



On 08/14/2009 01:22 PM, Anand Avati wrote:

I've been running 2.0.3 with two backend bricks and a frontend
client of mod_gluster/apache 2.2.11+worker for a few weeks now
without much issue. Last night i upgraded to 2.0.6 only to find
out that mod_gluster has been removed and is recommending to
use the booster library - which is fine but i didnt have time
to test it last night so i just mounted the whole filesystem 
with a fuse mount and figured id test the booster config later

and then swap.  I did try running the 2.0.3 mod_gluster module
with the 2.0.6 bricks but apache kept segfaulting (every 10
seconds) and then would spawn another process which would
reconnect and keep going.  I figured it was dropping a client
request every few seconds which is why i went with the fuse
mount until i could test the booster library.



That would not work, swapping binaries across versions.



Well, before with mod_gluster, we would be pushing around
200mbit of web traffic and it would evenly distribute that
200mbit between our two bricks - so server1 would be pushing
100mbit and server2 would be pushing another 100mbit.
Basically both inbound from the backend bricks and outbound 
from apache was basically identical.  Except of course if one

of the backend glusterd processes died for whatever reason the
other remaining brick would take the whole load and its traffic
would double as you would expect. Perfect, all was happy.

Now using gluster 2.0.6 and fuse both server bricks are pushing
the full 200mbit of traffic - so i basically have 400mbit of
incoming traffic from the gluster bricks but the same 200mbit
of web traffic.  I can deal, but i only have a shared gigabit
link between my client server and backend bricks and im already
eating up basically 50% of that pipe.  It is also putting a 
much larger load on both bricks since i have basically doubled

the disk IO time and traffic.  Is this a feature? Bug?



If I understand correct, 2.0.3 mod_glusterfs = 1x, 2.0.6 fuse =
2x? Can you describe the files being served? (average file size
and number of files)

Avati ___ 
Gluster-users mailing list Gluster-users@gluster.org 
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users










___ Gluster-users
mailing list Gluster-users@gluster.org 
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org

Re: [Gluster-users] double traffic usage since upgrade?

2009-08-17 Thread Liam Slusser
On Mon, Aug 17, 2009 at 7:42 AM, Mark Mielkem...@mark.mielke.cc wrote:
 On 08/17/2009 08:06 AM, Shehjar Tikoo wrote:

 For a start, we've aimed at getting apache and unfs3 to work with booster.
 The functional support for both in booster is complete in
 2.0.6 release.

 For a list of system calls supported by booster, please see:
 http://www.gluster.org/docs/index.php/BoosterConfiguration

 There can be applications which need un-boosted syscalls also to be
 usable over GlusterFS. For such a scenario we have two ways booster
 can be used. Both approaches are described at the page linked above
 but in short, you're right in thinking that when the un-supported
 syscalls are also needed to go over FUSE, we are, as you said, leaking
 or redirecting calls over the FUSE mount point.


 Hi Shehjar:

 That's fine, I think, as long as it is recognized that trapping system call
 open() as booster is implemented today probably does not trap fopen() on
 Linux. If apache and unfs3 always call open() directly, and you are trapping
 this, then your purpose is being served.

 I was kind of hoping you had found a way around --disable-hidden-plt, so I
 could steal the idea from you. Too bad. :-)

 Cheers,
 mark

 --
 Mark Mielkem...@mielke.cc


Just a FYI - I am not using booster at all on our feed boxes, this is
just straight fuse and the glusterfs process [with the box we're
seeing the traffic doubling on].

liam
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] double traffic usage since upgrade?

2009-08-15 Thread Mark Mielke

Possibly relevant here -

At work, we have used a tool which does something similar to booster to 
accelerate an extremely slow remote file system. It works the same way 
with LD_PRELOAD, however, it also requires GLIBC to be compiled with 
--disable-hidden-plt. Reviewing the Internet for similar solutions, will 
find PlasticFS which also has the same requirement.


Recent versions of GLIBC call open() internally without following the 
regular the regular PLT name resolution model. This increases 
performance as the PLT indirect lookup model has an expense associated. 
For example, GLIBC fopen() calls open() directly rather than going 
through the PLT. So, overriding open() does not intercept calls to fopen()?


Is this something the booster developers are aware of? Have they found a 
way around this, or is it possible that booster is only boosting *some* 
types of access, and other types of access are still falling through 
to FUSE?


I've asked the developer who wrote out library what he thought of 
glusterfs/booster not requiring GLIBC with --disable-hidden-plt, and he 
thinks glusterfs/booster cannot be working (or cannot be intercepting 
all calls and some calls are leaking through to FUSE). Comments?


If some calls were leaking through, this might have the double traffic 
effect, since FUSE would have its own cache separate from booster?


Cheers,
mark



On 08/14/2009 01:22 PM, Anand Avati wrote:

I've been running 2.0.3 with two backend bricks and a frontend client of
mod_gluster/apache 2.2.11+worker for a few weeks now without much issue.
  Last night i upgraded to 2.0.6 only to find out that mod_gluster has been
removed and is recommending to use the booster library - which is fine but i
didnt have time to test it last night so i just mounted the whole filesystem
with a fuse mount and figured id test the booster config later and then
swap.  I did try running the 2.0.3 mod_gluster module with the 2.0.6 bricks
but apache kept segfaulting (every 10 seconds) and then would spawn another
process which would reconnect and keep going.  I figured it was dropping a
client request every few seconds which is why i went with the fuse mount
until i could test the booster library.
 


That would not work, swapping binaries across versions.

   

Well, before with mod_gluster, we would be pushing around 200mbit of web
traffic and it would evenly distribute that 200mbit between our two bricks -
so server1 would be pushing 100mbit and server2 would be pushing another
100mbit.  Basically both inbound from the backend bricks and outbound from
apache was basically identical.  Except of course if one of the backend
glusterd processes died for whatever reason the other remaining brick would
take the whole load and its traffic would double as you would expect.
  Perfect, all was happy.

Now using gluster 2.0.6 and fuse both server bricks are pushing the full
200mbit of traffic - so i basically have 400mbit of incoming traffic from
the gluster bricks but the same 200mbit of web traffic.  I can deal, but i
only have a shared gigabit link between my client server and backend bricks
and im already eating up basically 50% of that pipe.  It is also putting a
much larger load on both bricks since i have basically doubled the disk IO
time and traffic.  Is this a feature? Bug?
 


If I understand correct, 2.0.3 mod_glusterfs = 1x, 2.0.6 fuse = 2x?
Can you describe the files being served? (average file size and number
of files)

Avati
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

   



--
Mark Mielkem...@mielke.cc

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] double traffic usage since upgrade?

2009-08-14 Thread Liam Slusser
I've been running 2.0.3 with two backend bricks and a frontend client of
mod_gluster/apache 2.2.11+worker for a few weeks now without much issue.
 Last night i upgraded to 2.0.6 only to find out that mod_gluster has been
removed and is recommending to use the booster library - which is fine but i
didnt have time to test it last night so i just mounted the whole filesystem
with a fuse mount and figured id test the booster config later and then
swap.  I did try running the 2.0.3 mod_gluster module with the 2.0.6 bricks
but apache kept segfaulting (every 10 seconds) and then would spawn another
process which would reconnect and keep going.  I figured it was dropping a
client request every few seconds which is why i went with the fuse mount
until i could test the booster library.

Well, before with mod_gluster, we would be pushing around 200mbit of web
traffic and it would evenly distribute that 200mbit between our two bricks -
so server1 would be pushing 100mbit and server2 would be pushing another
100mbit.  Basically both inbound from the backend bricks and outbound from
apache was basically identical.  Except of course if one of the backend
glusterd processes died for whatever reason the other remaining brick would
take the whole load and its traffic would double as you would expect.
 Perfect, all was happy.

Now using gluster 2.0.6 and fuse both server bricks are pushing the full
200mbit of traffic - so i basically have 400mbit of incoming traffic from
the gluster bricks but the same 200mbit of web traffic.  I can deal, but i
only have a shared gigabit link between my client server and backend bricks
and im already eating up basically 50% of that pipe.  It is also putting a
much larger load on both bricks since i have basically doubled the disk IO
time and traffic.  Is this a feature? Bug?

thanks,
liam
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] double traffic usage since upgrade?

2009-08-14 Thread Anand Avati
 I've been running 2.0.3 with two backend bricks and a frontend client of
 mod_gluster/apache 2.2.11+worker for a few weeks now without much issue.
  Last night i upgraded to 2.0.6 only to find out that mod_gluster has been
 removed and is recommending to use the booster library - which is fine but i
 didnt have time to test it last night so i just mounted the whole filesystem
 with a fuse mount and figured id test the booster config later and then
 swap.  I did try running the 2.0.3 mod_gluster module with the 2.0.6 bricks
 but apache kept segfaulting (every 10 seconds) and then would spawn another
 process which would reconnect and keep going.  I figured it was dropping a
 client request every few seconds which is why i went with the fuse mount
 until i could test the booster library.

That would not work, swapping binaries across versions.

 Well, before with mod_gluster, we would be pushing around 200mbit of web
 traffic and it would evenly distribute that 200mbit between our two bricks -
 so server1 would be pushing 100mbit and server2 would be pushing another
 100mbit.  Basically both inbound from the backend bricks and outbound from
 apache was basically identical.  Except of course if one of the backend
 glusterd processes died for whatever reason the other remaining brick would
 take the whole load and its traffic would double as you would expect.
  Perfect, all was happy.

 Now using gluster 2.0.6 and fuse both server bricks are pushing the full
 200mbit of traffic - so i basically have 400mbit of incoming traffic from
 the gluster bricks but the same 200mbit of web traffic.  I can deal, but i
 only have a shared gigabit link between my client server and backend bricks
 and im already eating up basically 50% of that pipe.  It is also putting a
 much larger load on both bricks since i have basically doubled the disk IO
 time and traffic.  Is this a feature? Bug?

If I understand correct, 2.0.3 mod_glusterfs = 1x, 2.0.6 fuse = 2x?
Can you describe the files being served? (average file size and number
of files)

Avati
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users