Re: [Gluster-users] Memory leak with a replica 3 arbiter 1 configuration

2016-08-23 Thread Ravishankar N

On 08/24/2016 02:12 AM, Benjamin Edgar wrote:
My test servers have been running for about 3 hours now (with the 
while loop to constantly write and delete files) and it looks like the 
memory usage of the arbiter brick process has not increased in the 
past hour. Before it was constantly increasing, so it looks like 
adding the "GF_FREE (ctx->iattbuf);" line in arbiter.c fixed the 
issue. If anything changes overnight I will post an update, but I 
believe that the fix worked!


Once this patch makes it into the master branch, how long does it 
usually take to get released back to 3.8?



Hi Ben,
Thanks for testing. The minor release schedule [1] for 3.8.x is on the 
10th of every month. But an out of order 3.8.3 release was just made. So 
maybe 3.8.4 would take a bit longer.


Thanks,
Ravi

[1] https://www.gluster.org/community/release-schedule/

Thanks!
Ben

On Tue, Aug 23, 2016 at 2:18 PM, Benjamin Edgar > wrote:


Hi Ravi,

I saw that you updated the patch today (@
http://review.gluster.org/#/c/15289/
). I built an RPM of the
first iteration you had of the patch (just changing the one line
in arbiter.c "GF_FREE (ctx->iattbuf);") and am running that on
some test servers now to see if the memory of the arbiter brick
gets out of control.

Ben

On Tue, Aug 23, 2016 at 3:38 AM, Ravishankar N
> wrote:

Hi Benjamin

On 08/23/2016 06:41 AM, Benjamin Edgar wrote:

I've attached a statedump of the problem brick process.  Let
me know if there are any other logs you need.


Thanks for the report! I've sent a fix @
http://review.gluster.org/#/c/15289/
 . It would be nice if
you can verify if the patch fixes the issue for you.

Thanks,
Ravi



Thanks a lot,
Ben

On Mon, Aug 22, 2016 at 5:03 PM, Pranith Kumar Karampuri
> wrote:

Could you collect statedump of the brick process by
following:
https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump


That should help us identify which datatype is causing
leaks and fix it.

Thanks!

On Tue, Aug 23, 2016 at 2:22 AM, Benjamin Edgar
> wrote:

Hi,

I appear to have a memory leak with a replica 3
arbiter 1 configuration of gluster. I have a data
brick and an arbiter brick on one server, and another
server with the last data brick. The more I write
files to gluster in this configuration, the more
memory the arbiter brick process takes up.

I am able to reproduce this issue by first setting up
a replica 3 arbiter 1 configuration and then using
the following bash script to create 10,000 200kB
files, delete those files, and run forever:

while true ; do
  for i in {1..1} ; do
dd if=/dev/urandom bs=200K count=1
of=$TEST_FILES_DIR/file$i
  done
  rm -rf $TEST_FILES_DIR/*
done

$TEST_FILES_DIR is a location on my gluster mount.

After about 3 days of this script running on one of
my clusters, this is what the output of "top" looks like:
  PID   USER  PR  NIVIRT   RESSHR S %CPU
%MEM TIME+   COMMAND
16039 root20   0 1397220  77720 3948 S   20.6  
 1.0860:01.53  glusterfsd
13174 root20   0 1395824  112728 3692 S   19.6  
 1.5806:07.17  glusterfs

19961 root20   0 2967204 *2.145g*   3896 S   17.3
 29.0  752:10.70  glusterfsd

As you can see one of the brick processes is using
over 2 gigabytes of memory.

One work-around for this is to kill the arbiter brick
process and restart the gluster daemon. This restarts
arbiter brick process and its memory usage goes back
down to a reasonable level. However I would rather
not kill the arbiter brick every week for production
environments.

Has anyone seen this issue before and is there a
known work-around/fix?

Thanks,
Ben

___
Gluster-users mailing list
  

Re: [Gluster-users] Memory leak with a replica 3 arbiter 1 configuration

2016-08-23 Thread Benjamin Edgar
My test servers have been running for about 3 hours now (with the while
loop to constantly write and delete files) and it looks like the memory
usage of the arbiter brick process has not increased in the past hour.
Before it was constantly increasing, so it looks like adding the "GF_FREE
(ctx->iattbuf);" line in arbiter.c fixed the issue. If anything changes
overnight I will post an update, but I believe that the fix worked!

Once this patch makes it into the master branch, how long does it usually
take to get released back to 3.8?

Thanks!
Ben

On Tue, Aug 23, 2016 at 2:18 PM, Benjamin Edgar  wrote:

> Hi Ravi,
>
> I saw that you updated the patch today (@ http://review.gluster.org/#
> /c/15289/). I built an RPM of the first iteration you had of the patch
> (just changing the one line in arbiter.c "GF_FREE (ctx->iattbuf);") and am
> running that on some test servers now to see if the memory of the arbiter
> brick gets out of control.
>
> Ben
>
> On Tue, Aug 23, 2016 at 3:38 AM, Ravishankar N 
> wrote:
>
>> Hi Benjamin
>>
>> On 08/23/2016 06:41 AM, Benjamin Edgar wrote:
>>
>> I've attached a statedump of the problem brick process.  Let me know if
>> there are any other logs you need.
>>
>>
>> Thanks for the report! I've sent a fix @ http://review.gluster.org/#/c/
>> 15289/ . It would be nice if you can verify if the patch fixes the issue
>> for you.
>>
>> Thanks,
>> Ravi
>>
>>
>> Thanks a lot,
>> Ben
>>
>> On Mon, Aug 22, 2016 at 5:03 PM, Pranith Kumar Karampuri <
>> pkara...@redhat.com> wrote:
>>
>>> Could you collect statedump of the brick process by following:
>>> https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump
>>>
>>> That should help us identify which datatype is causing leaks and fix it.
>>>
>>> Thanks!
>>>
>>> On Tue, Aug 23, 2016 at 2:22 AM, Benjamin Edgar 
>>> wrote:
>>>
 Hi,

 I appear to have a memory leak with a replica 3 arbiter 1 configuration
 of gluster. I have a data brick and an arbiter brick on one server, and
 another server with the last data brick. The more I write files to gluster
 in this configuration, the more memory the arbiter brick process takes up.

 I am able to reproduce this issue by first setting up a replica 3
 arbiter 1 configuration and then using the following bash script to create
 10,000 200kB files, delete those files, and run forever:

 while true ; do
   for i in {1..1} ; do
 dd if=/dev/urandom bs=200K count=1 of=$TEST_FILES_DIR/file$i
   done
   rm -rf $TEST_FILES_DIR/*
 done

 $TEST_FILES_DIR is a location on my gluster mount.

 After about 3 days of this script running on one of my clusters, this
 is what the output of "top" looks like:
   PID   USER  PR  NIVIRT   RESSHR S   %CPU %MEM
 TIME+   COMMAND
 16039 root  20   0 1397220  77720 3948 S   20.61.0
860:01.53  glusterfsd
 13174 root  20   0 1395824  112728   3692 S   19.61.5
  806:07.17  glusterfs
 19961 root  20   0 2967204  *2.145g*3896 S   17.3
  29.0  752:10.70  glusterfsd

 As you can see one of the brick processes is using over 2 gigabytes of
 memory.

 One work-around for this is to kill the arbiter brick process and
 restart the gluster daemon. This restarts arbiter brick process and its
 memory usage goes back down to a reasonable level. However I would rather
 not kill the arbiter brick every week for production environments.

 Has anyone seen this issue before and is there a known work-around/fix?

 Thanks,
 Ben

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-users

>>>
>>>
>>>
>>> --
>>> Pranith
>>>
>>
>>
>>
>> --
>> Benjamin Edgar
>> Computer Science
>> University of Virginia 2015
>> (571) 338-0878
>>
>>
>> ___
>> Gluster-users mailing 
>> listGluster-users@gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>
>
> --
> Benjamin Edgar
> Computer Science
> University of Virginia 2015
> (571) 338-0878
>



-- 
Benjamin Edgar
Computer Science
University of Virginia 2015
(571) 338-0878
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Memory leak with a replica 3 arbiter 1 configuration

2016-08-23 Thread Benjamin Edgar
Hi Ravi,

I saw that you updated the patch today (@ http://review.gluster.org/#/c/
15289/). I built an RPM of the first iteration you had of the patch (just
changing the one line in arbiter.c "GF_FREE (ctx->iattbuf);") and am
running that on some test servers now to see if the memory of the arbiter
brick gets out of control.

Ben

On Tue, Aug 23, 2016 at 3:38 AM, Ravishankar N 
wrote:

> Hi Benjamin
>
> On 08/23/2016 06:41 AM, Benjamin Edgar wrote:
>
> I've attached a statedump of the problem brick process.  Let me know if
> there are any other logs you need.
>
>
> Thanks for the report! I've sent a fix @ http://review.gluster.org/#/c/
> 15289/ . It would be nice if you can verify if the patch fixes the issue
> for you.
>
> Thanks,
> Ravi
>
>
> Thanks a lot,
> Ben
>
> On Mon, Aug 22, 2016 at 5:03 PM, Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
>> Could you collect statedump of the brick process by following:
>> https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump
>>
>> That should help us identify which datatype is causing leaks and fix it.
>>
>> Thanks!
>>
>> On Tue, Aug 23, 2016 at 2:22 AM, Benjamin Edgar 
>> wrote:
>>
>>> Hi,
>>>
>>> I appear to have a memory leak with a replica 3 arbiter 1 configuration
>>> of gluster. I have a data brick and an arbiter brick on one server, and
>>> another server with the last data brick. The more I write files to gluster
>>> in this configuration, the more memory the arbiter brick process takes up.
>>>
>>> I am able to reproduce this issue by first setting up a replica 3
>>> arbiter 1 configuration and then using the following bash script to create
>>> 10,000 200kB files, delete those files, and run forever:
>>>
>>> while true ; do
>>>   for i in {1..1} ; do
>>> dd if=/dev/urandom bs=200K count=1 of=$TEST_FILES_DIR/file$i
>>>   done
>>>   rm -rf $TEST_FILES_DIR/*
>>> done
>>>
>>> $TEST_FILES_DIR is a location on my gluster mount.
>>>
>>> After about 3 days of this script running on one of my clusters, this is
>>> what the output of "top" looks like:
>>>   PID   USER  PR  NIVIRT   RESSHR S   %CPU %MEM
>>> TIME+   COMMAND
>>> 16039 root  20   0 1397220  77720 3948 S   20.61.0
>>>  860:01.53  glusterfsd
>>> 13174 root  20   0 1395824  112728   3692 S   19.61.5
>>>  806:07.17  glusterfs
>>> 19961 root  20   0 2967204  *2.145g*3896 S   17.3
>>>  29.0  752:10.70  glusterfsd
>>>
>>> As you can see one of the brick processes is using over 2 gigabytes of
>>> memory.
>>>
>>> One work-around for this is to kill the arbiter brick process and
>>> restart the gluster daemon. This restarts arbiter brick process and its
>>> memory usage goes back down to a reasonable level. However I would rather
>>> not kill the arbiter brick every week for production environments.
>>>
>>> Has anyone seen this issue before and is there a known work-around/fix?
>>>
>>> Thanks,
>>> Ben
>>>
>>> ___
>>> Gluster-users mailing list
>>> Gluster-users@gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>
>>
>>
>> --
>> Pranith
>>
>
>
>
> --
> Benjamin Edgar
> Computer Science
> University of Virginia 2015
> (571) 338-0878
>
>
> ___
> Gluster-users mailing 
> listGluster-users@gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users
>
>
>


-- 
Benjamin Edgar
Computer Science
University of Virginia 2015
(571) 338-0878
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Memory leak with a replica 3 arbiter 1 configuration

2016-08-23 Thread Ravishankar N

Hi Benjamin

On 08/23/2016 06:41 AM, Benjamin Edgar wrote:
I've attached a statedump of the problem brick process.  Let me know 
if there are any other logs you need.


Thanks for the report! I've sent a fix @ 
http://review.gluster.org/#/c/15289/ . It would be nice if you can 
verify if the patch fixes the issue for you.


Thanks,
Ravi


Thanks a lot,
Ben

On Mon, Aug 22, 2016 at 5:03 PM, Pranith Kumar Karampuri 
> wrote:


Could you collect statedump of the brick process by following:
https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump


That should help us identify which datatype is causing leaks and
fix it.

Thanks!

On Tue, Aug 23, 2016 at 2:22 AM, Benjamin Edgar
> wrote:

Hi,

I appear to have a memory leak with a replica 3 arbiter 1
configuration of gluster. I have a data brick and an arbiter
brick on one server, and another server with the last data
brick. The more I write files to gluster in this
configuration, the more memory the arbiter brick process takes up.

I am able to reproduce this issue by first setting up a
replica 3 arbiter 1 configuration and then using the following
bash script to create 10,000 200kB files, delete those files,
and run forever:

while true ; do
  for i in {1..1} ; do
dd if=/dev/urandom bs=200K count=1 of=$TEST_FILES_DIR/file$i
  done
  rm -rf $TEST_FILES_DIR/*
done

$TEST_FILES_DIR is a location on my gluster mount.

After about 3 days of this script running on one of my
clusters, this is what the output of "top" looks like:
  PID   USER  PR  NIVIRT RESSHR S   %CPU %MEM
TIME+ COMMAND
16039 root  20   0 1397220  77720 3948 S  
20.61.0  860:01.53  glusterfsd

13174 root  20   0 1395824  112728   3692 S   19.6
   1.5  806:07.17  glusterfs
19961 root  20   0 2967204 *2.145g*   3896 S  
17.329.0  752:10.70  glusterfsd


As you can see one of the brick processes is using over 2
gigabytes of memory.

One work-around for this is to kill the arbiter brick process
and restart the gluster daemon. This restarts arbiter brick
process and its memory usage goes back down to a reasonable
level. However I would rather not kill the arbiter brick every
week for production environments.

Has anyone seen this issue before and is there a known
work-around/fix?

Thanks,
Ben

___
Gluster-users mailing list
Gluster-users@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-users





-- 
Pranith





--
Benjamin Edgar
Computer Science
University of Virginia 2015
(571) 338-0878


___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Memory leak with a replica 3 arbiter 1 configuration

2016-08-22 Thread Pranith Kumar Karampuri
Could you collect statedump of the brick process by following:
https://gluster.readthedocs.io/en/latest/Troubleshooting/statedump

That should help us identify which datatype is causing leaks and fix it.

Thanks!

On Tue, Aug 23, 2016 at 2:22 AM, Benjamin Edgar  wrote:

> Hi,
>
> I appear to have a memory leak with a replica 3 arbiter 1 configuration of
> gluster. I have a data brick and an arbiter brick on one server, and
> another server with the last data brick. The more I write files to gluster
> in this configuration, the more memory the arbiter brick process takes up.
>
> I am able to reproduce this issue by first setting up a replica 3 arbiter
> 1 configuration and then using the following bash script to create 10,000
> 200kB files, delete those files, and run forever:
>
> while true ; do
>   for i in {1..1} ; do
> dd if=/dev/urandom bs=200K count=1 of=$TEST_FILES_DIR/file$i
>   done
>   rm -rf $TEST_FILES_DIR/*
> done
>
> $TEST_FILES_DIR is a location on my gluster mount.
>
> After about 3 days of this script running on one of my clusters, this is
> what the output of "top" looks like:
>   PID   USER  PR  NIVIRT   RESSHR S   %CPU %MEM
> TIME+   COMMAND
> 16039 root  20   0 1397220  77720 3948 S   20.61.0
>860:01.53  glusterfsd
> 13174 root  20   0 1395824  112728   3692 S   19.61.5
>806:07.17  glusterfs
> 19961 root  20   0 2967204  *2.145g*3896 S   17.329.0
>  752:10.70  glusterfsd
>
> As you can see one of the brick processes is using over 2 gigabytes of
> memory.
>
> One work-around for this is to kill the arbiter brick process and restart
> the gluster daemon. This restarts arbiter brick process and its memory
> usage goes back down to a reasonable level. However I would rather not kill
> the arbiter brick every week for production environments.
>
> Has anyone seen this issue before and is there a known work-around/fix?
>
> Thanks,
> Ben
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>



-- 
Pranith
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Memory leak with a replica 3 arbiter 1 configuration

2016-08-22 Thread Benjamin Edgar
Hi,

I appear to have a memory leak with a replica 3 arbiter 1 configuration of
gluster. I have a data brick and an arbiter brick on one server, and
another server with the last data brick. The more I write files to gluster
in this configuration, the more memory the arbiter brick process takes up.

I am able to reproduce this issue by first setting up a replica 3 arbiter 1
configuration and then using the following bash script to create 10,000
200kB files, delete those files, and run forever:

while true ; do
  for i in {1..1} ; do
dd if=/dev/urandom bs=200K count=1 of=$TEST_FILES_DIR/file$i
  done
  rm -rf $TEST_FILES_DIR/*
done

$TEST_FILES_DIR is a location on my gluster mount.

After about 3 days of this script running on one of my clusters, this is
what the output of "top" looks like:
  PID   USER  PR  NIVIRT   RESSHR S   %CPU %MEM
TIME+   COMMAND
16039 root  20   0 1397220  77720 3948 S   20.61.0
   860:01.53  glusterfsd
13174 root  20   0 1395824  112728   3692 S   19.61.5
 806:07.17  glusterfs
19961 root  20   0 2967204  *2.145g*3896 S   17.329.0
   752:10.70  glusterfsd

As you can see one of the brick processes is using over 2 gigabytes of
memory.

One work-around for this is to kill the arbiter brick process and restart
the gluster daemon. This restarts arbiter brick process and its memory
usage goes back down to a reasonable level. However I would rather not kill
the arbiter brick every week for production environments.

Has anyone seen this issue before and is there a known work-around/fix?

Thanks,
Ben
___
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users