Re: [Gluster-users] Geo-Replication memory leak on slave node

2018-06-20 Thread Mark Betham
Hi Kotresh,

Many thanks for your prompt response.  No need to apologise, any help you
can provide is greatly appreciated.

I look forward to receiving your update next week.

Many thanks,

Mark Betham

On Wed, 20 Jun 2018 at 10:55, Kotresh Hiremath Ravishankar <
khire...@redhat.com> wrote:

> Hi Mark,
>
> Sorry, I was busy and could not take a serious look at the logs. I can
> update you on Monday.
>
> Thanks,
> Kotresh HR
>
> On Wed, Jun 20, 2018 at 12:32 PM, Mark Betham <
> mark.bet...@performancehorizon.com> wrote:
>
>> Hi Kotresh,
>>
>> I was wondering if you had made any progress with regards to the issue I
>> am currently experiencing with geo-replication.
>>
>> For info the fault remains and effectively requires a restart of the
>> geo-replication service on a daily basis to reclaim the used memory on the
>> slave node.
>>
>> If you require any further information then please do not hesitate to ask.
>>
>> Many thanks,
>>
>> Mark Betham
>>
>>
>> On Mon, 11 Jun 2018 at 08:24, Mark Betham <
>> mark.bet...@performancehorizon.com> wrote:
>>
>>> Hi Kotresh,
>>>
>>> Many thanks.  I will shortly setup a share on my GDrive and send the
>>> link directly to yourself.
>>>
>>> For Info;
>>> The Geo-Rep slave failed again over the weekend but it did not recover
>>> this time.  It looks to have become unresponsive at around 14:40 UTC on 9th
>>> June.  I have attached an image showing the mem usage and you can see from
>>> this when the system failed.  The system was totally unresponsive and
>>> required a cold power off and then power on in order to recover the server.
>>>
>>> Many thanks for your help.
>>>
>>> Mark Betham.
>>>
>>> On 11 June 2018 at 05:53, Kotresh Hiremath Ravishankar <
>>> khire...@redhat.com> wrote:
>>>
>>>> Hi Mark,
>>>>
>>>> Google drive works for me.
>>>>
>>>> Thanks,
>>>> Kotresh HR
>>>>
>>>> On Fri, Jun 8, 2018 at 3:00 PM, Mark Betham <
>>>> mark.bet...@performancehorizon.com> wrote:
>>>>
>>>>> Hi Kotresh,
>>>>>
>>>>> The memory issue re-occurred again.  This is indicating it will occur
>>>>> around once a day.
>>>>>
>>>>> Again no traceback listed in the log, the only update in the log was
>>>>> as follows;
>>>>> [2018-06-08 08:26:43.404261] I [resource(slave):1020:service_loop]
>>>>> GLUSTER: connection inactive, stopping timeout=120
>>>>> [2018-06-08 08:29:19.357615] I [syncdutils(slave):271:finalize] :
>>>>> exiting.
>>>>> [2018-06-08 08:31:02.432002] I [resource(slave):1502:connect] GLUSTER:
>>>>> Mounting gluster volume locally...
>>>>> [2018-06-08 08:31:03.716967] I [resource(slave):1515:connect] GLUSTER:
>>>>> Mounted gluster volume duration=1.2729
>>>>> [2018-06-08 08:31:03.717411] I [resource(slave):1012:service_loop]
>>>>> GLUSTER: slave listening
>>>>>
>>>>> I have attached an image showing the latest memory usage pattern.
>>>>>
>>>>> Can you please advise how I can pass the log data across to you?  As
>>>>> soon as I know this I will get the data uploaded for your review.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Mark Betham
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 7 June 2018 at 08:19, Mark Betham <
>>>>> mark.bet...@performancehorizon.com> wrote:
>>>>>
>>>>>> Hi Kotresh,
>>>>>>
>>>>>> Many thanks for your prompt response.
>>>>>>
>>>>>> Below are my responses to your questions;
>>>>>>
>>>>>> 1. Is this trace back consistently hit? I just wanted to confirm
>>>>>> whether it's transient which occurs once in a while and gets back to 
>>>>>> normal?
>>>>>> It appears not.  As soon as the geo-rep recovered yesterday from the
>>>>>> high memory usage it immediately began rising again until it consumed all
>>>>>> of the available ram.  But this time nothing was committed to the log 
>>>>>> file.
>>>>>> I would like to add here that this current instance of geo-rep was
>>>>&g

Re: [Gluster-users] Geo-Replication memory leak on slave node

2018-06-20 Thread Mark Betham
Hi Kotresh,

I was wondering if you had made any progress with regards to the issue I am
currently experiencing with geo-replication.

For info the fault remains and effectively requires a restart of the
geo-replication service on a daily basis to reclaim the used memory on the
slave node.

If you require any further information then please do not hesitate to ask.

Many thanks,

Mark Betham


On Mon, 11 Jun 2018 at 08:24, Mark Betham <
mark.bet...@performancehorizon.com> wrote:

> Hi Kotresh,
>
> Many thanks.  I will shortly setup a share on my GDrive and send the link
> directly to yourself.
>
> For Info;
> The Geo-Rep slave failed again over the weekend but it did not recover
> this time.  It looks to have become unresponsive at around 14:40 UTC on 9th
> June.  I have attached an image showing the mem usage and you can see from
> this when the system failed.  The system was totally unresponsive and
> required a cold power off and then power on in order to recover the server.
>
> Many thanks for your help.
>
> Mark Betham.
>
> On 11 June 2018 at 05:53, Kotresh Hiremath Ravishankar <
> khire...@redhat.com> wrote:
>
>> Hi Mark,
>>
>> Google drive works for me.
>>
>> Thanks,
>> Kotresh HR
>>
>> On Fri, Jun 8, 2018 at 3:00 PM, Mark Betham <
>> mark.bet...@performancehorizon.com> wrote:
>>
>>> Hi Kotresh,
>>>
>>> The memory issue re-occurred again.  This is indicating it will occur
>>> around once a day.
>>>
>>> Again no traceback listed in the log, the only update in the log was as
>>> follows;
>>> [2018-06-08 08:26:43.404261] I [resource(slave):1020:service_loop]
>>> GLUSTER: connection inactive, stopping timeout=120
>>> [2018-06-08 08:29:19.357615] I [syncdutils(slave):271:finalize] :
>>> exiting.
>>> [2018-06-08 08:31:02.432002] I [resource(slave):1502:connect] GLUSTER:
>>> Mounting gluster volume locally...
>>> [2018-06-08 08:31:03.716967] I [resource(slave):1515:connect] GLUSTER:
>>> Mounted gluster volume duration=1.2729
>>> [2018-06-08 08:31:03.717411] I [resource(slave):1012:service_loop]
>>> GLUSTER: slave listening
>>>
>>> I have attached an image showing the latest memory usage pattern.
>>>
>>> Can you please advise how I can pass the log data across to you?  As
>>> soon as I know this I will get the data uploaded for your review.
>>>
>>> Thanks,
>>>
>>> Mark Betham
>>>
>>>
>>>
>>>
>>> On 7 June 2018 at 08:19, Mark Betham >> > wrote:
>>>
>>>> Hi Kotresh,
>>>>
>>>> Many thanks for your prompt response.
>>>>
>>>> Below are my responses to your questions;
>>>>
>>>> 1. Is this trace back consistently hit? I just wanted to confirm
>>>> whether it's transient which occurs once in a while and gets back to 
>>>> normal?
>>>> It appears not.  As soon as the geo-rep recovered yesterday from the
>>>> high memory usage it immediately began rising again until it consumed all
>>>> of the available ram.  But this time nothing was committed to the log file.
>>>> I would like to add here that this current instance of geo-rep was only
>>>> brought online at the start of this week due to the issues with glibc on
>>>> CentOS 7.5.  This is the first time I have had geo-rep running with Gluster
>>>> ver 3.12.9, both storage clusters at each physical site were only rebuilt
>>>> approx. 4 weeks ago, due to the previous version in use going EOL.  Prior
>>>> to this I had been running 3.13.2 (3.13.X now EOL) at each of the sites and
>>>> it is worth noting that the same behaviour was also seen on this version of
>>>> Gluster, unfortunately I do not have any of the log data from then but I do
>>>> not recall seeing any instances of the trace back message mentioned.
>>>>
>>>> 2. Please upload the complete geo-rep logs from both master and slave.
>>>> I have the log files, just checking to make sure there is no
>>>> confidential info inside.  The logfiles are too big to send via email, even
>>>> when compressed.  Do you have a preferred method to allow me to share this
>>>> data with you or would a share from my Google drive be sufficient?
>>>>
>>>> 3. Are the gluster versions same across master and slave?
>>>> Yes, all gluster versions are the same across the two sites for all
>>>> storage nodes.  See below for version info taken f

[Gluster-users] Geo-Replication memory leak on slave node

2018-06-06 Thread Mark Betham
rvol1.log*

If any further information is required in order to troubleshoot this issue
then please let me know.

I would be very grateful for any help or guidance received.

Many thanks,

Mark Betham.

-- 




  This
email may contain confidential material; 
unintended
recipients must not disseminate, use, or act upon 
any
information in it. If you received this email in error,

please contact the sender and permanently delete the email.

   
 Performance Horizon Group Limited | Registered in England

& Wales 07188234 | Level 8, West One, Forth Banks,
Newcastle 
upon Tyne, NE1 3PA






40e9e77a-034c-44a2-896e-59eec47e8a84:storage-server.%2Fdata%2Fbrick0.glustervol1.log
Description: Binary data
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users