Re: [Gluster-users] Geo-replication stops after 4-5 hours

2018-08-13 Thread Sunny Kumar
Hi Marcus,

Can you please share mount log from slave (You can find it at
"/var/log/glusterfs/geo-replication-slaves/hostname/mnt.log").

- Sunny
On Tue, Aug 14, 2018 at 12:48 AM Marcus Pedersén  wrote:
>
> Hi again,
>
> New changes in behaviour, both master master nodes that are active toggles to 
> failure and the logs repeat the same over and over again.
>
>
> Part of log, node1:
>
> [2018-08-13 18:24:44.701711] I [gsyncdstatus(worker 
> /urd-gds/gluster):276:set_active] GeorepStatus: Worker Status Change
> status=Active
> [2018-08-13 18:24:44.704360] I [gsyncdstatus(worker 
> /urd-gds/gluster):248:set_worker_crawl_status] GeorepStatus: Crawl Status 
> Changestatus=History Crawl
> [2018-08-13 18:24:44.705162] I [master(worker /urd-gds/gluster):1448:crawl] 
> _GMaster: starting history crawlturns=1 stime=(1523907056, 0)   
> entry_stime=Noneetime=1534184684
> [2018-08-13 18:24:45.717072] I [master(worker /urd-gds/gluster):1477:crawl] 
> _GMaster: slave's time  stime=(1523907056, 0)
> [2018-08-13 18:24:45.904958] E [repce(worker /urd-gds/gluster):197:__call__] 
> RepceClient: call failed   call=5919:140339726538560:1534184685.88 
> method=entry_opserror=GsyncdError
> [2018-08-13 18:24:45.905111] E [syncdutils(worker 
> /urd-gds/gluster):298:log_raise_exception] : execution of "gluster" 
> failed with ENOENT (No such file or directory)
> [2018-08-13 18:24:45.919265] I [repce(agent 
> /urd-gds/gluster):80:service_loop] RepceServer: terminating on reaching EOF.
> [2018-08-13 18:24:46.553194] I [monitor(monitor):272:monitor] Monitor: worker 
> died in startup phase brick=/urd-gds/gluster
> [2018-08-13 18:24:46.561784] I [gsyncdstatus(monitor):243:set_worker_status] 
> GeorepStatus: Worker Status Change status=Faulty
> [2018-08-13 18:24:56.581748] I [monitor(monitor):158:monitor] Monitor: 
> starting gsyncd worker   brick=/urd-gds/gluster  slave_node=urd-gds-geo-000
> [2018-08-13 18:24:56.655164] I [gsyncd(worker /urd-gds/gluster):297:main] 
> : Using session config file  
> path=/var/lib/glusterd/geo-replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
> [2018-08-13 18:24:56.655193] I [gsyncd(agent /urd-gds/gluster):297:main] 
> : Using session config file   
> path=/var/lib/glusterd/geo-replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
> [2018-08-13 18:24:56.655889] I [changelogagent(agent 
> /urd-gds/gluster):72:__init__] ChangelogAgent: Agent listining...
> [2018-08-13 18:24:56.664628] I [resource(worker 
> /urd-gds/gluster):1348:connect_remote] SSH: Initializing SSH connection 
> between master and slave...
> [2018-08-13 18:24:58.347415] I [resource(worker 
> /urd-gds/gluster):1395:connect_remote] SSH: SSH connection between master and 
> slave established.duration=1.6824
> [2018-08-13 18:24:58.348151] I [resource(worker 
> /urd-gds/gluster):1067:connect] GLUSTER: Mounting gluster volume locally...
> [2018-08-13 18:24:59.463598] I [resource(worker 
> /urd-gds/gluster):1090:connect] GLUSTER: Mounted gluster volume 
> duration=1.1150
> [2018-08-13 18:24:59.464184] I [subcmds(worker 
> /urd-gds/gluster):70:subcmd_worker] : Worker spawn successful. 
> Acknowledging back to monitor
> [2018-08-13 18:25:01.549007] I [master(worker 
> /urd-gds/gluster):1534:register] _GMaster: Working dir
> path=/var/lib/misc/gluster/gsyncd/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/urd-gds-gluster
> [2018-08-13 18:25:01.549606] I [resource(worker 
> /urd-gds/gluster):1253:service_loop] GLUSTER: Register time 
> time=1534184701
> [2018-08-13 18:25:01.593946] I [gsyncdstatus(worker 
> /urd-gds/gluster):276:set_active] GeorepStatus: Worker Status Change
> status=Active
>
>
> Part of log, node2:
>
> [2018-08-13 18:25:14.554233] I [gsyncdstatus(monitor):243:set_worker_status] 
> GeorepStatus: Worker Status Change status=Faulty
> [2018-08-13 18:25:24.568727] I [monitor(monitor):158:monitor] Monitor: 
> starting gsyncd worker   brick=/urd-gds/gluster  slave_node=urd-gds-geo-000
> [2018-08-13 18:25:24.609642] I [gsyncd(agent /urd-gds/gluster):297:main] 
> : Using session config file   
> path=/var/lib/glusterd/geo-replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
> [2018-08-13 18:25:24.609678] I [gsyncd(worker /urd-gds/gluster):297:main] 
> : Using session config file  
> path=/var/lib/glusterd/geo-replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
> [2018-08-13 18:25:24.610362] I [changelogagent(agent 
> /urd-gds/gluster):72:__init__] ChangelogAgent: Agent listining...
> [2018-08-13 18:25:24.621551] I [resource(worker 
> /urd-gds/gluster):1348:connect_remote] SSH: Initializing SSH connection 
> between master and slave...
> [2018-08-13 18:25:26.164855] I [resource(worker 
> /urd-gds/gluster):1395:connect_remote] SSH: SSH connection between master and 
> slave established.duration=1.5431
> [2018-08-13 18:25:26.165124] I [resource(worker 
> 

Re: [Gluster-users] Geo-replication stops after 4-5 hours

2018-08-13 Thread Marcus Pedersén
Hi again,

New changes in behaviour, both master master nodes that are active toggles to 
failure and the logs repeat the same over and over again.


Part of log, node1:

[2018-08-13 18:24:44.701711] I [gsyncdstatus(worker 
/urd-gds/gluster):276:set_active] GeorepStatus: Worker Status Change
status=Active
[2018-08-13 18:24:44.704360] I [gsyncdstatus(worker 
/urd-gds/gluster):248:set_worker_crawl_status] GeorepStatus: Crawl Status 
Changestatus=History Crawl
[2018-08-13 18:24:44.705162] I [master(worker /urd-gds/gluster):1448:crawl] 
_GMaster: starting history crawlturns=1 stime=(1523907056, 0)   
entry_stime=Noneetime=1534184684
[2018-08-13 18:24:45.717072] I [master(worker /urd-gds/gluster):1477:crawl] 
_GMaster: slave's time  stime=(1523907056, 0)
[2018-08-13 18:24:45.904958] E [repce(worker /urd-gds/gluster):197:__call__] 
RepceClient: call failed   call=5919:140339726538560:1534184685.88 
method=entry_opserror=GsyncdError
[2018-08-13 18:24:45.905111] E [syncdutils(worker 
/urd-gds/gluster):298:log_raise_exception] : execution of "gluster" failed 
with ENOENT (No such file or directory)
[2018-08-13 18:24:45.919265] I [repce(agent /urd-gds/gluster):80:service_loop] 
RepceServer: terminating on reaching EOF.
[2018-08-13 18:24:46.553194] I [monitor(monitor):272:monitor] Monitor: worker 
died in startup phase brick=/urd-gds/gluster
[2018-08-13 18:24:46.561784] I [gsyncdstatus(monitor):243:set_worker_status] 
GeorepStatus: Worker Status Change status=Faulty
[2018-08-13 18:24:56.581748] I [monitor(monitor):158:monitor] Monitor: starting 
gsyncd worker   brick=/urd-gds/gluster  slave_node=urd-gds-geo-000
[2018-08-13 18:24:56.655164] I [gsyncd(worker /urd-gds/gluster):297:main] 
: Using session config file  
path=/var/lib/glusterd/geo-replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
[2018-08-13 18:24:56.655193] I [gsyncd(agent /urd-gds/gluster):297:main] : 
Using session config file   
path=/var/lib/glusterd/geo-replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
[2018-08-13 18:24:56.655889] I [changelogagent(agent 
/urd-gds/gluster):72:__init__] ChangelogAgent: Agent listining...
[2018-08-13 18:24:56.664628] I [resource(worker 
/urd-gds/gluster):1348:connect_remote] SSH: Initializing SSH connection between 
master and slave...
[2018-08-13 18:24:58.347415] I [resource(worker 
/urd-gds/gluster):1395:connect_remote] SSH: SSH connection between master and 
slave established.duration=1.6824
[2018-08-13 18:24:58.348151] I [resource(worker /urd-gds/gluster):1067:connect] 
GLUSTER: Mounting gluster volume locally...
[2018-08-13 18:24:59.463598] I [resource(worker /urd-gds/gluster):1090:connect] 
GLUSTER: Mounted gluster volume duration=1.1150
[2018-08-13 18:24:59.464184] I [subcmds(worker 
/urd-gds/gluster):70:subcmd_worker] : Worker spawn successful. 
Acknowledging back to monitor
[2018-08-13 18:25:01.549007] I [master(worker /urd-gds/gluster):1534:register] 
_GMaster: Working dir
path=/var/lib/misc/gluster/gsyncd/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/urd-gds-gluster
[2018-08-13 18:25:01.549606] I [resource(worker 
/urd-gds/gluster):1253:service_loop] GLUSTER: Register time time=1534184701
[2018-08-13 18:25:01.593946] I [gsyncdstatus(worker 
/urd-gds/gluster):276:set_active] GeorepStatus: Worker Status Change
status=Active


Part of log, node2:

[2018-08-13 18:25:14.554233] I [gsyncdstatus(monitor):243:set_worker_status] 
GeorepStatus: Worker Status Change status=Faulty
[2018-08-13 18:25:24.568727] I [monitor(monitor):158:monitor] Monitor: starting 
gsyncd worker   brick=/urd-gds/gluster  slave_node=urd-gds-geo-000
[2018-08-13 18:25:24.609642] I [gsyncd(agent /urd-gds/gluster):297:main] : 
Using session config file   
path=/var/lib/glusterd/geo-replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
[2018-08-13 18:25:24.609678] I [gsyncd(worker /urd-gds/gluster):297:main] 
: Using session config file  
path=/var/lib/glusterd/geo-replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
[2018-08-13 18:25:24.610362] I [changelogagent(agent 
/urd-gds/gluster):72:__init__] ChangelogAgent: Agent listining...
[2018-08-13 18:25:24.621551] I [resource(worker 
/urd-gds/gluster):1348:connect_remote] SSH: Initializing SSH connection between 
master and slave...
[2018-08-13 18:25:26.164855] I [resource(worker 
/urd-gds/gluster):1395:connect_remote] SSH: SSH connection between master and 
slave established.duration=1.5431
[2018-08-13 18:25:26.165124] I [resource(worker /urd-gds/gluster):1067:connect] 
GLUSTER: Mounting gluster volume locally...
[2018-08-13 18:25:27.331969] I [resource(worker /urd-gds/gluster):1090:connect] 
GLUSTER: Mounted gluster volume duration=1.1667
[2018-08-13 18:25:27.335560] I [subcmds(worker 
/urd-gds/gluster):70:subcmd_worker] : Worker spawn successful. 
Acknowledging back to monitor
[2018-08-13 18:25:37.768867] I [master(worker 

Re: [Gluster-users] gluster 3.12 memory leak

2018-08-13 Thread Jim Kinney
Did this get released yet? Fuse client mounting on a computational
cluster that writes a few thousand files a day on a slow day is causing
many oomkiller problems.

On Tue, 2018-08-07 at 11:33 +0530, Hari Gowtham wrote:
> Hi,
> The reason for memory leak was found. The patch (
> https://review.gluster.org/#/c/20437/ ) will fix the leak.Should be
> made available with the next release. You can keep an eye on it.For
> more info refer the above mentioned bug.
> Regards,Hari.On Fri, Aug 3, 2018 at 7:36 PM Alex K <
> rightkickt...@gmail.com> wrote:
> 
> Thank you Hari.Hope we get a fix soon to put us out of our misery J
> Alex
> On Fri, Aug 3, 2018 at 4:58 PM, Hari Gowtham 
> wrote:
> 
> Hi,
> It is a known issue.This bug will give more insight on the memory
> leak.https://bugzilla.redhat.com/show_bug.cgi?id=1593826On Fri, Aug
> 3, 2018 at 6:15 PM Alex K  wrote:
> 
> Hi,
> I was using 3.8.12-1 up to 3.8.15-2. I did not have issue with these
> versions.I still have systems running with those with no such memory
> leaks.
> Thanx,Alex
> 
> On Fri, Aug 3, 2018 at 3:13 PM, Nithya Balachandran <
> nbala...@redhat.com> wrote:
> 
> Hi,
> What version of gluster were you using before you  upgraded?
> Regards,Nithya
> On 3 August 2018 at 16:56, Alex K  wrote:
> 
> Hi all,
> I am using gluster 3.12.9-1 on ovirt 4.1.9 and I have observed
> consistent high memory use which at some point renders the hosts
> unresponsive. This behavior is observed also while using 3.12.11-1
> with ovirt 4.2.5. I did not have this issue prior to upgrading
> gluster.
> I have seen a relevant bug reporting memory leaks of gluster and it
> seems that this is the case for my trouble. To temporarily resolve
> the high memory issue, I put hosts in maintenance then activate them
> back again. This indicates that the memory leak is caused from the
> gluster client. Ovirt is using fuse mounts.
> Is there any bug fx available for this?This issue is hitting us hard
> with several production installations.
> Thanx,Alex
> ___Gluster-users mailing
> listgluster-us...@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
> 
> 
> ___Gluster-users mailing
> listgluster-us...@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
> 
> 
> --Regards,Hari Gowtham.
> 
> 
> 
> 
-- 
James P. Kinney III

Every time you stop a school, you will have to build a jail. What you
gain at one end you lose at the other. It's like feeding a dog on his
own tail. It won't fatten the dog.
- Speech 11/23/1900 Mark Twain

http://heretothereideas.blogspot.com/

___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] create volume with glusterd2 fails with "HTTP/1.1 500 Internal Server Error"

2018-08-13 Thread Davide Obbi
Hi,

i'm trying to create a volume using glusterd2 however curl does not return
anything and in verbose mode i get the above mentioned error,  the only
gluster service on the hosts is "glusterd2.service"

/var/log/glusterd2/glusterd2.log:
time="2018-08-13 18:05:15.395479" level=info msg="runtime error: index out
of range"

JSON FILE:
{
"name": "test01",
"subvols": [
{
"type": "replicate",
"bricks": [
{"peerid": "2f468973-8de9-4dbf-90c2-7236e229697d",
"path": "/srv/gfs/test01/brk01"},

{"peerid": "ad228d0e-e7b9-4d02-a863-94c74bd3d843",
"path": "/srv/gfs/test01/brk01"},

{"peerid": "de7ea1d7-5566-40f4-bc34-0d68bfd48193",
"path": "/srv/gfs/test01/brk01"}

],
"replica": 3
}
],
"force": false
}

COMMAND:
curl -v -X POST http://localhost:24007/v1/volumes --data @test01.json -H
'Content-Type: application/json'
# i have also tried with fqd from localhost

ERROR:
* About to connect() to localhost port 24007 (#0)
*   Trying ::1...
* Connected to localhost (::1) port 24007 (#0)
> POST /v1/volumes HTTP/1.1
> User-Agent: curl/7.29.0
> Host: localhost:24007
> Accept: */*
> Content-Type: application/json
> Content-Length: 525
>
* upload completely sent off: 525 out of 525 bytes
< HTTP/1.1 500 Internal Server Error
< X-Gluster-Cluster-Id: 04385f96-4b7e-4afa-9366-8d1a3b30f36e
< X-Gluster-Peer-Id: ad228d0e-e7b9-4d02-a863-94c74bd3d843
< X-Request-Id: fea765a5-1014-48f7-919c-3431278d14ab
< Date: Mon, 13 Aug 2018 15:57:24 GMT
< Content-Length: 0
< Content-Type: text/plain; charset=utf-8
<
* Connection #0 to host localhost left intact

VERSIONS:
glusterd2.x86_64
 4.1.0-1.el7
glusterfs-server.x86_64
  4.1.2-1.el7

PEERS:
+--+--+-+--++

|  ID  |   NAME
   |  CLIENT ADDRESSES   |  PEER ADDRESSES  | ONLINE |

+--+--+-+--++

| 2f468973-8de9-4dbf-90c2-7236e229697d | gluster-1009
   | 127.0.0.1:24007 | gluster-1009:24008   | yes|

|  |
  | 10.10.10.19:24007   | 10.10.10.19:24008||

| ad228d0e-e7b9-4d02-a863-94c74bd3d843 | gluster-1005
   | 127.0.0.1:24007 | 10.10.10.18:24008| yes|

|  |
  | 10.10.10.18:24007   |  ||

| de7ea1d7-5566-40f4-bc34-0d68bfd48193 | gluster-1008
   | 127.0.0.1:24007 | gluster-1008:24008   | yes|

|  |
  | 10.10.10.23:24007   | 10.10.10.23:24008||

+--+--+-+--++
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gluster fuse comsumes huge memory

2018-08-13 Thread Darrell Budic
Probably this https://bugzilla.redhat.com/show_bug.cgi?id=1593826 
 , known issue in 3.12.9 
and later versions.

> From: huting3 
> Subject: [Gluster-users] gluster fuse comsumes huge memory
> Date: August 12, 2018 at 9:51:31 PM CDT
> To: gluster-users@gluster.org
> 
> Hi expert:
> 
> I meet a problem when I use glusterfs. The problem is that the fuse client 
> consumes huge memory when write a   lot of files(>million) to the gluster, at 
> last leading to killed by OS oom. The memory the fuse process consumes can up 
> to 100G! I wonder if there are memory leaks in the gluster fuse process, or 
> some other causes. Thanks!
> 
> This is the statedump file`s link:
> 
> https://drive.google.com/file/d/1ZlttTzt4E56Qtk9j7b4I9GkZC2W3mJgp/view?usp=sharing
>  
> 
> 
> My gluster version is 3.13.2, the gluster volume info is listed as following:
> 
> Volume Name: gv0
> Type: Distributed-Replicate
> Volume ID: 4a6f96f8-b3fb-4550-bd19-e1a5dffad4d0
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 19 x 3 = 57
> Transport-type: tcp
> Options Reconfigured:
> performance.cache-size: 10GB
> performance.parallel-readdir: on
> performance.readdir-ahead: on
> network.inode-lru-limit: 20
> performance.md-cache-timeout: 600
> performance.cache-invalidation: on
> performance.stat-prefetch: on
> features.cache-invalidation-timeout: 600
> features.cache-invalidation: on
> features.inode-quota: off
> features.quota: off
> cluster.quorum-reads: on
> cluster.quorum-count: 2
> cluster.quorum-type: fixed
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
> cluster.server-quorum-ratio: 51%
> 
> 
>   
> huting3
> 
> huti...@corp.netease.com
>  
> 
> 签名由 网易邮箱大师  定制
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org 
> https://lists.gluster.org/mailman/listinfo/gluster-users 
> 
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster Outreachy

2018-08-13 Thread Amye Scavarda
This is great!
One thing that I'm noticing is that most proposed projects do not have
mentors at this time, which is pretty crucial to the success of these
projects.
Will signups close 20 August for that as well?
- amye


On Thu, Aug 9, 2018 at 11:16 PM Bhumika Goyal  wrote:

> Hi all,
>
> *Gentle reminder!*
>
> The doc[1] for adding project ideas for Outreachy will be open for editing
> till August 20th. Please feel free to add your project ideas :).
> [1]:
> https://docs.google.com/document/d/16yKKDD2Dd6Ag0tssrdoFPojKsF16QI5-j7cUHcR5Pq4/edit?usp=sharing
>
> Thanks,
> Bhumika
>
>
>
> On Wed, Jul 4, 2018 at 4:51 PM, Bhumika Goyal  wrote:
>
>> Hi all,
>>
>> Gnome has been working on an initiative known as Outreachy[1] since 2010.
>> Outreachy is a three months remote internship program. It aims to increase
>> the participation of women and members from under-represented groups in
>> open source. This program is held twice in a year. During the internship
>> period, interns contribute to a project under the guidance of one or more
>> mentors.
>>
>> For the next round(Dec 2018- March 2019) we are planning to apply
>> projects from Gluster. We would like you to propose projects ideas or/and
>> come forward as mentors/volunteers.
>> Please feel free to add project ideas in this doc[2]. The doc[2] will be
>> open for editing till July end.
>>
>> [1]: https://www.outreachy.org/
>> [2]:
>> https://docs.google.com/document/d/16yKKDD2Dd6Ag0tssrdoFPojKsF16QI5-j7cUHcR5Pq4/edit?usp=sharing
>>
>> Outreachy timeline:
>> Pre-Application Period - Late August to early September
>> Application Period - Early September to mid-October
>> Internship Period -  December to March
>>
>> Thanks,
>> Bhumika
>>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users



-- 
Amye Scavarda | a...@redhat.com | Gluster Community Lead
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users