Re: [Gluster-users] Run away memory with gluster mount

2018-02-05 Thread Raghavendra Gowdappa
I missed your reply :). Sorry about that.

- Original Message -
> From: "Raghavendra Gowdappa" 
> To: "Dan Ragle" 
> Cc: "Csaba Henk" , "gluster-users" 
> 
> Sent: Tuesday, February 6, 2018 1:14:10 AM
> Subject: Re: [Gluster-users] Run away memory with gluster mount
> 
> Hi Dan,
> 
> I had a suggestion and a question in my previous response. Let us know
> whether the suggestion helps and please let us know about your data-set
> (like how many directories/files and how these directories/files are
> organised) to understand the problem better.
> 
> 
> 
> > In the
> > meantime can you remount glusterfs with options
> > --entry-timeout=0 and --attribute-timeout=0? This will make sure
> > that kernel won't cache inodes/attributes of the file and should
> > bring down the memory usage.
> >
> > I am curious to know what is your data-set like? Is it the case
> > of too many directories and files present in deep directories? I
> > am wondering whether a significant number of inodes cached by
> > kernel are there to hold dentry structure in kernel.
> 
> 
> 
> regards,
> Raghavendra
> 
> - Original Message -
> > From: "Dan Ragle" 
> > To: "Nithya Balachandran" 
> > Cc: "gluster-users" , "Csaba Henk"
> > 
> > Sent: Saturday, February 3, 2018 7:28:15 PM
> > Subject: Re: [Gluster-users] Run away memory with gluster mount
> > 
> > 
> > 
> > On 2/2/2018 2:13 AM, Nithya Balachandran wrote:
> > > Hi Dan,
> > > 
> > > It sounds like you might be running into [1]. The patch has been posted
> > > upstream and the fix should be in the next release.
> > > In the meantime, I'm afraid there is no way to get around this without
> > > restarting the process.
> > > 
> > > Regards,
> > > Nithya
> > > 
> > > [1]https://bugzilla.redhat.com/show_bug.cgi?id=1541264
> > > 
> > 
> > Much appreciated. Will watch for the next release and retest then.
> > 
> > Cheers!
> > 
> > Dan
> > 
> > > 
> > > On 2 February 2018 at 02:57, Dan Ragle  > > > wrote:
> > > 
> > > 
> > > 
> > > On 1/30/2018 6:31 AM, Raghavendra Gowdappa wrote:
> > > 
> > > 
> > > 
> > > - Original Message -
> > > 
> > > From: "Dan Ragle" 
> > > To: "Raghavendra Gowdappa"  > > >, "Ravishankar N"
> > > >
> > > Cc: gluster-users@gluster.org
> > > , "Csaba Henk"
> > > >, "Niels de Vos"
> > > >, "Nithya
> > > Balachandran"  > > >
> > > Sent: Monday, January 29, 2018 9:02:21 PM
> > > Subject: Re: [Gluster-users] Run away memory with gluster
> > > mount
> > > 
> > > 
> > > 
> > > On 1/29/2018 2:36 AM, Raghavendra Gowdappa wrote:
> > > 
> > > 
> > > 
> > > - Original Message -
> > > 
> > > From: "Ravishankar N"  > > >
> > > To: "Dan Ragle" ,
> > > gluster-users@gluster.org
> > > 
> > > Cc: "Csaba Henk"  > > >, "Niels de Vos"
> > > >,
> > > "Nithya Balachandran"  > > >,
> > > "Raghavendra Gowdappa"  > > >
> > > Sent: Saturday, January 27, 2018 10:23:38 AM
> > > Subject: Re: [Gluster-users] Run away memory with
> > > gluster mount
> > > 
> > > 
> > > 
> > > On 01/27/2018 02:29 AM, Dan Ragle wrote:
> > > 
> > > 
> > > On 1/25/2018 8:21 PM, Ravishankar N wrote:
> > > 
> > > 
> > > 
> > > On 01/25/2018 11:04 PM, Dan Ragle wrote:
> > > 
> > > *sigh* trying again to correct
> > > formatting ... apologize for the
> > > earlier mess.
> > > 
> > > Having a memory issue with Gluster
> > > 3.12.4 and not sure how to
> > > 

Re: [Gluster-users] Run away memory with gluster mount

2018-02-05 Thread Raghavendra Gowdappa
Hi Dan,

I had a suggestion and a question in my previous response. Let us know whether 
the suggestion helps and please let us know about your data-set (like how many 
directories/files and how these directories/files are organised) to understand 
the problem better.



> In the
> meantime can you remount glusterfs with options
> --entry-timeout=0 and --attribute-timeout=0? This will make sure
> that kernel won't cache inodes/attributes of the file and should
> bring down the memory usage.
>
> I am curious to know what is your data-set like? Is it the case
> of too many directories and files present in deep directories? I
> am wondering whether a significant number of inodes cached by
> kernel are there to hold dentry structure in kernel.



regards,
Raghavendra

- Original Message -
> From: "Dan Ragle" 
> To: "Nithya Balachandran" 
> Cc: "gluster-users" , "Csaba Henk" 
> 
> Sent: Saturday, February 3, 2018 7:28:15 PM
> Subject: Re: [Gluster-users] Run away memory with gluster mount
> 
> 
> 
> On 2/2/2018 2:13 AM, Nithya Balachandran wrote:
> > Hi Dan,
> > 
> > It sounds like you might be running into [1]. The patch has been posted
> > upstream and the fix should be in the next release.
> > In the meantime, I'm afraid there is no way to get around this without
> > restarting the process.
> > 
> > Regards,
> > Nithya
> > 
> > [1]https://bugzilla.redhat.com/show_bug.cgi?id=1541264
> > 
> 
> Much appreciated. Will watch for the next release and retest then.
> 
> Cheers!
> 
> Dan
> 
> > 
> > On 2 February 2018 at 02:57, Dan Ragle  > > wrote:
> > 
> > 
> > 
> > On 1/30/2018 6:31 AM, Raghavendra Gowdappa wrote:
> > 
> > 
> > 
> > - Original Message -
> > 
> > From: "Dan Ragle" 
> > To: "Raghavendra Gowdappa"  > >, "Ravishankar N"
> > >
> > Cc: gluster-users@gluster.org
> > , "Csaba Henk"
> > >, "Niels de Vos"
> > >, "Nithya
> > Balachandran"  > >
> > Sent: Monday, January 29, 2018 9:02:21 PM
> > Subject: Re: [Gluster-users] Run away memory with gluster mount
> > 
> > 
> > 
> > On 1/29/2018 2:36 AM, Raghavendra Gowdappa wrote:
> > 
> > 
> > 
> > - Original Message -
> > 
> > From: "Ravishankar N"  > >
> > To: "Dan Ragle" ,
> > gluster-users@gluster.org
> > 
> > Cc: "Csaba Henk"  > >, "Niels de Vos"
> > >,
> > "Nithya Balachandran"  > >,
> > "Raghavendra Gowdappa"  > >
> > Sent: Saturday, January 27, 2018 10:23:38 AM
> > Subject: Re: [Gluster-users] Run away memory with
> > gluster mount
> > 
> > 
> > 
> > On 01/27/2018 02:29 AM, Dan Ragle wrote:
> > 
> > 
> > On 1/25/2018 8:21 PM, Ravishankar N wrote:
> > 
> > 
> > 
> > On 01/25/2018 11:04 PM, Dan Ragle wrote:
> > 
> > *sigh* trying again to correct
> > formatting ... apologize for the
> > earlier mess.
> > 
> > Having a memory issue with Gluster
> > 3.12.4 and not sure how to
> > troubleshoot. I don't *think* this is
> > expected behavior.
> > 
> > This is on an updated CentOS 7 box. The
> > setup is a simple two node
> > replicated layout where the two nodes
> > act as both server and
> > client.
> > 
> > The volume in question:
> > 
> > Volume Name: GlusterWWW
> > Type: Replicate
> > 

Re: [Gluster-users] geo-replication command rsync returned with 3

2018-02-05 Thread Florian Weimer

On 02/05/2018 01:33 PM, Florian Weimer wrote:

Do you have strace output going further back, at least to the proceeding 
getcwd call?  It would be interesting to see which path the kernel 
reports, and if it starts with "(unreachable)".


I got the strace output now, but it very difficult to read (chdir in a 
multi-threaded process …).


My current inclination is to blame rsync because it does an 
unconditional getcwd during startup, which now fails if the current 
directory is unreachable.


Further references:

https://sourceware.org/ml/libc-alpha/2018-02/msg00152.html
https://bugzilla.redhat.com/show_bug.cgi?id=1542180

Andreas Schwab agrees that rsync is buggy:

https://sourceware.org/ml/libc-alpha/2018-02/msg00153.html

Thanks,
Florian
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Release 4.0: RC0 packages

2018-02-05 Thread Shyam Ranganathan
Hi,

We have tagged and created RC0 packages for the 4.0 release of Gluster.
Details of the packages are given below.

We request community feedback from the RC stages, so that the end
release can be better, towards this please test and direct any feedback
to the lists for us to take a look at.

CentOS packages:

CentOS7:
  # yum install
http://cbs.centos.org/kojifiles/work/tasks/1548/311548/centos-release-gluster40-0.9-1.el7.centos.x86_64.rpm
  # yum install glusterfs-server
  (only the testing repository is enabled with this c-r-gluster40)

There is no glusterd2 packaged for the Storage SIG *yet*.

CentOS6:
CentOS-6 builds will not be done until the glusterfs-server sub-package
can be prevented from getting created (BZ 1074947). CentOS-6 will *only*
get the glusterfs-client packages from 4.0 release onwards.

Other distributions:

Packages for Fedora 27 and Fedora 28/rawhide are at [1].

Packages for Debian stretch/9 and buster/10 are coming soon. They will
also be at [1].

GlusterFS 4.0 packages, including these 4.0RC0 packages, are signed with
a new signing key. The public key is at [2].

Packages for glusterd2 will be added later.

Thanks,
Gluster community

[1] https://download.gluster.org/pub/gluster/glusterfs/qa-releases/4.0rc0/
[2]
https://download.gluster.org/pub/gluster/glusterfs/qa-releases/4.0rc0/rsa.pub
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Very slow rsync to gluster volume UNLESS `ls` or `find` scan dir on gluster volume first

2018-02-05 Thread Tom Fite
Hi all,

I have seen this issue as well, on Gluster 3.12.1. (3 bricks per box, 2
boxes, distributed-replicate) My testing shows the same thing -- running a
find on a directory dramatically increases lstat performance. To add
another clue, the performance degrades again after issuing a call to reset
the system's cache of dentries and inodes:

# sync; echo 2 > /proc/sys/vm/drop_caches

I think that this shows that it's the system cache that's actually doing
the heavy lifting here. There are a couple of sysctl tunables that I've
found helps out with this.

See here:

http://docs.gluster.org/en/latest/Administrator%20Guide/Linux%20Kernel%20Tuning/

Contrary to what that doc says, I've found that setting
vm.vfs_cache_pressure to a low value increases performance by allowing more
dentries and inodes to be retained in the cache.

# Set the swappiness to avoid swap when possible.
vm.swappiness = 10

# Set the cache pressure to prefer inode and dentry cache over file cache.
This is done to keep as many
# dentries and inodes in cache as possible, which dramatically improves
gluster small file performance.
vm.vfs_cache_pressure = 25

For comparison, my config is:

Volume Name: gv0
Type: Tier
Volume ID: d490a9ec-f9c8-4f10-a7f3-e1b6d3ced196
Status: Started
Snapshot Count: 13
Number of Bricks: 8
Transport-type: tcp
Hot Tier :
Hot Tier Type : Replicate
Number of Bricks: 1 x 2 = 2
Brick1: gluster2:/data/hot_tier/gv0
Brick2: gluster1:/data/hot_tier/gv0
Cold Tier:
Cold Tier Type : Distributed-Replicate
Number of Bricks: 3 x 2 = 6
Brick3: gluster1:/data/brick1/gv0
Brick4: gluster2:/data/brick1/gv0
Brick5: gluster1:/data/brick2/gv0
Brick6: gluster2:/data/brick2/gv0
Brick7: gluster1:/data/brick3/gv0
Brick8: gluster2:/data/brick3/gv0
Options Reconfigured:
performance.cache-max-file-size: 128MB
cluster.readdir-optimize: on
cluster.watermark-hi: 95
features.ctr-sql-db-cachesize: 262144
cluster.read-freq-threshold: 5
cluster.write-freq-threshold: 2
features.record-counters: on
cluster.tier-promote-frequency: 15000
cluster.tier-pause: off
cluster.tier-compact: on
cluster.tier-mode: cache
features.ctr-enabled: on
performance.cache-refresh-timeout: 60
performance.stat-prefetch: on
server.outstanding-rpc-limit: 2056
cluster.lookup-optimize: on
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
features.barrier: disable
client.event-threads: 4
server.event-threads: 4
performance.cache-size: 1GB
network.inode-lru-limit: 9
performance.md-cache-timeout: 600
performance.cache-invalidation: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
performance.quick-read: on
performance.io-cache: on
performance.nfs.write-behind-window-size: 4MB
performance.write-behind-window-size: 4MB
performance.nfs.io-threads: off
network.tcp-window-size: 1048576
performance.rda-cache-limit: 64MB
performance.flush-behind: on
server.allow-insecure: on
cluster.tier-demote-frequency: 18000
cluster.tier-max-files: 100
cluster.tier-max-promote-file-size: 10485760
cluster.tier-max-mb: 64000
features.ctr-sql-db-wal-autocheckpoint: 2500
cluster.tier-hot-compact-frequency: 86400
cluster.tier-cold-compact-frequency: 86400
performance.readdir-ahead: off
cluster.watermark-low: 50
storage.build-pgfid: on
performance.rda-request-size: 128KB
performance.rda-low-wmark: 4KB
cluster.min-free-disk: 5%
auto-delete: enable


On Sun, Feb 4, 2018 at 9:44 PM, Amar Tumballi  wrote:

> Thanks for the report Artem,
>
> Looks like the issue is about cache warming up. Specially, I suspect rsync
> doing a 'readdir(), stat(), file operations' loop, where as when a find or
> ls is issued, we get 'readdirp()' request, which contains the stat
> information along with entries, which also makes sure cache is up-to-date
> (at md-cache layer).
>
> Note that this is just a off-the memory hypothesis, We surely need to
> analyse and debug more thoroughly for a proper explanation.  Some one in my
> team would look at it soon.
>
> Regards,
> Amar
>
> On Mon, Feb 5, 2018 at 7:25 AM, Vlad Kopylov  wrote:
>
>> You mounting it to the local bricks?
>>
>> struggling with same performance issues
>> try using this volume setting
>> http://lists.gluster.org/pipermail/gluster-users/2018-January/033397.html
>> performance.stat-prefetch: on might be it
>>
>> seems like when it gets to cache it is fast - those stat fetch which
>> seem to come from .gluster are slow
>>
>> On Sun, Feb 4, 2018 at 3:45 AM, Artem Russakovskii 
>> wrote:
>> > An update, and a very interesting one!
>> >
>> > After I started stracing rsync, all I could see was lstat calls, quite
>> slow
>> > ones, over and over, which is expected.
>> >
>> > For example: lstat("uploads/2016/10/nexus2c
>> ee_DSC05339_thumb-161x107.jpg",
>> > {st_mode=S_IFREG|0664, st_size=4043, ...}) = 0
>> >
>> > I googled around and found
>> > https://gist.github.com/nh2/1836415489e2132cf85ed3832105fcc1, which is
>> > seeing this exact issue with 

Re: [Gluster-users] Dir split brain resolution

2018-02-05 Thread Karthik Subrahmanya
On 05-Feb-2018 7:12 PM, "Alex K"  wrote:

Hi Karthik,

I tried to delete one file at one node and that is probably the reason.
After several deletes seems that I deleted some files that shouldn't and
the ovirt engine hosted on this volume was not able to start.
Now I am setting up the engine from scratch...
In case I see this kind of split brain again I will get back before I start
deleting :)

Sure. Thanks for the update.

Regards,
Karthik



Alex


On Mon, Feb 5, 2018 at 2:34 PM, Karthik Subrahmanya 
wrote:

> Hi,
>
> I am wondering why the other brick is not showing any entry in split brain
> in the heal info split-brain output.
> Can you give the output of stat & getfattr -d -m . -e hex
>  from both the bricks.
>
> Regards,
> Karthik
>
> On Mon, Feb 5, 2018 at 5:03 PM, Alex K  wrote:
>
>> After stoping/starting the volume I have:
>>
>> gluster volume heal engine  info split-brain
>> Brick gluster0:/gluster/engine/brick
>> 
>> Status: Connected
>> Number of entries in split-brain: 1
>>
>> Brick gluster1:/gluster/engine/brick
>> Status: Connected
>> Number of entries in split-brain: 0
>>
>> gluster volume heal engine split-brain latest-mtime
>> gfid:bb675ea6-0622-4852-9e59-27a4c93ac0f8
>> Healing gfid:bb675ea6-0622-4852-9e59-27a4c93ac0f8 failed:Operation not
>> permitted.
>> Volume heal failed.
>>
>> I will appreciate any help.
>> thanx,
>> Alex
>>
>> On Mon, Feb 5, 2018 at 1:11 PM, Alex K  wrote:
>>
>>> Hi all,
>>>
>>> I have a split brain issue and have the following situation:
>>>
>>> gluster volume heal engine  info split-brain
>>>
>>> Brick gluster0:/gluster/engine/brick
>>> /ad1f38d7-36df-4cee-a092-ab0ce1f98ce9/ha_agent
>>> Status: Connected
>>> Number of entries in split-brain: 1
>>>
>>> Brick gluster1:/gluster/engine/brick
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>> cd ha_agent/
>>> [root@v0 ha_agent]# ls -al
>>> ls: cannot access hosted-engine.metadata: Input/output error
>>> ls: cannot access hosted-engine.lockspace: Input/output error
>>> total 8
>>> drwxrwx--- 2 vdsm kvm 4096 Feb  5 10:52 .
>>> drwxr-xr-x 5 vdsm kvm 4096 Jan 18 01:17 ..
>>> l? ? ??  ?? hosted-engine.lockspace
>>> l? ? ??  ?? hosted-engine.metadata
>>>
>>> I tried to delete the directory from one node but it gives Input/output
>>> error.
>>> How would one proceed to resolve this?
>>>
>>> Thanx,
>>> Alex
>>>
>>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Dir split brain resolution

2018-02-05 Thread Alex K
Hi Karthik,

I tried to delete one file at one node and that is probably the reason.
After several deletes seems that I deleted some files that shouldn't and
the ovirt engine hosted on this volume was not able to start.
Now I am setting up the engine from scratch...
In case I see this kind of split brain again I will get back before I start
deleting :)

Alex


On Mon, Feb 5, 2018 at 2:34 PM, Karthik Subrahmanya 
wrote:

> Hi,
>
> I am wondering why the other brick is not showing any entry in split brain
> in the heal info split-brain output.
> Can you give the output of stat & getfattr -d -m . -e hex
>  from both the bricks.
>
> Regards,
> Karthik
>
> On Mon, Feb 5, 2018 at 5:03 PM, Alex K  wrote:
>
>> After stoping/starting the volume I have:
>>
>> gluster volume heal engine  info split-brain
>> Brick gluster0:/gluster/engine/brick
>> 
>> Status: Connected
>> Number of entries in split-brain: 1
>>
>> Brick gluster1:/gluster/engine/brick
>> Status: Connected
>> Number of entries in split-brain: 0
>>
>> gluster volume heal engine split-brain latest-mtime
>> gfid:bb675ea6-0622-4852-9e59-27a4c93ac0f8
>> Healing gfid:bb675ea6-0622-4852-9e59-27a4c93ac0f8 failed:Operation not
>> permitted.
>> Volume heal failed.
>>
>> I will appreciate any help.
>> thanx,
>> Alex
>>
>> On Mon, Feb 5, 2018 at 1:11 PM, Alex K  wrote:
>>
>>> Hi all,
>>>
>>> I have a split brain issue and have the following situation:
>>>
>>> gluster volume heal engine  info split-brain
>>>
>>> Brick gluster0:/gluster/engine/brick
>>> /ad1f38d7-36df-4cee-a092-ab0ce1f98ce9/ha_agent
>>> Status: Connected
>>> Number of entries in split-brain: 1
>>>
>>> Brick gluster1:/gluster/engine/brick
>>> Status: Connected
>>> Number of entries in split-brain: 0
>>>
>>> cd ha_agent/
>>> [root@v0 ha_agent]# ls -al
>>> ls: cannot access hosted-engine.metadata: Input/output error
>>> ls: cannot access hosted-engine.lockspace: Input/output error
>>> total 8
>>> drwxrwx--- 2 vdsm kvm 4096 Feb  5 10:52 .
>>> drwxr-xr-x 5 vdsm kvm 4096 Jan 18 01:17 ..
>>> l? ? ??  ?? hosted-engine.lockspace
>>> l? ? ??  ?? hosted-engine.metadata
>>>
>>> I tried to delete the directory from one node but it gives Input/output
>>> error.
>>> How would one proceed to resolve this?
>>>
>>> Thanx,
>>> Alex
>>>
>>>
>>
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Dir split brain resolution

2018-02-05 Thread Karthik Subrahmanya
Hi,

I am wondering why the other brick is not showing any entry in split brain
in the heal info split-brain output.
Can you give the output of stat & getfattr -d -m . -e hex
 from both the bricks.

Regards,
Karthik

On Mon, Feb 5, 2018 at 5:03 PM, Alex K  wrote:

> After stoping/starting the volume I have:
>
> gluster volume heal engine  info split-brain
> Brick gluster0:/gluster/engine/brick
> 
> Status: Connected
> Number of entries in split-brain: 1
>
> Brick gluster1:/gluster/engine/brick
> Status: Connected
> Number of entries in split-brain: 0
>
> gluster volume heal engine split-brain latest-mtime
> gfid:bb675ea6-0622-4852-9e59-27a4c93ac0f8
> Healing gfid:bb675ea6-0622-4852-9e59-27a4c93ac0f8 failed:Operation not
> permitted.
> Volume heal failed.
>
> I will appreciate any help.
> thanx,
> Alex
>
> On Mon, Feb 5, 2018 at 1:11 PM, Alex K  wrote:
>
>> Hi all,
>>
>> I have a split brain issue and have the following situation:
>>
>> gluster volume heal engine  info split-brain
>>
>> Brick gluster0:/gluster/engine/brick
>> /ad1f38d7-36df-4cee-a092-ab0ce1f98ce9/ha_agent
>> Status: Connected
>> Number of entries in split-brain: 1
>>
>> Brick gluster1:/gluster/engine/brick
>> Status: Connected
>> Number of entries in split-brain: 0
>>
>> cd ha_agent/
>> [root@v0 ha_agent]# ls -al
>> ls: cannot access hosted-engine.metadata: Input/output error
>> ls: cannot access hosted-engine.lockspace: Input/output error
>> total 8
>> drwxrwx--- 2 vdsm kvm 4096 Feb  5 10:52 .
>> drwxr-xr-x 5 vdsm kvm 4096 Jan 18 01:17 ..
>> l? ? ??  ?? hosted-engine.lockspace
>> l? ? ??  ?? hosted-engine.metadata
>>
>> I tried to delete the directory from one node but it gives Input/output
>> error.
>> How would one proceed to resolve this?
>>
>> Thanx,
>> Alex
>>
>>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] halo not work as desired!!!

2018-02-05 Thread atris adam
I have mounted the halo glusterfs volume in debug mode, and the output is
as follows:
.
.
.
[2018-02-05 11:42:48.282473] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk]
0-test-halo-client-1: Ping latency is 0ms
[2018-02-05 11:42:48.282502] D [MSGID: 0]
[afr-common.c:5025:afr_get_halo_latency] 0-test-halo-replicate-0: Using
halo latency 10
[2018-02-05 11:42:48.282525] D [MSGID: 0]
[afr-common.c:4820:__afr_handle_ping_event] 0-test-halo-client-1: Client
ping @ 140032933708544 ms
.
.
.
[2018-02-05 11:42:48.393776] D [MSGID: 0]
[afr-common.c:4803:find_worst_up_child] 0-test-halo-replicate-0: Found
worst up child (1) @ 140032933708544 ms latency
[2018-02-05 11:42:48.393803] D [MSGID: 0]
[afr-common.c:4903:__afr_handle_child_up_event] 0-test-halo-replicate-0:
Marking child 1 down, doesn't meet halo threshold (10), and >
halo_min_replicas (2)
.
.
.

I think these debug output means:
As the ping time for test-halo-client-1 (brick2) is (0.5ms) and it is not
under halo threshold (10 ms), this false decision for selecting bricks
happen to halo.
I can not set the halo threshold to 0 because:

#gluster vol set test-halo cluster.halo-max-latency 0
volume set: failed: '0' in 'option halo-max-latency 0' is out of range [1 -
9]

so I think the range [1 - 9] should change to [0 - 9], so I can get
the desired brick selection for halo feature, am I right? If not, why the
halo decide to mark down the best brick which has ping time bellow 0.5ms?

On Sun, Feb 4, 2018 at 2:27 PM, atris adam  wrote:

> I have 2 data centers in two different region, each DC have 3 severs, I
> have created glusterfs volume with 4 replica, this is glusterfs volume info
> output:
>
>
> Volume Name: test-halo
> Type: Replicate
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 4 = 4
> Transport-type: tcp
> Bricks:
> Brick1: 10.0.0.1:/mnt/test1
> Brick2: 10.0.0.3:/mnt/test2
> Brick3: 10.0.0.5:/mnt/test3
> Brick4: 10.0.0.6:/mnt/test4
> Options Reconfigured:
> cluster.halo-shd-max-latency: 5
> cluster.halo-max-latency: 10
> cluster.quorum-count: 2
> cluster.quorum-type: fixed
> cluster.halo-enabled: yes
> transport.address-family: inet
> nfs.disable: on
>
> bricks with ip 10.0.0.1 & 10.0.0.3 are in region A and bricks with ip
> 10.0.0.5 & 10.0.0.6 are in region B
>
>
> when I mount the volume in region A, I except the data first store in
> brick1 & brick2, then asynchronously the data copies in region B, on brick3
> & brick4.
>
> Am I write? this is what halo claims?
>
> If yes, unfortunately, this not happen to me, no differ I mount the volume
> in region A or mount the volume in region B, all the data are copied in
> brick3 & brick4 and no data copies in brick1 & brick2.
>
> ping bricks ip from region A is as follows:
> ping 10.0.0.1 & 10.0.0.3 are bellow  time=0.500 ms
> ping 10.0.0.5 & 10.0.0.6 are more than  time=20 ms
>
> What is the logic that the halo select the bricks to write to?if it is the
> access time, so when I mount the volume in region A, the ping time to
> brick1 & brick2 is bellow 0.5 ms, but the halo select the brick3 &
> brick4
>
> glusterfs version is:
> glusterfs 3.12.4
>
> I really need to work with halo feature, But I am not successful to run
> this case, Can anyone help me soon??
>
>
> Thx alot
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] geo-replication command rsync returned with 3

2018-02-05 Thread Florian Weimer

(resending, sorry for duplicates)

On 01/24/2018 05:59 PM, Dietmar Putz wrote:

strace rsync :

30743 23:34:47 newfstatat(3, "6737", {st_mode=S_IFDIR|0755, 
st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
30743 23:34:47 newfstatat(3, "6741", {st_mode=S_IFDIR|0755, 
st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0

30743 23:34:47 getdents(3, /* 0 entries */, 131072) = 0
30743 23:34:47 munmap(0x7fa4feae7000, 135168) = 0
30743 23:34:47 close(3) = 0
30743 23:34:47 write(2, "rsync: getcwd(): No such file or directory 
(2)", 46) = 46

30743 23:34:47 write(2, "\n", 1)    = 1
30743 23:34:47 rt_sigaction(SIGUSR1, {SIG_IGN, [], SA_RESTORER, 
0x7fa4fdf404b0}, NULL, 8) = 0
30743 23:34:47 rt_sigaction(SIGUSR2, {SIG_IGN, [], SA_RESTORER, 
0x7fa4fdf404b0}, NULL, 8) = 0
30743 23:34:47 write(2, "rsync error: errors selecting input/output 
files, dirs (code 3) at util.c(1056) [Receiver=3.1.1]", 96) = 96

30743 23:34:47 write(2, "\n", 1)    = 1
30743 23:34:47 exit_group(3)    = ?
30743 23:34:47 +++ exited with 3 +++


Do you have strace output going further back, at least to the proceeding 
getcwd call?  It would be interesting to see which path the kernel 
reports, and if it starts with "(unreachable)".


Thanks,
Florian
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Dir split brain resolution

2018-02-05 Thread Alex K
After stoping/starting the volume I have:

gluster volume heal engine  info split-brain
Brick gluster0:/gluster/engine/brick

Status: Connected
Number of entries in split-brain: 1

Brick gluster1:/gluster/engine/brick
Status: Connected
Number of entries in split-brain: 0

gluster volume heal engine split-brain latest-mtime
gfid:bb675ea6-0622-4852-9e59-27a4c93ac0f8
Healing gfid:bb675ea6-0622-4852-9e59-27a4c93ac0f8 failed:Operation not
permitted.
Volume heal failed.

I will appreciate any help.
thanx,
Alex

On Mon, Feb 5, 2018 at 1:11 PM, Alex K  wrote:

> Hi all,
>
> I have a split brain issue and have the following situation:
>
> gluster volume heal engine  info split-brain
>
> Brick gluster0:/gluster/engine/brick
> /ad1f38d7-36df-4cee-a092-ab0ce1f98ce9/ha_agent
> Status: Connected
> Number of entries in split-brain: 1
>
> Brick gluster1:/gluster/engine/brick
> Status: Connected
> Number of entries in split-brain: 0
>
> cd ha_agent/
> [root@v0 ha_agent]# ls -al
> ls: cannot access hosted-engine.metadata: Input/output error
> ls: cannot access hosted-engine.lockspace: Input/output error
> total 8
> drwxrwx--- 2 vdsm kvm 4096 Feb  5 10:52 .
> drwxr-xr-x 5 vdsm kvm 4096 Jan 18 01:17 ..
> l? ? ??  ?? hosted-engine.lockspace
> l? ? ??  ?? hosted-engine.metadata
>
> I tried to delete the directory from one node but it gives Input/output
> error.
> How would one proceed to resolve this?
>
> Thanx,
> Alex
>
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Dir split brain resolution

2018-02-05 Thread Alex K
Hi all,

I have a split brain issue and have the following situation:

gluster volume heal engine  info split-brain

Brick gluster0:/gluster/engine/brick
/ad1f38d7-36df-4cee-a092-ab0ce1f98ce9/ha_agent
Status: Connected
Number of entries in split-brain: 1

Brick gluster1:/gluster/engine/brick
Status: Connected
Number of entries in split-brain: 0

cd ha_agent/
[root@v0 ha_agent]# ls -al
ls: cannot access hosted-engine.metadata: Input/output error
ls: cannot access hosted-engine.lockspace: Input/output error
total 8
drwxrwx--- 2 vdsm kvm 4096 Feb  5 10:52 .
drwxr-xr-x 5 vdsm kvm 4096 Jan 18 01:17 ..
l? ? ??  ?? hosted-engine.lockspace
l? ? ??  ?? hosted-engine.metadata

I tried to delete the directory from one node but it gives Input/output
error.
How would one proceed to resolve this?

Thanx,
Alex
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Fwd: Troubleshooting glusterfs

2018-02-05 Thread Nithya Balachandran
On 5 February 2018 at 15:40, Nithya Balachandran 
wrote:

> Hi,
>
>
> I see a lot of the following messages in the logs:
> [2018-02-04 03:22:01.56] I [glusterfsd-mgmt.c:1821:mgmt_getspec_cbk]
> 0-glusterfs: No change in volfile,continuing
> [2018-02-04 07:41:16.189349] W [MSGID: 109011]
> [dht-layout.c:186:dht_layout_search] 48-gv0-dht: no subvolume for hash
> (value) = 122440868
> [2018-02-04 07:41:16.244261] W [fuse-bridge.c:2398:fuse_writev_cbk]
> 0-glusterfs-fuse: 3615890: WRITE => -1 
> gfid=c73ca10f-e83e-42a9-9b0a-1de4e12c6798
> fd=0x7ffa3802a5f0 (Ошибка ввода/вывода)
> [2018-02-04 07:41:16.254503] W [fuse-bridge.c:1377:fuse_err_cbk]
> 0-glusterfs-fuse: 3615891: FLUSH() ERR => -1 (Ошибка ввода/вывода)
> The message "W [MSGID: 109011] [dht-layout.c:186:dht_layout_search]
> 48-gv0-dht: no subvolume for hash (value) = 122440868" repeated 81 times
> between [2018-02-04 07:41:16.189349] and [2018-02-04 07:41:16.254480]
> [2018-02-04 10:50:27.624283] W [MSGID: 109011]
> [dht-layout.c:186:dht_layout_search] 48-gv0-dht: no subvolume for hash
> (value) = 116958174
> [2018-02-04 10:50:27.752107] W [fuse-bridge.c:2398:fuse_writev_cbk]
> 0-glusterfs-fuse: 3997764: WRITE => -1 
> gfid=18e2adee-ff52-414f-aa37-506cff1472ee
> fd=0x7ffa3801d7d0 (Ошибка ввода/вывода)
> [2018-02-04 10:50:27.762331] W [fuse-bridge.c:1377:fuse_err_cbk]
> 0-glusterfs-fuse: 3997765: FLUSH() ERR => -1 (Ошибка ввода/вывода)
> The message "W [MSGID: 109011] [dht-layout.c:186:dht_layout_search]
> 48-gv0-dht: no subvolume for hash (value) = 116958174" repeated 147 times
> between [2018-02-04 10:50:27.624283] and [2018-02-04 10:50:27.762292]
> [2018-02-04 10:55:35.256018] W [MSGID: 109011]
> [dht-layout.c:186:dht_layout_search] 48-gv0-dht: no subvolume for hash
> (value) = 28918667
> [2018-02-04 10:55:35.387073] W [fuse-bridge.c:2398:fuse_writev_cbk]
> 0-glusterfs-fuse: 4006263: WRITE => -1 
> gfid=54e6f8ea-27d7-4e92-ae64-5e198bd3cb42
> fd=0x7ffa38036bf0 (Ошибка ввода/вывода)
> [2018-02-04 10:55:35.407554] W [fuse-bridge.c:1377:fuse_err_cbk]
> 0-glusterfs-fuse: 4006264: FLUSH() ERR => -1 (Ошибка ввода/вывода)
> [2018-02-04 10:55:59.677734] W [MSGID: 109011]
> [dht-layout.c:186:dht_layout_search] 48-gv0-dht: no subvolume for hash
> (value) = 69319528
> [2018-02-04 10:55:59.827012] W [fuse-bridge.c:2398:fuse_writev_cbk]
> 0-glusterfs-fuse: 4014645: WRITE => -1 
> gfid=ce700d9b-ef55-4e55-a371-9642e90555cb
> fd=0x7ffa38036bf0 (Ошибка ввода/вывода)
>
>
>
> This is the reason for the I/O errors you are seeing. Gluster cannot find
> the subvolume for the file in question so it will fail the write with I/O
> error. It looks like some bricks may not have been up at the time the
> volume tried to get the layout.
>
> This is a problem as this is a pure distributed volume. For some reason
> the layout is not set on some bricks/some bricks are unreachable.
>
> There are a lot of graph changes in the logs - I would recommend against
> so many changes in such a short interval. There aren't logs for the
> interval before to find out why. Can you send me the rebalance logs from
> the nodes?
>

To clarify, I see multiple graph changes in a few minutes. I would
recommend adding/removing multiple bricks at a time when
expanding/shrinking the volume instead of one at a time.

>
>
> >I case we have too much capacity that's not needed at the moment we are
> going to remove-brick and fix-layout again in order to shrink >storage.
>
>
> I do see the number of bricks reducing in the graphs.Are you sure a
> remove-brick has not been run?  There is no need to run a fix-layout after
> using "remove-brick start" as that will automatically rebalance data.
>
>
>
> Regards,
> Nithya
>
> On 5 February 2018 at 14:06, Nikita Yeryomin  wrote:
>
>> Attached the log. There are some errors in it like
>>
>> [2018-02-04 18:50:41.112962] E [fuse-bridge.c:903:fuse_getattr_resume]
>> 0-glusterfs-fuse: 9613852: GETATTR 140712792330896
>> (7d39d329-c0e0-4997-85e6-0e66e0436315) resolution failed
>>
>> But when it occurs it seems not affecting current file i/o operations.
>> I've already re-created the volume yesterday and I was not able to
>> reproduce the error during file download after that, but still there are
>> errors in logs like above and system seems a bit unstable.
>> Let me share some more details on how we are trying to use glusterfs.
>> So it's distributed NOT replicated volume with sharding enabled.
>> We have many small servers (20GB each) in a cloud and a need to work with
>> rather large files (~300GB).
>> We start volume with one 15GB brick which is a separate XFS partition on
>> each server and then add bricks one by one to reach needed capacity.
>> After each brick is added we do rebalance fix-layout.
>> I case we have too much capacity that's not needed at the moment we are
>> going to remove-brick and fix-layout again in order to shrink storage. But
>> we have not yet been able to test removing bricks as system behaves not
>> 

Re: [Gluster-users] Fwd: Troubleshooting glusterfs

2018-02-05 Thread Nithya Balachandran
Hi,


I see a lot of the following messages in the logs:
[2018-02-04 03:22:01.56] I [glusterfsd-mgmt.c:1821:mgmt_getspec_cbk]
0-glusterfs: No change in volfile,continuing
[2018-02-04 07:41:16.189349] W [MSGID: 109011]
[dht-layout.c:186:dht_layout_search]
48-gv0-dht: no subvolume for hash (value) = 122440868
[2018-02-04 07:41:16.244261] W [fuse-bridge.c:2398:fuse_writev_cbk]
0-glusterfs-fuse: 3615890: WRITE => -1
gfid=c73ca10f-e83e-42a9-9b0a-1de4e12c6798
fd=0x7ffa3802a5f0 (Ошибка ввода/вывода)
[2018-02-04 07:41:16.254503] W [fuse-bridge.c:1377:fuse_err_cbk]
0-glusterfs-fuse: 3615891: FLUSH() ERR => -1 (Ошибка ввода/вывода)
The message "W [MSGID: 109011] [dht-layout.c:186:dht_layout_search]
48-gv0-dht: no subvolume for hash (value) = 122440868" repeated 81 times
between [2018-02-04 07:41:16.189349] and [2018-02-04 07:41:16.254480]
[2018-02-04 10:50:27.624283] W [MSGID: 109011]
[dht-layout.c:186:dht_layout_search]
48-gv0-dht: no subvolume for hash (value) = 116958174
[2018-02-04 10:50:27.752107] W [fuse-bridge.c:2398:fuse_writev_cbk]
0-glusterfs-fuse: 3997764: WRITE => -1
gfid=18e2adee-ff52-414f-aa37-506cff1472ee
fd=0x7ffa3801d7d0 (Ошибка ввода/вывода)
[2018-02-04 10:50:27.762331] W [fuse-bridge.c:1377:fuse_err_cbk]
0-glusterfs-fuse: 3997765: FLUSH() ERR => -1 (Ошибка ввода/вывода)
The message "W [MSGID: 109011] [dht-layout.c:186:dht_layout_search]
48-gv0-dht: no subvolume for hash (value) = 116958174" repeated 147 times
between [2018-02-04 10:50:27.624283] and [2018-02-04 10:50:27.762292]
[2018-02-04 10:55:35.256018] W [MSGID: 109011]
[dht-layout.c:186:dht_layout_search]
48-gv0-dht: no subvolume for hash (value) = 28918667
[2018-02-04 10:55:35.387073] W [fuse-bridge.c:2398:fuse_writev_cbk]
0-glusterfs-fuse: 4006263: WRITE => -1
gfid=54e6f8ea-27d7-4e92-ae64-5e198bd3cb42
fd=0x7ffa38036bf0 (Ошибка ввода/вывода)
[2018-02-04 10:55:35.407554] W [fuse-bridge.c:1377:fuse_err_cbk]
0-glusterfs-fuse: 4006264: FLUSH() ERR => -1 (Ошибка ввода/вывода)
[2018-02-04 10:55:59.677734] W [MSGID: 109011]
[dht-layout.c:186:dht_layout_search]
48-gv0-dht: no subvolume for hash (value) = 69319528
[2018-02-04 10:55:59.827012] W [fuse-bridge.c:2398:fuse_writev_cbk]
0-glusterfs-fuse: 4014645: WRITE => -1
gfid=ce700d9b-ef55-4e55-a371-9642e90555cb
fd=0x7ffa38036bf0 (Ошибка ввода/вывода)



This is the reason for the I/O errors you are seeing. Gluster cannot find
the subvolume for the file in question so it will fail the write with I/O
error. It looks like some bricks may not have been up at the time the
volume tried to get the layout.

This is a problem as this is a pure distributed volume. For some reason the
layout is not set on some bricks/some bricks are unreachable.

There are a lot of graph changes in the logs - I would recommend against so
many changes in such a short interval. There aren't logs for the interval
before to find out why. Can you send me the rebalance logs from the nodes?


>I case we have too much capacity that's not needed at the moment we are
going to remove-brick and fix-layout again in order to shrink >storage.


I do see the number of bricks reducing in the graphs.Are you sure a
remove-brick has not been run?  There is no need to run a fix-layout after
using "remove-brick start" as that will automatically rebalance data.



Regards,
Nithya

On 5 February 2018 at 14:06, Nikita Yeryomin  wrote:

> Attached the log. There are some errors in it like
>
> [2018-02-04 18:50:41.112962] E [fuse-bridge.c:903:fuse_getattr_resume]
> 0-glusterfs-fuse: 9613852: GETATTR 140712792330896
> (7d39d329-c0e0-4997-85e6-0e66e0436315) resolution failed
>
> But when it occurs it seems not affecting current file i/o operations.
> I've already re-created the volume yesterday and I was not able to
> reproduce the error during file download after that, but still there are
> errors in logs like above and system seems a bit unstable.
> Let me share some more details on how we are trying to use glusterfs.
> So it's distributed NOT replicated volume with sharding enabled.
> We have many small servers (20GB each) in a cloud and a need to work with
> rather large files (~300GB).
> We start volume with one 15GB brick which is a separate XFS partition on
> each server and then add bricks one by one to reach needed capacity.
> After each brick is added we do rebalance fix-layout.
> I case we have too much capacity that's not needed at the moment we are
> going to remove-brick and fix-layout again in order to shrink storage. But
> we have not yet been able to test removing bricks as system behaves not
> stable after scaling out.
>
> What I've found here https://bugzilla.redhat.com/show_bug.cgi?id=875076 -
> seems starting with one brick is not a good idea.. so we are going to try
> starting with 2 bricks.
> Please let me know if there are anything else we should consider changing
> in our strategy.
>
> Many thanks in advance!
> Nikita Yeryomin
>
> 2018-02-05 7:53 GMT+02:00 Nithya Balachandran :
>
>> Hi,
>>

Re: [Gluster-users] Error - Disk Full - No Space Left

2018-02-05 Thread Nithya Balachandran
Hi,

I have already replied to your earlier email. Did not receive it?

Regards,
Nithya


On 5 February 2018 at 14:52, Taste-Of-IT  wrote:

> Hi to all,
>
> its sad that no one can help. I testet the 3 Bricks and created a new
> Volume. But if i want to create an folder, i got the same error message.
> Disk ist full? Looks seems ok and i need help.
> thx
>
> Taste
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Error - Disk Full - No Space Left

2018-02-05 Thread Taste-Of-IT
Hi to all,

its sad that no one can help. I testet the 3 Bricks and created a new Volume. 
But if i want to create an folder, i got the same error message. Disk ist full? 
Looks seems ok and i need help.
thx

Taste
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users