Re: [Gluster-devel] NSR: Suggestions for a new name

2016-01-20 Thread Avra Sengupta
Thanks for the suggestion Pranith. To make things interesting, we have 
created an etherpad where people can put their suggestions. Somewhere 
around mid of feb, we will look at all the suggestions we have got, have 
a community vote and zero in on one. The suggester of the winning name 
gets a goody.


Feel free to add more than one entry.

Regards,
Avra

On 01/21/2016 10:08 AM, Pranith Kumar Karampuri wrote:



On 01/19/2016 08:00 PM, Avra Sengupta wrote:

Hi,

The leader election based replication has been called NSR or "New 
Style Replication" for a while now. We would like to have a new name 
for the same that's less generic. It can be something like "Leader 
Driven Replication" or something more specific that would make sense 
a few years down the line too.


We would love to hear more suggestions from the community. Thanks


If I had a chance to name AFR (Automatic File Replication) I would 
have named it Automatic Data replication. Feel free to use it if you 
like it.


Pranith


Regards,
Avra
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NSR: Suggestions for a new name

2016-01-20 Thread Pranith Kumar Karampuri



On 01/19/2016 08:00 PM, Avra Sengupta wrote:

Hi,

The leader election based replication has been called NSR or "New 
Style Replication" for a while now. We would like to have a new name 
for the same that's less generic. It can be something like "Leader 
Driven Replication" or something more specific that would make sense a 
few years down the line too.


We would love to hear more suggestions from the community. Thanks


If I had a chance to name AFR (Automatic File Replication) I would have 
named it Automatic Data replication. Feel free to use it if you like it.


Pranith


Regards,
Avra
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] rm -r problem on FreeBSD port

2016-01-20 Thread Rick Macklem
Sakshi Bansal wrote:
> The directory deletion is failing with ENOTEMPTY since not all the files
> inside it have been deleted. Looks like lookup is not listing all the files.
> It is possible that cluster.lookup-optimize could be the culprit here. When
> did you turn this option 'on'? Was it during the untaring of the source
> tree?
> Also once this option if turned 'off', explicitly doing an ls on the
> problematic files still throw error?
> 
Good suggestion. I had disabled it but after I had created the tree
(unrolled the tarball and created the directory tree that the build goes in).

I ran a test where I disabled all three of:
performance,readdir-ahead
cluster.lookup-optimize
cluster.readdir-optimize
right after I created the volume with 2 bricks.

Then I ran a test and everything worked. I didn't get any directory with files
missing when doing an "ls" and the "rm -r" worked too.
So, it looks like it is one or more of these settings and they have to be
disabled when the files/directories are created to fix the problem.

It will take a while, but I will run tests with them individually disabled
to see which one(s) need to be disabled. Once I know that I'll email and
try to get the other information you requested to see if we can isolate the 
problem further.

Thanks, I feel this is progress, rick
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Netbsd regressions are failing because of connection problems?

2016-01-20 Thread Emmanuel Dreyfus
Vijay Bellur  wrote:

> Does not look like a DNS problem. It is happening to me outside of
> rackspace too.

I mean I have already seen rackspace VM failing to initiate connexions
because rackspace DNS failed to answer DNS requests. This was the cause
of failed regression at some time.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] heal hanging

2016-01-20 Thread Pranith Kumar Karampuri

hey,
   Which process is consuming so much cpu? I went through the logs 
you gave me. I see that the following files are in gfid mismatch state:


<066e4525-8f8b-43aa-b7a1-86bbcecc68b9/safebrowsing-backup>,
<1d48754b-b38c-403d-94e2-0f5c41d5f885/recovery.bak>,
,

Could you give me the output of "ls /indices/xattrop | wc 
-l" output on all the bricks which are acting this way? This will tell 
us the number of pending self-heals on the system.


Pranith

On 01/20/2016 09:26 PM, David Robinson wrote:

resending with parsed logs...
I am having issues with 3.6.6 where the load will spike up to 800% 
for one of the glusterfsd processes and the users can no longer 
access the system.  If I reboot the node, the heal will finish 
normally after a few minutes and the system will be responsive, 
but a few hours later the issue will start again.  It look like it 
is hanging in a heal and spinning up the load on one of the bricks.  
The heal gets stuck and says it is crawling and never returns.  
After a few minutes of the heal saying it is crawling, the load 
spikes up and the mounts become unresponsive.
Any suggestions on how to fix this?  It has us stopped cold as the 
user can no longer access the systems when the load spikes... Logs 
attached.

System setup info is:
[root@gfs01a ~]# gluster volume info homegfs

Volume Name: homegfs
Type: Distributed-Replicate
Volume ID: 1e32672a-f1b7-4b58-ba94-58c085e59071
Status: Started
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: gfsib01a.corvidtec.com:/data/brick01a/homegfs
Brick2: gfsib01b.corvidtec.com:/data/brick01b/homegfs
Brick3: gfsib01a.corvidtec.com:/data/brick02a/homegfs
Brick4: gfsib01b.corvidtec.com:/data/brick02b/homegfs
Brick5: gfsib02a.corvidtec.com:/data/brick01a/homegfs
Brick6: gfsib02b.corvidtec.com:/data/brick01b/homegfs
Brick7: gfsib02a.corvidtec.com:/data/brick02a/homegfs
Brick8: gfsib02b.corvidtec.com:/data/brick02b/homegfs
Options Reconfigured:
performance.io-thread-count: 32
performance.cache-size: 128MB
performance.write-behind-window-size: 128MB
server.allow-insecure: on
network.ping-timeout: 42
storage.owner-gid: 100
geo-replication.indexing: off
geo-replication.ignore-pid-check: on
changelog.changelog: off
changelog.fsync-interval: 3
changelog.rollover-time: 15
server.manage-gids: on
diagnostics.client-log-level: WARNING
[root@gfs01a ~]# rpm -qa | grep gluster
gluster-nagios-common-0.1.1-0.el6.noarch
glusterfs-fuse-3.6.6-1.el6.x86_64
glusterfs-debuginfo-3.6.6-1.el6.x86_64
glusterfs-libs-3.6.6-1.el6.x86_64
glusterfs-geo-replication-3.6.6-1.el6.x86_64
glusterfs-api-3.6.6-1.el6.x86_64
glusterfs-devel-3.6.6-1.el6.x86_64
glusterfs-api-devel-3.6.6-1.el6.x86_64
glusterfs-3.6.6-1.el6.x86_64
glusterfs-cli-3.6.6-1.el6.x86_64
glusterfs-rdma-3.6.6-1.el6.x86_64
samba-vfs-glusterfs-4.1.11-2.el6.x86_64
glusterfs-server-3.6.6-1.el6.x86_64
glusterfs-extra-xlators-3.6.6-1.el6.x86_64



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] GlusterFS FUSE client hangs on rsyncing lots of file

2016-01-20 Thread Pranith Kumar Karampuri



On 01/18/2016 02:28 PM, Oleksandr Natalenko wrote:

XFS. Server side works OK, I'm able to mount volume again. Brick is 30% full.


Oleksandr,
  Will it be possible to get the statedump of the client, bricks 
output next time it happens?

https://github.com/gluster/glusterfs/blob/master/doc/debugging/statedump.md#how-to-generate-statedump

Pranith



On понеділок, 18 січня 2016 р. 15:07:18 EET baul jianguo wrote:

What is your brick file system? and the glusterfsd process and all
thread status?
I met same issue when client app such as rsync stay in D status,and
the brick process and relate thread also be in the D status.
And the brick dev disk util is 100% .

On Sun, Jan 17, 2016 at 6:13 AM, Oleksandr Natalenko

 wrote:

Wrong assumption, rsync hung again.

On субота, 16 січня 2016 р. 22:53:04 EET Oleksandr Natalenko wrote:

One possible reason:

cluster.lookup-optimize: on
cluster.readdir-optimize: on

I've disabled both optimizations, and at least as of now rsync still does
its job with no issues. I would like to find out what option causes such
a
behavior and why. Will test more.

On пʼятниця, 15 січня 2016 р. 16:09:51 EET Oleksandr Natalenko wrote:

Another observation: if rsyncing is resumed after hang, rsync itself
hangs a lot faster because it does stat of already copied files. So,
the
reason may be not writing itself, but massive stat on GlusterFS volume
as well.

15.01.2016 09:40, Oleksandr Natalenko написав:

While doing rsync over millions of files from ordinary partition to
GlusterFS volume, just after approx. first 2 million rsync hang
happens, and the following info appears in dmesg:

===
[17075038.924481] INFO: task rsync:10310 blocked for more than 120
seconds.
[17075038.931948] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[17075038.940748] rsync   D 88207fc13680 0 10310
10309 0x0080
[17075038.940752]  8809c578be18 0086 8809c578bfd8
00013680
[17075038.940756]  8809c578bfd8 00013680 880310cbe660
881159d16a30
[17075038.940759]  881e3aa25800 8809c578be48 881159d16b10
88087d553980
[17075038.940762] Call Trace:
[17075038.940770]  [] schedule+0x29/0x70
[17075038.940797]  []
__fuse_request_send+0x13d/0x2c0
[fuse]
[17075038.940801]  [] ?
fuse_get_req_nofail_nopages+0xc0/0x1e0 [fuse]
[17075038.940805]  [] ? wake_up_bit+0x30/0x30
[17075038.940809]  [] fuse_request_send+0x12/0x20
[fuse]
[17075038.940813]  [] fuse_flush+0xff/0x150 [fuse]
[17075038.940817]  [] filp_close+0x34/0x80
[17075038.940821]  [] __close_fd+0x78/0xa0
[17075038.940824]  [] SyS_close+0x23/0x50
[17075038.940828]  []
system_call_fastpath+0x16/0x1b
===

rsync blocks in D state, and to kill it, I have to do umount --lazy
on
GlusterFS mountpoint, and then kill corresponding client glusterfs
process. Then rsync exits.

Here is GlusterFS volume info:

===
Volume Name: asterisk_records
Type: Distributed-Replicate
Volume ID: dc1fe561-fa3a-4f2e-8330-ec7e52c75ba4
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1:
server1:/bricks/10_megaraid_0_3_9_x_0_4_3_hdd_r1_nolvm_hdd_storage_01
/as
te
risk/records Brick2:
server2:/bricks/10_megaraid_8_5_14_x_8_6_16_hdd_r1_nolvm_hdd_storage_
01/
as
terisk/records Brick3:
server1:/bricks/11_megaraid_0_5_4_x_0_6_5_hdd_r1_nolvm_hdd_storage_02
/as
te
risk/records Brick4:
server2:/bricks/11_megaraid_8_7_15_x_8_8_20_hdd_r1_nolvm_hdd_storage_
02/
as
terisk/records Brick5:
server1:/bricks/12_megaraid_0_7_6_x_0_13_14_hdd_r1_nolvm_hdd_storage_
03/
as
terisk/records Brick6:
server2:/bricks/12_megaraid_8_9_19_x_8_13_24_hdd_r1_nolvm_hdd_storage
_03
/a
sterisk/records Options Reconfigured:
cluster.lookup-optimize: on
cluster.readdir-optimize: on
client.event-threads: 2
network.inode-lru-limit: 4096
server.event-threads: 4
performance.client-io-threads: on
storage.linux-aio: on
performance.write-behind-window-size: 4194304
performance.stat-prefetch: on
performance.quick-read: on
performance.read-ahead: on
performance.flush-behind: on
performance.write-behind: on
performance.io-thread-count: 2
performance.cache-max-file-size: 1048576
performance.cache-size: 33554432
features.cache-invalidation: on
performance.readdir-ahead: on
===

The issue reproduces each time I rsync such an amount of files.

How could I debug this issue better?
___
Gluster-users mailing list
gluster-us...@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] Netbsd regressions are failing because of connection problems?

2016-01-20 Thread Vijay Bellur


- Original Message -
> From: "Emmanuel Dreyfus" 
> To: "Vijay Bellur" , "Pranith Kumar Karampuri" 
> 
> Cc: "Gluster Devel" , "Gluster Infra" 
> 
> Sent: Wednesday, January 20, 2016 9:10:10 PM
> Subject: Re: [Gluster-devel] Netbsd regressions are failing because of
> connection problems?
> 
> Vijay Bellur  wrote:
> 
> > There is some problem with review.gluster.org now. git clone/pull fails
> > for me consistently.
> 
> First check DNS is working. I recall seing rackspace DNS failing to
> answer.
> 


Does not look like a DNS problem. It is happening to me outside of rackspace 
too.

-Vijay
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Netbsd regressions are failing because of connection problems?

2016-01-20 Thread Emmanuel Dreyfus
Vijay Bellur  wrote:

> There is some problem with review.gluster.org now. git clone/pull fails
> for me consistently.

First check DNS is working. I recall seing rackspace DNS failing to
answer.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] rm -r problem on FreeBSD port

2016-01-20 Thread Sakshi Bansal
I would expect cluster.lookup-optimize to be creating the problem here, so may 
be you could first try with this option off. Another thing that would be 
helpful is to get the strace when rm fails with no such file, as this would 
tell us to identify if the readdir is not returning the entry or is it the 
unlink that is failing.

- Original Message -
From: "Rick Macklem" 
To: "Sakshi Bansal" 
Cc: "Raghavendra Gowdappa" , "Gluster Devel" 

Sent: Thursday, January 21, 2016 10:03:37 AM
Subject: Re: [Gluster-devel] rm -r problem on FreeBSD port

Sakshi Bansal wrote:
> The directory deletion is failing with ENOTEMPTY since not all the files
> inside it have been deleted. Looks like lookup is not listing all the files.
> It is possible that cluster.lookup-optimize could be the culprit here. When
> did you turn this option 'on'? Was it during the untaring of the source
> tree?
> Also once this option if turned 'off', explicitly doing an ls on the
> problematic files still throw error?
> 
Good suggestion. I had disabled it but after I had created the tree
(unrolled the tarball and created the directory tree that the build goes in).

I ran a test where I disabled all three of:
performance,readdir-ahead
cluster.lookup-optimize
cluster.readdir-optimize
right after I created the volume with 2 bricks.

Then I ran a test and everything worked. I didn't get any directory with files
missing when doing an "ls" and the "rm -r" worked too.
So, it looks like it is one or more of these settings and they have to be
disabled when the files/directories are created to fix the problem.

It will take a while, but I will run tests with them individually disabled
to see which one(s) need to be disabled. Once I know that I'll email and
try to get the other information you requested to see if we can isolate the 
problem further.

Thanks, I feel this is progress, rick
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NSR: Suggestions for a new name

2016-01-20 Thread Joe Julian
The two favorite current marketing buzzwords seem to be "Hyperconverged" 
and "Technology", so if we could work those in somewhere it might make 
it seem more hip. Maybe "Hyperconvered Replication with Leader Technology".


On 01/20/16 20:38, Pranith Kumar Karampuri wrote:



On 01/19/2016 08:00 PM, Avra Sengupta wrote:

Hi,

The leader election based replication has been called NSR or "New 
Style Replication" for a while now. We would like to have a new name 
for the same that's less generic. It can be something like "Leader 
Driven Replication" or something more specific that would make sense 
a few years down the line too.


We would love to hear more suggestions from the community. Thanks


If I had a chance to name AFR (Automatic File Replication) I would 
have named it Automatic Data replication. Feel free to use it if you 
like it.


Pranith


Regards,
Avra
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NSR: Suggestions for a new name

2016-01-20 Thread Atin Mukherjee
Etherpad link please?

On 01/21/2016 12:19 PM, Avra Sengupta wrote:
> Thanks for the suggestion Pranith. To make things interesting, we have
> created an etherpad where people can put their suggestions. Somewhere
> around mid of feb, we will look at all the suggestions we have got, have
> a community vote and zero in on one. The suggester of the winning name
> gets a goody.
> 
> Feel free to add more than one entry.
> 
> Regards,
> Avra
> 
> On 01/21/2016 10:08 AM, Pranith Kumar Karampuri wrote:
>>
>>
>> On 01/19/2016 08:00 PM, Avra Sengupta wrote:
>>> Hi,
>>>
>>> The leader election based replication has been called NSR or "New
>>> Style Replication" for a while now. We would like to have a new name
>>> for the same that's less generic. It can be something like "Leader
>>> Driven Replication" or something more specific that would make sense
>>> a few years down the line too.
>>>
>>> We would love to hear more suggestions from the community. Thanks
>>
>> If I had a chance to name AFR (Automatic File Replication) I would
>> have named it Automatic Data replication. Feel free to use it if you
>> like it.
>>
>> Pranith
>>>
>>> Regards,
>>> Avra
>>> ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-infra] NetBSD regression fixes

2016-01-20 Thread Rajesh Joseph


- Original Message -
> From: "Emmanuel Dreyfus" 
> To: "Niels de Vos" 
> Cc: gluster-in...@gluster.org, gluster-devel@gluster.org
> Sent: Sunday, January 17, 2016 10:23:16 AM
> Subject: Re: [Gluster-devel] [Gluster-infra] NetBSD regression fixes
> 
> Niels de Vos  wrote:
> 
> > > 2) Spurious failures
> > > I added a retry-failed-test-once feature so that we get less regression
> > > failures because of spurious failures. It is not used right now because
> > > it does not play nicely with bad tests blacklist.
> > > 
> > > This will be fixed by that changes:
> > > http://review.gluster.org/13245
> > > http://review.gluster.org/13247
> > > 
> > > I have been looping failure-free regression for a while with that trick.
> > 
> > Nice, thanks for these improvements!
> 
> But I just realized the change is wrong, since running tests "new way"
> stops on first failed test. My change just retry the failed test and
> considers the regression run to be good on success, without running next
> tests.
> 
> I will post an update shortly.
> 

I think we should not take this approach. If the tests are not reliable then 
there
is no guarantee that it will pass in the next retry. In fact we should not rely 
on 
luck here. Lets not run those tests which are spurious in nature. Anyway we 
don't
consider the result of those tests. Therefore I think we should consider the 
patch 
sent by Talur (http://review.gluster.org/13173).

> > Could you send a pull request for the regression.sh script on
> > https://github.com/gluster/glusterfs-patch-acceptance-tests/ ? Or, if
> > you dont use GitHub, send the patch by email and we'll take care of
> > pushing it for you.
> 
> Sure, but let me settle on something that works first.
> 
> --
> Emmanuel Dreyfus
> http://hcpnet.free.fr/pubz
> m...@netbsd.org
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] NSR: Suggestions for a new name

2016-01-20 Thread Avra Sengupta

On 01/21/2016 12:20 PM, Atin Mukherjee wrote:

Etherpad link please?

Oops My Bad. Here it is https://public.pad.fsfe.org/p/NSR_name_suggestions


On 01/21/2016 12:19 PM, Avra Sengupta wrote:

Thanks for the suggestion Pranith. To make things interesting, we have
created an etherpad where people can put their suggestions. Somewhere
around mid of feb, we will look at all the suggestions we have got, have
a community vote and zero in on one. The suggester of the winning name
gets a goody.

Feel free to add more than one entry.

Regards,
Avra

On 01/21/2016 10:08 AM, Pranith Kumar Karampuri wrote:


On 01/19/2016 08:00 PM, Avra Sengupta wrote:

Hi,

The leader election based replication has been called NSR or "New
Style Replication" for a while now. We would like to have a new name
for the same that's less generic. It can be something like "Leader
Driven Replication" or something more specific that would make sense
a few years down the line too.

We would love to hear more suggestions from the community. Thanks

If I had a chance to name AFR (Automatic File Replication) I would
have named it Automatic Data replication. Feel free to use it if you
like it.

Pranith

Regards,
Avra
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Netbsd regressions are failing because of connection problems?

2016-01-20 Thread Vijay Bellur


- Original Message -
> From: "Pranith Kumar Karampuri" 
> To: "Gluster Devel" , "Gluster Infra" 
> , "Emmanuel Dreyfus"
> 
> Sent: Wednesday, January 20, 2016 7:35:49 PM
> Subject: [Gluster-devel] Netbsd regressions are failing because of
> connection problems?
> 
> /origin/*
> ERROR: Error cloning remote repo 'origin'
> hudson.plugins.git.GitException: Command "git -c core.askpass=true fetch
> --tags --progress git://review.gluster.org/glusterfs.git
> +refs/heads/*:refs/remotes/origin/*" returned status code 128:
> stdout:
> stderr: fatal: unable to connect to review.gluster.org:
> review.gluster.org[0: 184.107.76.10]: errno=Connection refused
> 


There is some problem with review.gluster.org now. git clone/pull fails for me 
consistently. Most jenkins jobs are aborting due to git clone failures.

Misc debugged the problem a few hours back but we could not make much progress 
in isolating the problem. My key based access to r.g.o is not working and that 
is yet another issue which needs to be debugged!

-Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] 3.8 Plan changes - proposal

2016-01-20 Thread Bishoy Mikhael
So what about IPv6?!

Bishoy

On Tuesday, January 19, 2016, Vijay Bellur  wrote:

> On 01/11/2016 04:22 PM, Vijay Bellur wrote:
>
>> Hi All,
>>
>> We discussed the following proposal for 3.8 in the maintainers mailing
>> list and there was general consensus about the changes being a step in
>> the right direction. Would like to hear your thoughts about the same.
>>
>> Changes to 3.8 Plan:
>> 
>>
>> 1. Include 4.0 features such as NSR, dht2, glusterd2.0 & eventing as
>> experimental features in 3.8. 4.0 features are shaping up reasonably
>> well and it would be nice to have them packaged in 3.8 so that we can
>> get more meaningful feedback early on. As 4.0 features mature, we can
>> push them out through subsequent 3.8.x releases to derive iterative
>> feedback.
>>
>> 2. Ensure that most of our components have tests in distaf by 3.8. I
>> would like us to have more deterministic pre-release testing for 3.8 and
>> having tests in distaf should help us in accomplishing that goal.
>>
>> 3. Add "forward compatibility" section to all feature pages proposed for
>> 3.8 so that we carefully review the impact of a feature on all upcoming
>> Gluster.next features.
>>
>> 4. Have Niels de Vos as the maintainer for 3.8 with immediate effect.
>> This is a change from the past where we have had release maintainers
>> after a .0 release is in place. I think Niels' diligence as a release
>> manager will help us in having a more polished .0 release.
>>
>> 5. Move out 3.8 by 2-3 months (end of May or early June 2016) to
>> accomplish these changes.
>>
>> Appreciate your feedback about this proposal!
>>
>>
> Since we have not come across any objections to this proposal, let us go
> ahead and turn this into our plan of action.
>
> Look forward to an exciting first release of Gluster.next!
>
> Cheers,
> Vijay
>
> ___
> Gluster-users mailing list
> gluster-us...@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Few details needed about *any* recent or upcoming feature

2016-01-20 Thread Shyam

Hi Niels,

Here is something for DHT2:

DHT2:

* Why DHT2:
To address consistency and correctness of operations, complexity and 
scale requirements in the gluster IO path, keeping performance 
characteristics unchanging.


The issue with current DHT is that a directory is present in all 
subvolumes of DHT, this leads to a few consistency issues when dealing 
with entry operations on directories or files. It also leads to scaling 
limitations when dealing with larger subvolume counts. Resolution of the 
consistency and correctness issues is becoming more complex and also 
involves some fine grained locking across the network, which would 
increase the complexity of the solution and also potentially decrease 
the performance of the file system.


Refer:
http://www.gluster.org/community/documentation/images/f/fd/Summit-dht2.odp

* Outline of approach:
To resolve some of the above inadequacies in the current DHT design, 
DHT2 aims to simplify the on disk data, and retain a directory in a 
single subvolume and hence address the various pitfalls that are present 
in the current distribution model.


It also addresses some of the other problems in DHT w.r.t layouts stored 
per directory and simplifies the same in order to retain consistency and 
correctness properties.


In the move to retain a directory on just a subvolume of DHT2, 
separation of namespace/inode and data becomes essential to make the 
distribution of data even. As a result DHT2, introduces a metadata 
sub-volumes and a data sub-volumes into gluster.


* Impacts:
- DHT2 changes the on disk format for gluster and hence does not provide 
an upgrade path from older DHT based volumes. It is not intended to in 
place convert older gluster volumes to newer gluster volumes as well.


- The on disk format changes also means changes to some other xlators 
like quota and possibly bit-rot and change log. Impact to these xlators 
are being analyzed


Shyam

On 01/20/2016 05:41 AM, Niels de Vos wrote:

Hi all,

on Saturday the 30th of January I am scheduled to give a presentation
titled "Gluster roadmap, recent improvements and upcoming features":

   https://fosdem.org/2016/schedule/event/gluster_roadmap/

I would like to ask from all feature owners/developers to reply to this
email with a short description and a few keywords about their features.
My plan is to have at most one slide for each feature, so keep it short.

Thanks,
Niels



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Netbsd regressions are failing because of connection problems?

2016-01-20 Thread Pranith Kumar Karampuri

/origin/*
ERROR: Error cloning remote repo 'origin'
hudson.plugins.git.GitException: Command "git -c core.askpass=true fetch 
--tags --progress git://review.gluster.org/glusterfs.git 
+refs/heads/*:refs/remotes/origin/*" returned status code 128:

stdout:
stderr: fatal: unable to connect to review.gluster.org:
review.gluster.org[0: 184.107.76.10]: errno=Connection refused


at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:1640)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandWithCredentials(CliGitAPIImpl.java:1388)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl.access$300(CliGitAPIImpl.java:62)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl$1.execute(CliGitAPIImpl.java:313)
at 
org.jenkinsci.plugins.gitclient.CliGitAPIImpl$2.execute(CliGitAPIImpl.java:505)
at 
org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1.call(RemoteGitImpl.java:152)
at 
org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler$1.call(RemoteGitImpl.java:145)

at hudson.remoting.UserRequest.perform(UserRequest.java:120)
at hudson.remoting.UserRequest.perform(UserRequest.java:48)

https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/13574/console

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client

2016-01-20 Thread Xavier Hernandez

I'm seeing a similar problem with 3.7.6.

This latest statedump contains a lot of gf_fuse_mt_invalidate_node_t 
objects in fuse. Looking at the code I see they are used to send 
invalidations to kernel fuse, however this is done in a separate thread 
that writes a log message when it exits. On the system I'm seeing the 
memory leak, I can see that message in the log files:


[2016-01-18 23:04:55.384873] I [fuse-bridge.c:3875:notify_kernel_loop] 
0-glusterfs-fuse: kernel notifier loop terminated


But the volume is still working at this moment, so any future inode 
invalidations will leak memory because it was this thread that should 
release it.


Can you check if you also see this message in the mount log ?

It seems that this thread terminates if write returns any error 
different than ENOENT. I'm not sure if there could be any other error 
that can cause this.


Xavi

On 20/01/16 00:13, Oleksandr Natalenko wrote:

Here is another RAM usage stats and statedump of GlusterFS mount approaching
to just another OOM:

===
root 32495  1.4 88.3 4943868 1697316 ? Ssl  Jan13 129:18 /usr/sbin/
glusterfs --volfile-server=server.example.com --volfile-id=volume /mnt/volume
===

https://gist.github.com/86198201c79e927b46bd

1.6G of RAM just for almost idle mount (we occasionally store Asterisk
recordings there). Triple OOM for 69 days of uptime.

Any thoughts?

On середа, 13 січня 2016 р. 16:26:59 EET Soumya Koduri wrote:

kill -USR1



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] 3.7.7 update

2016-01-20 Thread Avra Sengupta
Adding http://review.gluster.org/#/c/13119/ to the list. Hopefully it 
will go in today.


On 01/20/2016 01:31 PM, Venky Shankar wrote:



Pranith Kumar Karampuri wrote:

https://public.pad.fsfe.org/p/glusterfs-3.7.7 is the final list of
patches I am waiting for before making 3.7.7 release.

Please let me know if I need to wait for any other patches. It would be
great if we make the tag tomorrow.


Backport of http://review.gluster.org/#/c/13120/

But that needs http://review.gluster.org/#/c/13041/ to be merged.



Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



Venky
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] 3.7.7 update

2016-01-20 Thread Venky Shankar



Pranith Kumar Karampuri wrote:

https://public.pad.fsfe.org/p/glusterfs-3.7.7 is the final list of
patches I am waiting for before making 3.7.7 release.

Please let me know if I need to wait for any other patches. It would be
great if we make the tag tomorrow.


Backport of http://review.gluster.org/#/c/13120/

But that needs http://review.gluster.org/#/c/13041/ to be merged.



Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



Venky
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] Memory leak in GlusterFS FUSE client

2016-01-20 Thread Oleksandr Natalenko
Yes, there are couple of messages like this in my logs too (I guess one 
message per each remount):

===
[2016-01-18 23:42:08.742447] I [fuse-bridge.c:3875:notify_kernel_loop] 0-
glusterfs-fuse: kernel notifier loop terminated
===

On середа, 20 січня 2016 р. 09:51:23 EET Xavier Hernandez wrote:
> I'm seeing a similar problem with 3.7.6.
> 
> This latest statedump contains a lot of gf_fuse_mt_invalidate_node_t
> objects in fuse. Looking at the code I see they are used to send
> invalidations to kernel fuse, however this is done in a separate thread
> that writes a log message when it exits. On the system I'm seeing the
> memory leak, I can see that message in the log files:
> 
> [2016-01-18 23:04:55.384873] I [fuse-bridge.c:3875:notify_kernel_loop]
> 0-glusterfs-fuse: kernel notifier loop terminated
> 
> But the volume is still working at this moment, so any future inode
> invalidations will leak memory because it was this thread that should
> release it.
> 
> Can you check if you also see this message in the mount log ?
> 
> It seems that this thread terminates if write returns any error
> different than ENOENT. I'm not sure if there could be any other error
> that can cause this.
> 
> Xavi
> 
> On 20/01/16 00:13, Oleksandr Natalenko wrote:
> > Here is another RAM usage stats and statedump of GlusterFS mount
> > approaching to just another OOM:
> > 
> > ===
> > root 32495  1.4 88.3 4943868 1697316 ? Ssl  Jan13 129:18
> > /usr/sbin/
> > glusterfs --volfile-server=server.example.com --volfile-id=volume
> > /mnt/volume ===
> > 
> > https://gist.github.com/86198201c79e927b46bd
> > 
> > 1.6G of RAM just for almost idle mount (we occasionally store Asterisk
> > recordings there). Triple OOM for 69 days of uptime.
> > 
> > Any thoughts?
> > 
> > On середа, 13 січня 2016 р. 16:26:59 EET Soumya Koduri wrote:
> >> kill -USR1
> > 
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Few details needed about *any* recent or upcoming feature

2016-01-20 Thread Atin Mukherjee
1. Why GlusterD 2.0

- Gluster has come a long way being the POSIX-compliant distributed
filesystem in clusters of small to medium sized clusters (10s-100s).
Gluster.Next is a collection of improvements to push Gluster's
capabilities to cloud-scale (read 1000s of nodes). GlusterD 2.0, the
next version of our native Gluster management software, aims to offer
devops-friendly interfaces to build, configure and deploy a
'thousand-node' Gluster cloud.

2. A high level description

- The high level goal is to replace the current order-n-squared
heartbeat/membership protocol with a much smaller "monitor cluster"
based on Paxos or Raft. Replication of store can also be built using
consistent distributed store like etcd/consul where only a subset of
peers maintain the store. The transaction framework will be built around
this store.

The design  should be modular with pluggable interfaces such that every
features which consumes GlusterD becomes a separate plugin and feature
owners owns the same. This will bring down lot of maintainability issues
seen over recent past. GlusterD 2.0 will also focus on extending
GlusterD to support new features as well as the external integration.

We believe the following are important when we design a scalable
enterprise-grade system. At scale, node failures, network failures are
more common and the following are a few things we would continuously
work on:
* Query APIs, REST for every new feature/capability
* Useful and searchable logs
* Better documentation (both user and development)


I'd request other 4.0 feature owners to share their feature details with
Niels.

Thanks,
Atin

On 01/20/2016 04:11 PM, Niels de Vos wrote:
> Hi all,
> 
> on Saturday the 30th of January I am scheduled to give a presentation
> titled "Gluster roadmap, recent improvements and upcoming features":
> 
>   https://fosdem.org/2016/schedule/event/gluster_roadmap/
> 
> I would like to ask from all feature owners/developers to reply to this
> email with a short description and a few keywords about their features.
> My plan is to have at most one slide for each feature, so keep it short.
> 
> Thanks,
> Niels
> 
> 
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Few details needed about *any* recent or upcoming feature

2016-01-20 Thread Niels de Vos
Hi all,

on Saturday the 30th of January I am scheduled to give a presentation
titled "Gluster roadmap, recent improvements and upcoming features":

  https://fosdem.org/2016/schedule/event/gluster_roadmap/

I would like to ask from all feature owners/developers to reply to this
email with a short description and a few keywords about their features.
My plan is to have at most one slide for each feature, so keep it short.

Thanks,
Niels


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Few details needed about *any* recent or upcoming feature

2016-01-20 Thread Pranith Kumar Karampuri

http://www.gluster.org/pipermail/gluster-devel/2015-September/046773.html

Pranith

On 01/20/2016 04:11 PM, Niels de Vos wrote:

Hi all,

on Saturday the 30th of January I am scheduled to give a presentation
titled "Gluster roadmap, recent improvements and upcoming features":

   https://fosdem.org/2016/schedule/event/gluster_roadmap/

I would like to ask from all feature owners/developers to reply to this
email with a short description and a few keywords about their features.
My plan is to have at most one slide for each feature, so keep it short.

Thanks,
Niels


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Few details needed about *any* recent or upcoming feature

2016-01-20 Thread Krutika Dhananjay
sharding: 
http://blog.gluster.org/2015/12/introducing-shard-translator/ 
http://blog.gluster.org/2015/12/sharding-what-next-2/ 

-Krutika 

- Original Message -

> From: "Niels de Vos" 
> To: gluster-devel@gluster.org
> Sent: Wednesday, January 20, 2016 4:11:42 PM
> Subject: [Gluster-devel] Few details needed about *any* recent or upcoming
> feature

> Hi all,

> on Saturday the 30th of January I am scheduled to give a presentation
> titled "Gluster roadmap, recent improvements and upcoming features":

> https://fosdem.org/2016/schedule/event/gluster_roadmap/

> I would like to ask from all feature owners/developers to reply to this
> email with a short description and a few keywords about their features.
> My plan is to have at most one slide for each feature, so keep it short.

> Thanks,
> Niels

> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Gluster AFR volume write performance has been seriously affected by GLUSTERFS_WRITE_IS_APPEND in afr_writev

2016-01-20 Thread Pranith Kumar Karampuri

Sorry for the delay in response.

On 01/15/2016 02:34 PM, li.ping...@zte.com.cn wrote:
GLUSTERFS_WRITE_IS_APPEND Setting in afr_writev function at glusterfs 
client end makes the posix_writev in the server end  deal IO write 
fops from parallel  to serial in consequence.


i.e.  multiple io-worker threads carrying out IO write fops are 
blocked in posix_writev to execute final write fop pwrite/pwritev in 
__posix_writev function ONE AFTER ANOTHER.


For example:

thread1: iot_worker -> ...  -> posix_writev()   |
thread2: iot_worker -> ...  -> posix_writev()   |
thread3: iot_worker -> ...  -> posix_writev()   -> __posix_writev()
thread4: iot_worker -> ...  -> posix_writev()   |

there are 4 iot_worker thread doing the 128KB IO write fops as above, 
but only one can execute __posix_writev function and the others have 
to wait.


however, if the afr volume is configured on with storage.linux-aio 
which is off in default,  the iot_worker will use posix_aio_writev 
instead of posix_writev to write data.
the posix_aio_writev function won't be affected by 
GLUSTERFS_WRITE_IS_APPEND, and the AFR volume write performance goes up.

I think this is a bug :-(.


So, my question is whether  AFR volume could work fine with 
storage.linux-aio configuration which bypass the 
GLUSTERFS_WRITE_IS_APPEND setting in afr_writev,

and why glusterfs keeps posix_aio_writev different from posix_writev ?

Any replies to clear my confusion would be grateful, and thanks in 
advance.

What is the workload you have? multiple writers on same file workloads?

Pranith




ZTE Information Security Notice: The information contained in this mail (and 
any attachment transmitted herewith) is privileged and confidential and is 
intended for the exclusive use of the addressee(s).  If you are not an intended 
recipient, any disclosure, reproduction, distribution or other dissemination or 
use of the information contained is strictly prohibited.  If you have received 
this mail in error, please delete it and notify us immediately.




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] [CANCELLED] Weekly meeting on #gluster-meeting

2016-01-20 Thread Venky Shankar

Due to less number of participants and noone to chair.

Thanks.

Venky
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Few details needed about *any* recent or upcoming feature

2016-01-20 Thread Jeff Darcy
> on Saturday the 30th of January I am scheduled to give a presentation
> titled "Gluster roadmap, recent improvements and upcoming features":
> 
>   https://fosdem.org/2016/schedule/event/gluster_roadmap/
> 
> I would like to ask from all feature owners/developers to reply to this
> email with a short description and a few keywords about their features.
> My plan is to have at most one slide for each feature, so keep it short.

=== NSR

* journal- and server-based (vs. client-based AFR)

* better throughput for many workloads

* faster, more precise repair

* SSD-friendly

* (some day) internal snapshot capability

Some explanation, so you're not just reading the slides or in case  

you're asked.   



* On throughput, NSR does not split each client's bandwidth among N 

servers, and generates a nice sequential I/O pattern on each server 

(into the journals).  These effects tend to outweigh any theoretical

increase in latency due to the extra server-to-server "hop" - as is 

clearly demonstrated by other systems already using this approach.  



* WTF does "SSD-friendly" mean?  It means that NSR can trivially be 

configured to put journals on a separate device from the main store.

Since we do full data journaling, that means we can serve newly written 
   
data from that separate device, which can be of a faster type.  This

gives us a simple form of tiering, independently of that implemented in 

DHT.  However, unlike Ceph, we do not *require* the journal to be on a  

separate device.



* Similarly, because journals are time-based and separate from the  

main store, simply skipping the most recent journal segments on reads   

gives us a kind of snapshot.  This is a feature of the *design* that we 

might exploit some day, but certainly not of the 4.0 *implementation*.  

The nice thing about it is that it's completely independent of the  

underlying local filesystem or volume manager, so (unlike our current   

LVM-biased approach) it can work on any platform. 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Few details needed about *any* recent or upcoming feature

2016-01-20 Thread Ravishankar N

On 01/20/2016 04:11 PM, Niels de Vos wrote:

Hi all,

on Saturday the 30th of January I am scheduled to give a presentation
titled "Gluster roadmap, recent improvements and upcoming features":

   https://fosdem.org/2016/schedule/event/gluster_roadmap/

I would like to ask from all feature owners/developers to reply to this
email with a short description and a few keywords about their features.
My plan is to have at most one slide for each feature, so keep it short.

Thanks,
Niels



AFR arbiter volumes:
--
Arbiter configuration is special type of 3 way replica volume in AFR 
where the 3rd brick holds only the meta-data of the file. It consumes 
less space than  a 3 way replica, provides the same consistency and 
prevents split-brain of files. More about the usage @ 
https://github.com/gluster/glusterfs-specs/blob/master/done/Features/afr-arbiter-volumes.md


Recent enhancements for 3.7.7 include:
-Indicating which bricks are arbiters in the output of `gluster volume 
info` (http://review.gluster.org/#/c/12747/)
-Skipping data-selfheal for arbiter bricks since they are zero byte 
files containing only metadata.(http://review.gluster.org/12778)
-An important bug fix which prevents arbiter bricks from also storing 
the file contents (data) under certain conditions 
(http://review.gluster.org/12479)


Thanks,
Ravi



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel