[Gluster-devel] ./tests/encryption/crypt.t fails regression with core

2017-06-21 Thread Kotresh Hiremath Ravishankar
Hi

./tests/encryption/crypt.t fails regression on
https://build.gluster.org/job/centos6-regression/5112/consoleFull
with a core. It doesn't seem to be related to the patch. Can somebody take
a look at it? Following is the backtrace.

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x7effbe9ef92b in offset_at_tail (conf=0xc0, object=0x7effb000ac28)
at
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/atom.c:96
96
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/atom.c:
No such file or directory.
[Current thread is 1 (LWP 1082)]
(gdb) bt
#0  0x7effbe9ef92b in offset_at_tail (conf=0xc0, object=0x7effb000ac28)
at
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/atom.c:96
#1  0x7effbe9ef9d5 in offset_at_data_tail (frame=0x7effa4001960,
object=0x7effb000ac28) at
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/atom.c:110
#2  0x7effbe9f0729 in rmw_partial_block (frame=0x7effa4001960,
cookie=0x7effb4010050, this=0x7effb800b870, op_ret=0, op_errno=2, vec=0x0,
count=0, stbuf=0x7effb402da18,
iobref=0x7effb804a0d0, atom=0x7effbec106a0 ) at
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/atom.c:523
#3  0x7effbe9f1339 in rmw_data_tail (frame=0x7effa4001960,
cookie=0x7effb4010050, this=0x7effb800b870, op_ret=0, op_errno=2, vec=0x0,
count=0, stbuf=0x7effb402da18,
iobref=0x7effb804a0d0, xdata=0x0) at
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/atom.c:716
#4  0x7effbea03684 in __crypt_readv_done (frame=0x7effb4010050,
cookie=0x0, this=0x7effb800b870, op_ret=0, op_errno=0, xdata=0x0)
at
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/crypt.c:3460
#5  0x7effbea0375f in crypt_readv_done (frame=0x7effb4010050,
this=0x7effb800b870) at
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/crypt.c:3487
#6  0x7effbea03b25 in put_one_call_readv (frame=0x7effb4010050,
this=0x7effb800b870) at
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/crypt.c:3514
#7  0x7effbe9f286e in crypt_readv_cbk (frame=0x7effb4010050,
cookie=0x7effb4010160, this=0x7effb800b870, op_ret=0, op_errno=2,
vec=0x7effbfb33880, count=1, stbuf=0x7effbfb33810,
iobref=0x7effb804a0d0, xdata=0x0) at
/home/jenkins/root/workspace/centos6-regression/xlators/encryption/crypt/src/crypt.c:371
#8  0x7effbec9cb4b in dht_readv_cbk (frame=0x7effb4010160,
cookie=0x7effb400ff40, this=0x7effb800a200, op_ret=0, op_errno=2,
vector=0x7effbfb33880, count=1, stbuf=0x7effbfb33810,
iobref=0x7effb804a0d0, xdata=0x0) at
/home/jenkins/root/workspace/centos6-regression/xlators/cluster/dht/src/dht-inode-read.c:479
#9  0x7effbeefff83 in client3_3_readv_cbk (req=0x7effb40048b0,
iov=0x7effb40048f0, count=2, myframe=0x7effb400ff40)
at
/home/jenkins/root/workspace/centos6-regression/xlators/protocol/client/src/client-rpc-fops.c:2997
#10 0x7effcc7b681e in rpc_clnt_handle_reply (clnt=0x7effb803eb70,
pollin=0x7effb807b350) at
/home/jenkins/root/workspace/centos6-regression/rpc/rpc-lib/src/rpc-clnt.c:793
#11 0x7effcc7b6de8 in rpc_clnt_notify (trans=0x7effb803ed10,
mydata=0x7effb803eba0, event=RPC_TRANSPORT_MSG_RECEIVED,
data=0x7effb807b350)
at
/home/jenkins/root/workspace/centos6-regression/rpc/rpc-lib/src/rpc-clnt.c:986
#12 0x7effcc7b2e0c in rpc_transport_notify (this=0x7effb803ed10,
event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7effb807b350)
at
/home/jenkins/root/workspace/centos6-regression/rpc/rpc-lib/src/rpc-transport.c:538
#13 0x7effc136458a in socket_event_poll_in (this=0x7effb803ed10,
notify_handled=_gf_true) at
/home/jenkins/root/workspace/centos6-regression/rpc/rpc-transport/socket/src/socket.c:2315
#14 0x7effc1364bd5 in socket_event_handler (fd=10, idx=2, gen=1,
data=0x7effb803ed10, poll_in=1, poll_out=0, poll_err=0)
at
/home/jenkins/root/workspace/centos6-regression/rpc/rpc-transport/socket/src/socket.c:2467
#15 0x7effcca6216e in event_dispatch_epoll_handler
(event_pool=0x2105fc0, event=0x7effbfb33e70) at
/home/jenkins/root/workspace/centos6-regression/libglusterfs/src/event-epoll.c:572
#16 0x7effcca62470 in event_dispatch_epoll_worker (data=0x215d950) at
/home/jenkins/root/workspace/centos6-regression/libglusterfs/src/event-epoll.c:648
#17 0x7effcbcc9aa1 in start_thread () from ./lib64/libpthread.so.0
#18 0x7effcb631bcd in clone () from ./lib64/libc.so.6


Thanks,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Community Meeting minutes, 2017-06-21

2017-06-21 Thread Kaleb S. KEITHLEY
===
#gluster-meeting: Gluster Community Meeting
===


Meeting started by kkeithley at 15:13:53 UTC. The full logs are
available at
https://meetbot.fedoraproject.org/gluster-meeting/2017-06-21/gluster_community_meeting.2017-06-21-15.13.log.html
.



Meeting summary
---
* roll call  (kkeithley, 15:14:12)

* AIs from last meeting  (kkeithley, 15:19:21)

* related projects  (kkeithley, 15:33:56)
  * ACTION: JoeJulian to invite Harsha to next community meeting to
discuss Minio  (kkeithley, 15:50:21)
  *

https://review.openstack.org/#/q/status:open+project:openstack/swift3,n,z
(kkeithley, 15:50:49)
  * there's definetely versioning work going on,  bunch of patches that
needs reviews...  (kkeithley, 15:50:57)
  * The infra for simplified reverts is done btw.  (kkeithley, 15:51:30)

* open floor  (kkeithley, 15:54:32)

Meeting ended at 16:07:14 UTC.




Action Items

* JoeJulian to invite Harsha to next community meeting to discuss Minio




Action Items, by person
---
* JoeJulian
  * JoeJulian to invite Harsha to next community meeting to discuss
Minio
* **UNASSIGNED**
  * (none)




People Present (lines said)
---
* kkeithley (54)
* ndevos (40)
* nigelb (35)
* JoeJulian (10)
* tdasilva (9)
* shyam (7)
* zodbot (3)
* jstrunk (1)




Generated by `MeetBot`_ 0.1.4

.. _`MeetBot`: http://wiki.debian.org/MeetBot



-- 

Kaleb
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-Maintainers] Release 3.11.1: Scheduled for 20th of June

2017-06-21 Thread Shyam

On 06/21/2017 11:37 AM, Pranith Kumar Karampuri wrote:



On Tue, Jun 20, 2017 at 7:37 PM, Shyam mailto:srang...@redhat.com>> wrote:

Hi,

Release tagging has been postponed by a day to accommodate a fix for
a regression that has been introduced between 3.11.0 and 3.11.1 (see
[1] for details).

As a result 3.11.1 will be tagged on the 21st June as of now
(further delays will be notified to the lists appropriately).


The required patches landed upstream for review and are undergoing
review. Could we do the tagging tomorrow? We don't want to rush the
patches to make sure we don't introduce any new bugs at this time.


Agreed, considering the situation we would be tagging the release 
tomorrow (June-22nd 2017).






Thanks,
Shyam

[1] Bug awaiting fix:
https://bugzilla.redhat.com/show_bug.cgi?id=1463250


"Releases are made better together"

On 06/06/2017 09:24 AM, Shyam wrote:

Hi,

It's time to prepare the 3.11.1 release, which falls on the 20th of
each month [4], and hence would be June-20th-2017 this time around.

This mail is to call out the following,

1) Are there any pending *blocker* bugs that need to be tracked for
3.11.1? If so mark them against the provided tracker [1] as blockers
for the release, or at the very least post them as a response to
this
mail

2) Pending reviews in the 3.11 dashboard will be part of the
release,
*iff* they pass regressions and have the review votes, so use the
dashboard [2] to check on the status of your patches to 3.11 and get
these going

3) Empty release notes are posted here [3], if there are any
specific
call outs for 3.11 beyond bugs, please update the review, or leave a
comment in the review, for us to pick it up

Thanks,
Shyam/Kaushal

[1] Release bug tracker:
https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.11.1


[2] 3.11 review dashboard:

https://review.gluster.org/#/projects/glusterfs,dashboards/dashboard:3-11-dashboard




[3] Release notes WIP: https://review.gluster.org/17480


[4] Release calendar:
https://www.gluster.org/community/release-schedule/

___
Gluster-devel mailing list
Gluster-devel@gluster.org 
http://lists.gluster.org/mailman/listinfo/gluster-devel


___
maintainers mailing list
maintain...@gluster.org 
http://lists.gluster.org/mailman/listinfo/maintainers





--
Pranith

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-Maintainers] Release 3.11.1: Scheduled for 20th of June

2017-06-21 Thread Pranith Kumar Karampuri
On Tue, Jun 20, 2017 at 7:37 PM, Shyam  wrote:

> Hi,
>
> Release tagging has been postponed by a day to accommodate a fix for a
> regression that has been introduced between 3.11.0 and 3.11.1 (see [1] for
> details).
>
> As a result 3.11.1 will be tagged on the 21st June as of now (further
> delays will be notified to the lists appropriately).
>

The required patches landed upstream for review and are undergoing review.
Could we do the tagging tomorrow? We don't want to rush the patches to make
sure we don't introduce any new bugs at this time.


>
> Thanks,
> Shyam
>
> [1] Bug awaiting fix: https://bugzilla.redhat.com/show_bug.cgi?id=1463250
>
> "Releases are made better together"
>
> On 06/06/2017 09:24 AM, Shyam wrote:
>
>> Hi,
>>
>> It's time to prepare the 3.11.1 release, which falls on the 20th of
>> each month [4], and hence would be June-20th-2017 this time around.
>>
>> This mail is to call out the following,
>>
>> 1) Are there any pending *blocker* bugs that need to be tracked for
>> 3.11.1? If so mark them against the provided tracker [1] as blockers
>> for the release, or at the very least post them as a response to this
>> mail
>>
>> 2) Pending reviews in the 3.11 dashboard will be part of the release,
>> *iff* they pass regressions and have the review votes, so use the
>> dashboard [2] to check on the status of your patches to 3.11 and get
>> these going
>>
>> 3) Empty release notes are posted here [3], if there are any specific
>> call outs for 3.11 beyond bugs, please update the review, or leave a
>> comment in the review, for us to pick it up
>>
>> Thanks,
>> Shyam/Kaushal
>>
>> [1] Release bug tracker:
>> https://bugzilla.redhat.com/show_bug.cgi?id=glusterfs-3.11.1
>>
>> [2] 3.11 review dashboard:
>> https://review.gluster.org/#/projects/glusterfs,dashboards/d
>> ashboard:3-11-dashboard
>>
>>
>> [3] Release notes WIP: https://review.gluster.org/17480
>>
>> [4] Release calendar: https://www.gluster.org/community/release-schedule/
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-devel
>>
> ___
> maintainers mailing list
> maintain...@gluster.org
> http://lists.gluster.org/mailman/listinfo/maintainers
>



-- 
Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Regression Voting Changes

2017-06-21 Thread Nigel Babu
On Wed, Jun 21, 2017 at 06:05:32PM +0530, Atin Mukherjee wrote:
> On Tue, Jun 20, 2017 at 9:17 AM, Nigel Babu  wrote:
>
> > Hello folks,
> >
> > Amar has proposed[1] these changes in the past and I'd like to announce us
> > going
> > live with them as we've not received any strong feedback against it.
> >
> > ## Centos Regression
> > * On master, we only run tests/basic as pre-merge testing.
> >
>
> I should have read this and Amar's email thread more carefully but
> unfortunately I missed to read the above point in both the cases, so
> apologies. I think running only tests/basic on master is not *sufficient*
> given our goal is to have more test coverages coming from each patch and I
> expect most of the patches if not all will have tests added and if we don't
> run the full regression suite this eventually means we don't test the patch
> on master. Also I'm not sure how reactive we are against the regression
> test burn failure reports, so if things go bad and if we don't react to it
> immediately it'd be difficult to get the master branch back to stable
> state. I'd suggest (and request) that we should run the full regression
> test suite on Centos.

These are valid concerns. I don't have an easy solution to any of them. So I've
reverted the Centos regression changes entirely.

> > * On release branches, we will run the entire suite of tests.
> > * Our regular regression-test-burn-in and regression-test-with-multiplex
> > will
> >   continue to run the full suite of tests as they currently do.
> >
> > ## NetBSD Regression
> > * We will not run a netbsd7-regression as required pre-merge test anymore.
> >   However, you should be able to trigger it with "recheck netbsd".
> > * A green NetBSD will no longer be required for merging patches, however
> > if you
> >   have a -1 vote, it will remain a blocker. This is so that reviewers can
> >   request a full NetBSD run, especially on release branches.
> > * We will do a periodic NetBSD regression run on all currently maintained
> >   branches (3.8, 3.10, and 3.11 at the moment) and master.
> >
> > ## Additional Changes
> > * As full regression runs per patch is run on release branches only (other
> > than
> >   the nightly on master), any failures need proper attention and possible
> > RCA.
> >   A re-trigger in the hopes of getting a green is no longer acceptable for
> >   release branches.
> > * fstat will soon track the regression-test-burn-in and
> >   regression-test-with-multiplex.
> > * As soon as we have the new jobs up, we'll add them to fstat so we can
> > track
> >   failure patterns.
> >
> > The CentOS changes are already in production. The NetBSD changes will land
> > in
> > production today.
> >
> > [1]: http://lists.gluster.org/pipermail/gluster-devel/2017-May/052868.html

--
nigelb
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Coverity covscan for 2017-06-21-b2522297 (master branch)

2017-06-21 Thread staticanalysis
GlusterFS Coverity covscan results are available from
http://download.gluster.org/pub/gluster/glusterfs/static-analysis/master/glusterfs-coverity/2017-06-21-b2522297
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Regression Voting Changes

2017-06-21 Thread Atin Mukherjee
On Tue, Jun 20, 2017 at 9:17 AM, Nigel Babu  wrote:

> Hello folks,
>
> Amar has proposed[1] these changes in the past and I'd like to announce us
> going
> live with them as we've not received any strong feedback against it.
>
> ## Centos Regression
> * On master, we only run tests/basic as pre-merge testing.
>

I should have read this and Amar's email thread more carefully but
unfortunately I missed to read the above point in both the cases, so
apologies. I think running only tests/basic on master is not *sufficient*
given our goal is to have more test coverages coming from each patch and I
expect most of the patches if not all will have tests added and if we don't
run the full regression suite this eventually means we don't test the patch
on master. Also I'm not sure how reactive we are against the regression
test burn failure reports, so if things go bad and if we don't react to it
immediately it'd be difficult to get the master branch back to stable
state. I'd suggest (and request) that we should run the full regression
test suite on Centos.



> * On release branches, we will run the entire suite of tests.
> * Our regular regression-test-burn-in and regression-test-with-multiplex
> will
>   continue to run the full suite of tests as they currently do.
>
> ## NetBSD Regression
> * We will not run a netbsd7-regression as required pre-merge test anymore.
>   However, you should be able to trigger it with "recheck netbsd".
> * A green NetBSD will no longer be required for merging patches, however
> if you
>   have a -1 vote, it will remain a blocker. This is so that reviewers can
>   request a full NetBSD run, especially on release branches.
> * We will do a periodic NetBSD regression run on all currently maintained
>   branches (3.8, 3.10, and 3.11 at the moment) and master.
>
> ## Additional Changes
> * As full regression runs per patch is run on release branches only (other
> than
>   the nightly on master), any failures need proper attention and possible
> RCA.
>   A re-trigger in the hopes of getting a green is no longer acceptable for
>   release branches.
> * fstat will soon track the regression-test-burn-in and
>   regression-test-with-multiplex.
> * As soon as we have the new jobs up, we'll add them to fstat so we can
> track
>   failure patterns.
>
> The CentOS changes are already in production. The NetBSD changes will land
> in
> production today.
>
> [1]: http://lists.gluster.org/pipermail/gluster-devel/2017-May/052868.html
>
> --
> nigelb
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] need reviews

2017-06-21 Thread Atin Mukherjee
On Wed, Jun 21, 2017 at 4:18 PM, Amar Tumballi  wrote:

>
>
> On Mon, May 29, 2017 at 1:11 PM, Hari Gowtham  wrote:
>
>> Hi,
>>
>> I would like to get reviews on the following patches.
>>
>> https://review.gluster.org/#/c/15740/5
>> https://review.gluster.org/#/c/15503/
>> https://review.gluster.org/#/c/17137/
>> https://review.gluster.org/#/c/17328/
>>
>>
> I see that one of the patches above is merged, and other 3 are more of
> glusterd/cli changes. I propose that we take these in for 'master' branch
> for now (as in if there are no -1 from reviewers) as each of these are an
> improvement over previous options available.
>

Although these are cli/glusterd specific changes, but they are related to
tiering feature and hence I was waiting for one of the tiering dev's score
before GD maintainer(s) take a final look. That has been the general
agreement so far.


>
> As we have around 25 days or so for release 3.4 cut-off, can handle any
> issues coming up because of it.
>
> Others, any thoughts? If no major concerns, can we proceed on these
> quickly.
>
> Regards,
> Amar
>
>
>> Regards,
>> Hari Gowtham.
>>
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>
>
>
> --
> Amar Tumballi (amarts)
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] need reviews

2017-06-21 Thread Amar Tumballi
On Mon, May 29, 2017 at 1:11 PM, Hari Gowtham  wrote:

> Hi,
>
> I would like to get reviews on the following patches.
>
> https://review.gluster.org/#/c/15740/5
> https://review.gluster.org/#/c/15503/
> https://review.gluster.org/#/c/17137/
> https://review.gluster.org/#/c/17328/
>
>
I see that one of the patches above is merged, and other 3 are more of
glusterd/cli changes. I propose that we take these in for 'master' branch
for now (as in if there are no -1 from reviewers) as each of these are an
improvement over previous options available.

As we have around 25 days or so for release 3.4 cut-off, can handle any
issues coming up because of it.

Others, any thoughts? If no major concerns, can we proceed on these quickly.

Regards,
Amar


> Regards,
> Hari Gowtham.
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Amar Tumballi (amarts)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] geo-rep regression because of node-uuid change

2017-06-21 Thread Karthik Subrahmanya
On Wed, Jun 21, 2017 at 1:56 PM, Xavier Hernandez 
wrote:

> That's ok. I'm currently unable to write a patch for this on ec.

Sunil is working on this patch.

~Karthik

> If no one can do it, I can try to do it in 6 - 7 hours...
>
> Xavi
>
>
> On Wednesday, June 21, 2017 09:48 CEST, Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
>
>
>
> On Wed, Jun 21, 2017 at 1:00 PM, Xavier Hernandez 
> wrote:
>>
>> I'm ok with reverting node-uuid content to the previous format and create
>> a new xattr for the new format. Currently, only rebalance will use it.
>>
>> Only thing to consider is what can happen if we have a half upgraded
>> cluster where some clients have this change and some not. Can rebalance
>> work in this situation ? if so, could there be any issue ?
>
>
> I think there shouldn't be any problem, because this is in-memory xattr so
> layers below afr/ec will only see node-uuid xattr.
> This also gives us a chance to do whatever we want to do in future with
> this xattr without any problems about backward compatibility.
>
> You can check https://review.gluster.org/#/c/17576/3/xlators/cluster/afr/
> src/afr-inode-read.c@1507 for how karthik implemented this in AFR (this
> got merged accidentally yesterday, but looks like this is what we are
> settling on)
>
>
>>
>> Xavi
>>
>>
>> On Wednesday, June 21, 2017 06:56 CEST, Pranith Kumar Karampuri <
>> pkara...@redhat.com> wrote:
>>
>>
>>
>>
>> On Wed, Jun 21, 2017 at 10:07 AM, Nithya Balachandran <
>> nbala...@redhat.com> wrote:
>>>
>>>
>>> On 20 June 2017 at 20:38, Aravinda  wrote:

 On 06/20/2017 06:02 PM, Pranith Kumar Karampuri wrote:

 Xavi, Aravinda and I had a discussion on #gluster-dev and we agreed to
 go with the format Aravinda suggested for now and in future we wanted some
 more changes for dht to detect which subvolume went down came back up, at
 that time we will revisit the solution suggested by Xavi.

 Susanth is doing the dht changes
 Aravinda is doing geo-rep changes

 Done. Geo-rep patch sent for review https://review.gluster.org/17582


>>>
>>> The proposed changes to the node-uuid behaviour (while good) are going
>>> to break tiering . Tiering changes will take a little more time to be coded
>>> and tested.
>>>
>>> As this is a regression for 3.11 and a blocker for 3.11.1, I suggest we
>>> go back to the original node-uuid behaviour for now so as to unblock the
>>> release and target the proposed changes for the next 3.11 releases.
>>>
>>
>> Let me see if I understand the changes correctly. We are restoring the
>> behavior of node-uuid xattr and adding a new xattr for parallel rebalance
>> for both afr and ec, correct? Otherwise that is one more regression. If
>> yes, we will also wait for Xavi's inputs. Jeff accidentally merged the afr
>> patch yesterday which does these changes. If everyone is in agreement, we
>> will leave it as is and add similar changes in ec as well. If we are not in
>> agreement, then we will let the discussion progress :-)
>>
>>
>>>
>>>
>>> Regards,
>>> Nithya
>>>
 --
 Aravinda



 Thanks to all of you guys for the discussions!

 On Tue, Jun 20, 2017 at 5:05 PM, Xavier Hernandez <
 xhernan...@datalab.es> wrote:
>
> Hi Aravinda,
>
> On 20/06/17 12:42, Aravinda wrote:
>>
>> I think following format can be easily adopted by all components
>>
>> UUIDs of a subvolume are seperated by space and subvolumes are
>> separated
>> by comma
>>
>> For example, node1 and node2 are replica with U1 and U2 UUIDs
>> respectively and
>> node3 and node4 are replica with U3 and U4 UUIDs respectively
>>
>> node-uuid can return "U1 U2,U3 U4"
>
>
> While this is ok for current implementation, I think this can be
> insufficient if there are more layers of xlators that require to indicate
> some sort of grouping. Some representation that can represent hierarchy
> would be better. For example: "(U1 U2) (U3 U4)" (we can use spaces or 
> comma
> as a separator).
>
>>
>>
>> Geo-rep can split by "," and then split by space and take first UUID
>> DHT can split the value by space or comma and get unique UUIDs list
>
>
> This doesn't solve the problem I described in the previous email. Some
> more logic will need to be added to avoid more than one node from each
> replica-set to be active. If we have some explicit hierarchy information 
> in
> the node-uuid value, more decisions can be taken.
>
> An initial proposal I made was this:
>
> DHT[2](AFR[2,0](NODE(U1), NODE(U2)), AFR[2,0](NODE(U1), NODE(U2)))
>
> This is harder to parse, but gives a lot of information: DHT with 2
> subvolumes, each subvolume is an AFR with replica 2 and no arbiters. It's
> also easily extensible with any new xlator that changes the layout.
>
> However maybe this is not the moment to do this,

Re: [Gluster-devel] geo-rep regression because of node-uuid change

2017-06-21 Thread Xavier Hernandez

That's ok. I'm currently unable to write a patch for this on ec. If no one can 
do it, I can try to do it in 6 - 7 hours...

Xavi

On Wednesday, June 21, 2017 09:48 CEST, Pranith Kumar Karampuri 
 wrote:
   On Wed, Jun 21, 2017 at 1:00 PM, Xavier Hernandez  
wrote:I'm ok with reverting node-uuid content to the previous format and create 
a new xattr for the new format. Currently, only rebalance will use it.

Only thing to consider is what can happen if we have a half upgraded cluster 
where some clients have this change and some not. Can rebalance work in this 
situation ? if so, could there be any issue ? I think there shouldn't be any 
problem, because this is in-memory xattr so layers below afr/ec will only see 
node-uuid xattr.This also gives us a chance to do whatever we want to do in 
future with this xattr without any problems about backward compatibility.
 You can check 
https://review.gluster.org/#/c/17576/3/xlators/cluster/afr/src/afr-inode-read.c@1507
 for how karthik implemented this in AFR (this got merged accidentally 
yesterday, but looks like this is what we are settling on) 
Xavi

On Wednesday, June 21, 2017 06:56 CEST, Pranith Kumar Karampuri 
 wrote:
   On Wed, Jun 21, 2017 at 10:07 AM, Nithya Balachandran  
wrote: On 20 June 2017 at 20:38, Aravinda  wrote:On 
06/20/2017 06:02 PM, Pranith Kumar Karampuri wrote:Xavi, Aravinda and I had a 
discussion on #gluster-dev and we agreed to go with the format Aravinda 
suggested for now and in future we wanted some more changes for dht to detect 
which subvolume went down came back up, at that time we will revisit the 
solution suggested by Xavi.
 Susanth is doing the dht changesAravinda is doing geo-rep changes Done. 
Geo-rep patch sent for review https://review.gluster.org/17582
  The proposed changes to the node-uuid behaviour (while good) are going to 
break tiering . Tiering changes will take a little more time to be coded and 
tested.  As this is a regression for 3.11 and a blocker for 3.11.1, I suggest 
we go back to the original node-uuid behaviour for now so as to unblock the 
release and target the proposed changes for the next 3.11 releases. Let me see 
if I understand the changes correctly. We are restoring the behavior of 
node-uuid xattr and adding a new xattr for parallel rebalance for both afr and 
ec, correct? Otherwise that is one more regression. If yes, we will also wait 
for Xavi's inputs. Jeff accidentally merged the afr patch yesterday which does 
these changes. If everyone is in agreement, we will leave it as is and add 
similar changes in ec as well. If we are not in agreement, then we will let the 
discussion progress :-)   Regards,Nithya--
Aravinda  Thanks to all of you guys for the discussions! On Tue, Jun 20, 2017 
at 5:05 PM, Xavier Hernandez  wrote:Hi Aravinda,

On 20/06/17 12:42, Aravinda wrote:I think following format can be easily 
adopted by all components

UUIDs of a subvolume are seperated by space and subvolumes are separated
by comma

For example, node1 and node2 are replica with U1 and U2 UUIDs
respectively and
node3 and node4 are replica with U3 and U4 UUIDs respectively

node-uuid can return "U1 U2,U3 U4"
While this is ok for current implementation, I think this can be insufficient 
if there are more layers of xlators that require to indicate some sort of 
grouping. Some representation that can represent hierarchy would be better. For 
example: "(U1 U2) (U3 U4)" (we can use spaces or comma as a separator).
 
Geo-rep can split by "," and then split by space and take first UUID
DHT can split the value by space or comma and get unique UUIDs list
This doesn't solve the problem I described in the previous email. Some more 
logic will need to be added to avoid more than one node from each replica-set 
to be active. If we have some explicit hierarchy information in the node-uuid 
value, more decisions can be taken.

An initial proposal I made was this:

DHT[2](AFR[2,0](NODE(U1), NODE(U2)), AFR[2,0](NODE(U1), NODE(U2)))

This is harder to parse, but gives a lot of information: DHT with 2 subvolumes, 
each subvolume is an AFR with replica 2 and no arbiters. It's also easily 
extensible with any new xlator that changes the layout.

However maybe this is not the moment to do this, and probably we could 
implement this in a new xattr with a better name.

Xavi 
Another question is about the behavior when a node is down, existing
node-uuid xattr will not return that UUID if a node is down. What is the
behavior with the proposed xattr?

Let me know your thoughts.

regards
Aravinda VK

On 06/20/2017 03:06 PM, Aravinda wrote:Hi Xavi,

On 06/20/2017 02:51 PM, Xavier Hernandez wrote:Hi Aravinda,

On 20/06/17 11:05, Pranith Kumar Karampuri wrote:Adding more people to get a 
consensus about this.

On Tue, Jun 20, 2017 at 1:49 PM, Aravinda mailto:avish...@redhat.com>> wrote:


    regards
    Aravinda VK


    On 06/20/2017 01:26 PM, Xavier Hernandez wrote:

        Hi Pranith,

        adding gluster-devel, Kotresh and Aravin

Re: [Gluster-devel] geo-rep regression because of node-uuid change

2017-06-21 Thread Pranith Kumar Karampuri
On Wed, Jun 21, 2017 at 1:00 PM, Xavier Hernandez 
wrote:

> I'm ok with reverting node-uuid content to the previous format and create
> a new xattr for the new format. Currently, only rebalance will use it.
>
> Only thing to consider is what can happen if we have a half upgraded
> cluster where some clients have this change and some not. Can rebalance
> work in this situation ? if so, could there be any issue ?
>

I think there shouldn't be any problem, because this is in-memory xattr so
layers below afr/ec will only see node-uuid xattr.
This also gives us a chance to do whatever we want to do in future with
this xattr without any problems about backward compatibility.

You can check
https://review.gluster.org/#/c/17576/3/xlators/cluster/afr/src/afr-inode-read.c@1507
for how karthik implemented this in AFR (this got merged accidentally
yesterday, but looks like this is what we are settling on)


>
> Xavi
>
>
> On Wednesday, June 21, 2017 06:56 CEST, Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
>
>
>
> On Wed, Jun 21, 2017 at 10:07 AM, Nithya Balachandran  > wrote:
>>
>>
>> On 20 June 2017 at 20:38, Aravinda  wrote:
>>>
>>> On 06/20/2017 06:02 PM, Pranith Kumar Karampuri wrote:
>>>
>>> Xavi, Aravinda and I had a discussion on #gluster-dev and we agreed to
>>> go with the format Aravinda suggested for now and in future we wanted some
>>> more changes for dht to detect which subvolume went down came back up, at
>>> that time we will revisit the solution suggested by Xavi.
>>>
>>> Susanth is doing the dht changes
>>> Aravinda is doing geo-rep changes
>>>
>>> Done. Geo-rep patch sent for review https://review.gluster.org/17582
>>>
>>>
>>
>> The proposed changes to the node-uuid behaviour (while good) are going to
>> break tiering . Tiering changes will take a little more time to be coded
>> and tested.
>>
>> As this is a regression for 3.11 and a blocker for 3.11.1, I suggest we
>> go back to the original node-uuid behaviour for now so as to unblock the
>> release and target the proposed changes for the next 3.11 releases.
>>
>
> Let me see if I understand the changes correctly. We are restoring the
> behavior of node-uuid xattr and adding a new xattr for parallel rebalance
> for both afr and ec, correct? Otherwise that is one more regression. If
> yes, we will also wait for Xavi's inputs. Jeff accidentally merged the afr
> patch yesterday which does these changes. If everyone is in agreement, we
> will leave it as is and add similar changes in ec as well. If we are not in
> agreement, then we will let the discussion progress :-)
>
>
>>
>>
>> Regards,
>> Nithya
>>
>>> --
>>> Aravinda
>>>
>>>
>>>
>>> Thanks to all of you guys for the discussions!
>>>
>>> On Tue, Jun 20, 2017 at 5:05 PM, Xavier Hernandez >> > wrote:

 Hi Aravinda,

 On 20/06/17 12:42, Aravinda wrote:
>
> I think following format can be easily adopted by all components
>
> UUIDs of a subvolume are seperated by space and subvolumes are
> separated
> by comma
>
> For example, node1 and node2 are replica with U1 and U2 UUIDs
> respectively and
> node3 and node4 are replica with U3 and U4 UUIDs respectively
>
> node-uuid can return "U1 U2,U3 U4"


 While this is ok for current implementation, I think this can be
 insufficient if there are more layers of xlators that require to indicate
 some sort of grouping. Some representation that can represent hierarchy
 would be better. For example: "(U1 U2) (U3 U4)" (we can use spaces or comma
 as a separator).

>
>
> Geo-rep can split by "," and then split by space and take first UUID
> DHT can split the value by space or comma and get unique UUIDs list


 This doesn't solve the problem I described in the previous email. Some
 more logic will need to be added to avoid more than one node from each
 replica-set to be active. If we have some explicit hierarchy information in
 the node-uuid value, more decisions can be taken.

 An initial proposal I made was this:

 DHT[2](AFR[2,0](NODE(U1), NODE(U2)), AFR[2,0](NODE(U1), NODE(U2)))

 This is harder to parse, but gives a lot of information: DHT with 2
 subvolumes, each subvolume is an AFR with replica 2 and no arbiters. It's
 also easily extensible with any new xlator that changes the layout.

 However maybe this is not the moment to do this, and probably we could
 implement this in a new xattr with a better name.

 Xavi

>
>
> Another question is about the behavior when a node is down, existing
> node-uuid xattr will not return that UUID if a node is down. What is
> the
> behavior with the proposed xattr?
>
> Let me know your thoughts.
>
> regards
> Aravinda VK
>
> On 06/20/2017 03:06 PM, Aravinda wrote:
>>
>> Hi Xavi,
>>
>> On 06/20/2017 02:51 PM, Xavier Hernandez wrote:
>>>
>>> Hi Aravi

Re: [Gluster-devel] geo-rep regression because of node-uuid change

2017-06-21 Thread Xavier Hernandez

I'm ok with reverting node-uuid content to the previous format and create a new 
xattr for the new format. Currently, only rebalance will use it.

Only thing to consider is what can happen if we have a half upgraded cluster 
where some clients have this change and some not. Can rebalance work in this 
situation ? if so, could there be any issue ?

Xavi

On Wednesday, June 21, 2017 06:56 CEST, Pranith Kumar Karampuri 
 wrote:
   On Wed, Jun 21, 2017 at 10:07 AM, Nithya Balachandran  
wrote: On 20 June 2017 at 20:38, Aravinda  wrote:On 
06/20/2017 06:02 PM, Pranith Kumar Karampuri wrote:Xavi, Aravinda and I had a 
discussion on #gluster-dev and we agreed to go with the format Aravinda 
suggested for now and in future we wanted some more changes for dht to detect 
which subvolume went down came back up, at that time we will revisit the 
solution suggested by Xavi.
 Susanth is doing the dht changesAravinda is doing geo-rep changes Done. 
Geo-rep patch sent for review https://review.gluster.org/17582
  The proposed changes to the node-uuid behaviour (while good) are going to 
break tiering . Tiering changes will take a little more time to be coded and 
tested.  As this is a regression for 3.11 and a blocker for 3.11.1, I suggest 
we go back to the original node-uuid behaviour for now so as to unblock the 
release and target the proposed changes for the next 3.11 releases. Let me see 
if I understand the changes correctly. We are restoring the behavior of 
node-uuid xattr and adding a new xattr for parallel rebalance for both afr and 
ec, correct? Otherwise that is one more regression. If yes, we will also wait 
for Xavi's inputs. Jeff accidentally merged the afr patch yesterday which does 
these changes. If everyone is in agreement, we will leave it as is and add 
similar changes in ec as well. If we are not in agreement, then we will let the 
discussion progress :-)   Regards,Nithya--
Aravinda  Thanks to all of you guys for the discussions! On Tue, Jun 20, 2017 
at 5:05 PM, Xavier Hernandez  wrote:Hi Aravinda,

On 20/06/17 12:42, Aravinda wrote:I think following format can be easily 
adopted by all components

UUIDs of a subvolume are seperated by space and subvolumes are separated
by comma

For example, node1 and node2 are replica with U1 and U2 UUIDs
respectively and
node3 and node4 are replica with U3 and U4 UUIDs respectively

node-uuid can return "U1 U2,U3 U4"
While this is ok for current implementation, I think this can be insufficient 
if there are more layers of xlators that require to indicate some sort of 
grouping. Some representation that can represent hierarchy would be better. For 
example: "(U1 U2) (U3 U4)" (we can use spaces or comma as a separator).
 
Geo-rep can split by "," and then split by space and take first UUID
DHT can split the value by space or comma and get unique UUIDs list
This doesn't solve the problem I described in the previous email. Some more 
logic will need to be added to avoid more than one node from each replica-set 
to be active. If we have some explicit hierarchy information in the node-uuid 
value, more decisions can be taken.

An initial proposal I made was this:

DHT[2](AFR[2,0](NODE(U1), NODE(U2)), AFR[2,0](NODE(U1), NODE(U2)))

This is harder to parse, but gives a lot of information: DHT with 2 subvolumes, 
each subvolume is an AFR with replica 2 and no arbiters. It's also easily 
extensible with any new xlator that changes the layout.

However maybe this is not the moment to do this, and probably we could 
implement this in a new xattr with a better name.

Xavi 
Another question is about the behavior when a node is down, existing
node-uuid xattr will not return that UUID if a node is down. What is the
behavior with the proposed xattr?

Let me know your thoughts.

regards
Aravinda VK

On 06/20/2017 03:06 PM, Aravinda wrote:Hi Xavi,

On 06/20/2017 02:51 PM, Xavier Hernandez wrote:Hi Aravinda,

On 20/06/17 11:05, Pranith Kumar Karampuri wrote:Adding more people to get a 
consensus about this.

On Tue, Jun 20, 2017 at 1:49 PM, Aravinda mailto:avish...@redhat.com>> wrote:


    regards
    Aravinda VK


    On 06/20/2017 01:26 PM, Xavier Hernandez wrote:

        Hi Pranith,

        adding gluster-devel, Kotresh and Aravinda,

        On 20/06/17 09:45, Pranith Kumar Karampuri wrote:



            On Tue, Jun 20, 2017 at 1:12 PM, Xavier Hernandez
            mailto:xhernan...@datalab.es>
            >> wrote:

                On 20/06/17 09:31, Pranith Kumar Karampuri wrote:

                    The way geo-replication works is:
                    On each machine, it does getxattr of node-uuid and
            check if its
                    own uuid
                    is present in the list. If it is present then it
            will consider
                    it active
                    otherwise it will be considered passive. With this
            change we are
                    giving