Re: [Gluster-devel] Query regards to heal xattr heal in dht

2016-09-15 Thread Nithya Balachandran
On 15 September 2016 at 17:21, Raghavendra Gowdappa 
wrote:

>
>
> - Original Message -
> > From: "Xavier Hernandez" 
> > To: "Raghavendra G" , "Nithya Balachandran" <
> nbala...@redhat.com>
> > Cc: "Gluster Devel" , "Mohit Agrawal" <
> moagr...@redhat.com>
> > Sent: Thursday, September 15, 2016 4:54:25 PM
> > Subject: Re: [Gluster-devel] Query regards to heal xattr heal in dht
> >
> >
> >
> > On 15/09/16 11:31, Raghavendra G wrote:
> > >
> > >
> > > On Thu, Sep 15, 2016 at 12:02 PM, Nithya Balachandran
> > > > wrote:
> > >
> > >
> > >
> > > On 8 September 2016 at 12:02, Mohit Agrawal  > > > wrote:
> > >
> > > Hi All,
> > >
> > >I have one another solution to heal user xattr but before
> > > implement it i would like to discuss with you.
> > >
> > >Can i call function (dht_dir_xattr_heal internally it is
> > > calling syncop_setxattr) to heal xattr in dht_getxattr_cbk in
> last
> > >after make sure we have a valid xattr.
> > >In function(dht_dir_xattr_heal) it will copy blindly all
> user
> > > xattr on all subvolume or i can compare subvol xattr with valid
> > > xattr if there is any mismatch then i will call syncop_setxattr
> > > otherwise no need to call. syncop_setxattr.
> > >
> > >
> > >
> > > This can be problematic if a particular xattr is being removed - it
> > > might still exist on some subvols. IIUC, the heal would go and
> reset
> > > it again?
> > >
> > > One option is to use the hash subvol for the dir as the source - so
> > > perform xattr op on hashed subvol first and on the others only if
> it
> > > succeeds on the hashed. This does have the problem of being unable
> > > to set xattrs if the hashed subvol is unavailable. This might not
> be
> > > such a big deal in case of distributed replicate or distribute
> > > disperse volumes but will affect pure distribute. However, this way
> > > we can at least be reasonably certain of the correctness (leaving
> > > rebalance out of the picture).
> > >
> > >
> > > * What is the behavior of getxattr when hashed subvol is down? Should
> we
> > > succeed with values from non-hashed subvols or should we fail getxattr?
> > > With hashed-subvol as source of truth, its difficult to determine
> > > correctness of xattrs and their values when it is down.
> > >
> > > * setxattr is an inode operation (as opposed to entry operation). So,
> we
> > > cannot calculate hashed-subvol as in (get)(set)xattr, parent layout and
> > > "basename" is not available. This forces us to store hashed subvol in
> > > inode-ctx. Now, when the hashed-subvol changes we need to update these
> > > inode-ctxs too.
> > >
> > > What do you think about a Quorum based solution to this problem?
> > >
> > > 1. setxattr succeeds only if it is successful on at least (n/2 + 1)
> > > number of subvols.
> > > 2. getxattr succeeds only if it is successful and values match on at
> > > least (n/2 + 1) number of subvols.
> > >
> > > The flip-side of this solution is we are increasing the probability of
> > > failure of (get)(set)xattr operations as opposed to the hashed-subvol
> as
> > > source of truth solution. Or are we - how do we compare probability of
> > > hashed-subvol going down with probability of (n/2 + 1) nodes going down
> > > simultaneously? Is it 1/n vs (1/n*1/n*... (n/2+1 times)?. Is 1/n
> correct
> > > probability for _a specific subvol (hashed-subvol)_ going down (as
> > > opposed to _any one subvol_ going down)?
> >
> > If we suppose p to be the probability of failure of a subvolume in a
> > period of time (a year for example), all subvolumes have the same
> > probability, and we have N subvolumes, then:
> >
> > Probability of failure of hashed-subvol: p
> > Probability of failure of N/2 + 1 or more subvols: 
>
> Thanks Xavi. That was quick :).
>
> >
> > Note that this probability says how much probable is that N/2 + 1
> > subvols or more fail in the specified period of time, but not
> > necessarily simultaneously. If we suppose that subvolumes are recovered
> > as fast as possible, the real probability of simultaneous failure will
> > be much smaller.
> >
> > In worst case (not recovering the failed subvolumes in the given period
> > of time), if p < 0.5 or N = 2 (and p != 1), then it's always better to
> > check N/2 + 1 subvolumes. Otherwise, it's better to check the
> hashed-subvol.
> >
> > I think that p should always be much smaller than 0.5 for small periods
> > of time where subvolume recovery could no be completed before other
> > failures, so checking half plus one subvols should always be the best
> > option in terms of probability. Performance can suffer though if some
> > kind of synchronization is needed.
>
> For this 

Re: [Gluster-devel] Gluster Developer Summit 2016 Talk Schedule

2016-09-15 Thread Soumya Koduri



On 09/16/2016 03:48 AM, Amye Scavarda wrote:


On Thu, Sep 15, 2016 at 8:26 AM, Pranith Kumar Karampuri
> wrote:



On Thu, Sep 15, 2016 at 2:37 PM, Soumya Koduri > wrote:

Hi Amye,

Is there any plan to record these talks?


I had same question.


There is no planned recording for this, however, what we've done before
is ask people to record one of their practice runs through BlueJeans or
Hangouts.

We'll post those recordings through the Gluster Community channels.


Great. Thanks

-Soumya


- amye


Thanks,
Soumya

On 09/15/2016 03:09 AM, Amye Scavarda wrote:

Thanks to all that submitted talks, and thanks to the
program committee
who helped select this year's content.

This will be posted on the main Summit page as
well: gluster.org/events/summit2016

>

October 6
9:00am - 9:25amOpening Session
9:30 - 9:55amDHT: current design, (dis)advantages,
challenges - A
perspective- Raghavendra Gowdappa
10:00am - 10:25am  DHT2 - O Brother, Where Art Thou? - Shyam
Ranganathan
10:30am - 10:55am Performance bottlenecks for metadata
workload in
Gluster - Poornima Gurusiddaiah ,  Rajesh Joseph
11:00am - 11:25am The life of a consultant listed on
gluster.org 
 - Ivan Rossi
11:30am - 11:55am Architecture of the High Availability
Solution for
Ganesha and Samba - Kaleb Keithley
12:00 - 1:00pmLunch
1:00pm - 1:25pmChallenges with Gluster and Persistent Memory
- Dan Lambright
1:25pm - 1:55pmThrottling in gluster  - Ravishankar
Narayanankutty
2:00pm  - 2:25pmGluster: The Ugly Parts - Jeff Darcy
2:30pm  - 2:55pmDeterministic Releases and How to Get There
- Nigel Babu
3:00pm - 3:25pmBreak
3:30pm - 4:00pmBirds of a Feather Sessions
4:00pm - 4:55pmBirds of a Feather Sessions
Evening Reception to be announced


October 7
9:00am - 9:25amGFProxy: Scaling the GlusterFS FUSE Client -
Shreyas Siravara
9:30 - 9:55amSharding in GlusterFS - Past, Present and
Future - Krutika
Dhananjay
10:00am - 10:25amObject Storage with Gluster - Prashanth Pai
10:30am - 10:55am Containers and Perisstent Storage for
Containers. -
Humble Chirammal, Luis Pabon
11:00am - 11:25am Gluster as Block Store in Containers  -
Prasanna Kalever
11:30am - 11:55amAn Update on GlusterD-2.0 - Kaushal Madappa
12:00 - 1:00pmLunch
1:00pm - 1:25pmIntegration of GlusterFS in to Commvault data
platform  -
Ankireddypalle Reddy
1:30-1:55pmBootstrapping Challenge
2:00pm  - 2:25pmPractical Glusto Example - Jonathan Holloway
2:30pm  - 2:55pmState of Gluster Performance - Manoj Pillai
3:00pm - 3:25pmServer side replication - Avra Sengupta
3:30pm - 4:00pmBirds of a Feather Sessions
4:00pm - 4:55pmBirds of a Feather Sessions
5:00pm - 5:30pm Closing

--
Amye Scavarda | a...@redhat.com 
> | Gluster
Community Lead


___
Gluster-devel mailing list
Gluster-devel@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org 
http://www.gluster.org/mailman/listinfo/gluster-devel





--
Pranith




--
Amye Scavarda | a...@redhat.com  | Gluster
Community Lead

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Gluster Developer Summit 2016 Talk Schedule

2016-09-15 Thread Amye Scavarda
On Thu, Sep 15, 2016 at 8:26 AM, Pranith Kumar Karampuri <
pkara...@redhat.com> wrote:

>
>
> On Thu, Sep 15, 2016 at 2:37 PM, Soumya Koduri  wrote:
>
>> Hi Amye,
>>
>> Is there any plan to record these talks?
>>
>
> I had same question.
>

There is no planned recording for this, however, what we've done before is
ask people to record one of their practice runs through BlueJeans or
Hangouts.

We'll post those recordings through the Gluster Community channels.
- amye


>> Thanks,
>> Soumya
>>
>> On 09/15/2016 03:09 AM, Amye Scavarda wrote:
>>
>>> Thanks to all that submitted talks, and thanks to the program committee
>>> who helped select this year's content.
>>>
>>> This will be posted on the main Summit page as
>>> well: gluster.org/events/summit2016 >> mit2016>
>>>
>>> October 6
>>> 9:00am - 9:25amOpening Session
>>> 9:30 - 9:55amDHT: current design, (dis)advantages, challenges - A
>>> perspective- Raghavendra Gowdappa
>>> 10:00am - 10:25am  DHT2 - O Brother, Where Art Thou? - Shyam Ranganathan
>>> 10:30am - 10:55am Performance bottlenecks for metadata workload in
>>> Gluster - Poornima Gurusiddaiah ,  Rajesh Joseph
>>> 11:00am - 11:25am The life of a consultant listed on gluster.org
>>>  - Ivan Rossi
>>> 11:30am - 11:55am Architecture of the High Availability Solution for
>>> Ganesha and Samba - Kaleb Keithley
>>> 12:00 - 1:00pmLunch
>>> 1:00pm - 1:25pmChallenges with Gluster and Persistent Memory - Dan
>>> Lambright
>>> 1:25pm - 1:55pmThrottling in gluster  - Ravishankar Narayanankutty
>>> 2:00pm  - 2:25pmGluster: The Ugly Parts - Jeff Darcy
>>> 2:30pm  - 2:55pmDeterministic Releases and How to Get There - Nigel Babu
>>> 3:00pm - 3:25pmBreak
>>> 3:30pm - 4:00pmBirds of a Feather Sessions
>>> 4:00pm - 4:55pmBirds of a Feather Sessions
>>> Evening Reception to be announced
>>>
>>>
>>> October 7
>>> 9:00am - 9:25amGFProxy: Scaling the GlusterFS FUSE Client - Shreyas
>>> Siravara
>>> 9:30 - 9:55amSharding in GlusterFS - Past, Present and Future - Krutika
>>> Dhananjay
>>> 10:00am - 10:25amObject Storage with Gluster - Prashanth Pai
>>> 10:30am - 10:55am Containers and Perisstent Storage for Containers. -
>>> Humble Chirammal, Luis Pabon
>>> 11:00am - 11:25am Gluster as Block Store in Containers  - Prasanna
>>> Kalever
>>> 11:30am - 11:55amAn Update on GlusterD-2.0 - Kaushal Madappa
>>> 12:00 - 1:00pmLunch
>>> 1:00pm - 1:25pmIntegration of GlusterFS in to Commvault data platform  -
>>> Ankireddypalle Reddy
>>> 1:30-1:55pmBootstrapping Challenge
>>> 2:00pm  - 2:25pmPractical Glusto Example - Jonathan Holloway
>>> 2:30pm  - 2:55pmState of Gluster Performance - Manoj Pillai
>>> 3:00pm - 3:25pmServer side replication - Avra Sengupta
>>> 3:30pm - 4:00pmBirds of a Feather Sessions
>>> 4:00pm - 4:55pmBirds of a Feather Sessions
>>> 5:00pm - 5:30pm Closing
>>>
>>> --
>>> Amye Scavarda | a...@redhat.com  | Gluster
>>> Community Lead
>>>
>>>
>>> ___
>>> Gluster-devel mailing list
>>> Gluster-devel@gluster.org
>>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>>
>>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
>
>
>
> --
> Pranith
>



-- 
Amye Scavarda | a...@redhat.com | Gluster Community Lead
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Gluster Developer Summit 2016 Talk Schedule

2016-09-15 Thread Pranith Kumar Karampuri
On Thu, Sep 15, 2016 at 2:37 PM, Soumya Koduri  wrote:

> Hi Amye,
>
> Is there any plan to record these talks?
>

I had same question.


>
> Thanks,
> Soumya
>
> On 09/15/2016 03:09 AM, Amye Scavarda wrote:
>
>> Thanks to all that submitted talks, and thanks to the program committee
>> who helped select this year's content.
>>
>> This will be posted on the main Summit page as
>> well: gluster.org/events/summit2016 > >
>>
>> October 6
>> 9:00am - 9:25amOpening Session
>> 9:30 - 9:55amDHT: current design, (dis)advantages, challenges - A
>> perspective- Raghavendra Gowdappa
>> 10:00am - 10:25am  DHT2 - O Brother, Where Art Thou? - Shyam Ranganathan
>> 10:30am - 10:55am Performance bottlenecks for metadata workload in
>> Gluster - Poornima Gurusiddaiah ,  Rajesh Joseph
>> 11:00am - 11:25am The life of a consultant listed on gluster.org
>>  - Ivan Rossi
>> 11:30am - 11:55am Architecture of the High Availability Solution for
>> Ganesha and Samba - Kaleb Keithley
>> 12:00 - 1:00pmLunch
>> 1:00pm - 1:25pmChallenges with Gluster and Persistent Memory - Dan
>> Lambright
>> 1:25pm - 1:55pmThrottling in gluster  - Ravishankar Narayanankutty
>> 2:00pm  - 2:25pmGluster: The Ugly Parts - Jeff Darcy
>> 2:30pm  - 2:55pmDeterministic Releases and How to Get There - Nigel Babu
>> 3:00pm - 3:25pmBreak
>> 3:30pm - 4:00pmBirds of a Feather Sessions
>> 4:00pm - 4:55pmBirds of a Feather Sessions
>> Evening Reception to be announced
>>
>>
>> October 7
>> 9:00am - 9:25amGFProxy: Scaling the GlusterFS FUSE Client - Shreyas
>> Siravara
>> 9:30 - 9:55amSharding in GlusterFS - Past, Present and Future - Krutika
>> Dhananjay
>> 10:00am - 10:25amObject Storage with Gluster - Prashanth Pai
>> 10:30am - 10:55am Containers and Perisstent Storage for Containers. -
>> Humble Chirammal, Luis Pabon
>> 11:00am - 11:25am Gluster as Block Store in Containers  - Prasanna Kalever
>> 11:30am - 11:55amAn Update on GlusterD-2.0 - Kaushal Madappa
>> 12:00 - 1:00pmLunch
>> 1:00pm - 1:25pmIntegration of GlusterFS in to Commvault data platform  -
>> Ankireddypalle Reddy
>> 1:30-1:55pmBootstrapping Challenge
>> 2:00pm  - 2:25pmPractical Glusto Example - Jonathan Holloway
>> 2:30pm  - 2:55pmState of Gluster Performance - Manoj Pillai
>> 3:00pm - 3:25pmServer side replication - Avra Sengupta
>> 3:30pm - 4:00pmBirds of a Feather Sessions
>> 4:00pm - 4:55pmBirds of a Feather Sessions
>> 5:00pm - 5:30pm Closing
>>
>> --
>> Amye Scavarda | a...@redhat.com  | Gluster
>> Community Lead
>>
>>
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
>>
>> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>



-- 
Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Running strfmt as smoke test

2016-09-15 Thread Nigel Babu
On Thu, Sep 01, 2016 at 12:06:03PM +0530, Nigel Babu wrote:
> On Wed, Aug 31, 2016 at 05:40:48PM +0530, Nigel Babu wrote:
> > Hello,
> >
> > Kaleb has pointed out that the number of failures here are creeping up.
> > I've suggested we run this as a smoke test. It's going to fail all the time
> > at first, so I propose it be a non-voting test for now. Once we get master
> > in a good shape, we can turn on voting for this job.
> >
> > Does that sound like a reasonable idea? I'll probably only run it on master
> > for now and any future branches (excluding i.e. after 3.9).
> >
>
> This job is now ready and working[1]. It doesn't vote on smoke yet. It passes
> on master, but not on older branches. It's a good idea to fix them soon so we
> can operate more smoothly.
>
> [1]: https://build.gluster.org/job/strfmt_errors/
>

We've had this job running for 2 weeks returning consistent green results.
Today, I'll turn on voting so we catch any future errors across any branch.

--
nigelb
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Query regards to heal xattr heal in dht

2016-09-15 Thread Raghavendra Gowdappa


- Original Message -
> From: "Xavier Hernandez" 
> To: "Raghavendra G" , "Nithya Balachandran" 
> 
> Cc: "Gluster Devel" , "Mohit Agrawal" 
> 
> Sent: Thursday, September 15, 2016 4:54:25 PM
> Subject: Re: [Gluster-devel] Query regards to heal xattr heal in dht
> 
> 
> 
> On 15/09/16 11:31, Raghavendra G wrote:
> >
> >
> > On Thu, Sep 15, 2016 at 12:02 PM, Nithya Balachandran
> > > wrote:
> >
> >
> >
> > On 8 September 2016 at 12:02, Mohit Agrawal  > > wrote:
> >
> > Hi All,
> >
> >I have one another solution to heal user xattr but before
> > implement it i would like to discuss with you.
> >
> >Can i call function (dht_dir_xattr_heal internally it is
> > calling syncop_setxattr) to heal xattr in dht_getxattr_cbk in last
> >after make sure we have a valid xattr.
> >In function(dht_dir_xattr_heal) it will copy blindly all user
> > xattr on all subvolume or i can compare subvol xattr with valid
> > xattr if there is any mismatch then i will call syncop_setxattr
> > otherwise no need to call. syncop_setxattr.
> >
> >
> >
> > This can be problematic if a particular xattr is being removed - it
> > might still exist on some subvols. IIUC, the heal would go and reset
> > it again?
> >
> > One option is to use the hash subvol for the dir as the source - so
> > perform xattr op on hashed subvol first and on the others only if it
> > succeeds on the hashed. This does have the problem of being unable
> > to set xattrs if the hashed subvol is unavailable. This might not be
> > such a big deal in case of distributed replicate or distribute
> > disperse volumes but will affect pure distribute. However, this way
> > we can at least be reasonably certain of the correctness (leaving
> > rebalance out of the picture).
> >
> >
> > * What is the behavior of getxattr when hashed subvol is down? Should we
> > succeed with values from non-hashed subvols or should we fail getxattr?
> > With hashed-subvol as source of truth, its difficult to determine
> > correctness of xattrs and their values when it is down.
> >
> > * setxattr is an inode operation (as opposed to entry operation). So, we
> > cannot calculate hashed-subvol as in (get)(set)xattr, parent layout and
> > "basename" is not available. This forces us to store hashed subvol in
> > inode-ctx. Now, when the hashed-subvol changes we need to update these
> > inode-ctxs too.
> >
> > What do you think about a Quorum based solution to this problem?
> >
> > 1. setxattr succeeds only if it is successful on at least (n/2 + 1)
> > number of subvols.
> > 2. getxattr succeeds only if it is successful and values match on at
> > least (n/2 + 1) number of subvols.
> >
> > The flip-side of this solution is we are increasing the probability of
> > failure of (get)(set)xattr operations as opposed to the hashed-subvol as
> > source of truth solution. Or are we - how do we compare probability of
> > hashed-subvol going down with probability of (n/2 + 1) nodes going down
> > simultaneously? Is it 1/n vs (1/n*1/n*... (n/2+1 times)?. Is 1/n correct
> > probability for _a specific subvol (hashed-subvol)_ going down (as
> > opposed to _any one subvol_ going down)?
> 
> If we suppose p to be the probability of failure of a subvolume in a
> period of time (a year for example), all subvolumes have the same
> probability, and we have N subvolumes, then:
> 
> Probability of failure of hashed-subvol: p
> Probability of failure of N/2 + 1 or more subvols: 

Thanks Xavi. That was quick :).

> 
> Note that this probability says how much probable is that N/2 + 1
> subvols or more fail in the specified period of time, but not
> necessarily simultaneously. If we suppose that subvolumes are recovered
> as fast as possible, the real probability of simultaneous failure will
> be much smaller.
> 
> In worst case (not recovering the failed subvolumes in the given period
> of time), if p < 0.5 or N = 2 (and p != 1), then it's always better to
> check N/2 + 1 subvolumes. Otherwise, it's better to check the hashed-subvol.
> 
> I think that p should always be much smaller than 0.5 for small periods
> of time where subvolume recovery could no be completed before other
> failures, so checking half plus one subvols should always be the best
> option in terms of probability. Performance can suffer though if some
> kind of synchronization is needed.

For this problem, no synchronization is needed. We need to wind the 
(get)(set)xattr call to all subvols though. What I didn't think through is 
rollback/rollforward during setxattr if the op fails on more than quorum 
subvols. One problem with rollback approach is that we may never get a chance 
to rollback 

Re: [Gluster-devel] Query regards to heal xattr heal in dht

2016-09-15 Thread Xavier Hernandez



On 15/09/16 11:31, Raghavendra G wrote:



On Thu, Sep 15, 2016 at 12:02 PM, Nithya Balachandran
> wrote:



On 8 September 2016 at 12:02, Mohit Agrawal > wrote:

Hi All,

   I have one another solution to heal user xattr but before
implement it i would like to discuss with you.

   Can i call function (dht_dir_xattr_heal internally it is
calling syncop_setxattr) to heal xattr in dht_getxattr_cbk in last
   after make sure we have a valid xattr.
   In function(dht_dir_xattr_heal) it will copy blindly all user
xattr on all subvolume or i can compare subvol xattr with valid
xattr if there is any mismatch then i will call syncop_setxattr
otherwise no need to call. syncop_setxattr.



This can be problematic if a particular xattr is being removed - it
might still exist on some subvols. IIUC, the heal would go and reset
it again?

One option is to use the hash subvol for the dir as the source - so
perform xattr op on hashed subvol first and on the others only if it
succeeds on the hashed. This does have the problem of being unable
to set xattrs if the hashed subvol is unavailable. This might not be
such a big deal in case of distributed replicate or distribute
disperse volumes but will affect pure distribute. However, this way
we can at least be reasonably certain of the correctness (leaving
rebalance out of the picture).


* What is the behavior of getxattr when hashed subvol is down? Should we
succeed with values from non-hashed subvols or should we fail getxattr?
With hashed-subvol as source of truth, its difficult to determine
correctness of xattrs and their values when it is down.

* setxattr is an inode operation (as opposed to entry operation). So, we
cannot calculate hashed-subvol as in (get)(set)xattr, parent layout and
"basename" is not available. This forces us to store hashed subvol in
inode-ctx. Now, when the hashed-subvol changes we need to update these
inode-ctxs too.

What do you think about a Quorum based solution to this problem?

1. setxattr succeeds only if it is successful on at least (n/2 + 1)
number of subvols.
2. getxattr succeeds only if it is successful and values match on at
least (n/2 + 1) number of subvols.

The flip-side of this solution is we are increasing the probability of
failure of (get)(set)xattr operations as opposed to the hashed-subvol as
source of truth solution. Or are we - how do we compare probability of
hashed-subvol going down with probability of (n/2 + 1) nodes going down
simultaneously? Is it 1/n vs (1/n*1/n*... (n/2+1 times)?. Is 1/n correct
probability for _a specific subvol (hashed-subvol)_ going down (as
opposed to _any one subvol_ going down)?


If we suppose p to be the probability of failure of a subvolume in a 
period of time (a year for example), all subvolumes have the same 
probability, and we have N subvolumes, then:


Probability of failure of hashed-subvol: p
Probability of failure of N/2 + 1 or more subvols: 

Note that this probability says how much probable is that N/2 + 1 
subvols or more fail in the specified period of time, but not 
necessarily simultaneously. If we suppose that subvolumes are recovered 
as fast as possible, the real probability of simultaneous failure will 
be much smaller.


In worst case (not recovering the failed subvolumes in the given period 
of time), if p < 0.5 or N = 2 (and p != 1), then it's always better to 
check N/2 + 1 subvolumes. Otherwise, it's better to check the hashed-subvol.


I think that p should always be much smaller than 0.5 for small periods 
of time where subvolume recovery could no be completed before other 
failures, so checking half plus one subvols should always be the best 
option in terms of probability. Performance can suffer though if some 
kind of synchronization is needed.


Xavi







   Let me know if this approach is suitable.



Regards
Mohit Agrawal

On Wed, Sep 7, 2016 at 10:27 PM, Pranith Kumar Karampuri
> wrote:



On Wed, Sep 7, 2016 at 9:46 PM, Mohit Agrawal
> wrote:

Hi Pranith,


In current approach i am getting list of xattr from
first up volume and update the user attributes from that
xattr to
all other volumes.

I have assumed first up subvol is source and rest of
them are sink as we are doing same in dht_dir_attr_heal.


I think first up subvol is different for different mounts as
per my understanding, I could be wrong.



Regards
Mohit Agrawal

On Wed, Sep 7, 2016 at 9:34 PM, Pranith Kumar Karampuri
  

[Gluster-devel] Running glusterd with valgrind

2016-09-15 Thread Avra Sengupta

Hi,

I was trying to run valgrind with glusterd using the following command:

valgrind --leak-check=full --log-file=/tmp/glusterd.log glusterd

This command used to work before, rather seamlessly but now glusterd 
crashes with the following bt:


/usr/local/lib/libglusterfs.so.0(_gf_msg_backtrace_nomem+0x9c)[0x4c3772c]
/usr/local/lib/libglusterfs.so.0(gf_print_trace+0x314)[0x4c426f4]
/lib64/libc.so.6[0x3344a329a0]
/lib64/libc.so.6(gsignal+0x35)[0x3344a32925]
/lib64/libc.so.6(abort+0x175)[0x3344a34105]
/lib64/libc.so.6[0x3344a2ba4e]
/lib64/libc.so.6(__assert_perror_fail+0x0)[0x3344a2bb10]
/usr/lib64/liburcu-bp.so.1(rcu_bp_register+0x288)[0xf7974a8]
/usr/lib64/liburcu-bp.so.1(rcu_read_lock_bp+0x4e)[0xf79751e]
/usr/local/lib/glusterfs/3.10dev/xlator/mgmt/glusterd.so(+0x10e07e)[0xf52e07e]
/usr/local/lib/glusterfs/3.10dev/xlator/mgmt/glusterd.so(+0x619cb)[0xf4819cb]
/usr/local/lib/glusterfs/3.10dev/xlator/mgmt/glusterd.so(+0x62394)[0xf482394]
/usr/local/lib/libglusterfs.so.0(synctask_wrap+0x10)[0x4c70a50]
/lib64/libc.so.6[0x3344a43bf0]

Is this a known issue? Is there another way to run gluster with valgrind.

Regards,
Avra
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Query regards to heal xattr heal in dht

2016-09-15 Thread Raghavendra G
On Thu, Sep 15, 2016 at 12:02 PM, Nithya Balachandran 
wrote:

>
>
> On 8 September 2016 at 12:02, Mohit Agrawal  wrote:
>
>> Hi All,
>>
>>I have one another solution to heal user xattr but before implement it
>> i would like to discuss with you.
>>
>>Can i call function (dht_dir_xattr_heal internally it is calling
>> syncop_setxattr) to heal xattr in dht_getxattr_cbk in last
>>after make sure we have a valid xattr.
>>In function(dht_dir_xattr_heal) it will copy blindly all user xattr on
>> all subvolume or i can compare subvol xattr with valid xattr if there is
>> any mismatch then i will call syncop_setxattr otherwise no need to call.
>> syncop_setxattr.
>>
>
>
> This can be problematic if a particular xattr is being removed - it might
> still exist on some subvols. IIUC, the heal would go and reset it again?
>
> One option is to use the hash subvol for the dir as the source - so
> perform xattr op on hashed subvol first and on the others only if it
> succeeds on the hashed. This does have the problem of being unable to set
> xattrs if the hashed subvol is unavailable. This might not be such a big
> deal in case of distributed replicate or distribute disperse volumes but
> will affect pure distribute. However, this way we can at least be
> reasonably certain of the correctness (leaving rebalance out of the
> picture).
>

* What is the behavior of getxattr when hashed subvol is down? Should we
succeed with values from non-hashed subvols or should we fail getxattr?
With hashed-subvol as source of truth, its difficult to determine
correctness of xattrs and their values when it is down.

* setxattr is an inode operation (as opposed to entry operation). So, we
cannot calculate hashed-subvol as in (get)(set)xattr, parent layout and
"basename" is not available. This forces us to store hashed subvol in
inode-ctx. Now, when the hashed-subvol changes we need to update these
inode-ctxs too.

What do you think about a Quorum based solution to this problem?

1. setxattr succeeds only if it is successful on at least (n/2 + 1) number
of subvols.
2. getxattr succeeds only if it is successful and values match on at least
(n/2 + 1) number of subvols.

The flip-side of this solution is we are increasing the probability of
failure of (get)(set)xattr operations as opposed to the hashed-subvol as
source of truth solution. Or are we - how do we compare probability of
hashed-subvol going down with probability of (n/2 + 1) nodes going down
simultaneously? Is it 1/n vs (1/n*1/n*... (n/2+1 times)?. Is 1/n correct
probability for _a specific subvol (hashed-subvol)_ going down (as opposed
to _any one subvol_ going down)?



>
>
>>
>>Let me know if this approach is suitable.
>>
>>
>>
>> Regards
>> Mohit Agrawal
>>
>> On Wed, Sep 7, 2016 at 10:27 PM, Pranith Kumar Karampuri <
>> pkara...@redhat.com> wrote:
>>
>>>
>>>
>>> On Wed, Sep 7, 2016 at 9:46 PM, Mohit Agrawal 
>>> wrote:
>>>
 Hi Pranith,


 In current approach i am getting list of xattr from first up volume and
 update the user attributes from that xattr to
 all other volumes.

 I have assumed first up subvol is source and rest of them are sink as
 we are doing same in dht_dir_attr_heal.

>>>
>>> I think first up subvol is different for different mounts as per my
>>> understanding, I could be wrong.
>>>
>>>

 Regards
 Mohit Agrawal

 On Wed, Sep 7, 2016 at 9:34 PM, Pranith Kumar Karampuri <
 pkara...@redhat.com> wrote:

> hi Mohit,
>How does dht find which subvolume has the correct list of
> xattrs? i.e. how does it determine which subvolume is source and which is
> sink?
>
> On Wed, Sep 7, 2016 at 2:35 PM, Mohit Agrawal 
> wrote:
>
>> Hi,
>>
>>   I am trying to find out solution of one problem in dht specific to
>> user xattr healing.
>>   I tried to correct it in a same way as we are doing for healing dir
>> attribute but i feel it is not best solution.
>>
>>   To find a right way to heal xattr i want to discuss with you if
>> anyone does have better solution to correct it.
>>
>>   Problem:
>>In a distributed volume environment custom extended attribute
>> value for a directory does not display correct value after stop/start the
>> brick. If any extended attribute value is set for a directory after stop
>> the brick the attribute value is not updated on brick after start the 
>> brick.
>>
>>   Current approach:
>> 1) function set_user_xattr to store user extended attribute in
>> dictionary
>> 2) function dht_dir_xattr_heal call syncop_setxattr to update the
>> attribute on all volume
>> 3) Call the function (dht_dir_xattr_heal) for every directory
>> lookup in dht_lookup_revalidate_cbk
>>
>>   Psuedocode for function dht_dir_xatt_heal is like below

Re: [Gluster-devel] Gluster Developer Summit 2016 Talk Schedule

2016-09-15 Thread Soumya Koduri

Hi Amye,

Is there any plan to record these talks?

Thanks,
Soumya

On 09/15/2016 03:09 AM, Amye Scavarda wrote:

Thanks to all that submitted talks, and thanks to the program committee
who helped select this year's content.

This will be posted on the main Summit page as
well: gluster.org/events/summit2016 

October 6
9:00am - 9:25amOpening Session
9:30 - 9:55amDHT: current design, (dis)advantages, challenges - A
perspective- Raghavendra Gowdappa
10:00am - 10:25am  DHT2 - O Brother, Where Art Thou? - Shyam Ranganathan
10:30am - 10:55am Performance bottlenecks for metadata workload in
Gluster - Poornima Gurusiddaiah ,  Rajesh Joseph
11:00am - 11:25am The life of a consultant listed on gluster.org
 - Ivan Rossi
11:30am - 11:55am Architecture of the High Availability Solution for
Ganesha and Samba - Kaleb Keithley
12:00 - 1:00pmLunch
1:00pm - 1:25pmChallenges with Gluster and Persistent Memory - Dan Lambright
1:25pm - 1:55pmThrottling in gluster  - Ravishankar Narayanankutty
2:00pm  - 2:25pmGluster: The Ugly Parts - Jeff Darcy
2:30pm  - 2:55pmDeterministic Releases and How to Get There - Nigel Babu
3:00pm - 3:25pmBreak
3:30pm - 4:00pmBirds of a Feather Sessions
4:00pm - 4:55pmBirds of a Feather Sessions
Evening Reception to be announced


October 7
9:00am - 9:25amGFProxy: Scaling the GlusterFS FUSE Client - Shreyas Siravara
9:30 - 9:55amSharding in GlusterFS - Past, Present and Future - Krutika
Dhananjay
10:00am - 10:25amObject Storage with Gluster - Prashanth Pai
10:30am - 10:55am Containers and Perisstent Storage for Containers. -
Humble Chirammal, Luis Pabon
11:00am - 11:25am Gluster as Block Store in Containers  - Prasanna Kalever
11:30am - 11:55amAn Update on GlusterD-2.0 - Kaushal Madappa
12:00 - 1:00pmLunch
1:00pm - 1:25pmIntegration of GlusterFS in to Commvault data platform  -
Ankireddypalle Reddy
1:30-1:55pmBootstrapping Challenge
2:00pm  - 2:25pmPractical Glusto Example - Jonathan Holloway
2:30pm  - 2:55pmState of Gluster Performance - Manoj Pillai
3:00pm - 3:25pmServer side replication - Avra Sengupta
3:30pm - 4:00pmBirds of a Feather Sessions
4:00pm - 4:55pmBirds of a Feather Sessions
5:00pm - 5:30pm Closing

--
Amye Scavarda | a...@redhat.com  | Gluster
Community Lead


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Review request: tier as a service.

2016-09-15 Thread Niels de Vos
On Thu, Sep 15, 2016 at 02:50:09AM -0400, Hari Gowtham wrote:
> Hi,
> 
> I would be happy to get reviews for this patch
> http://review.gluster.org/#/c/13365/
> 
> more details can be found here about the changes:
> https://docs.google.com/document/d/1_iyjiwTLnBJlCiUgjAWnpnPD801h5LNxLhHmN7zmk1o/edit?usp=sharing

Please send this as a document for the glusterfs-specs repository (uses
Gerrit just like the glusterfs sources). See the README.md on
https://github.com/gluster/glusterfs-specs/blob/master/README.md for
some more details.

Thanks,
Niels


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Review request: tier as a service.

2016-09-15 Thread Hari Gowtham
Hi,

I would be happy to get reviews for this patch
http://review.gluster.org/#/c/13365/

more details can be found here about the changes:
https://docs.google.com/document/d/1_iyjiwTLnBJlCiUgjAWnpnPD801h5LNxLhHmN7zmk1o/edit?usp=sharing


-- 
Regards, 
Hari. 

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Query regards to heal xattr heal in dht

2016-09-15 Thread Pranith Kumar Karampuri
On Thu, Sep 15, 2016 at 12:02 PM, Nithya Balachandran 
wrote:

>
>
> On 8 September 2016 at 12:02, Mohit Agrawal  wrote:
>
>> Hi All,
>>
>>I have one another solution to heal user xattr but before implement it
>> i would like to discuss with you.
>>
>>Can i call function (dht_dir_xattr_heal internally it is calling
>> syncop_setxattr) to heal xattr in dht_getxattr_cbk in last
>>after make sure we have a valid xattr.
>>In function(dht_dir_xattr_heal) it will copy blindly all user xattr on
>> all subvolume or i can compare subvol xattr with valid xattr if there is
>> any mismatch then i will call syncop_setxattr otherwise no need to call.
>> syncop_setxattr.
>>
>
>
> This can be problematic if a particular xattr is being removed - it might
> still exist on some subvols. IIUC, the heal would go and reset it again?
>
> One option is to use the hash subvol for the dir as the source - so
> perform xattr op on hashed subvol first and on the others only if it
> succeeds on the hashed. This does have the problem of being unable to set
> xattrs if the hashed subvol is unavailable. This might not be such a big
> deal in case of distributed replicate or distribute disperse volumes but
> will affect pure distribute. However, this way we can at least be
> reasonably certain of the correctness (leaving rebalance out of the
> picture).
>

Yes, this seems fine.


>
>
>
>>
>>Let me know if this approach is suitable.
>>
>>
>>
>> Regards
>> Mohit Agrawal
>>
>> On Wed, Sep 7, 2016 at 10:27 PM, Pranith Kumar Karampuri <
>> pkara...@redhat.com> wrote:
>>
>>>
>>>
>>> On Wed, Sep 7, 2016 at 9:46 PM, Mohit Agrawal 
>>> wrote:
>>>
 Hi Pranith,


 In current approach i am getting list of xattr from first up volume and
 update the user attributes from that xattr to
 all other volumes.

 I have assumed first up subvol is source and rest of them are sink as
 we are doing same in dht_dir_attr_heal.

>>>
>>> I think first up subvol is different for different mounts as per my
>>> understanding, I could be wrong.
>>>
>>>

 Regards
 Mohit Agrawal

 On Wed, Sep 7, 2016 at 9:34 PM, Pranith Kumar Karampuri <
 pkara...@redhat.com> wrote:

> hi Mohit,
>How does dht find which subvolume has the correct list of
> xattrs? i.e. how does it determine which subvolume is source and which is
> sink?
>
> On Wed, Sep 7, 2016 at 2:35 PM, Mohit Agrawal 
> wrote:
>
>> Hi,
>>
>>   I am trying to find out solution of one problem in dht specific to
>> user xattr healing.
>>   I tried to correct it in a same way as we are doing for healing dir
>> attribute but i feel it is not best solution.
>>
>>   To find a right way to heal xattr i want to discuss with you if
>> anyone does have better solution to correct it.
>>
>>   Problem:
>>In a distributed volume environment custom extended attribute
>> value for a directory does not display correct value after stop/start the
>> brick. If any extended attribute value is set for a directory after stop
>> the brick the attribute value is not updated on brick after start the 
>> brick.
>>
>>   Current approach:
>> 1) function set_user_xattr to store user extended attribute in
>> dictionary
>> 2) function dht_dir_xattr_heal call syncop_setxattr to update the
>> attribute on all volume
>> 3) Call the function (dht_dir_xattr_heal) for every directory
>> lookup in dht_lookup_revalidate_cbk
>>
>>   Psuedocode for function dht_dir_xatt_heal is like below
>>
>>1) First it will fetch atttributes from first up volume and store
>> into xattr.
>>2) Run loop on all subvolume and fetch existing attributes from
>> every volume
>>3) Replace user attributes from current attributes with xattr user
>> attributes
>>4) Set latest extended attributes(current + old user attributes)
>> inot subvol.
>>
>>
>>In this current approach problem is
>>
>>1) it will call heal function(dht_dir_xattr_heal) for every
>> directory lookup without comparing xattr.
>> 2) The function internally call syncop xattr for every subvolume
>> that would be a expensive operation.
>>
>>I have one another way like below to correct it but again in this
>> one it does have dependency on time (not sure time is synch on all bricks
>> or not)
>>
>>1) At the time of set extended attribute(setxattr) change time in
>> metadata at server side
>>2) Compare change time before call healing function in
>> dht_revalidate_cbk
>>
>> Please share your input on this.
>> Appreciate your input.
>>
>> Regards
>> Mohit Agrawal
>>
>> ___
>> 

Re: [Gluster-devel] Query regards to heal xattr heal in dht

2016-09-15 Thread Nithya Balachandran
On 8 September 2016 at 12:02, Mohit Agrawal  wrote:

> Hi All,
>
>I have one another solution to heal user xattr but before implement it
> i would like to discuss with you.
>
>Can i call function (dht_dir_xattr_heal internally it is calling
> syncop_setxattr) to heal xattr in dht_getxattr_cbk in last
>after make sure we have a valid xattr.
>In function(dht_dir_xattr_heal) it will copy blindly all user xattr on
> all subvolume or i can compare subvol xattr with valid xattr if there is
> any mismatch then i will call syncop_setxattr otherwise no need to call.
> syncop_setxattr.
>


This can be problematic if a particular xattr is being removed - it might
still exist on some subvols. IIUC, the heal would go and reset it again?

One option is to use the hash subvol for the dir as the source - so perform
xattr op on hashed subvol first and on the others only if it succeeds on
the hashed. This does have the problem of being unable to set xattrs if the
hashed subvol is unavailable. This might not be such a big deal in case of
distributed replicate or distribute disperse volumes but will affect pure
distribute. However, this way we can at least be reasonably certain of the
correctness (leaving rebalance out of the picture).



>
>Let me know if this approach is suitable.
>
>
>
> Regards
> Mohit Agrawal
>
> On Wed, Sep 7, 2016 at 10:27 PM, Pranith Kumar Karampuri <
> pkara...@redhat.com> wrote:
>
>>
>>
>> On Wed, Sep 7, 2016 at 9:46 PM, Mohit Agrawal 
>> wrote:
>>
>>> Hi Pranith,
>>>
>>>
>>> In current approach i am getting list of xattr from first up volume and
>>> update the user attributes from that xattr to
>>> all other volumes.
>>>
>>> I have assumed first up subvol is source and rest of them are sink as we
>>> are doing same in dht_dir_attr_heal.
>>>
>>
>> I think first up subvol is different for different mounts as per my
>> understanding, I could be wrong.
>>
>>
>>>
>>> Regards
>>> Mohit Agrawal
>>>
>>> On Wed, Sep 7, 2016 at 9:34 PM, Pranith Kumar Karampuri <
>>> pkara...@redhat.com> wrote:
>>>
 hi Mohit,
How does dht find which subvolume has the correct list of
 xattrs? i.e. how does it determine which subvolume is source and which is
 sink?

 On Wed, Sep 7, 2016 at 2:35 PM, Mohit Agrawal 
 wrote:

> Hi,
>
>   I am trying to find out solution of one problem in dht specific to
> user xattr healing.
>   I tried to correct it in a same way as we are doing for healing dir
> attribute but i feel it is not best solution.
>
>   To find a right way to heal xattr i want to discuss with you if
> anyone does have better solution to correct it.
>
>   Problem:
>In a distributed volume environment custom extended attribute value
> for a directory does not display correct value after stop/start the brick.
> If any extended attribute value is set for a directory after stop the 
> brick
> the attribute value is not updated on brick after start the brick.
>
>   Current approach:
> 1) function set_user_xattr to store user extended attribute in
> dictionary
> 2) function dht_dir_xattr_heal call syncop_setxattr to update the
> attribute on all volume
> 3) Call the function (dht_dir_xattr_heal) for every directory
> lookup in dht_lookup_revalidate_cbk
>
>   Psuedocode for function dht_dir_xatt_heal is like below
>
>1) First it will fetch atttributes from first up volume and store
> into xattr.
>2) Run loop on all subvolume and fetch existing attributes from
> every volume
>3) Replace user attributes from current attributes with xattr user
> attributes
>4) Set latest extended attributes(current + old user attributes)
> inot subvol.
>
>
>In this current approach problem is
>
>1) it will call heal function(dht_dir_xattr_heal) for every
> directory lookup without comparing xattr.
> 2) The function internally call syncop xattr for every subvolume
> that would be a expensive operation.
>
>I have one another way like below to correct it but again in this
> one it does have dependency on time (not sure time is synch on all bricks
> or not)
>
>1) At the time of set extended attribute(setxattr) change time in
> metadata at server side
>2) Compare change time before call healing function in
> dht_revalidate_cbk
>
> Please share your input on this.
> Appreciate your input.
>
> Regards
> Mohit Agrawal
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>



 --
 Pranith

>>>
>>>
>>
>>
>> --
>> Pranith
>>
>
>
> ___
> 

[Gluster-devel] Rebalance status

2016-09-15 Thread Nithya Balachandran
Hi,

While the code defines the following:

GF_DEFRAG_STATUS_LAYOUT_FIX_STARTED,
GF_DEFRAG_STATUS_LAYOUT_FIX_STOPPED,
GF_DEFRAG_STATUS_LAYOUT_FIX_COMPLETE,
GF_DEFRAG_STATUS_LAYOUT_FIX_FAILED


we don't seem to be using them anywhere. They sound like statuses specific
to the fix-layout command.
I would like to return these in case we are only doing a fix-layout instead
of a rebalance.


Does anyone have any concerns with this proposed change?

Thanks,
Nithya
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel