Re: [Gluster-devel] Bitrot/Tering : Bad files get migrated and hence corruption goes undetected.

2016-02-26 Thread Joseph Fernandes
Well correctly we dont migrate the existing signature, the file starts it life 
fresh in the new tier(i.e get the bit rot version 1 on the new tier),
Now this is also the case with any special xattr/attributes of the file.
Again we rely heavily on the dht rebalance mechanism for migrations, which also 
doesnt carry special attributes/xattr.


- Original Message -
From: "Niels de Vos" 
To: "Joseph Fernandes" 
Cc: "Gluster Devel" 
Sent: Friday, February 26, 2016 10:33:11 PM
Subject: Re: [Gluster-devel] Bitrot/Tering : Bad files get migrated and hence 
corruption goes undetected.

On Fri, Feb 26, 2016 at 09:32:46AM -0500, Joseph Fernandes wrote:
> Hi All,
> 
> This is a discussion mail on the following issue, 
> 
> 1. Object is corrupted before it could be signed: In this case, the corrupted
>object is signed and get migrated upon I/O. There's no way to identify 
> corruption
>for this set of objects.
> 
> 2. Object is signed (but not scrubbed) and corruption happens thereafter:
>In this case, as of now, integrity checking is not done on the fly
>and the object would get migrated (and signed again in the hot tier).
> 
> 
> The (1) is definitely not a issue with bitrot with tiering. But (2) we can do 
> something to avoid
> corrupted file from getting migrated. Before we migrate files we can scrub 
> it, but its just a naive
> thought, any better suggestions? 

Is there a reason the existing signature can not be migrated? Why does
it become invalid?

Niels
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Need help with bitrot

2016-02-26 Thread Ajil Abraham
Thanks Joseph for the precise set of information.  I will follow these.

- Ajil

On Fri, Feb 26, 2016 at 7:05 AM, Joseph Fernandes 
wrote:

> Hope this helps
>
> Courtesy : Raghvendra Talur (rta...@redhat.com)
>
> 1. Clone glusterfs repo to your laptop and get acquainted with dev
> workflow.
> https://gluster.readthedocs.org/en/latest/Developer-guide/Developers-Index/
>
> 2. If you find using your laptop as the test machine for Gluster as too
> scary, here is a vagrant based mechanism to get VMs setup on your laptop
> easily for Gluster testing.
> http://comments.gmane.org/gmane.comp.file-systems.gluster.devel/13494
>
> 3.Find my Gluster Introduction blog post here in the preview link:
>
>
>
> https://6227134958232800133_bafac39c28bee4f256bbbef7510c9bb9b44fca05.blogspot.com/b/post-preview?token=s6_4MVIBAAA.zY--3ij00CkDwnitBOwnFBowEvCsKZ0o4ToQ0KYk9Po4pKujPj9ugmn-fm-XUFdLQxU50FmnCxBBr_IkSzuSlA.l_XFe1UvIEAiqkFAZZPdqQ&postId=4168074834715190149&type=POST
>
> 4. Follow all the lessons here in Translator 101 series for build on your
> understanding of Gluster.
>
> http://pl.atyp.us/hekafs.org/index.php/2011/11/translator-101-class-1-setting-the-stage/
>
> http://hekafs.org/index.php/2011/11/translator-101-lesson-2-init-fini-and-private-context
>
> http://hekafs.org/index.php/2011/11/translator-101-lesson-3-this-time-for-real
>
> http://hekafs.org/index.php/2011/11/translator-101-lesson-4-debugging-a-translator
>
>
> 5. Try to fix or understand any of the bugs in this list
>
> https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&classification=Community&f1=keywords&list_id=4424622&o1=substring&product=GlusterFS&query_format=advanced&v1=easyfix
>
>
> Regards,
> Joe
>
>
> - Original Message -
> From: "Ajil Abraham" 
> To: "FNU Raghavendra Manjunath" 
> Cc: "Gluster Devel" 
> Sent: Thursday, February 25, 2016 8:58:35 PM
> Subject: Re: [Gluster-devel] Need help with bitrot
>
> Thanks FNU Raghavendra. Does the signing happen only when the file data
> changes or even when extended attribute changes?
>
> I am also trying to understand the Gluster internal data structures. Are
> there any materials for the same? Similarly for the translators, the way
> they are stacked on client & server side, how control flows between them.
> Can somebody please help?
>
> - Ajil
>
> On Thu, Feb 25, 2016 at 7:27 AM, FNU Raghavendra Manjunath <
> rab...@redhat.com > wrote:
>
>
>
> Hi Ajil,
>
> Expiry policy tells the signer (Bit-rot Daemon) to wait for a specific
> period of time before signing a object.
>
> Whenever a object is modified, a notification is sent to the signer by
> brick process (bit-rot-stub xlator sitting in the I/O path) upon getting a
> release (i.e. when all the fds of that object are closed). The expiry
> policy tells the signer to wait for some time (by default its 120 seconds)
> before signing that object. It is done because, suppose the signer starts
> signing (i.e. read the object + calculate the checksum + store the
> checksum) a object the object gets modified again, then a new notification
> has to be sent and again signer has to sign the object by calculating the
> checksum. Whereas if the signer waits for some time and receives a new
> notification on the same object when its waiting, then it can avoid signing
> for the first notification.
>
> Venky, do you want to add anything more?
>
> Regards,
> Raghavendra
>
>
>
> On Wed, Feb 24, 2016 at 12:28 AM, Ajil Abraham < ajil95.abra...@gmail.com
> > wrote:
>
>
>
> Hi,
>
> I am a student interested in GlusterFS. Trying to understand the design of
> GlusterFS. Came across the Bitrot design document in Google. There is a
> mention of expiry policy used to sign the files. I did not clearly
> understand what the expiry policy is. Can somebody please help?
>
> -Ajil
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
>
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Query on healing process

2016-02-26 Thread ABHISHEK PALIWAL
Hi Ravi,

Thanks for the response.

We are using Glugsterfs-3.7.8

Here is the use case:

We have a logging file which saves logs of the events for every board of a
node and these files are in sync using glusterfs. System in replica 2 mode
it means When one brick in a replicated volume goes offline, the glusterd
daemons on the other nodes keep track of all the files that are not
replicated to the offline brick. When the offline brick becomes available
again, the cluster initiates a healing process, replicating the updated
files to that brick. But in our casse, we see that log file of one board is
not in the sync and its format is corrupted means files are not in sync.

Even the outcome of #gluster volume heal c_glusterfs info shows that there
is no pending heals.

Also , The logging file which is updated is of fixed size and the new
entries will be wrapped ,overwriting the old entries.

This way we have seen that after few restarts , the contents of the same
file on two bricks are different , but the volume heal info shows zero
entries

Solution:

But when we tried to put delay  > 5 min before the healing everything is
working fine.

Regards,
Abhishek

On Fri, Feb 26, 2016 at 6:35 AM, Ravishankar N 
wrote:

> On 02/25/2016 06:01 PM, ABHISHEK PALIWAL wrote:
>
> Hi,
>
> Here, I have one query regarding the time taken by the healing process.
> In current two node setup when we rebooted one node then the self-healing
> process starts less than 5min interval on the board which resulting the
> corruption of the some files data.
>
>
> Heal should start immediately after the brick process comes up. What
> version of gluster are you using? What do you mean by corruption of data?
> Also, how did you observe that the heal started after 5 minutes?
> -Ravi
>
>
> And to resolve it I have search on google and found the following link:
> https://support.rackspace.com/how-to/glusterfs-troubleshooting/
>
> Mentioning that the healing process can takes upto 10min of time to start
> this process.
>
> Here is the statement from the link:
>
> "Healing replicated volumes
>
> When any brick in a replicated volume goes offline, the glusterd daemons
> on the remaining nodes keep track of all the files that are not replicated
> to the offline brick. When the offline brick becomes available again, the
> cluster initiates a healing process, replicating the updated files to that
> brick. *The start of this process can take up to 10 minutes, based on
> observation.*"
>
> After giving the time of more than 5 min file corruption problem has been
> resolved.
>
> So, Here my question is there any way through which we can reduce the time
> taken by the healing process to start?
>
>
> Regards,
> Abhishek Paliwal
>
>
>
>
> ___
> Gluster-devel mailing 
> listGluster-devel@gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-devel
>
>
>
>


-- 




Regards
Abhishek Paliwal
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Need help with bitrot

2016-02-26 Thread Ajil Abraham
Thank you sir.  I will read these documents.

- Ajil

On Thu, Feb 25, 2016 at 9:05 PM, FNU Raghavendra Manjunath <
rab...@redhat.com> wrote:

> As of now, signing happens only upon data modification. Metadata changes
> and xattr changes does not trigger signing.
>
> For more information about gluster and its internals, you can look here "
> https://gluster.readthedocs.org/en/latest/";.
>
> Regards,
> Raghavendra
>
> On Thu, Feb 25, 2016 at 10:28 AM, Ajil Abraham 
> wrote:
>
>> Thanks FNU Raghavendra.  Does the signing happen only when the file data
>> changes or even when extended attribute changes?
>>
>> I am also trying to understand the Gluster internal data structures. Are
>> there any materials for the same? Similarly for the translators, the way
>> they are stacked on client & server side, how control flows between them.
>> Can somebody please help?
>>
>> - Ajil
>>
>>
>> On Thu, Feb 25, 2016 at 7:27 AM, FNU Raghavendra Manjunath <
>> rab...@redhat.com> wrote:
>>
>>> Hi Ajil,
>>>
>>> Expiry policy tells the signer (Bit-rot Daemon) to wait for a specific
>>> period of time before signing a object.
>>>
>>> Whenever a object is modified, a notification is sent to the signer by
>>> brick process (bit-rot-stub xlator sitting in the I/O path) upon getting a
>>> release (i.e. when all the fds of that object are closed). The expiry
>>> policy tells the signer to wait for some time (by default its 120 seconds)
>>> before signing that object. It is done because, suppose the signer starts
>>> signing (i.e. read the object + calculate the checksum + store the
>>> checksum) a object the object gets modified again, then a new notification
>>> has to be sent and again signer has to sign the object by calculating the
>>> checksum. Whereas if the signer waits for some time and receives a new
>>> notification on the same object when its waiting, then it can avoid signing
>>> for the first notification.
>>>
>>> Venky, do you want to add anything more?
>>>
>>> Regards,
>>> Raghavendra
>>>
>>>
>>>
>>> On Wed, Feb 24, 2016 at 12:28 AM, Ajil Abraham >> > wrote:
>>>
 Hi,

 I am a student interested in GlusterFS.  Trying to understand the
 design of GlusterFS. Came across the Bitrot design document in Google.
 There is a mention of expiry policy used to sign the files. I did not
 clearly understand what the expiry policy is.  Can somebody please help?

 -Ajil

 ___
 Gluster-devel mailing list
 Gluster-devel@gluster.org
 http://www.gluster.org/mailman/listinfo/gluster-devel

>>>
>>>
>>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Need help with bitrot

2016-02-26 Thread Raghavendra Talur
On Fri, Feb 26, 2016 at 7:05 AM, Joseph Fernandes 
wrote:

> Hope this helps
>
> Courtesy : Raghvendra Talur (rta...@redhat.com)
>
> 1. Clone glusterfs repo to your laptop and get acquainted with dev
> workflow.
> https://gluster.readthedocs.org/en/latest/Developer-guide/Developers-Index/
>
> 2. If you find using your laptop as the test machine for Gluster as too
> scary, here is a vagrant based mechanism to get VMs setup on your laptop
> easily for Gluster testing.
> http://comments.gmane.org/gmane.comp.file-systems.gluster.devel/13494
>
> 3.Find my Gluster Introduction blog post here in the preview link:
>
>
>
> https://6227134958232800133_bafac39c28bee4f256bbbef7510c9bb9b44fca05.blogspot.com/b/post-preview?token=s6_4MVIBAAA.zY--3ij00CkDwnitBOwnFBowEvCsKZ0o4ToQ0KYk9Po4pKujPj9ugmn-fm-XUFdLQxU50FmnCxBBr_IkSzuSlA.l_XFe1UvIEAiqkFAZZPdqQ&postId=4168074834715190149&type=POST


This is the public link
http://blog.raghavendratalur.in/2016/02/gluster-developer-guide-part-1.html


>
>
> 4. Follow all the lessons here in Translator 101 series for build on your
> understanding of Gluster.
>
> http://pl.atyp.us/hekafs.org/index.php/2011/11/translator-101-class-1-setting-the-stage/
>
> http://hekafs.org/index.php/2011/11/translator-101-lesson-2-init-fini-and-private-context
>
> http://hekafs.org/index.php/2011/11/translator-101-lesson-3-this-time-for-real
>
> http://hekafs.org/index.php/2011/11/translator-101-lesson-4-debugging-a-translator
>
>
> 5. Try to fix or understand any of the bugs in this list
>
> https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&classification=Community&f1=keywords&list_id=4424622&o1=substring&product=GlusterFS&query_format=advanced&v1=easyfix
>
>
> Regards,
> Joe
>
>
> - Original Message -
> From: "Ajil Abraham" 
> To: "FNU Raghavendra Manjunath" 
> Cc: "Gluster Devel" 
> Sent: Thursday, February 25, 2016 8:58:35 PM
> Subject: Re: [Gluster-devel] Need help with bitrot
>
> Thanks FNU Raghavendra. Does the signing happen only when the file data
> changes or even when extended attribute changes?
>
> I am also trying to understand the Gluster internal data structures. Are
> there any materials for the same? Similarly for the translators, the way
> they are stacked on client & server side, how control flows between them.
> Can somebody please help?
>
> - Ajil
>
> On Thu, Feb 25, 2016 at 7:27 AM, FNU Raghavendra Manjunath <
> rab...@redhat.com > wrote:
>
>
>
> Hi Ajil,
>
> Expiry policy tells the signer (Bit-rot Daemon) to wait for a specific
> period of time before signing a object.
>
> Whenever a object is modified, a notification is sent to the signer by
> brick process (bit-rot-stub xlator sitting in the I/O path) upon getting a
> release (i.e. when all the fds of that object are closed). The expiry
> policy tells the signer to wait for some time (by default its 120 seconds)
> before signing that object. It is done because, suppose the signer starts
> signing (i.e. read the object + calculate the checksum + store the
> checksum) a object the object gets modified again, then a new notification
> has to be sent and again signer has to sign the object by calculating the
> checksum. Whereas if the signer waits for some time and receives a new
> notification on the same object when its waiting, then it can avoid signing
> for the first notification.
>
> Venky, do you want to add anything more?
>
> Regards,
> Raghavendra
>
>
>
> On Wed, Feb 24, 2016 at 12:28 AM, Ajil Abraham < ajil95.abra...@gmail.com
> > wrote:
>
>
>
> Hi,
>
> I am a student interested in GlusterFS. Trying to understand the design of
> GlusterFS. Came across the Bitrot design document in Google. There is a
> mention of expiry policy used to sign the files. I did not clearly
> understand what the expiry policy is. Can somebody please help?
>
> -Ajil
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
>
>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Bitrot/Tering : Bad files get migrated and hence corruption goes undetected.

2016-02-26 Thread Niels de Vos
On Fri, Feb 26, 2016 at 09:32:46AM -0500, Joseph Fernandes wrote:
> Hi All,
> 
> This is a discussion mail on the following issue, 
> 
> 1. Object is corrupted before it could be signed: In this case, the corrupted
>object is signed and get migrated upon I/O. There's no way to identify 
> corruption
>for this set of objects.
> 
> 2. Object is signed (but not scrubbed) and corruption happens thereafter:
>In this case, as of now, integrity checking is not done on the fly
>and the object would get migrated (and signed again in the hot tier).
> 
> 
> The (1) is definitely not a issue with bitrot with tiering. But (2) we can do 
> something to avoid
> corrupted file from getting migrated. Before we migrate files we can scrub 
> it, but its just a naive
> thought, any better suggestions? 

Is there a reason the existing signature can not be migrated? Why does
it become invalid?

Niels


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Bitrot/Tering : Bad files get migrated and hence corruption goes undetected.

2016-02-26 Thread Joseph Fernandes
Hi All,

This is a discussion mail on the following issue, 

1. Object is corrupted before it could be signed: In this case, the corrupted
   object is signed and get migrated upon I/O. There's no way to identify 
corruption
   for this set of objects.

2. Object is signed (but not scrubbed) and corruption happens thereafter:
   In this case, as of now, integrity checking is not done on the fly
   and the object would get migrated (and signed again in the hot tier).


The (1) is definitely not a issue with bitrot with tiering. But (2) we can do 
something to avoid
corrupted file from getting migrated. Before we migrate files we can scrub it, 
but its just a naive
thought, any better suggestions? 

Regards,
Joe
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-infra] r.g.o not responding or too slow

2016-02-26 Thread Atin Mukherjee


On 02/26/2016 06:10 PM, Niels de Vos wrote:
> On Fri, Feb 26, 2016 at 05:40:00PM +0530, Atin Mukherjee wrote:
>> As $subj
> 
> Please check the infra list archive before sending reports like this :)
I'd say that some other guy who sent it should have copied gluster-devel
too in the mail :)
> 
> http://news.gmane.org/gmane.comp.file-systems.gluster.infra already has
> an email about it. Not sure if any of the admins with access to the
> Gerrit server are available, none of them seem to be on irc in
> #gluster-dev.
> 
> Niels
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-infra] r.g.o not responding or too slow

2016-02-26 Thread Niels de Vos
On Fri, Feb 26, 2016 at 05:40:00PM +0530, Atin Mukherjee wrote:
> As $subj

Please check the infra list archive before sending reports like this :)

http://news.gmane.org/gmane.comp.file-systems.gluster.infra already has
an email about it. Not sure if any of the admins with access to the
Gerrit server are available, none of them seem to be on irc in
#gluster-dev.

Niels


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] r.g.o not responding or too slow

2016-02-26 Thread Atin Mukherjee
As $subj
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] WORM/Retention Feature - 26/02/2016

2016-02-26 Thread Karthik Subrahmanya
Hi all,

The current status of the project is:

-It works as a file level WORM
-Handles the setattr call if all the write bits are removed
-Sets an xattr storing the WORM/Retention state along with the retention period
-atime of the file will point to the time till which the file is retained
-When a write/unlink/rename/truncate request comes for a WORM/Retained file,
 it returns EROFS error
-Whenever a fop request comes for a file, it will do a lookup
-Lookup will do the state transition if the retention period is expired
-It will reset the state from WORM/Retained to WORM
-The atime of the file will also revert back to the actual atime
-The file will still be read-only and blocks the write, truncate, and the 
rename requests
-The unlink call will succeed for a WORM file
-You can transition back to the WORM/Retained state by again doing setattr


Plans for next week:

-As Niels suggestion, preparing a specs document
-Fixing the bugs in the program
-Working on handling the ctime change

You can find the feature page at:
http://www.gluster.org/community/documentation/index.php/Features/gluster_compliance_archive
Patch: http://review.gluster.org/#/c/13429/

Your valuable suggestions, wish lists, and reviews are most welcome.

Regards,
Karthik Subrahmanya
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Throttling xlator on the bricks

2016-02-26 Thread Ravishankar N

Hey Shreyas,
I'll be starting on the TBF based implementation next week, as this 
needs to be completed by 3.8. If you can send your patch, I'll see we 
can leverage it too.

Thanks,
Ravi


On 02/13/2016 09:06 AM, Pranith Kumar Karampuri wrote:



On 02/13/2016 12:13 AM, Richard Wareing wrote:

Hey Ravi,

I'll ping Shreyas about this today.  There's also a patch we'll need 
for multi-threaded SHD to fix the least-pri queuing.  The PID of the 
process wasn't tagged correctly via the call frame in my original 
patch.  The patch below fixes this (for 3.6.3), I didn't see 
multi-threaded self heal on github/master yet so let me know what 
branch you need this patch on and I can come up with a clean patch.


Hi Richard,
 I reviewed the patch and found that the same needs to be done 
even for ec. So I am thinking if I can split it out as two different 
patches, 1 patch in syncop-utils which builds the functionality of 
parallelization. Another patch which uses this in afr, ec. Do you mind 
if I give it a go? I can complete it by end of Wednesday.


Pranith


Richard


=


diff --git a/xlators/cluster/afr/src/afr-self-heald.c 
b/xlators/cluster/afr/src/afr-self-heald.c

index 028010d..b0f6248 100644
--- a/xlators/cluster/afr/src/afr-self-heald.c
+++ b/xlators/cluster/afr/src/afr-self-heald.c
@@ -532,6 +532,9 @@ afr_mt_process_entries_done (int ret, 
call_frame_t *sync_frame,

  pthread_cond_signal (&mt_data->task_done);
  }
  pthread_mutex_unlock (&mt_data->lock);
+
+if (task_ctx->frame)
+AFR_STACK_DESTROY (task_ctx->frame);
  GF_FREE (task_ctx);
  return 0;
  }
@@ -787,6 +790,7 @@ _afr_mt_create_process_entries_task (xlator_t *this,
  int   ret = -1;
  afr_mt_process_entries_task_ctx_t *task_ctx;
  afr_mt_data_t *mt_data;
+call_frame_t  *frame = NULL;

  mt_data = &healer->mt_data;

@@ -799,6 +803,8 @@ _afr_mt_create_process_entries_task (xlator_t *this,
  if (!task_ctx)
  goto err;

+task_ctx->frame = afr_frame_create (this);
+
  INIT_LIST_HEAD (&task_ctx->list);
  task_ctx->readdir_xl = this;
  task_ctx->healer = healer;
@@ -812,7 +818,7 @@ _afr_mt_create_process_entries_task (xlator_t *this,
  // This returns immediately, and 
afr_mt_process_entries_done will
  // be called when the task is completed e.g. our queue is 
empty
  ret = synctask_new (this->ctx->env, 
afr_mt_process_entries_task,

-afr_mt_process_entries_done, NULL,
+afr_mt_process_entries_done, task_ctx->frame,
  (void *)task_ctx);

  if (!ret) {
diff --git a/xlators/cluster/afr/src/afr-self-heald.h 
b/xlators/cluster/afr/src/afr-self-heald.h

index 817e712..1588fc8 100644
--- a/xlators/cluster/afr/src/afr-self-heald.h
+++ b/xlators/cluster/afr/src/afr-self-heald.h
@@ -74,6 +74,7 @@ typedef struct afr_mt_process_entries_task_ctx_ {
  subvol_healer_t *healer;
  xlator_t*readdir_xl;
  inode_t *idx_inode;  /* inode ref for 
xattrop dir */

+call_frame_t*frame;
  unsigned intentries_healed;
  unsigned intentries_processed;
  unsigned intalready_healed;


Richard

From: Ravishankar N [ravishan...@redhat.com]
Sent: Sunday, February 07, 2016 11:15 PM
To: Shreyas Siravara
Cc: Richard Wareing; Vijay Bellur; Gluster Devel
Subject: Re: [Gluster-devel] Throttling xlator on the bricks

Hello,

On 01/29/2016 06:51 AM, Shreyas Siravara wrote:

So the way our throttling works is (intentionally) very simplistic.

(1) When someone mounts an NFS share, we tag the frame with a 32 bit 
hash of the export name they were authorized to mount.
(2) io-stats keeps track of the "current rate" of fops we're seeing 
for that particular mount, using a sampling of fops and a moving 
average over a short period of time.
(3) Based on whether the share violated its allowed rate (which is 
defined in a config file), we tag the FOP as "least-pri". Of course 
this makes the assumption that all NFS endpoints are receiving 
roughly the same # of FOPs. The rate defined in the config file is a 
*per* NFS endpoint number. So if your cluster has 10 NFS endpoints, 
and you've pre-computed that it can do roughly 1000 FOPs per second, 
the rate in the config file would be 100.
(4) IO-Threads then shoves the FOP into the least-pri queue, rather 
than its default. The value is honored all the way down to the bricks.


The code is actually complete, and I'll put it up for review after 
we iron out a few minor issues.

Did you get a chance to send the patch? Just wanted to run some tests
and see if this is all we need at the moment to regulate shd traffic,
especially with Richa