Re: [Gluster-devel] Plans for Gluster 3.8

2015-08-12 Thread Kaushal M
Hi Csaba,

These are the updates regarding the requirements, after our meeting
last week. The specific updates on the requirements are inline.

In general, we feel that the requirements for selective read-only mode
and immediate disconnection of clients on access revocation are doable
for GlusterFS-3.8. The only problem right now is that we do not have
any volunteers for it.

> 1.Bug 829042 - [FEAT] selective read-only mode
>  https://bugzilla.redhat.com/show_bug.cgi?id=829042
>
>   absolutely necessary for not getting tarred & feathered in Tokyo ;)
>   either resurrect http://review.gluster.org/3526
>   and _find out integration with auth mechanism for special
>   mounts_, or come up with a completely different concept
>

With the availability of client_t, implementing this should become
easier. The server xlator would store the incoming connections common
name or address in the client_t associated with the connection. The
read-only xlator could then make use of this information to
selectively allow read-only clients. The read-only xlator would need
to implement a new option for selective read-only, which would be
populated with lists of common-names and addresses of clients which
would get read-only access.

> 2.Bug 1245380 - [RFE] Render all mounts of a volume defunct upon access 
> revocation
>  https://bugzilla.redhat.com/show_bug.cgi?id=1245380
>
>   necessary to let us enable a watershed scalability
>   enhancement
>

Currently, when auth.allow/reject and auth.ssl-allow options are
changed, the server xlator does a reconfigure to reload its access
list. It just does a reload, and doesn't affect any existing
connections. To bring this feature in, the server xlator would need to
iterate through its xprt_list and check every connection for
authorization again on a reconfigure. Those connections which have
lost authorization would be disconnected.

> 3.Bug 1226776 – [RFE] volume capability query
>  https://bugzilla.redhat.com/show_bug.cgi?id=1226776
>
>   eventually we'll be choking in spaghetti if we don't get
>   this feature. The ugly version checks we need to do against
>   GlusterFS as in
>
>   
> https://review.openstack.org/gitweb?p=openstack/manila.git;a=commitdiff;h=29456c#patch3
>
>   will proliferate and eat the guts of the code out of its
>   living body if this is not addressed.
>

This requires some more thought to figure out the correct solution.
One possible way to get the capabilities of the cluster would be to
look at the clusters running op-version. This can be obtained using
`gluster volume get all cluster.op-version` (the volume get command is
available in glusterfs-3.6 and above). But this doesn't provide much
improvement over the existing checks being done in the driver.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] semi-sync replication

2015-08-12 Thread Anoop Nair
Hi,

Do we have plans to support "semi-synchronous" type replication in the future? 
By semi-sync I mean writing to one leg the replica, securing the write on a 
faster stable storage (capacitor backed SSD or NVRAM) and then acknowledge the 
client. The write on other replica leg may happen at later point in time. 

Thanks
-Anoop
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] testcase ./tests/geo-rep/georep-basic-dr-rsync.t failure

2015-08-12 Thread Susant Palai
Hi,
   ./tests/geo-rep/georep-basic-dr-rsync.t fails in regression machine as well 
as in my local machine also. Requesting geo-rep team to look in to it.

link: 
https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/9158/consoleFull

Regards,
Susant
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] REMINDER: Weekly Gluster Community meeting today at 12:00 UTC (~5 minutes from now)

2015-08-12 Thread Mohammed Rafi K C
Hi All,

In about 5 minutes from now we will have the regular weekly Gluster
Community meeting.

Meeting details:
- location: #gluster-meeting on Freenode IRC
- date: every Wednesday
- time: 12:00 UTC, 14:00 CEST, 17:30 IST
   (in your terminal, run: date -d "12:00 UTC")
- agenda: https://public.pad.fsfe.org/p/gluster-community-meetings

Currently the following items are listed:
* Roll Call
* Status of last week's action items
* Gluster 3.7
* Gluster 3.6
* Gluster 3.5
* Gluster 4.0
* Open Floor
   - bring your own topic!

The last topic has space for additions. If you have a suitable topic to
discuss, please add it to the agenda.

Thanks,
Rafi KC

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] testcase ./tests/geo-rep/georep-basic-dr-rsync.t failure

2015-08-12 Thread Kotresh Hiremath Ravishankar
Hi Susanth,

It could be issue with "PasswordLess SSH" not being setup in NetBSD machines.

Emmanuel,

Could you please setup "PasswordLess SSH" in all NetBSD regression machines ?
Otherwise, till then geo-rep testsuite should be skipped for now.

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Susant Palai" 
> To: "Gluster Devel" 
> Sent: Wednesday, August 12, 2015 3:56:31 PM
> Subject: [Gluster-devel] testcase ./tests/geo-rep/georep-basic-dr-rsync.t 
> failure
> 
> Hi,
>./tests/geo-rep/georep-basic-dr-rsync.t fails in regression machine as
>well as in my local machine also. Requesting geo-rep team to look in to
>it.
> 
> link:
> 
> https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/9158/consoleFull
> 
> Regards,
> Susant
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] semi-sync replication

2015-08-12 Thread Ravishankar N



On 08/12/2015 12:50 PM, Anoop Nair wrote:

Hi,

Do we have plans to support "semi-synchronous" type replication in the future? 
By semi-sync I mean writing to one leg the replica, securing the write on a faster stable 
storage (capacitor backed SSD or NVRAM) and then acknowledge the client. The write on 
other replica leg may happen at later point in time.
Not exactly in the way you describe, but there are plans to achieve 
"near-synchronous" replication wherein we wind the write to all replica 
legs, but acknowledge success as soon as we hear a success from one of 
the bricks (instead of waiting for responses from all bricks as we do 
today).


-Ravi
  


Thanks
-Anoop
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] semi-sync replication

2015-08-12 Thread Anoop Nair
Hmm, that's kind of risky. What if you good leg fails before the sync happens 
to the secondary leg? Replay cache may serve as a lifeline in such a scenario.

Thanks
-Anoop

- Original Message -
From: "Ravishankar N" 
To: "Anoop Nair" , gluster-devel@gluster.org
Sent: Wednesday, August 12, 2015 5:46:04 PM
Subject: Re: [Gluster-devel] semi-sync replication



On 08/12/2015 12:50 PM, Anoop Nair wrote:
> Hi,
>
> Do we have plans to support "semi-synchronous" type replication in the 
> future? By semi-sync I mean writing to one leg the replica, securing the 
> write on a faster stable storage (capacitor backed SSD or NVRAM) and then 
> acknowledge the client. The write on other replica leg may happen at later 
> point in time.
Not exactly in the way you describe, but there are plans to achieve 
"near-synchronous" replication wherein we wind the write to all replica 
legs, but acknowledge success as soon as we hear a success from one of 
the bricks (instead of waiting for responses from all bricks as we do 
today).

-Ravi
>   
>
> Thanks
> -Anoop
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Patch merge request-3.7 branch: http://review.gluster.org/#/c/11858/

2015-08-12 Thread Ravishankar N
Could some one with merge rights take 
http://review.gluster.org/#/c/11858/ in for the 3.7 branch? This 
backport has +2 from the maintainer and has passed regressions.


Thanks in advance :-)
Ravi
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Patch merge request-3.7 branch: http://review.gluster.org/#/c/11858/

2015-08-12 Thread Raghavendra Gowdappa


- Original Message -
> From: "Ravishankar N" 
> To: "Gluster Devel" 
> Sent: Wednesday, August 12, 2015 6:01:16 PM
> Subject: [Gluster-devel] Patch merge request-3.7 branch:  
> http://review.gluster.org/#/c/11858/
> 
> Could some one with merge rights take
> http://review.gluster.org/#/c/11858/ in for the 3.7 branch? This
> backport has +2 from the maintainer and has passed regressions.
> 

Done.

> Thanks in advance :-)
> Ravi
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] semi-sync replication

2015-08-12 Thread Ravishankar N



On 08/12/2015 05:56 PM, Anoop Nair wrote:

Hmm, that's kind of risky. What if you good leg fails before the sync happens 
to the secondary leg?
Oh, the writes would still need to happen as a part of the AFR 
transaction; so if the writes (which are wound to all bricks 
immediately, its just that we don't wait for all responses before 
unwinding to DHT ) failed on some bricks, the self-heal would take care 
of it..


Thanks,
Ravi

  Replay cache may serve as a lifeline in such a scenario.

Thanks
-Anoop

- Original Message -
From: "Ravishankar N" 
To: "Anoop Nair" , gluster-devel@gluster.org
Sent: Wednesday, August 12, 2015 5:46:04 PM
Subject: Re: [Gluster-devel] semi-sync replication



On 08/12/2015 12:50 PM, Anoop Nair wrote:

Hi,

Do we have plans to support "semi-synchronous" type replication in the future? 
By semi-sync I mean writing to one leg the replica, securing the write on a faster stable 
storage (capacitor backed SSD or NVRAM) and then acknowledge the client. The write on 
other replica leg may happen at later point in time.

Not exactly in the way you describe, but there are plans to achieve
"near-synchronous" replication wherein we wind the write to all replica
legs, but acknowledge success as soon as we hear a success from one of
the bricks (instead of waiting for responses from all bricks as we do
today).

-Ravi
   


Thanks
-Anoop
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] Gluster 3.6.4 tune2fs and inode size errors

2015-08-12 Thread Atin Mukherjee
Well, this looks like a bug even in 3.7 as well. I've posted a fix [1]
to address it.

[1] http://review.gluster.org/11898

Could you please raise a bug for this?

~Atin

On 08/12/2015 01:32 PM, Davy Croonen wrote:
> Hi Atin
> 
> Thanks for your answer. The op-version was indeed an old one, 30501 to be 
> precise. I’ve updated the op-version to the one you suggested with the 
> command: gluster volume set all cluster.op-version 30603. From testing it 
> seems this issue is solved for the moment.
> 
> Considering the errors in the etc-glusterfs-glusterd.vol.log file I’m looking 
> forward to hear from you.
> 
> Thanks in advance.
> 
> KR
> Davy
> 
> On 11 Aug 2015, at 19:28, Atin Mukherjee 
> mailto:atin.mukherje...@gmail.com>> wrote:
> 
> 
> 
> -Atin
> Sent from one plus one
> On Aug 11, 2015 7:54 PM, "Davy Croonen" 
> mailto:davy.croo...@smartbit.be>> wrote:
>>
>> Hi all
>>
>> Our etc-glusterfs-glusterd.vol.log is filling up with entries as shown:
>>
>> [2015-08-11 11:40:33.807940] E 
>> [glusterd-utils.c:7410:glusterd_add_inode_size_to_dict] 0-management: 
>> tune2fs exited with non-zero exit status
>> [2015-08-11 11:40:33.807962] E 
>> [glusterd-utils.c:7436:glusterd_add_inode_size_to_dict] 0-management: failed 
>> to get inode size
> I will check this and get back to you.
>>
>> From the mailinglist archive I could understand this was a problem in 
>> gluster version 3.4 and should be fixed. We started out from version 3.5 and 
>> upgraded in the meantime to version 3.6.4 but the error in the errorlog 
>> still exists.
>>
>> We are also unable to execute the command
>>
>> $gluster volume status all inode
>>
>> as a result gluster hangs up with the message: “Another transaction is in 
>> progress. Please try again after sometime.” while executing the command
>>
>> $gluster volume status
> Have you bump up the op version to 30603? Otherwise glusterd will still have 
> cluster locking and then multiple commands can't run simultaneously.
>>
>> Are the error messages in the logs related to the hung up of gluster while 
>> executing the mentioned commands? And any ideas about how to fix this?
> The error messages are not because of this.
>>
>> Kind regards
>> Davy
>> ___
>> Gluster-users mailing list
>> gluster-us...@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
> 
> 
> 
> ___
> Gluster-users mailing list
> gluster-us...@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
> 

-- 
~Atin
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Minutes of today's Gluster Community meeting

2015-08-12 Thread Atin Mukherjee
Minutes:
http://meetbot.fedoraproject.org/gluster-meeting/2015-08-12/gluster-meeting.2015-08-12-11.59.html
Minutes (text):
http://meetbot.fedoraproject.org/gluster-meeting/2015-08-12/gluster-meeting.2015-08-12-11.59.txt
Log:
http://meetbot.fedoraproject.org/gluster-meeting/2015-08-12/gluster-meeting.2015-08-12-11.59.log.html


Meeting summary
---
* agenda is available @
  https://public.pad.fsfe.org/p/gluster-community-meetings  (atinm,
  12:00:38)
* action items from last week  (atinm, 12:04:47)
  * LINK:
http://www.gluster.org/pipermail/gluster-infra/2015-August/001338.html
(tigert, 12:05:27)
  * tigert has asked for feedback about integrating gluster calendar @
http://www.gluster.org/pipermail/gluster-infra/2015-August/001338.html
(atinm, 12:05:56)
  * ACTION: tigert to send a reminder mail to ask for feedback for
gluster calendar integration  (atinm, 12:08:06)
  * LINK:

https://github.com/gluster/glusterweb/blob/master/source/community/release-schedule.md
(hchiramm, 12:09:01)
  * release schedule details @

https://github.com/gluster/glusterweb/blob/master/source/community/release-schedule.md
(atinm, 12:09:51)
  * ACTION: raghu to fill in the contents for release schedule and ask
tigert to push the page in gluster.org  (atinm, 12:11:24)
  * ACTION: rtalur to setup a new project gerrit specs in
review.gluster.org  (atinm, 12:12:40)
  * LINK: http://review.gluster.org/#/admin/projects/glusterfs-specs I
have setup the project  (hchiramm, 12:13:18)
  * ACTION: msvbhat to  do 3.7.3 announcement on the gluster blog and
social media  (atinm, 12:16:29)
  * ACTION: pranithk to write up a post announcing EC's production
readiness  (atinm, 12:17:49)
  * ACTION: msvbhat/rtalur to send update mailing list with a DiSTAF
how-to and start discussion on enhancements to DiSTAF.  (atinm,
12:19:13)
  * ACTION: kshlm  to test the new jenkins slave in ci.gluster.org
(atinm, 12:21:01)
  * Shyam has announced about DHT2 @
http://www.gluster.org/pipermail/gluster-devel/2015-August/046369.html
(atinm, 12:22:00)

* GlusterFS 3.7  (atinm, 12:23:05)

* GlusterFS 3.6  (atinm, 12:25:20)
  * Gluster 3.6 is getting released in next week  (atinm, 12:26:51)
  * raghu seeks for more backports for next 3.6 release  (atinm,
12:28:02)

* GlusterFS 3.5  (atinm, 12:28:32)

* Glusterfs 3.8  (atinm, 12:29:38)
  * ACTION: poornimag to backport one of the libgfapi related changes
into 3.5  (atinm, 12:30:27)
  * More information about what we target in for 3.8 can be found @
http://www.gluster.org/pipermail/gluster-users/2015-July/022722.html
(atinm, 12:31:23)

* GlusterFS 4.0  (atinm, 12:32:07)
  * Gluster 4.0 Tracker :
https://bugzilla.redhat.com/showdependencytree.cgi?id=glusterfs-4.0
(atinm, 12:32:49)
  * GlusterD 2.0 high level plan for next 2-3 months is out now @
http://www.gluster.org/pipermail/gluster-devel/2015-August/046310.html
(atinm, 12:33:50)

* Open Floor  (atinm, 12:37:10)
  * Weekly reminder to announce Gluster attendance of events:
https://public.pad.fsfe.org/p/gluster-events  (atinm, 12:37:57)
  * ACTION: Instead of maintaining a public pad, integrate the gluster
event details in gluster.org, tigert / hchiramm to work on it
(atinm, 12:40:37)
  * REMINDER to put (even minor) interesting topics on
https://public.pad.fsfe.org/p/gluster-weekly-news  (atinm, 12:40:52)
  * Per release documentation. Users are landing on 3.2 documentation
from google itseems. How can we make this better  (atinm, 12:42:21)
  * Spurious failures - status  (atinm, 12:44:01)
  * ACTION: kkeithley to drop a follow up mail about spurious failure
fixes  (atinm, 12:54:39)

Meeting ended at 12:55:06 UTC.




Action Items

* tigert to send a reminder mail to ask for feedback for gluster
  calendar integration
* raghu to fill in the contents for release schedule and ask tigert to
  push the page in gluster.org
* rtalur to setup a new project gerrit specs in review.gluster.org
* msvbhat to  do 3.7.3 announcement on the gluster blog and social media
* pranithk to write up a post announcing EC's production readiness
* msvbhat/rtalur to send update mailing list with a DiSTAF how-to and
  start discussion on enhancements to DiSTAF.
* kshlm  to test the new jenkins slave in ci.gluster.org
* poornimag to backport one of the libgfapi related changes into 3.5
* Instead of maintaining a public pad, integrate the gluster event
  details in gluster.org, tigert / hchiramm to work on it
* kkeithley to drop a follow up mail about spurious failure fixes




Action Items, by person
---
* hchiramm
  * Instead of maintaining a public pad, integrate the gluster event
details in gluster.org, tigert / hchiramm to work on it
* kkeithley
  * kkeithley to drop a follow up mail about spurious failure fixes
* msvbhat
  * msvbhat to  do 3.7.3 announcement on the gluster blog and social
media
  * msvbhat/rtalur to se

Re: [Gluster-devel] reviving spurious failures tracking

2015-08-12 Thread Kaleb S. KEITHLEY
On Wednesday 29 July 2015  Vijay Bellur wrote:
> On Wednesday 29 July 2015 03:40 PM, Pranith Kumar Karampuri wrote:
>> hi,
>>  I just updated
>> https://public.pad.fsfe.org/p/gluster-spurious-failures with the latest
>> spurious failures we saw in linux and NetBSD regressions. Could you guys
>> update with any more spurious regressions that you guys are observing
>> but not listed on the pad. Could you guys help in fixing these issues
>> fast as the number of failures is increasing quite a bit nowadays.
>>
>
> I think we have been very tolerant for failing tests and it is time to
> change this behavior. I propose that:
>
> - we block commits for components that have failing tests listed in the
> tracking etherpad.
>
>
> - once failing tests are addressed on a particular branch, normal patch
> merging can resume.
>
> - If there are tests that cannot be fixed easily in the near term, we
> move such tests to a different folder or drop such test units.

We still have a couple tests with frequent, spurious failures.

Let's please either get these fixed, or removed from the tree if they
are fundamentally broken.

Otherwise we'll have to invoke the nuclear option. ;-)


-- 

Kaleb
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] semi-sync replication

2015-08-12 Thread Aravinda
I think NSR is good candidate here. It has leadership election for 
writing data, if that can be enhanced to give more priority to SSD 
bricks during leadership election.


regards
Aravinda

On 08/12/2015 06:06 PM, Ravishankar N wrote:



On 08/12/2015 05:56 PM, Anoop Nair wrote:
Hmm, that's kind of risky. What if you good leg fails before the sync 
happens to the secondary leg?
Oh, the writes would still need to happen as a part of the AFR 
transaction; so if the writes (which are wound to all bricks 
immediately, its just that we don't wait for all responses before 
unwinding to DHT ) failed on some bricks, the self-heal would take 
care of it..


Thanks,
Ravi

  Replay cache may serve as a lifeline in such a scenario.

Thanks
-Anoop

- Original Message -
From: "Ravishankar N" 
To: "Anoop Nair" , gluster-devel@gluster.org
Sent: Wednesday, August 12, 2015 5:46:04 PM
Subject: Re: [Gluster-devel] semi-sync replication



On 08/12/2015 12:50 PM, Anoop Nair wrote:

Hi,

Do we have plans to support "semi-synchronous" type replication in 
the future? By semi-sync I mean writing to one leg the replica, 
securing the write on a faster stable storage (capacitor backed SSD 
or NVRAM) and then acknowledge the client. The write on other 
replica leg may happen at later point in time.

Not exactly in the way you describe, but there are plans to achieve
"near-synchronous" replication wherein we wind the write to all replica
legs, but acknowledge success as soon as we hear a success from one of
the bricks (instead of waiting for responses from all bricks as we do
today).

-Ravi


Thanks
-Anoop
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] testcase ./tests/geo-rep/georep-basic-dr-rsync.t failure

2015-08-12 Thread Kotresh Hiremath Ravishankar
Hi Emmanuel,

I checked the netbsd regression machines and found that they were already 
configure with
"PasswordLess SSH" for root.

The issue was Geo-rep runs "gluster vol info" via ssh and it can't find the 
gluster PATH via ssh.

I have fixed the above in following four machines which are up by adding 
"export PATH=$PATH:/build/install/sbin:/build/install/bin" in ~/.kshrc and
similarly in other shells as I didn't know the default shell used by regression 
run

nbslave75
nbslave79
nbslave7g
nbslvae7j

This hopefully should fix geo-rep regression run in netbsd machines.
The same should be done for other netbsd slave machines when those are brought 
up.

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Kotresh Hiremath Ravishankar" 
> To: "Susant Palai" , "Emmanuel Dreyfus" 
> Cc: "Gluster Devel" 
> Sent: Wednesday, August 12, 2015 5:31:35 PM
> Subject: Re: [Gluster-devel] testcase ./tests/geo-rep/georep-basic-dr-rsync.t 
> failure
> 
> Hi Susanth,
> 
> It could be issue with "PasswordLess SSH" not being setup in NetBSD machines.
> 
> Emmanuel,
> 
> Could you please setup "PasswordLess SSH" in all NetBSD regression machines ?
> Otherwise, till then geo-rep testsuite should be skipped for now.
> 
> Thanks and Regards,
> Kotresh H R
> 
> - Original Message -
> > From: "Susant Palai" 
> > To: "Gluster Devel" 
> > Sent: Wednesday, August 12, 2015 3:56:31 PM
> > Subject: [Gluster-devel] testcase ./tests/geo-rep/georep-basic-dr-rsync.t
> > failure
> > 
> > Hi,
> >./tests/geo-rep/georep-basic-dr-rsync.t fails in regression machine as
> >well as in my local machine also. Requesting geo-rep team to look in to
> >it.
> > 
> > link:
> > 
> > https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/9158/consoleFull
> > 
> > Regards,
> > Susant
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> > 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] Inconsistent behavior due to lack of lookup on entry followed by readdirp

2015-08-12 Thread Mohammed Rafi K C
Hi All,

We are facing some inconsistent behavior for fops like rename, unlink
etc due to lack of lookup followed by a readdirp, more specifically if
inodes/gfid are populated via readdirp call and this nodeid is shared
with kernal, md-cache will cache this based on base-name. Then
subsequent named lookup will be served from md-cache and it winds-back
immediately. So there is a chance to have an FOP triggered with out
having a lookup on an entry. DHT does lot of things like creating link
files and populate inode_ctx etc, during lookup. In such scenario it is
must to have at least one lookup to be happened on an entry. Since
readdirp preventing the lookup,  it has been very hard for fops to
proceed without a first lookup on the entry. We are also suspecting some
problems due to same with afr/ec self healing also. So If we remove
readdirp from md-cache ([1], [2]) it causes, an additional hop for first
lookup for every entry. I'm mostly concerned with this one extra network
call, and the performance degradation caused by the same.

Now with this, the only advantage with readdirp is, it removes one
context switch between kernal and userspace. Is it really worth to
sacrifice this for consistency ?

What do you think about removing readdirp functionality?

Please provide your input/suggestion/ideas.

[1] : http://review.gluster.org/#/c/11892/

[2] : http://review.gluster.org/#/c/11894/

Thanks in Advance
Rafi KC
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] [Gluster-users] Gluster 3.6.4 tune2fs and inode size errors

2015-08-12 Thread Atin Mukherjee
Davy,

I will check this with Kaleb and get back to you.

-Atin
Sent from one plus one
On Aug 12, 2015 7:22 PM, "Davy Croonen"  wrote:

> Atin
>
> No problem to raise a bug for this, but isn’t this already addressed here:
>
> Bug 670  - continuous
> log entries failed to get inode size
> https://bugzilla.redhat.com/show_bug.cgi?id=670#c2
>
> KR
> Davy
>
> On 12 Aug 2015, at 14:56, Atin Mukherjee  wrote:
>
> Well, this looks like a bug even in 3.7 as well. I've posted a fix [1]
> to address it.
>
> [1] http://review.gluster.org/11898
>
> Could you please raise a bug for this?
>
> ~Atin
>
> On 08/12/2015 01:32 PM, Davy Croonen wrote:
>
> Hi Atin
>
> Thanks for your answer. The op-version was indeed an old one, 30501 to be
> precise. I’ve updated the op-version to the one you suggested with the
> command: gluster volume set all cluster.op-version 30603. From testing it
> seems this issue is solved for the moment.
>
> Considering the errors in the etc-glusterfs-glusterd.vol.log file I’m
> looking forward to hear from you.
>
> Thanks in advance.
>
> KR
> Davy
>
> On 11 Aug 2015, at 19:28, Atin Mukherjee  mailto:atin.mukherje...@gmail.com >> wrote:
>
>
>
> -Atin
> Sent from one plus one
> On Aug 11, 2015 7:54 PM, "Davy Croonen"  mailto:davy.croo...@smartbit.be >> wrote:
>
>
> Hi all
>
> Our etc-glusterfs-glusterd.vol.log is filling up with entries as shown:
>
> [2015-08-11 11:40:33.807940] E
> [glusterd-utils.c:7410:glusterd_add_inode_size_to_dict] 0-management:
> tune2fs exited with non-zero exit status
> [2015-08-11 11:40:33.807962] E
> [glusterd-utils.c:7436:glusterd_add_inode_size_to_dict] 0-management:
> failed to get inode size
>
> I will check this and get back to you.
>
>
> From the mailinglist archive I could understand this was a problem in
> gluster version 3.4 and should be fixed. We started out from version 3.5
> and upgraded in the meantime to version 3.6.4 but the error in the errorlog
> still exists.
>
> We are also unable to execute the command
>
> $gluster volume status all inode
>
> as a result gluster hangs up with the message: “Another transaction is in
> progress. Please try again after sometime.” while executing the command
>
> $gluster volume status
>
> Have you bump up the op version to 30603? Otherwise glusterd will still
> have cluster locking and then multiple commands can't run simultaneously.
>
>
> Are the error messages in the logs related to the hung up of gluster while
> executing the mentioned commands? And any ideas about how to fix this?
>
> The error messages are not because of this.
>
>
> Kind regards
> Davy
> ___
> Gluster-users mailing list
> gluster-us...@gluster.org >
> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> ___
> Gluster-users mailing list
> gluster-us...@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
> --
> ~Atin
>
>
>
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Inconsistent behavior due to lack of lookup on entry followed by readdirp

2015-08-12 Thread Krutika Dhananjay
I faced the same issue with the sharding translator. I fixed it by making its 
readdirp callback initialize individual entries' inode ctx, some of these being 
xattr values, which are filled in entry->dict by the posix translator. 
Here is the patch that got merged recently: http://review.gluster.org/11854 
Would that be as easy to do in DHT as well? 

As far as AFR is concerned, it indirectly forces LOOKUP on entries which are 
being retrieved for the first time through a READDIRP (and as a result do not 
have their inode ctx etc initialised yet) by setting entry->inode to NULL. See 
afr_readdir_transform_entries(). 
This is the default behavior which is being made optional as part of 
http://review.gluster.org/#/c/11846/ which is still under review (see BZ 
1250803, a performance bug :) ). 

-Krutika 

- Original Message -

> From: "Mohammed Rafi K C" 
> To: "Gluster Devel" 
> Cc: "Dan Lambright" , "Nithya Balachandran"
> , "Raghavendra Gowdappa" , "Ben
> Turner" , "Ben England" , "Manoj
> Pillai" , "Pranith Kumar Karampuri"
> , "Ravishankar Narayanankutty" ,
> kdhan...@redhat.com, xhernan...@datalab.es
> Sent: Wednesday, August 12, 2015 7:29:48 PM
> Subject: Inconsistent behavior due to lack of lookup on entry followed by
> readdirp

> Hi All,

> We are facing some inconsistent behavior for fops like rename, unlink
> etc due to lack of lookup followed by a readdirp, more specifically if
> inodes/gfid are populated via readdirp call and this nodeid is shared
> with kernal, md-cache will cache this based on base-name. Then
> subsequent named lookup will be served from md-cache and it winds-back
> immediately. So there is a chance to have an FOP triggered with out
> having a lookup on an entry. DHT does lot of things like creating link
> files and populate inode_ctx etc, during lookup. In such scenario it is
> must to have at least one lookup to be happened on an entry. Since
> readdirp preventing the lookup, it has been very hard for fops to
> proceed without a first lookup on the entry. We are also suspecting some
> problems due to same with afr/ec self healing also. So If we remove
> readdirp from md-cache ([1], [2]) it causes, an additional hop for first
> lookup for every entry. I'm mostly concerned with this one extra network
> call, and the performance degradation caused by the same.

> Now with this, the only advantage with readdirp is, it removes one
> context switch between kernal and userspace. Is it really worth to
> sacrifice this for consistency ?

> What do you think about removing readdirp functionality?

> Please provide your input/suggestion/ideas.

> [1] : http://review.gluster.org/#/c/11892/

> [2] : http://review.gluster.org/#/c/11894/

> Thanks in Advance
> Rafi KC
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] testcase ./tests/geo-rep/georep-basic-dr-rsync.t failure

2015-08-12 Thread Emmanuel Dreyfus
Kotresh Hiremath Ravishankar  wrote:

>  have fixed the above in following four machines which are up by adding
> "export PATH=$PATH:/build/install/sbin:/build/install/bin" in ~/.kshrc and
> similarly in other shells as I didn't know the default shell used by
> regression run

IMO you covered a bug: georep should cope with directory layout not
hardcoded.

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Inconsistent behavior due to lack of lookup on entry followed by readdirp

2015-08-12 Thread Raghavendra Gowdappa


- Original Message -
> From: "Krutika Dhananjay" 
> To: "Mohammed Rafi K C" 
> Cc: "Gluster Devel" , "Dan Lambright" 
> , "Nithya Balachandran"
> , "Raghavendra Gowdappa" , "Ben 
> Turner" , "Ben
> England" , "Manoj Pillai" , "Pranith 
> Kumar Karampuri"
> , "Ravishankar Narayanankutty" , 
> xhernan...@datalab.es
> Sent: Wednesday, August 12, 2015 9:02:44 PM
> Subject: Re: Inconsistent behavior due to lack of lookup on entry followed by 
> readdirp
> 
> I faced the same issue with the sharding translator. I fixed it by making its
> readdirp callback initialize individual entries' inode ctx, some of these
> being xattr values, which are filled in entry->dict by the posix translator.
> Here is the patch that got merged recently: http://review.gluster.org/11854
> Would that be as easy to do in DHT as well?

The problem is not just filling out state in the inode. The bigger problem is 
healing, which is supposed to maintain a directory/file to be in state 
consistent with our design before a successful reply to lookup. The operations 
can involve creating directories on missing subvols, setting appropriate 
layout, etc. Effectively for readdirp to replace lookup, it should be calling 
dht_lookup on each of the dentry it is passing back to application.

> 
> As far as AFR is concerned, it indirectly forces LOOKUP on entries which are
> being retrieved for the first time through a READDIRP (and as a result do
> not have their inode ctx etc initialised yet) by setting entry->inode to
> NULL. See afr_readdir_transform_entries().

Hmm. Then we already have "disabled" readdirp through code :). Without an inode 
corresponding to entry, readdirp will be effectively readdir stripping any 
performance benefits by having readdirp as a "batched" lookup (of all the 
dentries).

> This is the default behavior which is being made optional as part of
> http://review.gluster.org/#/c/11846/ which is still under review (see BZ
> 1250803, a performance bug :) ).

If it is made optional, when we enable setting entry->inode we still see 
consistency issues. Also, it seems to me that there is no point in having each 
individual xlator option controlling this behaviour. Instead we can make each 
xlator behave in compliance to global mount option "--use-readdirp=yes/no". Is 
there any specific reason to have an option to control this behaviour in afr?

> 
> -Krutika
> 
> - Original Message -
> 
> > From: "Mohammed Rafi K C" 
> > To: "Gluster Devel" 
> > Cc: "Dan Lambright" , "Nithya Balachandran"
> > , "Raghavendra Gowdappa" , "Ben
> > Turner" , "Ben England" , "Manoj
> > Pillai" , "Pranith Kumar Karampuri"
> > , "Ravishankar Narayanankutty" ,
> > kdhan...@redhat.com, xhernan...@datalab.es
> > Sent: Wednesday, August 12, 2015 7:29:48 PM
> > Subject: Inconsistent behavior due to lack of lookup on entry followed by
> > readdirp
> 
> > Hi All,
> 
> > We are facing some inconsistent behavior for fops like rename, unlink
> > etc due to lack of lookup followed by a readdirp, more specifically if
> > inodes/gfid are populated via readdirp call and this nodeid is shared
> > with kernal, md-cache will cache this based on base-name. Then
> > subsequent named lookup will be served from md-cache and it winds-back
> > immediately. So there is a chance to have an FOP triggered with out
> > having a lookup on an entry. DHT does lot of things like creating link
> > files and populate inode_ctx etc, during lookup. In such scenario it is
> > must to have at least one lookup to be happened on an entry. Since
> > readdirp preventing the lookup, it has been very hard for fops to
> > proceed without a first lookup on the entry. We are also suspecting some
> > problems due to same with afr/ec self healing also. So If we remove
> > readdirp from md-cache ([1], [2]) it causes, an additional hop for first
> > lookup for every entry. I'm mostly concerned with this one extra network
> > call, and the performance degradation caused by the same.
> 
> > Now with this, the only advantage with readdirp is, it removes one
> > context switch between kernal and userspace. Is it really worth to
> > sacrifice this for consistency ?
> 
> > What do you think about removing readdirp functionality?
> 
> > Please provide your input/suggestion/ideas.
> 
> > [1] : http://review.gluster.org/#/c/11892/
> 
> > [2] : http://review.gluster.org/#/c/11894/
> 
> > Thanks in Advance
> > Rafi KC
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Inconsistent behavior due to lack of lookup on entry followed by readdirp

2015-08-12 Thread Krutika Dhananjay
- Original Message -

> From: "Raghavendra Gowdappa" 
> To: "Krutika Dhananjay" 
> Cc: "Mohammed Rafi K C" , "Gluster Devel"
> , "Dan Lambright" , "Nithya
> Balachandran" , "Ben Turner" , "Ben
> England" , "Manoj Pillai" ,
> "Pranith Kumar Karampuri" , "Ravishankar
> Narayanankutty" , xhernan...@datalab.es
> Sent: Thursday, August 13, 2015 9:06:37 AM
> Subject: Re: Inconsistent behavior due to lack of lookup on entry followed by
> readdirp

> - Original Message -
> > From: "Krutika Dhananjay" 
> > To: "Mohammed Rafi K C" 
> > Cc: "Gluster Devel" , "Dan Lambright"
> > , "Nithya Balachandran"
> > , "Raghavendra Gowdappa" , "Ben
> > Turner" , "Ben
> > England" , "Manoj Pillai" ,
> > "Pranith Kumar Karampuri"
> > , "Ravishankar Narayanankutty" ,
> > xhernan...@datalab.es
> > Sent: Wednesday, August 12, 2015 9:02:44 PM
> > Subject: Re: Inconsistent behavior due to lack of lookup on entry followed
> > by readdirp
> >
> > I faced the same issue with the sharding translator. I fixed it by making
> > its
> > readdirp callback initialize individual entries' inode ctx, some of these
> > being xattr values, which are filled in entry->dict by the posix
> > translator.
> > Here is the patch that got merged recently: http://review.gluster.org/11854
> > Would that be as easy to do in DHT as well?

> The problem is not just filling out state in the inode. The bigger problem is
> healing, which is supposed to maintain a directory/file to be in state
> consistent with our design before a successful reply to lookup. The
> operations can involve creating directories on missing subvols, setting
> appropriate layout, etc. Effectively for readdirp to replace lookup, it
> should be calling dht_lookup on each of the dentry it is passing back to
> application.

OK. 

> >
> > As far as AFR is concerned, it indirectly forces LOOKUP on entries which
> > are
> > being retrieved for the first time through a READDIRP (and as a result do
> > not have their inode ctx etc initialised yet) by setting entry->inode to
> > NULL. See afr_readdir_transform_entries().

> Hmm. Then we already have "disabled" readdirp through code :). Without an
> inode corresponding to entry, readdirp will be effectively readdir stripping
> any performance benefits by having readdirp as a "batched" lookup (of all
> the dentries).
No. Not every single READDIRP will be transformed into a READDIR by AFR. AFR 
resets the inode corresponding to an entry, before responding to its parent, 
_only_ under the following two conditions: 
1) if this entry in question is being retrieved by this client for the first 
time through a READDIRP. In other words, this client has not _yet_ performed a 
LOOKUP on it. 
2) if that sub-volume of AFR on which the parent directory is being READDIRP'd 
(remember AFR would only need to serve inode and directory reads from one of 
the replicas) does _not_ contain a good copy of the entry. 
In other words this entry needs to be healed on parent's read child. This is 
because we do not want the caching translators or the application itself to get 
incorrect entry attributes. 

This means that more often than not, AFR _would_ be leaving the inode 
corresponding to the entry as it is, and not setting it to NULL. 

> > This is the default behavior which is being made optional as part of
> > http://review.gluster.org/#/c/11846/ which is still under review (see BZ
> > 1250803, a performance bug :) ).

> If it is made optional, when we enable setting entry->inode we still see
> consistency issues. Also, it seems to me that there is no point in having
> each individual xlator option controlling this behaviour. Instead we can
> make each xlator behave in compliance to global mount option
> "--use-readdirp=yes/no". Is there any specific reason to have an option to
> control this behaviour in afr?
Agreed that there will be consistency issues. 
The reason to move away from letting 1) and 2) above be the default behavior is 
performance. :) And I guess it is also partly because AFR-v1 does not have 
these restrictions in READDIRP. But the patch is still under review. 

-Krutika 

> >
> > -Krutika
> >
> > - Original Message -
> >
> > > From: "Mohammed Rafi K C" 
> > > To: "Gluster Devel" 
> > > Cc: "Dan Lambright" , "Nithya Balachandran"
> > > , "Raghavendra Gowdappa" , "Ben
> > > Turner" , "Ben England" , "Manoj
> > > Pillai" , "Pranith Kumar Karampuri"
> > > , "Ravishankar Narayanankutty"
> > > ,
> > > kdhan...@redhat.com, xhernan...@datalab.es
> > > Sent: Wednesday, August 12, 2015 7:29:48 PM
> > > Subject: Inconsistent behavior due to lack of lookup on entry followed by
> > > readdirp
> >
> > > Hi All,
> >
> > > We are facing some inconsistent behavior for fops like rename, unlink
> > > etc due to lack of lookup followed by a readdirp, more specifically if
> > > inodes/gfid are populated via readdirp call and this nodeid is shared
> > > with kernal, md-cache will cache this based on base-name. Then
> > > subsequent named lookup will 

Re: [Gluster-devel] testcase ./tests/geo-rep/georep-basic-dr-rsync.t failure

2015-08-12 Thread Kotresh Hiremath Ravishankar
Hi Emmanuel,

Well, the failure is due to glusterd unable to get the lock and is failing with
'Another Transaction in progress'
It's interesting as the same test case is passing in linux regression machine
and consistently failing in netBSD.

We need a netbsd machine off the ring to debug. Could you please provide one?


Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Emmanuel Dreyfus" 
> To: "Kotresh Hiremath Ravishankar" 
> Cc: "Gluster Devel" 
> Sent: Thursday, August 13, 2015 5:42:36 AM
> Subject: Re: [Gluster-devel] testcase ./tests/geo-rep/georep-basic-dr-rsync.t 
> failure
> 
> Kotresh Hiremath Ravishankar  wrote:
> 
> >  have fixed the above in following four machines which are up by adding
> > "export PATH=$PATH:/build/install/sbin:/build/install/bin" in ~/.kshrc and
> > similarly in other shells as I didn't know the default shell used by
> > regression run
> 
> IMO you covered a bug: georep should cope with directory layout not
> hardcoded.
> 
> --
> Emmanuel Dreyfus
> http://hcpnet.free.fr/pubz
> m...@netbsd.org
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel