Re: [Gluster-devel] Sharding - what next?

2015-12-09 Thread Krutika Dhananjay
- Original Message -

> From: "Lindsay Mathieson" 
> To: "Krutika Dhananjay" , "Gluster Devel"
> , "gluster-users" 
> Sent: Wednesday, December 9, 2015 6:48:40 PM
> Subject: Re: Sharding - what next?

> Hi Guys, sorry for the late reply, my attention tends to be somewhat sporadic
> due to work and the large number of rescue dogs/cats I care for :)

> On 3/12/2015 8:34 PM, Krutika Dhananjay wrote:

> > We would love to hear from you on what you think of the feature and where
> > it
> > could be improved.
> 
> > Specifically, the following are the questions we are seeking feedback on:
> 
> > a) your experience testing sharding with VM store use-case - any bugs you
> > ran
> > into, any performance issues, etc
> 

> Testing was initially somewhat stressful as I regularly encountered file
> corruption. However I don't think that was due to bugs, rather incorrect
> settings for the VM usecase. Once I got that sorted out it has been very
> stable - I have really stressed failure modes we run into at work - nodes
> going down while heavy writes were happening. Live migrations during heals.
> gluster software being killed while VM were running on the host. So far its
> held up without a hitch.

> To that end, one thing I think should be made more obvious is the settings
> required for VM Hosting:

> > quick-read=off
> 
> > read-ahead=off
> 
> > io-cache=off
> 
> > stat-prefetch=off
> 
> > eager-lock=enable
> 
> > remote-dio=enable
> 
> > quorum-type=auto
> 
> > server-quorum-type=server
> 

> They are quite crucial and very easy to miss in the online docs. And they are
> only recommended with noo mention that you will corrupt KVM VM's if you live
> migrate them between gluster nodes without them set. Also the virt group is
> missing from the debian packages.
Hi Lindsay, 
Thanks for the feedback. I will get in touch with Humble to find out what can 
be done about the docs. 

> Setting them does seem to have slowed sequential writes by about 10% but I
> need to test that more.

> Something related - sharding is useful because it makes heals much more
> granular and hence faster. To that end it would be really useful if there
> was a heal info variant that gave a overview of the process - rather than
> list the shards that are being healed, just a aggregate total, e.g.

> $ gluster volume heal datastore1 status
> volume datastore1
> - split brain: 0
> - Wounded:65
> - healing:4

> It gives one a easy feeling of progress - heals aren't happening faster, but
> it would feel that way :)
There is a 'heal-info summary' command that is under review, written by 
Mohammed Ashiq @ http://review.gluster.org/#/c/12154/3 which prints the number 
of files that are yet to be healed. 
It could perhaps be enhanced to print files in split-brain and also files which 
are possibly being healed. Note that these counts are printed per brick. 
It does not print a single list of counts with aggregated values. Would that be 
something you would consider useful? 

> Also, it would be great if the heal info command could return faster,
> sometimes it takes over a minute.
Yeah, I think part of the problem could be eager-lock feature which is causing 
the GlusterFS client process to not relinquish the network lock on the file 
soon enough, causing the heal info utility to be blocked for longer duration. 
There is an enhancement Anuradha Talur is working on where heal-info would do 
away with taking locks altogether. Once that is in place, heal-info should 
return faster. 

-Krutika 

> Thanks for the great work,

> Lindsay
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] libgfapi compound operations - multiple writes

2015-12-09 Thread Poornima Gurusiddaiah
Answers inline.

Regards,
Poornima

- Original Message -
> From: "Raghavendra Gowdappa" 
> To: "Poornima Gurusiddaiah" 
> Cc: "Gluster Devel" 
> Sent: Wednesday, December 9, 2015 9:00:55 PM
> Subject: libgfapi compound operations - multiple writes
> 
> forking off since it muddles the original conversation. I've some questions:
> 
> 1. Why do multiple writes need to be compounded together?
If the application splits the large write into fixed sized chunks (Samba 64KB), 
it would be an option to compound it.

> 2. If the reason is aggregation, cant we tune write-behind to do the same?
Yes surely. IO-cache would be a better candidate? write behind mostly doesn't 
aggregate.
Since this can also be one by compound fops, just added as good to have. But 
not mandatory as it can be achieved otherwise.

> 
> regards,
> Raghavendra.
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] test throws core intermittently: tests/bugs/snapshot/bug-1140162-file-snapshot-features-encrypt-opts-validation.t

2015-12-09 Thread Gaurav Garg
Hi,

this issue already reported by community and it seems that there is problem 
during cleanup when features.encryption is enable.

previous discussion on the same core: 

http://nongnu.13855.n7.nabble.com/Upstream-regression-crash-https-build-gluster-org-job-rackspace-regression-2GB-triggered-16191-consol-td206079.html

will look into this issue further.

Thanks,
Gaurav

- Original Message -
From: "Vijay Bellur" 
To: "Michael Adam" , gluster-devel@gluster.org, "Gaurav Garg" 

Sent: Thursday, December 10, 2015 9:12:08 AM
Subject: Re: [Gluster-devel] test throws core intermittently: 
tests/bugs/snapshot/bug-1140162-file-snapshot-features-encrypt-opts-validation.t

On 12/09/2015 07:33 PM, Michael Adam wrote:
> by
>
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/16674/consoleFull
>
>

Gaurav - can you please check this test? It caused the baseline 
regression to fail as well:

https://build.gluster.org/job/regression-test-burn-in/47/console

Regards,
Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] libgfapi compound operations - multiple writes

2015-12-09 Thread Raghavendra Gowdappa


- Original Message -
> From: "Jeff Darcy" 
> To: "Raghavendra Gowdappa" , "Poornima Gurusiddaiah" 
> 
> Cc: "Gluster Devel" 
> Sent: Wednesday, December 9, 2015 10:36:43 PM
> Subject: Re: [Gluster-devel] libgfapi compound operations - multiple writes
> 
> 
> 
> 
> On December 9, 2015 at 10:31:03 AM, Raghavendra Gowdappa
> (rgowd...@redhat.com) wrote:
> > forking off since it muddles the original conversation. I've some
> > questions:
> >  
> > 1. Why do multiple writes need to be compounded together?
> > 2. If the reason is aggregation, cant we tune write-behind to do the same?
> 
> I think compounding (as we’ve been discussing it) is only necessary when
> there’s a dependency between operations.  For example, if the first
> creates a value (e.g. file descriptor) used by the second, or if the
> second should not proceed unless the first (e.g. a lock) succeeded.  If
> multiple operations are completely independent of one another, as is the
> case for writes without fsync, then I think we should rely on
> write-behind or something similar instead.  Compounding is likely to be
> the wrong solution here for two reasons:
> 
>  * Correctness: if the writes are independent, there’s no reason why
>    failure of the first should cause the second not to be issued (as
>    would be the case with compounding).
> 
>  * Performance: compounding would keep the writes separate, whereas
>    write-behind can reduce overhead even more by coalescing them into a
>    single request.

Yes. I had similar thoughts while asking the question. Thanks for elaborating.

> 
> There is, however, one case where compounding would be the right answer:
> when there really is a dependency between the writes.  There’s no way to
> specify this through the POSIX/VFS interface (more’s the pity), but it’s
> easy to imagine GFAPI or internal use cases where a second write should
> not overtake or continue without the first - e.g.  a key/value store
> that writes new data followed by an index update pointing to that data.
> The strictly-sequential behavior of a compound operation might be just
> the right match for such cases.

We have one such use-case already i.e., O_APPEND writes. In fact write-behind 
has enough logic to address dependencies like conflicting writes, read, stat 
etc on just written regions etc (Of course, we would loose performance gains as 
write-behind still wind calls across network for dependent ops. But again, if 
write-behind cache is sufficient enough, this latency is not witnessed by 
application). So, I am wondering can we pass down these dependency requirements 
down the stack and let write-behind handle them.

@Poornima and others,

Did you've any such use-cases in mind when you proposed compounding?

regards,
Raghavendra
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] intermittent failure - tests/bugs/glusterd/bug-1225716-brick-online-validation-remove-brick.t

2015-12-09 Thread Vijay Bellur

Hi Sakshi,

can you please take a look into

./tests/bugs/glusterd/bug-1225716-brick-online-validation-remove-brick.t ?

A non-related patch got affected by this test:

https://build.gluster.org/job/rackspace-regression-2GB-triggered/16686/consoleFull

Thanks,
Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] test throws core intermittently: tests/bugs/snapshot/bug-1140162-file-snapshot-features-encrypt-opts-validation.t

2015-12-09 Thread Vijay Bellur

On 12/09/2015 07:33 PM, Michael Adam wrote:

by

https://build.gluster.org/job/rackspace-regression-2GB-triggered/16674/consoleFull




Gaurav - can you please check this test? It caused the baseline 
regression to fail as well:


https://build.gluster.org/job/regression-test-burn-in/47/console

Regards,
Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] everything builds and installs on FreeBSD via ports tarball

2015-12-09 Thread Rick Macklem
Hi,

Just to let you know, others are working on a "port" for FreeBSD and
everything builds/installs when you use it. (It uses gcc and some other
things I didn't use. I suspect my "make install" problem had to do with
using BSD make, but I don't know.)

Anyhow, it build/installs and work once you:
cp /usr/local/etc/glusterfs/glusterd.vol.sample 
/usr/local/etc/glusterfs/glusterd.vol

Hopefully the "port" will be fixed for this and set up soon, which will
make it much easier for others to test on FreeBSD.

Just fyi, rick

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


[Gluster-devel] submitted one patch for marking several tests bad

2015-12-09 Thread Michael Adam
Since all those patches mutually prevent the other patches
addition to the bad tests list from successfully running
regressions, I created a patch to add all those that I have
seen recently:

http://review.gluster.org/#/c/12933/

If it is too much for your taste, I'll reduce... :-)

Cheers - Michael


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] test throws core intermittently: tests/bugs/snapshot/bug-1140162-file-snapshot-features-encrypt-opts-validation.t

2015-12-09 Thread Michael Adam
by

https://build.gluster.org/job/rackspace-regression-2GB-triggered/16674/consoleFull


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] intermittent failure: tests/bugs/tier/bug-1279376-rename-demoted-file.t

2015-12-09 Thread Michael Adam
Another one?

https://build.gluster.org/job/rackspace-regression-2GB-triggered/16675/console

Triggered by:

http://review.gluster.org/12930

Cheers - Michael


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] intermittent failure: tests/basic/afr/split-brain-healing.t

2015-12-09 Thread Michael Adam
On 2015-12-09 at 17:00 +0100, Michael Adam wrote:
> 
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/16652/consoleFull
> 
> triggered by
> 
> http://review.gluster.org/#/c/12826/

More of these happen.

E.g.:

https://build.gluster.org/job/rackspace-regression-2GB-triggered/16680/consoleFull

Created a bug

https://bugzilla.redhat.com/show_bug.cgi?id=1290245

and a patch mark the test as bad:

http://review.gluster.org/#/c/12932/

Thanks - Michael



signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] intermittent failure: tests/features/weighted-rebalance.t

2015-12-09 Thread Michael Adam
On 2015-12-09 at 19:59 +0100, Michael Adam wrote:
> https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/12530/consoleFull
> 
> http://review.gluster.org/#/c/12929/
> 
> Michael


Having eliminated arbiter-statfs.t (in the review request above),
this seems to be the next suspect.

https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/12538/consoleFull

Created a BZ:

https://bugzilla.redhat.com/show_bug.cgi?id=1290204

and a patch to mark it bad:

http://review.gluster.org/12931

Cheers - Michael



signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] intermittent failure: tests/features/weighted-rebalance.t

2015-12-09 Thread Michael Adam
https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/12530/consoleFull

http://review.gluster.org/#/c/12929/

Michael


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] compound fop design first cut

2015-12-09 Thread Pranith Kumar Karampuri



On 12/09/2015 08:11 PM, Shyam wrote:

On 12/09/2015 02:37 AM, Soumya Koduri wrote:



On 12/09/2015 11:44 AM, Pranith Kumar Karampuri wrote:



On 12/09/2015 06:37 AM, Vijay Bellur wrote:

On 12/08/2015 03:45 PM, Jeff Darcy wrote:




On December 8, 2015 at 12:53:04 PM, Ira Cooper (i...@redhat.com) 
wrote:

Raghavendra Gowdappa writes:
I propose that we define a "compound op" that contains ops.

Within each op, there are fields that can be "inherited" from the
previous op, via use of a sentinel value.

Sentinel is -1, for all of these examples.

So:

LOOKUP (1, "foo") (Sets the gfid value to be picked up by
compounding, 1
is the root directory, as a gfid, by convention.)
OPEN(-1, O_RDWR) (Uses the gfid value, sets the glfd compound 
value.)

WRITE(-1, "foo", 3) (Uses the glfd compound value.)
CLOSE(-1) (Uses the glfd compound value)


So, basically, what the programming-language types would call futures
and promises.  It’s a good and well studied concept, which is 
necessary

to solve the second-order problem of how to specify an argument in
sub-operation N+1 that’s not known until sub-operation N completes.

To be honest, some of the highly general approaches suggested here
scare
me too.  Wrapping up the arguments for one sub-operation in xdata for
another would get pretty hairy if we ever try to go beyond two
sub-operations and have to nest sub-operation #3’s args within
sub-operation #2’s xdata which is itself encoded within sub-operation
#1’s xdata.  There’s also not much clarity about how to handle
errors in
that model.  Encoding N sub-operations’ arguments in a linear 
structure
as Shyam proposes seems a bit cleaner that way.  If I were to 
continue
down that route I’d suggest just having start_compound and 
end-compound
fops, plus an extra field (or by-convention xdata key) that either 
the

client-side or server-side translator could use to build whatever
structure it wants and schedule sub-operations however it wants.

However, I’d be even more comfortable with an even simpler approach
that
avoids the need to solve what the database folks (who have dealt with
complex transactions for years) would tell us is a really hard 
problem.

Instead of designing for every case we can imagine, let’s design for
the
cases that we know would be useful for improving performance. Open 
plus

read/write plus close is an obvious one.  Raghavendra mentions
create+inodelk as well.  For each of those, we can easily define a
structure that contains the necessary fields, we don’t need a
client-side translator, and the server-side translator can take 
care of
“forwarding” results from one sub-operation to the next. We could 
even
use GF_FOP_IPC to prototype this.  If we later find that the 
number of

“one-off” compound requests is growing too large, then at least we’ll
have some experience to guide our design of a more general 
alternative.

Right now, I think we’re trying to look further ahead than we can see
clearly.
Yes Agree. This makes implementation on the client side simpler as 
well.

So it is welcome.

Just updating the solution.
1) New RPCs are going to be implemented.
2) client stack will use these new fops.
3) On the server side we have server xlator implementing these new fops
to decode the RPC request then resolve_resume and
compound-op-receiver(Better name for this is welcome) which sends 
one op

after other and send compound fop response.


@Pranith, I assume you would expand on this at a later date (something 
along the lines of what Soumya has done below, right?


I will talk to her tomorrow to know more about this. Not saying this is 
what I will be implementing (There doesn't seem to be any consensus 
yet). But I would love to know how it is implemented.


Pranith




List of compound fops identified so far:
Swift/S3:
PUT: creat(), write()s, setxattr(), fsync(), close(), rename()

Dht:
mkdir + inodelk

Afr:
xattrop+writev, xattrop+unlock to begin with.

Could everyone who needs compound fops add to this list?

I see that Niels is back on 14th. Does anyone else know the list of
compound fops he has in mind?


 From the discussions we had with Niels regarding the kerberos support
on GlusterFS, I think below are the set of compound fops which are
required.

set_uid +
set_gid +
set_lkowner (or kerberos principal name) +
actual_fop

Also gfapi does lookup (first time/to refresh inode) before performing
actual fops most of the times. It may really help if we can club such
fops -


@Soumya +5 (just a random number :) )

This came to my mind as well, and is a good candidate for compounding.



LOOKUP + FOP (OPEN etc)

Coming to the design proposed, I agree with Shyam, Ira and Jeff's
thoughts. Defining different compound fops for each specific set of
operations and wrapping up those arguments in xdata seem rather complex
and difficult to maintain going further. Having being worked with NFS,
may I suggest why not we follow (or in similar lines)  the approach
being taken by NFS protocol to define and implem

Re: [Gluster-devel] Netbsd failures on ./tests/basic/afr/arbiter-statfs.t

2015-12-09 Thread Michael Adam
On 2015-12-09 at 10:17 -0500, Vijay Bellur wrote:
> On 08/24/2015 07:01 AM, Susant Palai wrote:
> >Ravi,
> >  The test case ./tests/basic/afr/arbiter-statfs.t failing frequently on 
> > netbsd machine. Requesting to take a look.
> >
> 
> tests/basic/afr/arbiter-statfs.t seems to be affecting most NetBSD runs now.
> Ravi - can you please take a look in?
> 
> Sample test run that got affected by this test unit:
> 
> https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/12516/consoleFull

This seems to prevent any NetBSD regression run from succeeding
currently. Have seen it many times since your mail.

I have created a bug:

https://bugzilla.redhat.com/show_bug.cgi?id=1290125

and a patch to add the test to bad tests for now:

http://review.gluster.org/12929

Michael


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] libgfapi compound operations - multiple writes

2015-12-09 Thread Jeff Darcy



On December 9, 2015 at 10:31:03 AM, Raghavendra Gowdappa (rgowd...@redhat.com) 
wrote:
> forking off since it muddles the original conversation. I've some questions:
>  
> 1. Why do multiple writes need to be compounded together?
> 2. If the reason is aggregation, cant we tune write-behind to do the same?

I think compounding (as we’ve been discussing it) is only necessary when
there’s a dependency between operations.  For example, if the first
creates a value (e.g. file descriptor) used by the second, or if the
second should not proceed unless the first (e.g. a lock) succeeded.  If
multiple operations are completely independent of one another, as is the
case for writes without fsync, then I think we should rely on
write-behind or something similar instead.  Compounding is likely to be
the wrong solution here for two reasons:

 * Correctness: if the writes are independent, there’s no reason why
   failure of the first should cause the second not to be issued (as
   would be the case with compounding).

 * Performance: compounding would keep the writes separate, whereas
   write-behind can reduce overhead even more by coalescing them into a
   single request.

There is, however, one case where compounding would be the right answer:
when there really is a dependency between the writes.  There’s no way to
specify this through the POSIX/VFS interface (more’s the pity), but it’s
easy to imagine GFAPI or internal use cases where a second write should
not overtake or continue without the first - e.g.  a key/value store
that writes new data followed by an index update pointing to that data.
The strictly-sequential behavior of a compound operation might be just
the right match for such cases.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] compound fop design first cut

2015-12-09 Thread Pranith Kumar Karampuri



On 12/09/2015 08:08 PM, Shyam wrote:

On 12/09/2015 12:52 AM, Pranith Kumar Karampuri wrote:



On 12/09/2015 10:39 AM, Prashanth Pai wrote:
However, I’d be even more comfortable with an even simpler approach 
that

avoids the need to solve what the database folks (who have dealt with
complex transactions for years) would tell us is a really hard 
problem.
Instead of designing for every case we can imagine, let’s design 
for the
cases that we know would be useful for improving performance.  Open 
plus

read/write plus close is an obvious one.  Raghavendra mentions
create+inodelk as well.

 From object interface (Swift/S3) perspective, this is the fop order
and flow for object operations:

GET: open(), fstat(), fgetxattr()s, read()s, close()

Krutika implemented fstat+fgetxattr(http://review.gluster.org/10180). In
posix there is an implementation of GF_CONTENT_KEY which is used to read
a file in lookup by quick-read. This needs to be exposed for fds as well
I think. So you can do all this using fstat on anon-fd.

HEAD: stat(), getxattr()s

Krutika already implemented this for sharding
http://review.gluster.org/10158. You can do this using stat fop.


I believe we need to fork this part of the conversation, i.e the stat 
+ xattr information clubbing.


My view on a stat for gluster is, POSIX stat + gluster extended 
information being returned. I state this as, a file system when it 
stats its inode, should get all information regarding the inode, and 
not just the POSIX ones. In the case of other local FS, the inode 
structure has more fields than just what POSIX needs, so when the 
inode is *read* the FS can populate all its internal inode information 
and return to the application/syscall the relevant fields that it needs.


I believe gluster should do the same, so in the cases above, we should 
actually extend our stat information (not elaborating how) to include 
all information from the brick, i.e stat from POSIX and all the 
extended attrs for the inode (file or dir). This can then be consumed 
by any layer as needed.


Currently, each layer adds what it needs in addition to the stat 
information in the xdata, as an xattr request, this can continue or go 
away, if the relevant FOPs return the whole inode information upward.


This also has useful outcomes in readdirp calls, where we get the 
extended stat information for each entry.

You can use "list-xattr" in xdata request to get this.


With the patches referred to, and older patches, this seems to be the 
direction sought (around 2013), any reasons why this is not prevalent 
across the stack and made so? Or am I mistaken?
No reason. We can revive it. There didn't seem to be any interest. So I 
didn't follow up to get it in.


Pranith



PUT: creat(), write()s, setxattr(), fsync(), close(), rename()

This I think should be a new compound fop. Nothing similar exists.

DELETE: getxattr(), unlink()

This can also be clubbed in unlink already because xdata exists on the
wire already.


Compounding some of these ops and exposing them as consumable libgfapi
APIs like glfs_get() and glfs_put() similar to librados compound
APIs[1] would greatly improve performance for object based access.

[1]:
https://github.com/ceph/ceph/blob/master/src/include/rados/librados.h#L2219 




Thanks.

- Prashanth Pai


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] intermittent failure: tests/basic/afr/split-brain-healing.t

2015-12-09 Thread Michael Adam

https://build.gluster.org/job/rackspace-regression-2GB-triggered/16652/consoleFull

triggered by

http://review.gluster.org/#/c/12826/

Michael


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] intermittent test failure: tests/basic/afr/sparse-file-self-heal.t

2015-12-09 Thread Michael Adam
On 2015-12-09 at 14:49 +0100, Michael Adam wrote:
> On 2015-12-09 at 13:20 +0100, Michael Adam wrote:
> > On 2015-12-09 at 09:19 +0100, Michael Adam wrote:
> > > Another one:
> > > 
> > > https://build.gluster.org/job/rackspace-regression-2GB-triggered/16601/consoleFull
> > > 
> > > by
> > > 
> > > http://review.gluster.org/#/c/12826/
> > > 
> > > Cheers - Michael
> > 
> > 
> > Again:
> > 
> > https://build.gluster.org/job/rackspace-regression-2GB-triggered/16644/consoleFull
> 
> and again:
> 
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/16652/consoleFull

Forget that -- it is a different test.

Michael



signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] libgfapi compound operations - multiple writes

2015-12-09 Thread Raghavendra Gowdappa
forking off since it muddles the original conversation. I've some questions:

1. Why do multiple writes need to be compounded together?
2. If the reason is aggregation, cant we tune write-behind to do the same?

regards,
Raghavendra.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Netbsd failures on ./tests/basic/afr/arbiter-statfs.t

2015-12-09 Thread Vijay Bellur

On 08/24/2015 07:01 AM, Susant Palai wrote:

Ravi,
  The test case ./tests/basic/afr/arbiter-statfs.t failing frequently on netbsd 
machine. Requesting to take a look.



tests/basic/afr/arbiter-statfs.t seems to be affecting most NetBSD runs 
now. Ravi - can you please take a look in?


Sample test run that got affected by this test unit:

https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/12516/consoleFull

-Vijay

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] compound fop design first cut

2015-12-09 Thread Shyam

On 12/09/2015 09:32 AM, Jeff Darcy wrote:




On December 9, 2015 at 7:07:06 AM, Ira Cooper (i...@redhat.com) wrote:

A simple "abort on failure" and let the higher levels clean it up is
probably right for the type of compounding I propose. It is what SMB2
does. So, if you get an error return value, cancel the rest of the
request, and have it return ECOMPOUND as the errno.


This is exactly the part that worries me.  If a compound operation
fails, some parts of it will often need to be undone.  “Let the higher
levels clean it up” means that rollback code will be scattered among all
of the translators that use compound operations.  Some of them will do
it right.  Others . . . less so.  ;)  All willl have to be tested
separately.  If we centralize dispatch of compound operations into one
piece of code, we can centralize error detection and recovery likewise.
That ensures uniformity of implementation, and facilitates focused
testing (or even formal proof) of that implementation.


My take on this, is whichever layer started the compounding takes into 
account the error handling. I do not see any requirement for undoing 
things that are done, and would almost say (without further thought 
(that's the gunslinger in me talking ;) )) that this is not supported as 
a part of the compounding.




Can we gain the same benefits with a more generic design?  Perhaps.  It
would require that the compounding translator know how to reverse each
type of operation, so that it can do so after an error.  That’s
feasible, though it does mean maintaining a stack of undo actions
instead of a simple state.  It might also mean testing combinations and
scenarios that will actually never occur in other components’ usage of
the compounding feature.  More likely it means that people will *think*
they can use the facility in unanticipated ways, until their
unanticipated usage creates a combination or scenario that was never
tested and doesn’t work.  Those are going to be hard problems to debug.
I think it’s better to be explicit about which permutations we actually
expect to work, and have those working earlier.


Jeff, a clarification, are you suggesting fop_xxx extensions for each 
compound operation supported?

Or,
Suggesting a *single* FOP, that carries compounded requests, but is 
specific about what requests can be compounded? (for example, allows 
open+write, but when building out the compound request, disallows *say* 
anything else)


(If any doubt, I am with the latter and not so gaga about the former as 
it explodes the FOP list)


Also, I think the compound list has exploded (in this mail conversation) 
and provided a lot of compounding requests... I would say this means we 
need a clear way of doing the latter.



___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



P.S: Ignore this...
gunslinger: "a man who carries a gun and shoots well." I claim to be 
neither... just stating

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] compound fop design first cut

2015-12-09 Thread Rajesh Joseph


- Original Message -
> From: "Ira Cooper" 
> To: "Jeff Darcy" , "Raghavendra Gowdappa" 
> , "Pranith Kumar Karampuri"
> 
> Cc: "Gluster Devel" 
> Sent: Wednesday, December 9, 2015 5:37:05 PM
> Subject: Re: [Gluster-devel] compound fop design first cut
> 
> Jeff Darcy  writes:
> 
> > However, I’d be even more comfortable with an even simpler approach that
> > avoids the need to solve what the database folks (who have dealt with
> > complex transactions for years) would tell us is a really hard problem.
> > Instead of designing for every case we can imagine, let’s design for the
> > cases that we know would be useful for improving performance.  Open plus
> > read/write plus close is an obvious one.  Raghavendra mentions
> > create+inodelk as well.  For each of those, we can easily define a
> > structure that contains the necessary fields, we don’t need a
> > client-side translator, and the server-side translator can take care of
> > “forwarding” results from one sub-operation to the next.  We could even
> > use GF_FOP_IPC to prototype this.  If we later find that the number of
> > “one-off” compound requests is growing too large, then at least we’ll
> > have some experience to guide our design of a more general alternative.
> > Right now, I think we’re trying to look further ahead than we can see
> > clearly.
> 
> Actually, I'm taking the design, I've seen another network protocol use,
> SMB2, and proposing it here, I'd be shocked if NFS doesn't behave in the
> same way.
> 
> Interestingly, all the cases, really deal with a single file, and a
> single lock, and a single...
> 
> There's a reason I talked about a single sentinel value, and not
> multiple ones.  Because I wanted to keep it simple.  Yes, the extensions
> you mention are obvious, but they lead to a giant mess, that we may not
> want initially.  (But that we CAN extend into if we want them.  I made
> the choice not to go there because honestly, I found the complexity too
> much for me.)
> 
> A simple "abort on failure" and let the higher levels clean it up is
> probably right for the type of compounding I propose.  It is what SMB2
> does.  So, if you get an error return value, cancel the rest of the
> request, and have it return ECOMPOUND as the errno.
> 
> Note: How you keep the list to be compounded doesn't matter much to me.
> the semantics matter, because those are what I can ask for later, and
> allow us to create ops the original desginers hadn't thought of, which
> is usually the hallmark of a good design.
> 
> I think you should look for a simple design you can "grow into" instead
> of creating one off ops, to satisfy a demand today.
> 

I agree with Ira here. This problem is already addressed by NFS and SMB.
So instead of reinventing the wheel lets pick the best bits from these
solutions and incorporate in Gluster.

From multi-protocol point of view we like to compound operations like
open + set_leaseID + lk and many more. With the current approach it would 
be really messy to have separate functions for each such combinations and a 
dedicated translator to handle them.

As others have mentioned I think it would be better to have a general
fop (fop_compound) which can handle compound fop. Each translator can
choose to implement it or not. Each translator can take a decision 
whether to compound more fops or de-compound them. e.g. currently
you can make the protocol server de-compound all the compound fops.

-Rajesh

> My thoughts,
> 
> -Ira
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] compound fop design first cut

2015-12-09 Thread Shyam

On 12/09/2015 02:37 AM, Soumya Koduri wrote:



On 12/09/2015 11:44 AM, Pranith Kumar Karampuri wrote:



On 12/09/2015 06:37 AM, Vijay Bellur wrote:

On 12/08/2015 03:45 PM, Jeff Darcy wrote:




On December 8, 2015 at 12:53:04 PM, Ira Cooper (i...@redhat.com) wrote:

Raghavendra Gowdappa writes:
I propose that we define a "compound op" that contains ops.

Within each op, there are fields that can be "inherited" from the
previous op, via use of a sentinel value.

Sentinel is -1, for all of these examples.

So:

LOOKUP (1, "foo") (Sets the gfid value to be picked up by
compounding, 1
is the root directory, as a gfid, by convention.)
OPEN(-1, O_RDWR) (Uses the gfid value, sets the glfd compound value.)
WRITE(-1, "foo", 3) (Uses the glfd compound value.)
CLOSE(-1) (Uses the glfd compound value)


So, basically, what the programming-language types would call futures
and promises.  It’s a good and well studied concept, which is necessary
to solve the second-order problem of how to specify an argument in
sub-operation N+1 that’s not known until sub-operation N completes.

To be honest, some of the highly general approaches suggested here
scare
me too.  Wrapping up the arguments for one sub-operation in xdata for
another would get pretty hairy if we ever try to go beyond two
sub-operations and have to nest sub-operation #3’s args within
sub-operation #2’s xdata which is itself encoded within sub-operation
#1’s xdata.  There’s also not much clarity about how to handle
errors in
that model.  Encoding N sub-operations’ arguments in a linear structure
as Shyam proposes seems a bit cleaner that way.  If I were to continue
down that route I’d suggest just having start_compound and end-compound
fops, plus an extra field (or by-convention xdata key) that either the
client-side or server-side translator could use to build whatever
structure it wants and schedule sub-operations however it wants.

However, I’d be even more comfortable with an even simpler approach
that
avoids the need to solve what the database folks (who have dealt with
complex transactions for years) would tell us is a really hard problem.
Instead of designing for every case we can imagine, let’s design for
the
cases that we know would be useful for improving performance. Open plus
read/write plus close is an obvious one.  Raghavendra mentions
create+inodelk as well.  For each of those, we can easily define a
structure that contains the necessary fields, we don’t need a
client-side translator, and the server-side translator can take care of
“forwarding” results from one sub-operation to the next.  We could even
use GF_FOP_IPC to prototype this.  If we later find that the number of
“one-off” compound requests is growing too large, then at least we’ll
have some experience to guide our design of a more general alternative.
Right now, I think we’re trying to look further ahead than we can see
clearly.

Yes Agree. This makes implementation on the client side simpler as well.
So it is welcome.

Just updating the solution.
1) New RPCs are going to be implemented.
2) client stack will use these new fops.
3) On the server side we have server xlator implementing these new fops
to decode the RPC request then resolve_resume and
compound-op-receiver(Better name for this is welcome) which sends one op
after other and send compound fop response.


@Pranith, I assume you would expand on this at a later date (something 
along the lines of what Soumya has done below, right?




List of compound fops identified so far:
Swift/S3:
PUT: creat(), write()s, setxattr(), fsync(), close(), rename()

Dht:
mkdir + inodelk

Afr:
xattrop+writev, xattrop+unlock to begin with.

Could everyone who needs compound fops add to this list?

I see that Niels is back on 14th. Does anyone else know the list of
compound fops he has in mind?


 From the discussions we had with Niels regarding the kerberos support
on GlusterFS, I think below are the set of compound fops which are
required.

set_uid +
set_gid +
set_lkowner (or kerberos principal name) +
actual_fop

Also gfapi does lookup (first time/to refresh inode) before performing
actual fops most of the times. It may really help if we can club such
fops -


@Soumya +5 (just a random number :) )

This came to my mind as well, and is a good candidate for compounding.



LOOKUP + FOP (OPEN etc)

Coming to the design proposed, I agree with Shyam, Ira and Jeff's
thoughts. Defining different compound fops for each specific set of
operations and wrapping up those arguments in xdata seem rather complex
and difficult to maintain going further. Having being worked with NFS,
may I suggest why not we follow (or in similar lines)  the approach
being taken by NFS protocol to define and implement compound procedures.

The basic structure of the NFS COMPOUND procedure is:

+-+--++---+---+---+--
| tag | minorversion | numops | op + args | op + args | op + args |
+-+--++

Re: [Gluster-devel] compound fop design first cut

2015-12-09 Thread Shyam

On 12/09/2015 12:52 AM, Pranith Kumar Karampuri wrote:



On 12/09/2015 10:39 AM, Prashanth Pai wrote:

However, I’d be even more comfortable with an even simpler approach that
avoids the need to solve what the database folks (who have dealt with
complex transactions for years) would tell us is a really hard problem.
Instead of designing for every case we can imagine, let’s design for the
cases that we know would be useful for improving performance.  Open plus
read/write plus close is an obvious one.  Raghavendra mentions
create+inodelk as well.

 From object interface (Swift/S3) perspective, this is the fop order
and flow for object operations:

GET: open(), fstat(), fgetxattr()s, read()s, close()

Krutika implemented fstat+fgetxattr(http://review.gluster.org/10180). In
posix there is an implementation of GF_CONTENT_KEY which is used to read
a file in lookup by quick-read. This needs to be exposed for fds as well
I think. So you can do all this using fstat on anon-fd.

HEAD: stat(), getxattr()s

Krutika already implemented this for sharding
http://review.gluster.org/10158. You can do this using stat fop.


I believe we need to fork this part of the conversation, i.e the stat + 
xattr information clubbing.


My view on a stat for gluster is, POSIX stat + gluster extended 
information being returned. I state this as, a file system when it stats 
its inode, should get all information regarding the inode, and not just 
the POSIX ones. In the case of other local FS, the inode structure has 
more fields than just what POSIX needs, so when the inode is *read* the 
FS can populate all its internal inode information and return to the 
application/syscall the relevant fields that it needs.


I believe gluster should do the same, so in the cases above, we should 
actually extend our stat information (not elaborating how) to include 
all information from the brick, i.e stat from POSIX and all the extended 
attrs for the inode (file or dir). This can then be consumed by any 
layer as needed.


Currently, each layer adds what it needs in addition to the stat 
information in the xdata, as an xattr request, this can continue or go 
away, if the relevant FOPs return the whole inode information upward.


This also has useful outcomes in readdirp calls, where we get the 
extended stat information for each entry.


With the patches referred to, and older patches, this seems to be the 
direction sought (around 2013), any reasons why this is not prevalent 
across the stack and made so? Or am I mistaken?



PUT: creat(), write()s, setxattr(), fsync(), close(), rename()

This I think should be a new compound fop. Nothing similar exists.

DELETE: getxattr(), unlink()

This can also be clubbed in unlink already because xdata exists on the
wire already.


Compounding some of these ops and exposing them as consumable libgfapi
APIs like glfs_get() and glfs_put() similar to librados compound
APIs[1] would greatly improve performance for object based access.

[1]:
https://github.com/ceph/ceph/blob/master/src/include/rados/librados.h#L2219


Thanks.

- Prashanth Pai


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] compound fop design first cut

2015-12-09 Thread Poornima Gurusiddaiah
libgfapi compound fops added inline.

- Original Message -
> From: "Kotresh Hiremath Ravishankar" 
> To: "Pranith Kumar Karampuri" 
> Cc: "Gluster Devel" 
> Sent: Wednesday, December 9, 2015 2:18:47 PM
> Subject: Re: [Gluster-devel] compound fop design first cut
> 
> Geo-rep requirements inline.
> 
> Thanks and Regards,
> Kotresh H R
> 
> - Original Message -
> > From: "Pranith Kumar Karampuri" 
> > To: "Vijay Bellur" , "Jeff Darcy" ,
> > "Raghavendra Gowdappa"
> > , "Ira Cooper" 
> > Cc: "Gluster Devel" 
> > Sent: Wednesday, December 9, 2015 11:44:52 AM
> > Subject: Re: [Gluster-devel] compound fop design first cut
> > 
> > 
> > 
> > On 12/09/2015 06:37 AM, Vijay Bellur wrote:
> > > On 12/08/2015 03:45 PM, Jeff Darcy wrote:
> > >>
> > >>
> > >>
> > >> On December 8, 2015 at 12:53:04 PM, Ira Cooper (i...@redhat.com) wrote:
> > >>> Raghavendra Gowdappa writes:
> > >>> I propose that we define a "compound op" that contains ops.
> > >>>
> > >>> Within each op, there are fields that can be "inherited" from the
> > >>> previous op, via use of a sentinel value.
> > >>>
> > >>> Sentinel is -1, for all of these examples.
> > >>>
> > >>> So:
> > >>>
> > >>> LOOKUP (1, "foo") (Sets the gfid value to be picked up by
> > >>> compounding, 1
> > >>> is the root directory, as a gfid, by convention.)
> > >>> OPEN(-1, O_RDWR) (Uses the gfid value, sets the glfd compound value.)
> > >>> WRITE(-1, "foo", 3) (Uses the glfd compound value.)
> > >>> CLOSE(-1) (Uses the glfd compound value)
> > >>
> > >> So, basically, what the programming-language types would call futures
> > >> and promises.  It’s a good and well studied concept, which is necessary
> > >> to solve the second-order problem of how to specify an argument in
> > >> sub-operation N+1 that’s not known until sub-operation N completes.
> > >>
> > >> To be honest, some of the highly general approaches suggested here scare
> > >> me too.  Wrapping up the arguments for one sub-operation in xdata for
> > >> another would get pretty hairy if we ever try to go beyond two
> > >> sub-operations and have to nest sub-operation #3’s args within
> > >> sub-operation #2’s xdata which is itself encoded within sub-operation
> > >> #1’s xdata.  There’s also not much clarity about how to handle errors in
> > >> that model.  Encoding N sub-operations’ arguments in a linear structure
> > >> as Shyam proposes seems a bit cleaner that way.  If I were to continue
> > >> down that route I’d suggest just having start_compound and end-compound
> > >> fops, plus an extra field (or by-convention xdata key) that either the
> > >> client-side or server-side translator could use to build whatever
> > >> structure it wants and schedule sub-operations however it wants.
> > >>
> > >> However, I’d be even more comfortable with an even simpler approach that
> > >> avoids the need to solve what the database folks (who have dealt with
> > >> complex transactions for years) would tell us is a really hard problem.
> > >> Instead of designing for every case we can imagine, let’s design for the
> > >> cases that we know would be useful for improving performance. Open plus
> > >> read/write plus close is an obvious one.  Raghavendra mentions
> > >> create+inodelk as well.  For each of those, we can easily define a
> > >> structure that contains the necessary fields, we don’t need a
> > >> client-side translator, and the server-side translator can take care of
> > >> “forwarding” results from one sub-operation to the next.  We could even
> > >> use GF_FOP_IPC to prototype this.  If we later find that the number of
> > >> “one-off” compound requests is growing too large, then at least we’ll
> > >> have some experience to guide our design of a more general alternative.
> > >> Right now, I think we’re trying to look further ahead than we can see
> > >> clearly.
> > Yes Agree. This makes implementation on the client side simpler as well.
> > So it is welcome.
> > 
> > Just updating the solution.
> > 1) New RPCs are going to be implemented.
> > 2) client stack will use these new fops.
> > 3) On the server side we have server xlator implementing these new fops
> > to decode the RPC request then resolve_resume and
> > compound-op-receiver(Better name for this is welcome) which sends one op
> > after other and send compound fop response.
> > 
> > List of compound fops identified so far:
> > Swift/S3:
> > PUT: creat(), write()s, setxattr(), fsync(), close(), rename()
> > 
> > Dht:
> > mkdir + inodelk
> > 
> > Afr:
> > xattrop+writev, xattrop+unlock to begin with.
> 
>   Geo-rep:
>   mknod,entrylk,stat(on backend gfid)
>   mkdir,entrylk,stat (on backend gfid)
>   symlink,entrylk,stat(on backend gfid)
>   
libgfapi :
glfs_setfsuid, glfs_setfsgid, glfs_setfsgroups, glfs_set_lkowner and 
leaseid - these are not network fops, hence mostly impact gfapi interface for 
compound fops.
open/create + lease + lk
readir + stat + getxattrs => already being discussed to replace this with 
readdirplus
M

Re: [Gluster-devel] compound fop design first cut

2015-12-09 Thread Jeff Darcy



On December 9, 2015 at 7:07:06 AM, Ira Cooper (i...@redhat.com) wrote:
> A simple "abort on failure" and let the higher levels clean it up is
> probably right for the type of compounding I propose. It is what SMB2
> does. So, if you get an error return value, cancel the rest of the
> request, and have it return ECOMPOUND as the errno.

This is exactly the part that worries me.  If a compound operation
fails, some parts of it will often need to be undone.  “Let the higher
levels clean it up” means that rollback code will be scattered among all
of the translators that use compound operations.  Some of them will do
it right.  Others . . . less so.  ;)  All willl have to be tested
separately.  If we centralize dispatch of compound operations into one
piece of code, we can centralize error detection and recovery likewise.
That ensures uniformity of implementation, and facilitates focused
testing (or even formal proof) of that implementation.

Can we gain the same benefits with a more generic design?  Perhaps.  It
would require that the compounding translator know how to reverse each
type of operation, so that it can do so after an error.  That’s
feasible, though it does mean maintaining a stack of undo actions
instead of a simple state.  It might also mean testing combinations and
scenarios that will actually never occur in other components’ usage of
the compounding feature.  More likely it means that people will *think*
they can use the facility in unanticipated ways, until their
unanticipated usage creates a combination or scenario that was never
tested and doesn’t work.  Those are going to be hard problems to debug.
I think it’s better to be explicit about which permutations we actually
expect to work, and have those working earlier.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] intermittent test failure: tests/bugs/tier/bug-1279376-rename-demoted-file.t

2015-12-09 Thread Nithya Balachandran
> 
> 
> - Original Message -
> > From: "Michael Adam" 
> > To: gluster-devel@gluster.org
> > Sent: Wednesday, December 9, 2015 1:46:32 PM
> > Subject: [Gluster-devel] intermittent test failure:
> > tests/bugs/tier/bug-1279376-rename-demoted-file.t
> > 
> > Hi,
> > 
> > found another one. See
> > 
> > https://build.gluster.org/job/rackspace-regression-2GB-triggered/16603/consoleFull
> > 

This run failed because the rename operation could not get the inodelk - looks 
like the file was being migrated. I have posted a patch:
http://review.gluster.org/#/c/12926/
which should prevent the demotion from happening too quickly for the dst file.


> > Run by http://review.gluster.org/#/c/12830/
> > which should not change any test result.


> 
> A bug has been filed at:
> https://bugzilla.redhat.com/show_bug.cgi?id=1289845
> 
> > 
> > Michael
> > 
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] intermittent test failure: tests/basic/afr/sparse-file-self-heal.t

2015-12-09 Thread Michael Adam
On 2015-12-09 at 13:20 +0100, Michael Adam wrote:
> On 2015-12-09 at 09:19 +0100, Michael Adam wrote:
> > Another one:
> > 
> > https://build.gluster.org/job/rackspace-regression-2GB-triggered/16601/consoleFull
> > 
> > by
> > 
> > http://review.gluster.org/#/c/12826/
> > 
> > Cheers - Michael
> 
> 
> Again:
> 
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/16644/consoleFull

and again:

https://build.gluster.org/job/rackspace-regression-2GB-triggered/16652/consoleFull


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] intermittent test failure: tests/bugs/tier/bug-1279376-rename-demoted-file.t

2015-12-09 Thread Raghavendra Gowdappa


- Original Message -
> From: "Michael Adam" 
> To: gluster-devel@gluster.org
> Sent: Wednesday, December 9, 2015 1:46:32 PM
> Subject: [Gluster-devel] intermittent test failure: 
> tests/bugs/tier/bug-1279376-rename-demoted-file.t
> 
> Hi,
> 
> found another one. See
> 
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/16603/consoleFull
> 
> Run by http://review.gluster.org/#/c/12830/
> which should not change any test result.

A bug has been filed at:
https://bugzilla.redhat.com/show_bug.cgi?id=1289845

> 
> Michael
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] intermittent test failure: tests/basic/afr/sparse-file-self-heal.t

2015-12-09 Thread Ravishankar N
I'm able to repro the issue (i.e Failed test #36 of 
sparse-file-self-heal.t) on my ancient rhs-2.1 VM but not on newer 
Fedora 21 machines:


Create a 1x2 replica and from the mount, do : `dd if=/dev/zero of=file 
bs=1024 count=1024`

When both bricks are up, `du /brick/file` = 1024
When one of the bricks is killed and the test repeated, `du /brick/file` 
= 1028


I have no idea why. The issue is reproducible on NFS and fuse mounts on 
the rhs-2.1 VM running  2.6.32 kernel, which is incidentally the same 
version running on slave29.cloud.gluster.org
While I try to figure out the issue, I am adding the test case to bad 
tests for the moment @ http://review.gluster.org/#/c/12925/ . Makes me 
wonder if we can upgrade the build machines to at least centos7 if not 
fedora. 2.6 is really an old kernel!


Thanks,
Ravi


On 12/09/2015 02:40 PM, Ravishankar N wrote:

I'll take a look at this one.
-Ravi

On 12/09/2015 01:49 PM, Michael Adam wrote:

Another one:

https://build.gluster.org/job/rackspace-regression-2GB-triggered/16601/consoleFull

by

http://review.gluster.org/#/c/12826/

Cheers - Michael


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



--
Ravishankar N
work: +91 80 3924 5143
extension: 8373143
mobile: +91 96118 43905
irc nick: itisravi


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



--
Ravishankar N
work: +91 80 3924 5143
extension: 8373143
mobile: +91 96118 43905
irc nick: itisravi

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Meeting minutes of Gluster community meeting 2015-12-09

2015-12-09 Thread Atin Mukherjee
Minutes:
http://meetbot.fedoraproject.org/gluster-meeting/2015-12-09/gluster_community_weekly_meeting.2015-12-09-12.00.html
Minutes (text):
http://meetbot.fedoraproject.org/gluster-meeting/2015-12-09/gluster_community_weekly_meeting.2015-12-09-12.00.txt
Log:
http://meetbot.fedoraproject.org/gluster-meeting/2015-12-09/gluster_community_weekly_meeting.2015-12-09-12.00.log.html


Meeting summary
---
* Roll Call  (atinm, 12:01:08)

* AIs from last week  (atinm, 12:03:58)
  * ACTION: ndevos to send out a reminder to the maintainers about more
actively enforcing backports of bugfixes  (atinm, 12:05:23)
  * ACTION: raghu to call for volunteers and help from maintainers for
doing backports listed by rwareing to 3.6.8  (atinm, 12:06:52)
  * bug triage meeting doodle poll result to be announced on December
22, need more votes  (atinm, 12:09:10)
  * agenda is right here
https://public.pad.fsfe.org/p/gluster-community-meetings  (atinm,
12:09:49)
  * ACTION: rastar and msvbhat to publish a test exit criterion for
major/minor releases on gluster.org  (atinm, 12:10:39)
  * ACTION: kshlm & csim to set up faux/pseudo user email for gerrit,
bugzilla,  github  (atinm, 12:11:24)
  * ACTION: hagarth to decide on 3.7.7 release manager  (atinm,
12:14:12)
  * ACTION: amye to get on top of disucssion on long-term releases.
(atinm, 12:15:28)
  * ACTION: hagarth to post Gluster Monthly News this week  (atinm,
12:17:47)

* GlusterFS 3.7  (atinm, 12:18:36)

* GlusterFS 3.6  (atinm, 12:19:49)
  * raghu to create 3.6.8 tracker  (atinm, 12:20:27)
  * ACTION: hagarth to create 3.6.8 for bugzilla version  (atinm,
12:21:24)
  * ACTION: community needs to find out 3.6.8 release manager  (atinm,
12:23:38)
  * ACTION: raghu to ask for volunteers for release manager for 3.6.8
(atinm, 12:24:22)

* GlusterFS 3.8  (atinm, 12:25:16)

* GlusterFS 4.0  (atinm, 12:27:13)
  * 3.8 feature freeze to happen on mid-last  Jan 2016  (atinm,
12:31:36)
  * ACTION: kkeithley_ to send a mail about using sanity checker tools
in the codebase  (atinm, 12:32:47)
  * Another follow up meeting on 3.8 to take place on first week of
January, 2016  (atinm, 12:33:23)

* Open Floor  (atinm, 12:34:02)
  * LINK:
http://www.gluster.org/pipermail/gluster-devel/2015-November/047125.html
(atinm, 12:37:25)
  * ACTION: rastar to continue the discussion on rebase+fast forward as
an option to gerrit submit type  (atinm, 12:40:49)

Meeting ended at 12:49:26 UTC.




Action Items

* ndevos to send out a reminder to the maintainers about more actively
  enforcing backports of bugfixes
* raghu to call for volunteers and help from maintainers for doing
  backports listed by rwareing to 3.6.8
* rastar and msvbhat to publish a test exit criterion for major/minor
  releases on gluster.org
* kshlm & csim to set up faux/pseudo user email for gerrit, bugzilla,
  github
* hagarth to decide on 3.7.7 release manager
* amye to get on top of disucssion on long-term releases.
* hagarth to post Gluster Monthly News this week
* hagarth to create 3.6.8 for bugzilla version
* community needs to find out 3.6.8 release manager
* raghu to ask for volunteers for release manager for 3.6.8
* kkeithley_ to send a mail about using sanity checker tools in the
  codebase
* rastar to continue the discussion on rebase+fast forward as an option
  to gerrit submit type




Action Items, by person
---
* kkeithley_
  * kkeithley_ to send a mail about using sanity checker tools in the
codebase
* msvbhat
  * rastar and msvbhat to publish a test exit criterion for major/minor
releases on gluster.org
* raghu
  * raghu to call for volunteers and help from maintainers for doing
backports listed by rwareing to 3.6.8
  * raghu to ask for volunteers for release manager for 3.6.8
* rastar
  * rastar and msvbhat to publish a test exit criterion for major/minor
releases on gluster.org
  * rastar to continue the discussion on rebase+fast forward as an
option to gerrit submit type
* **UNASSIGNED**
  * ndevos to send out a reminder to the maintainers about more actively
enforcing backports of bugfixes
  * kshlm & csim to set up faux/pseudo user email for gerrit, bugzilla,
github
  * hagarth to decide on 3.7.7 release manager
  * amye to get on top of disucssion on long-term releases.
  * hagarth to post Gluster Monthly News this week
  * hagarth to create 3.6.8 for bugzilla version
  * community needs to find out 3.6.8 release manager




People Present (lines said)
---
* atinm (116)
* obnox (21)
* kkeithley_ (15)
* raghu (10)
* rastar (10)
* jiffin (7)
* rafi (5)
* hgowtham (4)
* anoopcs (3)
* zodbot (3)
* pranithk (3)
* Manikandan (2)
* skoduri (1)
* msvbhat (1)
* ggarg (1)
* partner (1)
* rjoseph (1)

Cheers,
Atin
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-d

Re: [Gluster-devel] intermittent test failure: tests/basic/afr/sparse-file-self-heal.t

2015-12-09 Thread Michael Adam
On 2015-12-09 at 09:19 +0100, Michael Adam wrote:
> Another one:
> 
> https://build.gluster.org/job/rackspace-regression-2GB-triggered/16601/consoleFull
> 
> by
> 
> http://review.gluster.org/#/c/12826/
> 
> Cheers - Michael


Again:

https://build.gluster.org/job/rackspace-regression-2GB-triggered/16644/consoleFull

same patch (rebased)



signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] compound fop design first cut

2015-12-09 Thread Ira Cooper
Jeff Darcy  writes:

> However, I’d be even more comfortable with an even simpler approach that
> avoids the need to solve what the database folks (who have dealt with
> complex transactions for years) would tell us is a really hard problem.
> Instead of designing for every case we can imagine, let’s design for the
> cases that we know would be useful for improving performance.  Open plus
> read/write plus close is an obvious one.  Raghavendra mentions
> create+inodelk as well.  For each of those, we can easily define a
> structure that contains the necessary fields, we don’t need a
> client-side translator, and the server-side translator can take care of
> “forwarding” results from one sub-operation to the next.  We could even
> use GF_FOP_IPC to prototype this.  If we later find that the number of
> “one-off” compound requests is growing too large, then at least we’ll
> have some experience to guide our design of a more general alternative.
> Right now, I think we’re trying to look further ahead than we can see
> clearly.

Actually, I'm taking the design, I've seen another network protocol use,
SMB2, and proposing it here, I'd be shocked if NFS doesn't behave in the
same way.

Interestingly, all the cases, really deal with a single file, and a
single lock, and a single...

There's a reason I talked about a single sentinel value, and not
multiple ones.  Because I wanted to keep it simple.  Yes, the extensions
you mention are obvious, but they lead to a giant mess, that we may not
want initially.  (But that we CAN extend into if we want them.  I made
the choice not to go there because honestly, I found the complexity too
much for me.)

A simple "abort on failure" and let the higher levels clean it up is
probably right for the type of compounding I propose.  It is what SMB2
does.  So, if you get an error return value, cancel the rest of the
request, and have it return ECOMPOUND as the errno.

Note: How you keep the list to be compounded doesn't matter much to me.
the semantics matter, because those are what I can ask for later, and
allow us to create ops the original desginers hadn't thought of, which
is usually the hallmark of a good design.

I think you should look for a simple design you can "grow into" instead
of creating one off ops, to satisfy a demand today.

My thoughts,

-Ira
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] intermittent test failure: tests/basic/afr/sparse-file-self-heal.t

2015-12-09 Thread Ravishankar N

I'll take a look at this one.
-Ravi

On 12/09/2015 01:49 PM, Michael Adam wrote:

Another one:

https://build.gluster.org/job/rackspace-regression-2GB-triggered/16601/consoleFull

by

http://review.gluster.org/#/c/12826/

Cheers - Michael


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel



--
Ravishankar N
work: +91 80 3924 5143
extension: 8373143
mobile: +91 96118 43905
irc nick: itisravi

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Storing pNFS related state on GlusterFS

2015-12-09 Thread Soumya Koduri

Hi,

pNFS is a feature introduced as part of NFSv4.1 protocol to allow direct 
client access to storage devices containing file data (in short parallel 
I/O). Client request for the layouts of entire file or specific range. 
On receiving the layout information, they shall directly contact the 
server containing the data for the I/O.


In case of a cluster of (NFS)servers,
* Meta-data servers (MDS) are responsible to provide layouts of the
  file and recall them in case of any change in the layout.
* Data servers (DS) contain the actual data and process the I/O.

For more information, kindly refer to [1].

Currently with NFS-Ganesha+GlusterFS, we support FILE_LAYOUTs but with 
single MDS.


So
* to avoid single point of failure & be able to support multiple MDS and
* to recall the layout in case of cluster of (NFS)servers,
we need to store the layouts on the back-end filesystem(GlusterFS) and 
recall them in case of any conflicting access which may change the file 
layout.


Since it is on similar lines to storing and recalling lease state (with 
slightly different semantics), we are planning to store and process them 
as a special type of lease ('LAYOUT') in the lease xlator being worked 
upon as part of [2].


More details are captured in the below spec [3] :
http://review.gluster.org/#/c/12367

Kindly review the same and provide your inputs/comments.

Thanks,
Soumya

[1] https://tools.ietf.org/rfc/rfc5661.txt (Section 12. Parallel NFS (pNFS))

[2] http://review.gluster.org/#/c/11980/

[3] http://review.gluster.org/#/c/12367
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] compound fop design first cut

2015-12-09 Thread Kotresh Hiremath Ravishankar
Geo-rep requirements inline.

Thanks and Regards,
Kotresh H R

- Original Message -
> From: "Pranith Kumar Karampuri" 
> To: "Vijay Bellur" , "Jeff Darcy" , 
> "Raghavendra Gowdappa"
> , "Ira Cooper" 
> Cc: "Gluster Devel" 
> Sent: Wednesday, December 9, 2015 11:44:52 AM
> Subject: Re: [Gluster-devel] compound fop design first cut
> 
> 
> 
> On 12/09/2015 06:37 AM, Vijay Bellur wrote:
> > On 12/08/2015 03:45 PM, Jeff Darcy wrote:
> >>
> >>
> >>
> >> On December 8, 2015 at 12:53:04 PM, Ira Cooper (i...@redhat.com) wrote:
> >>> Raghavendra Gowdappa writes:
> >>> I propose that we define a "compound op" that contains ops.
> >>>
> >>> Within each op, there are fields that can be "inherited" from the
> >>> previous op, via use of a sentinel value.
> >>>
> >>> Sentinel is -1, for all of these examples.
> >>>
> >>> So:
> >>>
> >>> LOOKUP (1, "foo") (Sets the gfid value to be picked up by
> >>> compounding, 1
> >>> is the root directory, as a gfid, by convention.)
> >>> OPEN(-1, O_RDWR) (Uses the gfid value, sets the glfd compound value.)
> >>> WRITE(-1, "foo", 3) (Uses the glfd compound value.)
> >>> CLOSE(-1) (Uses the glfd compound value)
> >>
> >> So, basically, what the programming-language types would call futures
> >> and promises.  It’s a good and well studied concept, which is necessary
> >> to solve the second-order problem of how to specify an argument in
> >> sub-operation N+1 that’s not known until sub-operation N completes.
> >>
> >> To be honest, some of the highly general approaches suggested here scare
> >> me too.  Wrapping up the arguments for one sub-operation in xdata for
> >> another would get pretty hairy if we ever try to go beyond two
> >> sub-operations and have to nest sub-operation #3’s args within
> >> sub-operation #2’s xdata which is itself encoded within sub-operation
> >> #1’s xdata.  There’s also not much clarity about how to handle errors in
> >> that model.  Encoding N sub-operations’ arguments in a linear structure
> >> as Shyam proposes seems a bit cleaner that way.  If I were to continue
> >> down that route I’d suggest just having start_compound and end-compound
> >> fops, plus an extra field (or by-convention xdata key) that either the
> >> client-side or server-side translator could use to build whatever
> >> structure it wants and schedule sub-operations however it wants.
> >>
> >> However, I’d be even more comfortable with an even simpler approach that
> >> avoids the need to solve what the database folks (who have dealt with
> >> complex transactions for years) would tell us is a really hard problem.
> >> Instead of designing for every case we can imagine, let’s design for the
> >> cases that we know would be useful for improving performance. Open plus
> >> read/write plus close is an obvious one.  Raghavendra mentions
> >> create+inodelk as well.  For each of those, we can easily define a
> >> structure that contains the necessary fields, we don’t need a
> >> client-side translator, and the server-side translator can take care of
> >> “forwarding” results from one sub-operation to the next.  We could even
> >> use GF_FOP_IPC to prototype this.  If we later find that the number of
> >> “one-off” compound requests is growing too large, then at least we’ll
> >> have some experience to guide our design of a more general alternative.
> >> Right now, I think we’re trying to look further ahead than we can see
> >> clearly.
> Yes Agree. This makes implementation on the client side simpler as well.
> So it is welcome.
> 
> Just updating the solution.
> 1) New RPCs are going to be implemented.
> 2) client stack will use these new fops.
> 3) On the server side we have server xlator implementing these new fops
> to decode the RPC request then resolve_resume and
> compound-op-receiver(Better name for this is welcome) which sends one op
> after other and send compound fop response.
> 
> List of compound fops identified so far:
> Swift/S3:
> PUT: creat(), write()s, setxattr(), fsync(), close(), rename()
> 
> Dht:
> mkdir + inodelk
> 
> Afr:
> xattrop+writev, xattrop+unlock to begin with.

  Geo-rep:
  mknod,entrylk,stat(on backend gfid)
  mkdir,entrylk,stat (on backend gfid)
  symlink,entrylk,stat(on backend gfid)
  
> 
> Could everyone who needs compound fops add to this list?
> 
> I see that Niels is back on 14th. Does anyone else know the list of
> compound fops he has in mind?
> 
> Pranith.
> >
> > Starting with a well defined set of operations for compounding has its
> > advantages. It would be easier to understand and maintain correctness
> > across the stack. Some of our translators perform transactions &
> > create/update internal metadata for certain fops. It would be easier
> > for such translators if the compound operations are well defined and
> > does not entail deep introspection of a generic representation to
> > ensure that the right behavior gets reflected at the end of a compound
> > operation.
> >
> > -Vijay
> >
> >
> >
> 
> 

[Gluster-devel] intermittent test failure: tests/basic/afr/sparse-file-self-heal.t

2015-12-09 Thread Michael Adam
Another one:

https://build.gluster.org/job/rackspace-regression-2GB-triggered/16601/consoleFull

by

http://review.gluster.org/#/c/12826/

Cheers - Michael


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] intermittent test failure: tests/bugs/tier/bug-1279376-rename-demoted-file.t

2015-12-09 Thread Michael Adam
Hi,

found another one. See

https://build.gluster.org/job/rackspace-regression-2GB-triggered/16603/consoleFull

Run by http://review.gluster.org/#/c/12830/
which should not change any test result.

Michael


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel