Re: [Nfs-ganesha-devel] NFS v4 permission checking - READ/WRITE operations

2015-04-23 Thread J. Bruce Fields
On Wed, Apr 22, 2015 at 04:45:36PM -0700, Frank Filz wrote:
> Bruce Fields just posted a patch for knfsd to do permission checking even
> when an OPEN stateid is presented, to protect against a rogue client
> spoofing stateid.

(Note that's not a new policy, we've always required permissions--I was
fixing a bug that I introduced last year and only just noticed.)

> That does seem like a good point. Some considerations:
> 
> 1. Do we ever really have a non-owner open a file, and then have permissions
> changed to not allow that user access, where a local process would continue
> to have access? If there's no real case scenario there, then we don't lose
> anything by enforcing current permissions.

The usual solution here is the owner-override hack:

> 2. If we go with this, we should still skip permission check with open (or
> lock) stateid when the user is the owner (because we would do owner override
> anyway).

Right.

And I think the case the owner-override hack addresses *is* something
that really comes up in the wild.

(Other cases--changing the owner on an open file?--I haven't heard
complaints about.)

> 3. Could we attach some credentials to stateid to prevent spoofing.
> That may not be practical unless all we require is the stateid came
> from the same client since open stateids might be shared across
> multiple users (for example Ganesha will certainly do that when doing
> anonymous I/O over FSAL_PROXY even AFTER we implement non-multiplexed
> file descriptors).

knfsd doesn't attempt to track associations between stateid's and
principals; we just do file-based permissions checks.

I *think* the spec may allow (even encourage?) restricting use of the
stateid only to principals that have previously performed an associated
open.  I can't find language right now so maybe that's my imagination.

--b.

--
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] DISCUSSION: V2.3 workflow and how we proceed

2015-04-27 Thread J. Bruce Fields
On Fri, Apr 24, 2015 at 02:59:49PM -0700, Frank Filz wrote:
> 2a. One solution here is use an e-mail review system (like the kernel
> process). This could be e-mail exclusively (I would then have to set up
> e-mail on my Linux VM in order to be able to merge patches via e-mail). I'm
> not fond of this process, but it would be workable.

For what it's worth: my mail client runs on a different machine from my
development trees, and I just scp around mailbox files.  All you need is
the ability to export a set of tagged messages to an mbox file.  I'd
think any client could be coerced into doing that somehow, but I don't
know

--b.

--
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] What's the status of cluster ganesha?

2015-05-28 Thread J. Bruce Fields
On Wed, May 27, 2015 at 10:10:26PM -0700, Frank Filz wrote:
> > On Wed, May 27, 2015 at 9:30 PM, Frank Filz
> >  wrote:
> > >> Frank, thanks for your reply, really help me to understand how
> > >> the NFS HA works.
> > >>
> > >> Another question coming into my mind is: The "state reclaim" only
> > >> talks about reclaim locks, what about the reply cache, which is
> > >> used to implement exactly-once-semantics?  Previous email states
> > >> that nfs-ganesha will not implement distributed reply cache(DRC),
> > >> thus the exactly-once-semantics will not be maintained during
> > >> failover?
> > >
> > > Correct. It's impossible to guarantee anyway, unless you make the
> > > DRC
> > transactional with the operations.
> > >
> > > Consider this:
> > >
> > > Client sends a REMOVE to delete a file Server processes the unlink
> > > system call, but fails before response is saved in clustered DRC
> > > and thus before response is sent to client Client retries REMOVE,
> > > which is not detected by DRC and thus an unlink system call is
> > > made that fails
> > >
> > > Only way to make that perfect is to make DRC atomic with the
> > > unlink
> > system call...
> > 
> > Maybe something like transaction log/journal can help here: before
> > calling unlink system call, nfs server should persistent a log to
> > somewhere saying that server is calling unlink system call with
> > certain parameters.  If unlink system call succeed, then add another
> > transaction log saying this operation succeed.  If unlink failed,
> > then add another transaction log saying this operation failed.  In
> > any case, if server is crashed, then another nfs server can check
> > the transaction log to find out what exactly happened for this
> > request.
> 
> That would help, though we would still be left guessing if we find:
> 
> An entry in the transaction log indicating an unlink call is to be
> made The file is deleted
> 
> The file may or may not have actually been deleted by the unlink call
> in the transaction log
> 
> UNLESS we don't expect any other process to be deleting files (either
> every process uses the same transaction log, or Ganesha is the only
> process touching the files).
> 
> But if we have every process using the same transaction log, then we
> basically get back to my statement of being able to make the DRC and
> unlink atomic.

Most filesystems already do this kind of thing somewhere under the
covers, so in theory it should be possible to solve this in cooperation
with the filesystem, but nobody's looked into it seriously that I'm
aware of.

The same problem can happen over a crash/restart.

One useful step might be to work out how to reproduce the problem easily
and reliably.

--b.

--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] What's the status of cluster ganesha?

2015-06-01 Thread 'J. Bruce Fields'
On Thu, May 28, 2015 at 11:20:31AM -0700, Frank Filz wrote:
> > From: J. Bruce Fields [mailto:bfie...@fieldses.org]
> > On Wed, May 27, 2015 at 10:10:26PM -0700, Frank Filz wrote:
> > > UNLESS we don't expect any other process to be deleting files (either
> > > every process uses the same transaction log, or Ganesha is the only
> > > process touching the files).
> > >
> > > But if we have every process using the same transaction log, then we
> > > basically get back to my statement of being able to make the DRC and
> > > unlink atomic.
> > 
> > Most filesystems already do this kind of thing somewhere under the covers,
> > so in theory it should be possible to solve this in cooperation with the
> > filesystem, but nobody's looked into it seriously that I'm aware of.
> > 
> > The same problem can happen over a crash/restart.
> > 
> > One useful step might be to work out how to reproduce the problem easily
> > and reliably.
> 
> Yea, if the filesystems provided an interface to hook into their transaction
> log, we could implement this pretty easily.
> 
> The catch might be in how the filesystem would want to handle transaction
> recovery when a user space application initiated a transaction. There would
> have to be some mechanism to abort the user space transaction if it was
> unable to perform it's recovery, with a definition of what happens to the
> filesystem portions of that transaction (does the unlink take effect
> yes/no?).

I don't understand the need for an abort mechanism?

You need some way to retry (or query the status of) some operation that
was attempted before a restart, that's all.

--b.

--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] BIND_CONN_TO_SESSION

2015-11-09 Thread J. Bruce Fields
On Sun, Nov 08, 2015 at 09:32:08PM -0800, Frank Filz wrote:
> It's on my "do sometime" list.
> 
>  
> 
> I'm not sure if any of the other developers have more immediate plans to
> look at BIND_CONN_TO_SESSION.

Note this is mandatory for any server 4.1 implementation.

Trunking isn't the only use: it's also necessary whenever a client wants
a series of (not necessarily simultaneous) connections to be associated
with the same session.  E.g. if they want to restore the backchannel
after a lost tcp connection without losing the reply cache.

--b.

> 
>  
> 
> Frank
> 
>  
> 
> From: Olivier Hault (Level IT) [mailto:olivier.ha...@level-it.be] 
> Sent: Saturday, November 7, 2015 4:04 AM
> To: ffilz...@mindspring.com
> Subject: BIND_CONN_TO_SESSION
> 
>  
> 
> Hello Frank,
> 
>  
> 
> I just wonder what is currently planned for BIND_CONN_TO_SESSION
> implementation to support session trunking in Ganesha?
> 
>  
> 
> Kind regards,
> 
> Olivier
> 
> 
> 
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus

> --
> Presto, an open source distributed SQL query engine for big data, initially
> developed by Facebook, enables you to easily query your data on Hadoop in a 
> more interactive manner. Teradata is also now providing full enterprise
> support for Presto. Download a free open source copy now.
> http://pubads.g.doubleclick.net/gampad/clk?id=250295911&iu=/4140

> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


--
Presto, an open source distributed SQL query engine for big data, initially
developed by Facebook, enables you to easily query your data on Hadoop in a 
more interactive manner. Teradata is also now providing full enterprise
support for Presto. Download a free open source copy now.
http://pubads.g.doubleclick.net/gampad/clk?id=250295911&iu=/4140
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] FOPs to be blocked during grace

2015-11-09 Thread J. Bruce Fields
On Mon, Nov 09, 2015 at 09:36:03PM +0530, Soumya Koduri wrote:
> Hi,
> 
>  From the code looks looks like, we block the following FOPs while the 
> NFS server is in grace (which have 'nfs_in_grace' check)-
> 
> NFSv3 -
> SETATTR
> 
> NLM -
> LOCK
> UNLOCK
> 
> NFSv4 -
> OPEN
> LOCK
> REMOVE
> RENAME
> SETATTR
> 
> Request clarification behind selecting these fops. Dan confirmed that
> 'as per 5661 RFC, RENAME and REMOVE should be denied during grace to 
> support volatile file handles (which we don't support...).

RENAME and REMOVE also conflict with delegations.  So I think you don't
want to allow those till clients have recovered their delegations (or
discovered that they can't).

I think LINK belongs on that list for similar reasons.

--b.

> 
> And from RFC 3530 -
> 
> 8.6.2.  Server Failure and Recovery
> 
> If the server loses locking state (usually as a result of a restart
> or reboot), it must allow clients time to discover this fact and re-
> establish the lost locking state.  The client must be able to re-
> establish the locking state without having the server deny valid
> requests because the server has granted conflicting access to another
> client.  Likewise, if there is the possibility that clients have not
> yet re-established their locking state for a file, the server must
> disallow READ and WRITE operations for that file.  The duration of
> this recovery period is equal to the duration of the lease period.
> 
> .
> .
> .
> 
> The period of special handling of locking and READs and WRITEs, equal
> in duration to the lease period, is referred to as the "grace
> period".  During the grace period, clients recover locks and the
> associated state by reclaim-type locking requests (i.e., LOCK
> requests with reclaim set to true and OPEN operations with a claim
> type of CLAIM_PREVIOUS).  During the grace period, the server must
> reject READ and WRITE operations and non-reclaim locking requests
> (i.e., other LOCK and OPEN operations) with an error of
> NFS4ERR_GRACE.
> 
> 
> 
> Does it mean that the NFS server need to reject I/Os as well unless it 
> is sure that there can be no other reclaim-type LOCK/OPEN requests? Also 
> why is SETATTR handled specially unlike WRITE fop.
> 
> Thanks,
> Soumya
> 
> --
> Presto, an open source distributed SQL query engine for big data, initially
> developed by Facebook, enables you to easily query your data on Hadoop in a 
> more interactive manner. Teradata is also now providing full enterprise
> support for Presto. Download a free open source copy now.
> http://pubads.g.doubleclick.net/gampad/clk?id=250295911&iu=/4140
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

--
Presto, an open source distributed SQL query engine for big data, initially
developed by Facebook, enables you to easily query your data on Hadoop in a 
more interactive manner. Teradata is also now providing full enterprise
support for Presto. Download a free open source copy now.
http://pubads.g.doubleclick.net/gampad/clk?id=250295911&iu=/4140
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] FOPs to be blocked during grace

2015-11-10 Thread J. Bruce Fields
On Tue, Nov 10, 2015 at 05:13:53PM +0530, Soumya Koduri wrote:
> 
> 
> On 11/10/2015 01:50 AM, J. Bruce Fields wrote:
> >On Mon, Nov 09, 2015 at 09:36:03PM +0530, Soumya Koduri wrote:
> >>Hi,
> >>
> >>  From the code looks looks like, we block the following FOPs while the
> >>NFS server is in grace (which have 'nfs_in_grace' check)-
> >>
> >>NFSv3 -
> >>SETATTR
> >>
> >>NLM -
> >>LOCK
> >>UNLOCK
> >>
> >>NFSv4 -
> >>OPEN
> >>LOCK
> >>REMOVE
> >>RENAME
> >>SETATTR
> >>
> >>Request clarification behind selecting these fops. Dan confirmed that
> >>'as per 5661 RFC, RENAME and REMOVE should be denied during grace to
> >>support volatile file handles (which we don't support...).
> >
> >RENAME and REMOVE also conflict with delegations.  So I think you don't
> >want to allow those till clients have recovered their delegations (or
> >discovered that they can't).
> >
> >I think LINK belongs on that list for similar reasons.
> >
> 
> Thanks Bruce. What about NFSv3 fops?  I can see I/Os going on even
> with kernel-NFS while the server is in grace (haven't checked with
> delegations though)

Yes, the grace period should really block v3 ops too.  It can't block
opens, obviously (there aren't any v3 opens), but it should block
specific operations that would conflict with recoverable v4 state.

That said, Linux knfsd currently *doesn't* do most of that.  (I believe
it's only blocking NLM lock/unlock).  I think that's a bug.  But that
means I don't have much experience yet with what it would mean to turn
this on.  People aren't used to v3 blocking on the grace period (unless
they do a lot of file locking), so you'd want to be careful to minimize
the impact, I think--e.g. make sure the server knows not to enforce
these things if there are no NFSv4 clients to recover.

--b.

> 
> -Soumya
> 
> >--b.
> 
> 
> >
> >>
> >>And from RFC 3530 -
> >>
> >>8.6.2.  Server Failure and Recovery
> >>
> >> If the server loses locking state (usually as a result of a restart
> >> or reboot), it must allow clients time to discover this fact and re-
> >> establish the lost locking state.  The client must be able to re-
> >> establish the locking state without having the server deny valid
> >> requests because the server has granted conflicting access to another
> >> client.  Likewise, if there is the possibility that clients have not
> >> yet re-established their locking state for a file, the server must
> >> disallow READ and WRITE operations for that file.  The duration of
> >> this recovery period is equal to the duration of the lease period.
> >>
> >>.
> >>.
> >>.
> >>
> >> The period of special handling of locking and READs and WRITEs, equal
> >> in duration to the lease period, is referred to as the "grace
> >> period".  During the grace period, clients recover locks and the
> >> associated state by reclaim-type locking requests (i.e., LOCK
> >> requests with reclaim set to true and OPEN operations with a claim
> >> type of CLAIM_PREVIOUS).  During the grace period, the server must
> >> reject READ and WRITE operations and non-reclaim locking requests
> >> (i.e., other LOCK and OPEN operations) with an error of
> >> NFS4ERR_GRACE.
> >>
> >>
> >>
> >>Does it mean that the NFS server need to reject I/Os as well unless it
> >>is sure that there can be no other reclaim-type LOCK/OPEN requests? Also
> >>why is SETATTR handled specially unlike WRITE fop.
> >>
> >>Thanks,
> >>Soumya
> >>
> >>--
> >>Presto, an open source distributed SQL query engine for big data, initially
> >>developed by Facebook, enables you to easily query your data on Hadoop in a
> >>more interactive manner. Teradata is also now providing full enterprise
> >>support for Presto. Download a free open source copy now.
> >>http://pubads.g.doubleclick.net/gampad/clk?id=250295911&iu=/4140
> >>___
> >>Nfs-ganesha-devel mailing list
> >>Nfs-ganesha-devel@lists.sourceforge.net
> >>https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

--
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] FOPs to be blocked during grace

2015-11-16 Thread J. Bruce Fields
On Mon, Nov 16, 2015 at 05:05:02PM +0530, Soumya Koduri wrote:
> 
> 
> On 11/10/2015 08:21 PM, J. Bruce Fields wrote:
> >On Tue, Nov 10, 2015 at 05:13:53PM +0530, Soumya Koduri wrote:
> >>
> >>
> >>On 11/10/2015 01:50 AM, J. Bruce Fields wrote:
> >>>On Mon, Nov 09, 2015 at 09:36:03PM +0530, Soumya Koduri wrote:
> >>>>Hi,
> >>>>
> >>>>  From the code looks looks like, we block the following FOPs while the
> >>>>NFS server is in grace (which have 'nfs_in_grace' check)-
> >>>>
> >>>>NFSv3 -
> >>>>SETATTR
> >>>>
> >>>>NLM -
> >>>>LOCK
> >>>>UNLOCK
> >>>>
> >>>>NFSv4 -
> >>>>OPEN
> >>>>LOCK
> >>>>REMOVE
> >>>>RENAME
> >>>>SETATTR
> >>>>
> >>>>Request clarification behind selecting these fops. Dan confirmed that
> >>>>'as per 5661 RFC, RENAME and REMOVE should be denied during grace to
> >>>>support volatile file handles (which we don't support...).
> >>>
> >>>RENAME and REMOVE also conflict with delegations.  So I think you don't
> >>>want to allow those till clients have recovered their delegations (or
> >>>discovered that they can't).
> >>>
> >>>I think LINK belongs on that list for similar reasons.
> >>>
> >>
> >>Thanks Bruce. What about NFSv3 fops?  I can see I/Os going on even
> >>with kernel-NFS while the server is in grace (haven't checked with
> >>delegations though)
> >
> >Yes, the grace period should really block v3 ops too.  It can't block
> >opens, obviously (there aren't any v3 opens), but it should block
> >specific operations that would conflict with recoverable v4 state.
> >
> >That said, Linux knfsd currently *doesn't* do most of that.  (I believe
> >it's only blocking NLM lock/unlock).  I think that's a bug.  But that
> >means I don't have much experience yet with what it would mean to turn
> >this on.  People aren't used to v3 blocking on the grace period (unless
> >they do a lot of file locking), so you'd want to be careful to minimize
> >the impact, I think--e.g. make sure the server knows not to enforce
> >these things if there are no NFSv4 clients to recover.
> 
> Alright. So if wish to have this behavior, we can have a flag to
> check if there is any earlier NFSv4 persistent state which can be
> reclaimed (not sure if we can specifically check for only
> delegations state) and then accordingly block/unblock NFSv3 fops -
> the list should be READ,WRITE,SETATTR,LINK,REMOVE,RENAME. Right?

Yes, sounds right to me!

--b.

--
Presto, an open source distributed SQL query engine for big data, initially
developed by Facebook, enables you to easily query your data on Hadoop in a 
more interactive manner. Teradata is also now providing full enterprise
support for Presto. Download a free open source copy now.
http://pubads.g.doubleclick.net/gampad/clk?id=250295911&iu=/4140
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] NFS Transactional Compound Project

2016-04-08 Thread J. Bruce Fields
On Wed, Apr 06, 2016 at 10:37:51AM -0400, Ming Chen wrote:
> We have just created a github repo about our transactional NFS
> compounds project: https://github.com/sbu-fsl/txn-compound

To me a "transaction" is an atomic operation that succeeds or fails as a
unit, which an NFSv4 compound isn't.

I know that's not what you're aiming for, you're just looking for
performance improvements, but the name might confuse people (on a point
that people are already inclined to get confused about).

--b.

> 
> The idea of the project is to take full advantage of NFSv'4 compound
> procedures for better performance by using higher-level FS APIs. NFS
> compound procedures are great but they are underutilitized because of
> the low-level nature of POSIX API, and this project provides a
> user-space NFS client library with high-level operations. For example,
> performance open, read, and close of multiple files in one compound
> procedure; and setting multiple file attributes of multiple files in
> one compound. In the future, we also would like to provide optional
> transaction execution of all operations inside a compound.
> 
> We are very interested in knowing real applications that can benefit
> from this higher-level transactional FS API. Please let us know the
> APIs you think will be useful to your applications. The current draft
> txn-compound API is defined in
> https://github.com/sbu-fsl/txn-compound/blob/master/tc_client/include/tc_api.h
> 
> The code is adapted from NFS-Ganesha's FSAL_PROXY. The repo has a
> working example at
> https://github.com/sbu-fsl/txn-compound/blob/master/tc_client/MainNFSD/tc_test_read.c.
> At this stage, no transactional support is added yet.
> 
> Any feedback is warmly welcome. Thanks a lot!
> 
> Best,
> Ming
> 
> --
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial! http://pubads.g.doubleclick.net/
gampad/clk?id=1444514301&iu=/ca-pub-7940484522588532
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] NFSv4 open close operations are slow compared to Linux kernel NFSv4

2016-04-08 Thread J. Bruce Fields
On Sun, Apr 03, 2016 at 07:40:24PM +0530, Tushar Shinde wrote:
> I am facing issue that NFS v4 Ganesha is slower than kernel v4 server. I
> found it about 20-30 percent slower.

What are you export options in the kernel case?

(We just want to make sure that you're not exporting with "async", which
is not recommended, and would give the kernel and unfair advantage on
create.)

--b.

--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial! http://pubads.g.doubleclick.net/
gampad/clk?id=1444514301&iu=/ca-pub-7940484522588532
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] NFSv4 open close operations are slow compared to Linux kernel NFSv4

2016-04-11 Thread J. Bruce Fields
On Mon, Apr 11, 2016 at 06:57:19PM +0530, Tushar Shinde wrote:
> I had used async with kernel

By that you mean the async export option, not the async mount option,
right?

knfsd's "async" export option is almost always a bad idea--it's off by
default, and I'd remove it completely if I thought we could get away
with it.

> and NFS_Commit = FALSE; with Ganesha.

>From some quick greps of the source code: it appears that turning on
NFS_Commit makes writes synchronous even if the client doesn't request
that.  With NFS_Commit off it looks like the behavior is still safe and
standards-compliant, and in particular will still require a sync to disk
on metadata operations (such as file creates).

> Do you think still it will be unfair comparison?

So if I'm understanding everything correctly, yes, that's an unfair
comparison, and the fact that knfsd with async is faster on a workload
with lots of file creates is unsurprising.

I definitely recommend sticking with the defaults wherever possible.
That's what most users will (and should) be using, so if performance
differs significantly between the two under default options then there's
likely something worth investigating.

--b.

> 
> On Sat, Apr 9, 2016 at 2:45 AM, J. Bruce Fields  wrote:
> > On Sun, Apr 03, 2016 at 07:40:24PM +0530, Tushar Shinde wrote:
> >> I am facing issue that NFS v4 Ganesha is slower than kernel v4 server. I
> >> found it about 20-30 percent slower.
> >
> > What are you export options in the kernel case?
> >
> > (We just want to make sure that you're not exporting with "async", which
> > is not recommended, and would give the kernel and unfair advantage on
> > create.)
> >
> > --b.

--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial! http://pubads.g.doubleclick.net/
gampad/clk?id=1444514301&iu=/ca-pub-7940484522588532
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] NFSv4 open close operations are slow compared to Linux kernel NFSv4

2016-04-13 Thread J. Bruce Fields
On Wed, Apr 13, 2016 at 03:50:43PM +0530, Tushar Shinde wrote:
> I did test with Kernel nfs and Ganesha, With sync and NFS_Commit =
> True;

You want "sync" for knfsd and "NFS_Commit = False" for Ganesha for an
apples-to-apples comparison.

(Which, happily, is also the default in each case.)

That said,

> Following are results,
> 
> KNFSv4
> -
> Operationtotal_ops-nsec  ops/s
> -
> create   11- 668159357063164.631378
> open 11- 266032790536413.482880
> close22- 872624163550252.113129
> 
> Ganesha V4
> -
> Operationtotal_ops-nsec  ops/s
> -
> create   11- 694286669184158.435989
> open 11- 408155570824269.505066
> close22- 1028802481561   213.840851
> 
> The time measured is in nsec by using clock_gettime(CLOCK_REALTIME) like,
> start_timer();
> open() or operation
> stop_timer();
> 
> The open operation is very slow (About 30-34%), I am finding it
> consistently slow across multiple runs. Please note there is no IO
> done no-read and no-write, Just open and close.

If there are no writes then the results may turn out to be the same.

So I think it's likely there's some Ganesha problem here, but it might
still be worth rerunning with NFS_Commit off, just to make sure.

--b.

> In this open I had not used O_CREAT its only O_RDWR.
> I observed cih_fh_cmpf is most frequenty called function, So I tried
> to increase cache_sz and nparts but that did not resulted in
> significant change.
> 
> cache_inode/cache_inode_hash.c
> 75 uint32_t cache_sz = 32767;  /* XXX */
> 85 npart = cache_param.nparts;
> 
> I will debug this again, But till then, If you know any tunables which
> can affect this code path please let me know.
> 
> 
> Thank you,
> Tushar.
> 
> 
> 
> 
> On Tue, Apr 12, 2016 at 11:08 PM, Malahal Naineni  wrote:
> > Tushar Shinde [mtk.tus...@gmail.com] wrote:
> >> I had used async with kernel and NFS_Commit = FALSE; with Ganesha.
> >> Do you think still it will be unfair comparison?
> >
> > Usually NFS clients are smart and they send write with FILE_SYNC if
> > they don't have any more writes. There are some clients (VMware I think)
> > that do "write followed by commit" even if they have just one write.
> > To avoid this follow on commit, ganesha syncs the data and tells the
> > client about it with "NFS_Commit = True".
> >
> > Usually, we don't recommend "NFS_Commit = True" unless you have such a
> > client. Ganesha always honors commits and there is no way to mimic
> > kernel NFS's async option.
> >
> > As others have commented, use "sync" and compare.
> >
> > Regards, Malahal.
> >

--
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Request: I'm looking for good file system test programs

2017-01-17 Thread J. Bruce Fields
On Tue, Jan 17, 2017 at 10:17:04PM +0530, sriram patil wrote:
> You can take a look at xfstests. It does data correctness checks as
> well.  You will have to google for figuring out the download and build
> procedure.

Yes, xfstests seems to be the suite most commonly used by linux
filesystem developers.  NFS developers have also traditionally used
cthon:

http://wiki.linux-nfs.org/wiki/index.php/Connectathon_test_suite

--b.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Segfault seen in libntirpc code even when values of the input arguments to function 'recvmsg' look fine

2017-09-06 Thread J. Bruce Fields
On Wed, Sep 06, 2017 at 10:18:36AM -0400, Daniel Gryniewicz wrote:
> On 09/06/2017 09:09 AM, William Allen Simpson wrote:
> >On 9/5/17 10:44 AM, Daniel Gryniewicz wrote:
> >>I'm stumped, then.  It all looks fine to me.
> >>
> >I think you'll find that things work better by switching to TCP.
> >
> >The UDP client code (clnt_dg) is badly bugged in general.  That needs
> >re-writing, but UDP hasn't been a priority.
> >
> >There were unlocked critical regions fixes in ganesha 2.5.
> >
> >In 2.6, we're planning on re-doing the entire call interface to use
> >async callbacks.  There's an XXX in clnt_vc circa 2012 mentioning it.
> 
> 
> Switching to TCP may not be an option,

Whatever you need to do for this bug, eventually I suspect you want to
drop support for UDP.

--b.

> and this particular crash
> should not be affected by locks in any way, since it's passing local
> stack data.  It would be nice to figure out what's causing the
> crash.
> 
> Daniel
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Segfault seen in libntirpc code even when values of the input arguments to function 'recvmsg' look fine

2017-09-06 Thread 'J. Bruce Fields'
On Wed, Sep 06, 2017 at 12:27:38PM -0700, Frank Filz wrote:
> > On Wed, Sep 06, 2017 at 10:18:36AM -0400, Daniel Gryniewicz wrote:
> > > On 09/06/2017 09:09 AM, William Allen Simpson wrote:
> > > >On 9/5/17 10:44 AM, Daniel Gryniewicz wrote:
> > > >>I'm stumped, then.  It all looks fine to me.
> > > >>
> > > >I think you'll find that things work better by switching to TCP.
> > > >
> > > >The UDP client code (clnt_dg) is badly bugged in general.  That needs
> > > >re-writing, but UDP hasn't been a priority.
> > > >
> > > >There were unlocked critical regions fixes in ganesha 2.5.
> > > >
> > > >In 2.6, we're planning on re-doing the entire call interface to use
> > > >async callbacks.  There's an XXX in clnt_vc circa 2012 mentioning it.
> > >
> > >
> > > Switching to TCP may not be an option,
> > 
> > Whatever you need to do for this bug, eventually I suspect you want to
> drop
> > support for UDP.
> 
> Do we genuinely not have any supported clients out there that will end up
> using UDP? Until we do, I think we need to continue to support UDP as best
> we can.

I haven't heard of a distro with a client that supports only UDP.  Or
even that defaults to UDP.

I kind of doubt there's anything at all out there that only supports
UDP--if there is, my bet would be odd one-off userspace clients or old
embedded devices?

So if you're seeing UDP it's probably because they explicitly asked for
it on the mount commandline.

--b.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Segfault seen in libntirpc code even when values of the input arguments to function 'recvmsg' look fine

2017-09-06 Thread 'J. Bruce Fields'
On Wed, Sep 06, 2017 at 01:47:42PM -0700, Frank Filz wrote:
> > On Wed, Sep 06, 2017 at 12:27:38PM -0700, Frank Filz wrote:
> > > > On Wed, Sep 06, 2017 at 10:18:36AM -0400, Daniel Gryniewicz wrote:
> > > > > On 09/06/2017 09:09 AM, William Allen Simpson wrote:
> > > > > >On 9/5/17 10:44 AM, Daniel Gryniewicz wrote:
> > > > > >>I'm stumped, then.  It all looks fine to me.
> > > > > >>
> > > > > >I think you'll find that things work better by switching to TCP.
> > > > > >
> > > > > >The UDP client code (clnt_dg) is badly bugged in general.  That
> > > > > >needs re-writing, but UDP hasn't been a priority.
> > > > > >
> > > > > >There were unlocked critical regions fixes in ganesha 2.5.
> > > > > >
> > > > > >In 2.6, we're planning on re-doing the entire call interface to
> > > > > >use async callbacks.  There's an XXX in clnt_vc circa 2012
> mentioning it.
> > > > >
> > > > >
> > > > > Switching to TCP may not be an option,
> > > >
> > > > Whatever you need to do for this bug, eventually I suspect you want
> > > > to
> > > drop
> > > > support for UDP.
> > >
> > > Do we genuinely not have any supported clients out there that will end
> > > up using UDP? Until we do, I think we need to continue to support UDP
> > > as best we can.
> > 
> > I haven't heard of a distro with a client that supports only UDP.  Or even
> that
> > defaults to UDP.
> > 
> > I kind of doubt there's anything at all out there that only supports
> UDP--if
> > there is, my bet would be odd one-off userspace clients or old embedded
> > devices?
> > 
> > So if you're seeing UDP it's probably because they explicitly asked for it
> on
> > the mount commandline.
> 
> I know at one time, some of the NFS v3 mount activity would wind up on UDP
> even when TCP was specified for mount. Hopefully that has been gone away
> long enough it doesn't remain on any supported clients anymore.

I'm just thinking about the main NFS protocol, I don't know what the
deal is with the sideband protocols.

But are any of those sideband protocols actually being handled by ntirpc
anyway?

--b.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-13 Thread J. Bruce Fields
On Tue, Feb 13, 2018 at 12:01:04AM +, Deepak Jagtap wrote:
> I ran few performance tests to compare nfs gansha and nfs kernel server and 
> noticed significant difference.

What is the actual test?

What are the export options in the knfsd case?

Is there any reason you're testing ext3 instead of ext4 or xfs?

What's the client?  What are the kernel versions on server and client?

--b.

> Please find my test result:
> 
> 
> SSD formated with EXT3 exported using nfs ganesha  : ~18K IOPSAvg 
> latency: ~8ms   Throughput: ~60MBPS
> 
> same directory exported using nfs kernel server: ~75K IOPSAvg 
> latency: ~0.8msThroughput: ~300MBPS
> 
> 
> nfs kernel and nfs ganesha both of them are configured with 128 worker 
> threads. nfs ganesha is configured with VFS FSAL.
> 
> 
> Am I missing something major in nfs ganesha config or this is expected 
> behavior.
> 
> Appreciate any inputs as how the performance can be improved for nfs ganesha.
> 
> 
> 
> Please find following ganesha config file that I am using:
> 
> 
> NFS_Core_Param
> {
> Nb_Worker = 128 ;
> }
> 
> EXPORT
> {
> # Export Id (mandatory, each EXPORT must have a unique Export_Id)
>Export_Id = 77;
># Exported path (mandatory)
>Path = /host/test;
>Protocols = 3;
># Pseudo Path (required for NFS v4)
>Pseudo = /host/test;
># Required for access (default is None)
># Could use CLIENT blocks instead
>Access_Type = RW;
># Exporting FSAL
>FSAL {
> Name = VFS;
>}
>CLIENT
>{
> Clients = *;
> Squash = None;
> Access_Type = RW;
>}
> }
> 
> 
> 
> 
> Thanks & Regards,
> 
> Deepak

> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot

> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-15 Thread J. Bruce Fields
On Wed, Feb 14, 2018 at 12:00:52AM +, Deepak Jagtap wrote:
> > What is the actual test?
> 
>- nfs server: RHEL based host exporting ext3 formated SSD.
> 
>- nfs client: RHEL based host running windows vm IOmeter with 70% read, 
> 30% random workload, 2 worker threads doing IOs with 64 outstanding IO queue 
> depth each.
> 
> 
> > What are the export options in the knfsd case?
> 
>- I am using this for exporting using knfsd.
> 
>  exportfs -oinsecure,rw,sync,no_root_squash,fsid=77 *:/host/test
> 
> > Is there any reason you're testing ext3 instead of ext4 or xfs?
>- Not in particular. I could have used ext4.  I was evaluating nfs server 
> impact in both scenarios.
> 
> >What's the client?  What are the kernel versions on server and client?
> 
>- Both of them are at 3.10.0-693.17.1.el7.x86_64 kernel version

That all osunds reasonable to me, thanks.--b.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance

2018-02-18 Thread J. Bruce Fields
On Wed, Feb 14, 2018 at 08:32:19AM -0500, Daniel Gryniewicz wrote:
> How many clients are you using?  Each client op can only (currently)
> be handled in a single thread, and client's won't send more ops
> until the current one is ack'd,

What version of NFS was the test run over?

I don't see how the server can limit the number of outstanding requests
for NFS versions less than 4.1.

So if the client's doing NFSv3 or v4.0, then there should still always
be a queue of requests in the server's receive buffer, so it shouldn't
have to wait for a round trip back to the client to process the next
request.

But the reads have to be synchronous assuming the working set isn't in
cache, so you're completely bound by the SSD's read latency there.  If
the writes are asynchronous (so, sent without the "stable" bit set),
there might be a little opportunity for parallelism writing to the drive
as well.

So might be interesting to know what exactly the model of SSD used is.

> so Ganesha can basically only
> parallelize on a per-client basis at the moment.

Still something worth fixing, of course.

--b.

> 
> I'm sure there are locking issues; so far we've mostly worked on
> correctness rather than performance.  2.6 has changed the threading
> model a fair amount, and 2.7 will have more improvements, but it's a
> slow process.
> 
> Daniel
> 
> On 02/13/2018 06:38 PM, Deepak Jagtap wrote:
> >Thanks Daniel!
> >
> >Yeah user-kernel context switching is definitely adding up
> >latency, but I wonder ifrpc or some locking overhead is also in
> >the picture.
> >
> >With 70% read 30% random workload nfs ganesha CPU usage was close
> >to 170% while remaining 2 cores were pretty much unused (~18K
> >IOPS, latency ~8ms)
> >
> >With 100% read 30% random nfs ganesha CPU usage ~250% ( ~50K IOPS,
> >latency ~2ms).
> >
> >
> >-Deepak
> >
> >
> >*From:* Daniel Gryniewicz 
> >*Sent:* Tuesday, February 13, 2018 6:15:47 AM
> >*To:* nfs-ganesha-devel@lists.sourceforge.net
> >*Subject:* Re: [Nfs-ganesha-devel] nfs ganesha vs nfs kernel performance
> >Also keep in mind that FSAL VFS can never, by it's very nature, beat
> >knfsd, since it has to do everything knfsd does, but also has userspace
> ><-> kernespace transitions.  Ganesha's strength is exporting
> >userspace-based cluster filesystems.
> >
> >That said, we're always working to make Ganesha faster, and I'm sure
> >there's gains to be made, even in these circumstances.
> >
> >Daniel
> >
> >On 02/12/2018 07:01 PM, Deepak Jagtap wrote:
> >>Hey Guys,
> >>
> >>
> >>I ran few performance tests to compare nfs gansha and nfs kernel
> >>server and noticed significant difference.
> >>
> >>
> >>Please find my test result:
> >>
> >>
> >>SSD formated with EXT3 exported using nfs ganesha  : ~18K IOPS 
> >>  Avg latency: ~8ms       Throughput: ~60MBPS
> >>
> >>same directory exported using nfs kernel server:           
> >> ~75K IOPS      Avg latency: ~0.8ms Throughput: ~300MBPS
> >>
> >>
> >>nfs kernel and nfs ganesha both of them are configured with 128
> >>worker threads. nfs ganesha is configured with VFS FSAL.
> >>
> >>
> >>Am I missing something major in nfs ganesha config or this is
> >>expected behavior.
> >>
> >>Appreciate any inputs as how the performance can be improved for
> >>nfs ganesha.
> >>
> >>
> >>
> >>Please find following ganesha config file that I am using:
> >>
> >>
> >>NFS_Core_Param
> >>{
> >>          Nb_Worker = 128 ;
> >>}
> >>
> >>EXPORT
> >>{
> >>      # Export Id (mandatory, each EXPORT must have a unique Export_Id)
> >>     Export_Id = 77;
> >>     # Exported path (mandatory)
> >>     Path = /host/test;
> >>     Protocols = 3;
> >>     # Pseudo Path (required for NFS v4)
> >>     Pseudo = /host/test;
> >>     # Required for access (default is None)
> >>     # Could use CLIENT blocks instead
> >>     Access_Type = RW;
> >>     # Exporting FSAL
> >>     FSAL {
> >>          Name = VFS;
> >>     }
> >>     CLIENT
> >>     {
> >>          Clients = *;
> >>          Squash = None;
> >>          Access_Type = RW;
> >>     }
> >>}
> >>
> >>
> >>
> >>Thanks & Regards,
> >>
> >>Deepak
> >>
> >>
> >>
> >>--
> >>Check out the vibrant tech community on one of the world's most
> >>engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >>
> >>
> >>
> >>___
> >>Nfs-ganesha-devel mailing list
> >>Nfs-ganesha-devel@lists.sourceforge.net
> >>https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
> >>
> >
> >
> >--
> >Check out the vibrant tech community on one of the world's most
> >engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >___
> >Nfs-ganesha-devel mailing list
> >Nfs-ganesha-devel@lists.sourceforge.net
> >https://lists.sourceforge.net/lists/list

Re: [Nfs-ganesha-devel] ACL support

2018-02-22 Thread J. Bruce Fields
On Thu, Feb 22, 2018 at 06:18:52AM -0800, Frank Filz wrote:
> Ah, that might be an issue. It’s hard to gets POSIX<->NFS V4 ACL
> conversion as best as possible (again, impossible to make it perfect,
> even for POSIX->NFS V4).

Well, POSIX->NFSv4 should be very close to perfect.  (Name mapping might
be the most likely problem in practice.)

> It would be good to fix all these conversion issues (without copying
> code from the kernel – note the license differences…)

The original ACL mapping code was all written while I was at UM/CITI by
me and a couple students, contributed under a permissive BSD-like
license, as you can see from the license header on fs/nfsd/nfs4acl.c.

So you should verify the license and git history to be sure, but I doubt
licensing would be an obstacle.

git://linux-nfs.org/bfields/acl.git also has patches implementing the
same mapping in libacl, written entirely while I was at citi.  They were
never upstreamed.  I'd recommend taking the kernel code instead as it's
gotten more bugfixes.

https://tools.ietf.org/html/draft-ietf-nfsv4-acl-mapping-05 has the best
documentation of the mapping.

All that aside, I agree with Frank that this is all complicated and
error-prone.  But the richacl patches seem stuck.  The only other
alternative I can think of at this point is to go back to the ietf nfsv4
working group with a proposal to add POSIX-like ACLs to NFSv4.2.

--b.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] ACE permission check

2018-05-25 Thread J. Bruce Fields
This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
On Fri, May 25, 2018 at 08:10:07PM +0530, Sagar M D wrote:
>  Hi,
> 
> By looking at nfs-Ganesha code, permission check (ACL) happens
> access_check.c. Our FSAL (not in tree FSAL), storing and serving the ACLs
> to Ganesha.
> 
> I see an issue with rename:
> Even though i set deny ACE for "delete child" on folder1 for user1. user1
> is able to rename file belongs to user2.

What's the ACL on the child?  The rule from Windows at least is that
you only need DELETE or DELETE_CHILD, not both.

> I see below RPC:-
> ACCESS request folder1
> ACCESS denied (as expected.) (denied for DELETE_CHILD permission)
> Rename request
> Rename succeed
> 
> I'm not sure why client is sending rename even after receiving  ACCESS
> Denied.
> 
> Native nfs denies rename though.

knfsd implements everything in terms of posix ACLs which never consider
DELETE_CHILD part of write permissions, and never allow DELETE.

--b.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel