from:"Oleg Drokin"

Re: [PATCH] staging: lustre: delete the filesystem from the tree.

2018-06-01 Thread Oleg Drokin

> delete mode 100644 drivers/staging/lustre/lustre/lov/Makefile
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lov_cl_internal.h
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lov_dev.c
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lov_ea.c
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lov_internal.h
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lov_io.c
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lov_lock.c
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lov_merge.c
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lov_obd.c
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lov_object.c
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lov_offset.c
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lov_pack.c
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lov_page.c
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lov_pool.c
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lov_request.c
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lovsub_dev.c
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lovsub_lock.c
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lovsub_object.c
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lovsub_page.c
>> delete mode 100644 drivers/staging/lustre/lustre/lov/lproc_lov.c
>> delete mode 100644 drivers/staging/lustre/lustre/mdc/Makefile
>> delete mode 100644 drivers/staging/lustre/lustre/mdc/lproc_mdc.c
>> delete mode 100644 drivers/staging/lustre/lustre/mdc/mdc_internal.h
>> delete mode 100644 drivers/staging/lustre/lustre/mdc/mdc_lib.c
>> delete mode 100644 drivers/staging/lustre/lustre/mdc/mdc_locks.c
>> delete mode 100644 drivers/staging/lustre/lustre/mdc/mdc_reint.c
>> delete mode 100644 drivers/staging/lustre/lustre/mdc/mdc_request.c
>> delete mode 100644 drivers/staging/lustre/lustre/mgc/Makefile
>> delete mode 100644 drivers/staging/lustre/lustre/mgc/lproc_mgc.c
>> delete mode 100644 drivers/staging/lustre/lustre/mgc/mgc_internal.h
>> delete mode 100644 drivers/staging/lustre/lustre/mgc/mgc_request.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/Makefile
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/cl_internal.h
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/cl_io.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/cl_lock.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/cl_object.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/cl_page.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/class_obd.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/debug.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/genops.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/kernelcomm.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/linkea.c
>> delete mode 100644 
>> drivers/staging/lustre/lustre/obdclass/linux/linux-module.c
>> delete mode 100644 
>> drivers/staging/lustre/lustre/obdclass/linux/linux-sysctl.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/llog.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/llog_cat.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/llog_internal.h
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/llog_obd.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/llog_swab.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/lprocfs_counters.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/lu_object.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/lu_ref.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/lustre_handles.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/lustre_peer.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/obd_config.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/obd_mount.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/obdo.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/statfs_pack.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdclass/uuid.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdecho/Makefile
>> delete mode 100644 drivers/staging/lustre/lustre/obdecho/echo_client.c
>> delete mode 100644 drivers/staging/lustre/lustre/obdecho/echo_internal.h
>> delete mode 100644 drivers/staging/lustre/lustre/osc/Makefile
>> delete mode 100644 drivers/staging/lustre/lustre/osc/lp

Re: [lustre-devel] [PATCH 41/80] staging: lustre: lmv: separate master object with master stripe

2018-02-11 Thread Oleg Drokin


> On Feb 11, 2018, at 6:44 PM, NeilBrown  wrote:
> 
> On Thu, Feb 08 2018, Oleg Drokin wrote:
>> 
>> Certain things that sound useless (like the debug subsystem in Lustre)
>> is very useful when you have a 10k nodes in a cluster and need to selectively
>> pull stuff from a run to debug a complicated cross-node interaction.
>> I asked NFS people how do they do it and they don’t have anything that scales
>> and usually involves reducing the problem to a much smaller set of nodes 
>> first.
> 
> the "rpcdebug" stuff that Linux/nfs has is sometimes useful, but some parts
> are changing to tracepoints and some parts have remained, which is a
> little confusing.
> 
> The fact that lustre tracing seems to *always* log everything so that if
> something goes wrong you can extract that last few meg(?) of logs seems
> really useful.

Not really. Lustre also has a bitmask for logs (since otherwise all those prints
are pretty cpu taxing), but what makes those logs better is:
the size is unlimited, not constrained by dmesg buffer size.
You can capture those logs from a crashdump (something I really wish
somebody would implement for tracepoint buffers, but alas, I have not
found anything for this yet - we have a crash plugin to extract lustre
debug logs from a kernel crashdump).
>>> 
>>> Even if it is horrible it would be nice to have it in staging... I guess
>>> the changes required to ext4 prohibit that... I don't suppose it can be
>>> made to work with mainline ext4 in a reduced-functionality-and-performance
>>> way??
>> 
>> We support unpatched ZFS as a server too! ;)
> 
> So that that mean you would expect lustre-server to work with unpatched
> ext4? In that case I won't give up hope of seeing the server in mainline
> in my lifetime.  Client first though.

While unpatched ext4 might in theory be possible, currently it does not export
everything we need from the transaction/fs control perspective.

Bye,
Oleg
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [lustre-devel] [PATCH 41/80] staging: lustre: lmv: separate master object with master stripe

2018-02-11 Thread Oleg Drokin


> On Feb 11, 2018, at 6:50 PM, NeilBrown  wrote:
> 
> Maybe - as you suggest in another email - it is due to some
> client/server incompatibility.  I guess it is unavoidable with an fs
> like lustre to have incompatible protocol changes.  Is there any
> mechanism for detecting the version of other peers in the cluster and
> refusing to run if versions are incompatible?

Yes, client and server exchange “feature bits” at connect time
and only use the subset of features that both can understand.

Bye,
Oleg
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [lustre-devel] [PATCH 41/80] staging: lustre: lmv: separate master object with master stripe

2018-02-08 Thread Oleg Drokin


> On Feb 8, 2018, at 10:10 PM, NeilBrown  wrote:
> 
> On Thu, Feb 08 2018, Oleg Drokin wrote:
> 
>>> On Feb 8, 2018, at 8:39 PM, NeilBrown  wrote:
>>> 
>>> On Tue, Aug 16 2016, James Simmons wrote:
>> 
>> my that’s an old patch
>> 
>>> 
> ...
>>> 
>>> Whoever converted it to "!strcmp()" inverted the condition.  This is a
>>> perfect example of why I absolutely *loathe* the "!strcmp()" construct!!
>>> 
>>> This causes many tests in the 'sanity' test suite to return
>>> -ENOMEM (that had me puzzled for a while!!).
>> 
>> huh? I am not seeing anything of the sort and I was running sanity
>> all the time until a recent pause (but going to resume).
> 
> That does surprised me - I reproduce it every time.
> I have two VMs running a SLE12-SP2 kernel with patches from
> lustre-release applied.  These are servers. They have 2 3G virtual disks
> each.
> I have two over VMs running current mainline.  These are clients.
> 
> I guess your 'recent pause' included between v4.15-rc1 (8e55b6fd0660)
> and v4.15-rc6 (a93639090a27) - a full month when lustre wouldn't work at
> all :-(

More than that, but I am pretty sure James Simmons is running tests all the 
time too
(he has a different config, I only have tcp).

>>> This seems to suggest that no-one has been testing the mainline linux
>>> lustre.
>>> It also seems to suggest that there is a good chance that there
>>> are other bugs that have crept in while no-one has really been caring.
>>> Given that the sanity test suite doesn't complete for me, but just
>>> hangs (in test_27z I think), that seems particularly likely.
>> 
>> Works for me, here’s a run from earlier today on 4.15.0:
> 
> Well that's encouraging .. I haven't looked into this one yet - I'm not
> even sure where to start.

m… debug logs for example (greatly neutered in staging tree, but still useful)?
try lctl dk and see what’s in there.

>> Instead the plan was to clean up the staging client into acceptable state,
>> move it out of staging, bring in all the missing features and then
>> drop the client (more or less) from the lustre-release.
> 
> That sounds like a great plan.  Any idea why it didn't happen?

Because meeting open-ended demands is hard and certain demands sound like
“throw away your X and rewrite it from scratch" (e.g. everything IB-related).

Certain things that sound useless (like the debug subsystem in Lustre)
is very useful when you have a 10k nodes in a cluster and need to selectively
pull stuff from a run to debug a complicated cross-node interaction.
I asked NFS people how do they do it and they don’t have anything that scales
and usually involves reducing the problem to a much smaller set of nodes first.

> It seems there is a lot of upstream work mixed in with the clean up, and
> I don't think that really helps anyone.

I don’t understand what you mean here.

> Is it at all realistic that the client might be removed from
> lustre-release?  That might be a good goal to work towards.

Assuming we can bring the whole functionality over - sure.

Of course there’d still be some separate development place and we would
need to create patches (new features?) for like SuSE and other distros
and for testing of server features, I guess, but that could just that -
a side branch somewhere I hope.

It’s not that we are super glad to chase every kernel vendors put out,
of course it would be much easier if the kernels already included
a very functional Lustre client.

>>> Might it make sense to instead start cleaning up the code in
>>> lustre-release so as to make it meet the upstream kernel standards.
>>> Then when the time is right, the kernel code can be moved *out* of
>>> lustre-release and *in* to linux.  Then development can continue in
>>> Linux (just like it does with other Linux filesystems).
>> 
>> While we can be cleaning lustre in lustre-release, there are some things
>> we cannot do as easily, e.g. decoupling Lustre client from the server.
>> Also it would not attract any reviews from all the janitor or
>> (more importantly) Al Viro and other people with a sharp eyes.
>> 
>>> An added bonus of this is that there is an obvious path to getting
>>> server support in mainline Linux.  The current situation of client-only
>>> support seems weird given how interdependent the two are.
>> 
>> Given the pushback Lustre client was given I have no hope Lustre server
>> will get into mainline in my lifetime.
> 
> Even if it is horrible it would be nice to have it in staging... I guess
> the changes required t

Re: [PATCH 41/80] staging: lustre: lmv: separate master object with master stripe

2018-02-08 Thread Oleg Drokin

> On Feb 8, 2018, at 8:39 PM, NeilBrown  wrote:
> 
> On Tue, Aug 16 2016, James Simmons wrote:

my that’s an old patch

> 
>> 
>> +static inline bool
>> +lsm_md_eq(const struct lmv_stripe_md *lsm1, const struct lmv_stripe_md 
>> *lsm2)
>> +{
>> +int idx;
>> +
>> +if (lsm1->lsm_md_magic != lsm2->lsm_md_magic ||
>> +lsm1->lsm_md_stripe_count != lsm2->lsm_md_stripe_count ||
>> +lsm1->lsm_md_master_mdt_index != lsm2->lsm_md_master_mdt_index ||
>> +lsm1->lsm_md_hash_type != lsm2->lsm_md_hash_type ||
>> +lsm1->lsm_md_layout_version != lsm2->lsm_md_layout_version ||
>> +!strcmp(lsm1->lsm_md_pool_name, lsm2->lsm_md_pool_name))
>> +return false;
> 
> Hi James and all,
> This patch (8f18c8a48b736c2f in linux) is different from the
> corresponding patch in lustre-release (60e07b972114df).
> 
> In that patch, the last clause in the 'if' condition is
> 
> +   strcmp(lsm1->lsm_md_pool_name,
> + lsm2->lsm_md_pool_name) != 0)
> 
> Whoever converted it to "!strcmp()" inverted the condition.  This is a
> perfect example of why I absolutely *loathe* the "!strcmp()" construct!!
> 
> This causes many tests in the 'sanity' test suite to return
> -ENOMEM (that had me puzzled for a while!!).

huh? I am not seeing anything of the sort and I was running sanity
all the time until a recent pause (but going to resume).

> This seems to suggest that no-one has been testing the mainline linux
> lustre.
> It also seems to suggest that there is a good chance that there
> are other bugs that have crept in while no-one has really been caring.
> Given that the sanity test suite doesn't complete for me, but just
> hangs (in test_27z I think), that seems particularly likely.

Works for me, here’s a run from earlier today on 4.15.0:
== sanity test 27z: check SEQ/OID on the MDT and OST filesystems 
= 16:43:58 (1518126238)
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0169548 s, 61.8 MB/s
2+0 records in
2+0 records out
2097152 bytes (2.1 MB, 2.0 MiB) copied, 0.02782 s, 75.4 MB/s
check file /mnt/lustre/d27z.sanity/f27z.sanity-1
FID seq 0x20401, oid 0x4640 ver 0x0
LOV seq 0x20401, oid 0x4640, count: 1
want: stripe:0 ost:0 oid:314/0x13a seq:0
Stopping /mnt/lustre-ost1 (opts:) on centos6-17
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
Starting ost1:   -o loop /tmp/lustre-ost1 /mnt/lustre-ost1
Failed to initialize ZFS library: 256
h2tcp: deprecated, use h2nettype instead
centos6-17.localnet: executing set_default_debug vfstrace rpctrace dlmtrace 
neterror ha config ioctl super all -lnet -lnd -pinger 16
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
Started lustre-OST
/mnt/lustre-ost1/O/0/d26/314: parent=[0x20401:0x4640:0x0] stripe=0 
stripe_size=0 stripe_count=0
check file /mnt/lustre/d27z.sanity/f27z.sanity-2
FID seq 0x20401, oid 0x4642 ver 0x0
LOV seq 0x20401, oid 0x4642, count: 2
want: stripe:0 ost:1 oid:1187/0x4a3 seq:0
Stopping /mnt/lustre-ost2 (opts:) on centos6-17
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
Starting ost2:   -o loop /tmp/lustre-ost2 /mnt/lustre-ost2
Failed to initialize ZFS library: 256
h2tcp: deprecated, use h2nettype instead
centos6-17.localnet: executing set_default_debug vfstrace rpctrace dlmtrace 
neterror ha config ioctl super all -lnet -lnd -pinger 16
pdsh@fedora1: centos6-17: ssh exited with exit code 1
pdsh@fedora1: centos6-17: ssh exited with exit code 1
Started lustre-OST0001
/mnt/lustre-ost2/O/0/d3/1187: parent=[0x20401:0x4642:0x0] stripe=0 
stripe_size=0 stripe_count=0
want: stripe:1 ost:0 oid:315/0x13b seq:0
got: objid=0 seq=0 parent=[0x20401:0x4642:0x0] stripe=1
Resetting fail_loc on all nodes...done.
16:44:32 (1518126272) waiting for centos6-16 network 5 secs ...
16:44:32 (1518126272) network interface is UP
16:44:33 (1518126273) waiting for centos6-17 network 5 secs ...
16:44:33 (1518126273) network interface is UP

> So my real question - to anyone interested in lustre for mainline linux
> - is: can we actually trust this code at all?

Absolutely. Seems that you just stumbled upon a corner case that was not
being hit by people that do the testing, so you have something unique about
your setup, I guess.

> I'm seriously tempted to suggest that we just
>  rm -r drivers/staging/lustre
> 
> drivers/staging is great for letting the community work on code that has
> been "thrown over the wall" and is not openly developed elsewhere, but
> that is not the case for lustre.  lustre has (or seems to have) an open
> development process.  Having on-going development happen both there and
> in drivers/staging seems a waste of resources.

It is a bit of

Re: [PATCH][V2] staging: lustre: fix spelling mistake, "grranted" -> "granted"

2017-07-14 Thread Oleg Drokin


On Jul 14, 2017, at 5:33 PM, Colin King wrote:

> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in CERROR error message. Also
> clean up the grammar.
> 
> Signed-off-by: Colin Ian King 

Reviewed-by: Oleg Drokin 

> ---
> drivers/staging/lustre/lustre/ptlrpc/import.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lustre/ptlrpc/import.c 
> b/drivers/staging/lustre/lustre/ptlrpc/import.c
> index 52cb1f0c9c94..b19dac15e901 100644
> --- a/drivers/staging/lustre/lustre/ptlrpc/import.c
> +++ b/drivers/staging/lustre/lustre/ptlrpc/import.c
> @@ -1026,7 +1026,7 @@ static int ptlrpc_connect_interpret(const struct lu_env 
> *env,
>   /* check that server granted subset of flags we asked for. */
>   if ((ocd->ocd_connect_flags & imp->imp_connect_flags_orig) !=
>   ocd->ocd_connect_flags) {
> - CERROR("%s: Server didn't granted asked subset of flags: 
> asked=%#llx grranted=%#llx\n",
> + CERROR("%s: Server didn't grant the asked for subset of flags: 
> asked=%#llx granted=%#llx\n",
>  imp->imp_obd->obd_name, imp->imp_connect_flags_orig,
>  ocd->ocd_connect_flags);
>   rc = -EPROTO;
> -- 
> 2.11.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH] staging: lustre: fix spelling mistake, "grranted" -> "granted"

2017-07-14 Thread Oleg Drokin


On Jul 14, 2017, at 9:26 AM, Colin King wrote:

> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in CERROR error message
> 
> Signed-off-by: Colin Ian King 
> ---
> drivers/staging/lustre/lustre/ptlrpc/import.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lustre/ptlrpc/import.c 
> b/drivers/staging/lustre/lustre/ptlrpc/import.c
> index 52cb1f0c9c94..99877aa10d6f 100644
> --- a/drivers/staging/lustre/lustre/ptlrpc/import.c
> +++ b/drivers/staging/lustre/lustre/ptlrpc/import.c
> @@ -1026,7 +1026,7 @@ static int ptlrpc_connect_interpret(const struct lu_env 
> *env,
>   /* check that server granted subset of flags we asked for. */
>   if ((ocd->ocd_connect_flags & imp->imp_connect_flags_orig) !=
>   ocd->ocd_connect_flags) {
> - CERROR("%s: Server didn't granted asked subset of flags: 
> asked=%#llx grranted=%#llx\n",
> + CERROR("%s: Server didn't granted asked subset of flags: 
> asked=%#llx granted=%#llx\n",

While we are at it can also address grammar problem: "didn't granted" -> 
"didn't grant"?

Thanks!

>  imp->imp_obd->obd_name, imp->imp_connect_flags_orig,
>  ocd->ocd_connect_flags);
>   rc = -EPROTO;
> -- 
> 2.11.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH] stating: lustre: fix sparse error: incompatible types in comparison expression

2017-07-12 Thread Oleg Drokin


On Jul 12, 2017, at 10:10 PM, Rui Teng wrote:

> Comparing two user space addresses to avoid sparse error:
> 
> drivers/staging//lustre/lnet/selftest/conrpc.c:490:30: error:
> incompatible types in comparison expression (different address spaces)
> 
> Signed-off-by: Rui Teng 
> ---
> drivers/staging/lustre/lnet/selftest/conrpc.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/selftest/conrpc.c 
> b/drivers/staging/lustre/lnet/selftest/conrpc.c
> index da36c55b86d3..ae7c2772825e 100644
> --- a/drivers/staging/lustre/lnet/selftest/conrpc.c
> +++ b/drivers/staging/lustre/lnet/selftest/conrpc.c
> @@ -487,10 +487,9 @@ lstcon_rpc_trans_interpreter(struct lstcon_rpc_trans 
> *trans,
>  sizeof(struct list_head)))
>   return -EFAULT;
> 
> - if (tmp.next == head_up)
> - return 0;
> -
>   next = tmp.next;

So the assignment is fine, but comparison is not? Strange.

I guess this is fine by me if that makes the warning go away.

Acked-by: Oleg Drokin 

> + if (next == head_up)
> + return 0;
> 
>   ent = list_entry(next, struct lstcon_rpc_ent, rpe_link);
> 
> -- 
> 2.11.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH V2] libcfs: Fix a sleep-in-atomic bug in cfs_wi_exit

2017-05-31 Thread Oleg Drokin

Hello!

On May 31, 2017, at 4:00 AM, Jia-Ju Bai wrote:

> The driver may sleep under a spin lock, and the function call path is:
> cfs_wi_exit (acquire the lock by spin_lock)
>  LASSERT
>lbug_with_loc
>  libcfs_debug_dumplog
>schedule and kthread_run --> may sleep
> 
> To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock.
> 
> Signed-off-by: Jia-Ju Bai 
> ---
> drivers/staging/lustre/lnet/libcfs/workitem.c |   13 +++--
> 1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c 
> b/drivers/staging/lustre/lnet/libcfs/workitem.c
> index dbc2a9b..928d06d 100644
> --- a/drivers/staging/lustre/lnet/libcfs/workitem.c
> +++ b/drivers/staging/lustre/lnet/libcfs/workitem.c
> @@ -111,22 +111,23 @@ struct cfs_wi_sched {
> {
>   LASSERT(!in_interrupt()); /* because we use plain spinlock */
>   LASSERT(!sched->ws_stopping);
> + LASSERT(wi->wi_running);
> + if (wi->wi_scheduled) {
> + LASSERT(!list_empty(&wi->wi_list));
> + LASSERT(sched->ws_nscheduled > 0);
> + }

Similarly here and in all other patches about LASSERT calls under spinlocks() 
from you,
just think of them as a panic() call, no operations are expected to continue
after it triggers.

Thanks.

> 
>   spin_lock(&sched->ws_lock);
> 
> - LASSERT(wi->wi_running);
>   if (wi->wi_scheduled) { /* cancel pending schedules */
> - LASSERT(!list_empty(&wi->wi_list));
>   list_del_init(&wi->wi_list);
> -
> - LASSERT(sched->ws_nscheduled > 0);
>   sched->ws_nscheduled--;
>   }
> 
> - LASSERT(list_empty(&wi->wi_list));
> -
>   wi->wi_scheduled = 1; /* LBUG future schedule attempts */
>   spin_unlock(&sched->ws_lock);
> +
> + LASSERT(list_empty(&wi->wi_list));
> }
> EXPORT_SYMBOL(cfs_wi_exit);
> 
> -- 
> 1.7.9.5
> 

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH V2] libcfs: Fix a sleep-in-atomic bug in cfs_wi_deschedule

2017-05-31 Thread Oleg Drokin

Hello!

On May 31, 2017, at 3:57 AM, Jia-Ju Bai wrote:

> The driver may sleep under a spin lock, and the function call path is:
> cfs_wi_deschedule (acquire the lock by spin_lock)
>  LASSERT
>lbug_with_loc
>  libcfs_debug_dumplog
>schedule and kthread_run --> may sleep
> 
> To fix it, all "LASSERT" is placed out of the spin_lock and spin_unlock.
> 
> Signed-off-by: Jia-Ju Bai 
> ---
> drivers/staging/lustre/lnet/libcfs/workitem.c |   12 ++--
> 1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c 
> b/drivers/staging/lustre/lnet/libcfs/workitem.c
> index dbc2a9b..9c530cf 100644
> --- a/drivers/staging/lustre/lnet/libcfs/workitem.c
> +++ b/drivers/staging/lustre/lnet/libcfs/workitem.c
> @@ -140,6 +140,10 @@ struct cfs_wi_sched {
> 
>   LASSERT(!in_interrupt()); /* because we use plain spinlock */
>   LASSERT(!sched->ws_stopping);
> + if (wi->wi_scheduled) {
> + LASSERT(!list_empty(&wi->wi_list));
> + LASSERT(sched->ws_nscheduled > 0);
> + }

I don't think you can do this,
this was under spinlock because those values could change from a different 
thread
and we do need to look at them all together.

You are correct that LASSET/LBUG might schedule to dump a debug log
into a file and even if not it does sleep indefinitely after that.
But in reality the default option is "panic_on_lbug=1" which simply converts
LASSERT() into panic().

This is certainly not a normal condition and as such I think we can leave the 
code
as is.

Thanks.


> 
>   /*
>* return 0 if it's running already, otherwise return 1, which
> @@ -151,18 +155,14 @@ struct cfs_wi_sched {
>   rc = !(wi->wi_running);
> 
>   if (wi->wi_scheduled) { /* cancel pending schedules */
> - LASSERT(!list_empty(&wi->wi_list));
>   list_del_init(&wi->wi_list);
> -
> - LASSERT(sched->ws_nscheduled > 0);
>   sched->ws_nscheduled--;
> -
>   wi->wi_scheduled = 0;
>   }
> 
> - LASSERT(list_empty(&wi->wi_list));
> -
>   spin_unlock(&sched->ws_lock);
> +
> + LASSERT(list_empty(&wi->wi_list));
>   return rc;
> }
> EXPORT_SYMBOL(cfs_wi_deschedule);
> -- 
> 1.7.9.5
> 

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH] staging/lustre/lov: remove set_fs() call from lov_getstripe()

2017-05-29 Thread Oleg Drokin


On May 29, 2017, at 10:28 AM, Greg Kroah-Hartman wrote:

> On Fri, May 26, 2017 at 11:40:33PM -0400, Oleg Drokin wrote:
>> lov_getstripe() calls set_fs(KERNEL_DS) so that it can handle a struct
>> lov_user_md pointer from user- or kernel-space.  This changes the
>> behavior of copy_from_user() on SPARC and may result in a misaligned
>> access exception which in turn oopses the kernel.  In fact the
>> relevant argument to lov_getstripe() is never called with a
>> kernel-space pointer and so changing the address limits is unnecessary
>> and so we remove the calls to save, set, and restore the address
>> limits.
>> 
>> Signed-off-by: John L. Hammond 
>> Reviewed-on: http://review.whamcloud.com/6150
>> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3221
>> Reviewed-by: Andreas Dilger 
>> Reviewed-by: Li Wei 
>> Signed-off-by: Oleg Drokin 
>> ---
>> drivers/staging/lustre/lustre/lov/lov_pack.c | 9 -
>> 1 file changed, 9 deletions(-)
> 
> So is this the patch that you want applied to the staging tree(s) as
> well?  If so, please let me know, otherwise I have no clue…

Yes, this is it.
Thanks!

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH] staging/lustre/lov: remove set_fs() call from lov_getstripe()

2017-05-26 Thread Oleg Drokin

lov_getstripe() calls set_fs(KERNEL_DS) so that it can handle a struct
lov_user_md pointer from user- or kernel-space.  This changes the
behavior of copy_from_user() on SPARC and may result in a misaligned
access exception which in turn oopses the kernel.  In fact the
relevant argument to lov_getstripe() is never called with a
kernel-space pointer and so changing the address limits is unnecessary
and so we remove the calls to save, set, and restore the address
limits.

Signed-off-by: John L. Hammond 
Reviewed-on: http://review.whamcloud.com/6150
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3221
Reviewed-by: Andreas Dilger 
Reviewed-by: Li Wei 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/lov/lov_pack.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c 
b/drivers/staging/lustre/lustre/lov/lov_pack.c
index 2e1bd47..e6727ce 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -293,18 +293,10 @@ int lov_getstripe(struct lov_object *obj, struct 
lov_stripe_md *lsm,
size_t lmmk_size;
size_t lum_size;
int rc;
-   mm_segment_t seg;
 
if (!lsm)
return -ENODATA;
 
-   /*
-* "Switch to kernel segment" to allow copying from kernel space by
-* copy_{to,from}_user().
-*/
-   seg = get_fs();
-   set_fs(KERNEL_DS);
-
if (lsm->lsm_magic != LOV_MAGIC_V1 && lsm->lsm_magic != LOV_MAGIC_V3) {
CERROR("bad LSM MAGIC: 0x%08X != 0x%08X nor 0x%08X\n",
   lsm->lsm_magic, LOV_MAGIC_V1, LOV_MAGIC_V3);
@@ -406,6 +398,5 @@ int lov_getstripe(struct lov_object *obj, struct 
lov_stripe_md *lsm,
 out_free:
kvfree(lmmk);
 out:
-   set_fs(seg);
return rc;
 }
-- 
2.9.3

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH 1/1] Staging: lustre: lnet: libcfs: Fixed checkpatch.pl coding style errors

2017-03-28 Thread Oleg Drokin


On Mar 28, 2017, at 6:10 AM,   
wrote:

> From: Vaibhav Kothari 
> 
> Shifted open brace { to previous line for 8 functions as indicated by
> checkpatch.pl
> 
> Signed-off-by: Vaibhav Kothari 
> ---
> drivers/staging/lustre/lnet/libcfs/hash.c | 43 +++
> 1 file changed, 15 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lnet/libcfs/hash.c 
> b/drivers/staging/lustre/lnet/libcfs/hash.c
> index 5c2ce2e..bb966e2 100644
> --- a/drivers/staging/lustre/lnet/libcfs/hash.c
> +++ b/drivers/staging/lustre/lnet/libcfs/hash.c
> @@ -1348,8 +1348,7 @@ void cfs_hash_putref(struct cfs_hash *hs)
> EXPORT_SYMBOL(cfs_hash_lookup);
> 
> static void
> -cfs_hash_for_each_enter(struct cfs_hash *hs)
> -{
> +cfs_hash_for_each_enter(struct cfs_hash *hs) {

Ugh, no.
This is obviously a false positive in checkpatch.


>   LASSERT(!cfs_hash_is_exiting(hs));
> 
>   if (!cfs_hash_with_rehash(hs))
> @@ -1375,8 +1374,7 @@ void cfs_hash_putref(struct cfs_hash *hs)
> }
> 
> static void
> -cfs_hash_for_each_exit(struct cfs_hash *hs)
> -{
> +cfs_hash_for_each_exit(struct cfs_hash *hs) {
>   int remained;
>   int bits;
> 
> @@ -1407,8 +1405,7 @@ void cfs_hash_putref(struct cfs_hash *hs)
>  */
> static u64
> cfs_hash_for_each_tight(struct cfs_hash *hs, cfs_hash_for_each_cb_t func,
> - void *data, int remove_safe)
> -{
> + void *data, int remove_safe) {
>   struct hlist_node *hnode;
>   struct hlist_node *pos;
>   struct cfs_hash_bd bd;
> @@ -1465,8 +1462,7 @@ struct cfs_hash_cond_arg {
> 
> static int
> cfs_hash_cond_del_locked(struct cfs_hash *hs, struct cfs_hash_bd *bd,
> -  struct hlist_node *hnode, void *data)
> -{
> +  struct hlist_node *hnode, void *data) {
>   struct cfs_hash_cond_arg *cond = data;
> 
>   if (cond->func(cfs_hash_object(hs, hnode), cond->arg))
> @@ -1480,8 +1476,8 @@ struct cfs_hash_cond_arg {
>  * any object be reference.
>  */
> void
> -cfs_hash_cond_del(struct cfs_hash *hs, cfs_hash_cond_opt_cb_t func, void 
> *data)
> -{
> +cfs_hash_cond_del(struct cfs_hash *hs, cfs_hash_cond_opt_cb_t func,
> + void *data) {
>   struct cfs_hash_cond_arg arg = {
>   .func   = func,
>   .arg= data,
> @@ -1493,31 +1489,27 @@ struct cfs_hash_cond_arg {
> 
> void
> cfs_hash_for_each(struct cfs_hash *hs, cfs_hash_for_each_cb_t func,
> -   void *data)
> -{
> +   void *data) {
>   cfs_hash_for_each_tight(hs, func, data, 0);
> }
> EXPORT_SYMBOL(cfs_hash_for_each);
> 
> void
> cfs_hash_for_each_safe(struct cfs_hash *hs, cfs_hash_for_each_cb_t func,
> -void *data)
> -{
> +void *data) {
>   cfs_hash_for_each_tight(hs, func, data, 1);
> }
> EXPORT_SYMBOL(cfs_hash_for_each_safe);
> 
> static int
> cfs_hash_peek(struct cfs_hash *hs, struct cfs_hash_bd *bd,
> -   struct hlist_node *hnode, void *data)
> -{
> +   struct hlist_node *hnode, void *data) {
>   *(int *)data = 0;
>   return 1; /* return 1 to break the loop */
> }
> 
> int
> -cfs_hash_is_empty(struct cfs_hash *hs)
> -{
> +cfs_hash_is_empty(struct cfs_hash *hs) {
>   int empty = 1;
> 
>   cfs_hash_for_each_tight(hs, cfs_hash_peek, &empty, 0);
> @@ -1526,8 +1518,7 @@ struct cfs_hash_cond_arg {
> EXPORT_SYMBOL(cfs_hash_is_empty);
> 
> u64
> -cfs_hash_size_get(struct cfs_hash *hs)
> -{
> +cfs_hash_size_get(struct cfs_hash *hs) {
>   return cfs_hash_with_counter(hs) ?
>  atomic_read(&hs->hs_count) :
>  cfs_hash_for_each_tight(hs, NULL, NULL, 0);
> @@ -1551,8 +1542,7 @@ struct cfs_hash_cond_arg {
>  */
> static int
> cfs_hash_for_each_relax(struct cfs_hash *hs, cfs_hash_for_each_cb_t func,
> - void *data, int start)
> -{
> + void *data, int start) {
>   struct hlist_node *hnode;
>   struct hlist_node *tmp;
>   struct cfs_hash_bd bd;
> @@ -1629,8 +1619,7 @@ struct cfs_hash_cond_arg {
> 
> int
> cfs_hash_for_each_nolock(struct cfs_hash *hs, cfs_hash_for_each_cb_t func,
> -  void *data, int start)
> -{
> +  void *data, int start) {
>   if (cfs_hash_with_no_lock(hs) ||
>   cfs_hash_with_rehash_key(hs) ||
>   !cfs_hash_with_no_itemref(hs))
> @@ -1661,8 +1650,7 @@ struct cfs_hash_cond_arg {
>  */
> int
> cfs_hash_for_each_empty(struct cfs_hash *hs, cfs_hash_for_each_cb_t func,
> - void *data)
> -{
> + void *data) {
>   unsigned int i = 0;
> 
>   if (cfs_hash_with_no_lock(hs))
> @@ -1718,8 +1706,7 @@ struct cfs_hash_cond_arg {
>  */
> void
> cfs_hash_for_each_key(struct cfs_hash *hs, const void *key,
> -   cfs_hash_for_each_cb_t func, void *data)
> -{
> +   cfs_hash_for_each_cb_t func, void *data) {
>   struct hlist_node *hnode;
>   struct cfs_hash_bd bds[2];
>   unsi

Re: [PATCH/RFC] staging/lustre: Rework class_process_proc_param

2017-03-18 Thread Oleg Drokin


On Mar 19, 2017, at 12:41 AM, Greg Kroah-Hartman wrote:

> On Sat, Mar 18, 2017 at 02:24:08AM -0400, Oleg Drokin wrote:
>> Ever since sysfs migration, class_process_proc_param stopped working
>> correctly as all the useful params were no longer present as lvars.
>> Replace all the nasty fake proc writes with hopefully less nasty
>> kobject attribute search and then update the attributes as needed.
>> 
>> Signed-off-by: Oleg Drokin 
>> Reported-by: Al Viro 
>> ---
>> Al has quite rightfully complained in the past that class_process_proc_param
>> is a terrible piece of code and needs to go.
>> This patch is an attempt at improving it somewhat and in process drop
>> all the user/kernel address space games we needed to play to make it work
>> in the past (and which I suspect attracted Al's attention in the first 
>> place).
>> 
>> Now I wonder if iterating kobject attributes like that would be ok with
>> you Greg, or do you think there is a better way?
>> class_find_write_attr could be turned into something generic since it's
>> certainly convenient to reuse same table of name-write_method pairs,
>> but I did some cursory research and nobody else seems to need anything
>> of the sort in-tree.
>> 
>> I know ll_process_config is still awful and I will likely just
>> replace the current hack with kset_find_obj, but I just wanted to make
>> sure this new approach would be ok before spending too much time on it.
>> 
>> Thanks!
>> 
>> drivers/staging/lustre/lustre/include/obd_class.h  |  4 +-
>> drivers/staging/lustre/lustre/llite/llite_lib.c| 10 +--
>> drivers/staging/lustre/lustre/lov/lov_obd.c|  3 +-
>> drivers/staging/lustre/lustre/mdc/mdc_request.c|  3 +-
>> .../staging/lustre/lustre/obdclass/obd_config.c| 78 
>> ++
>> drivers/staging/lustre/lustre/osc/osc_request.c|  3 +-
>> 6 files changed, 44 insertions(+), 57 deletions(-)
>> 
>> diff --git a/drivers/staging/lustre/lustre/include/obd_class.h 
>> b/drivers/staging/lustre/lustre/include/obd_class.h
>> index 083a6ff..badafb8 100644
>> --- a/drivers/staging/lustre/lustre/include/obd_class.h
>> +++ b/drivers/staging/lustre/lustre/include/obd_class.h
>> @@ -114,8 +114,8 @@ typedef int (*llog_cb_t)(const struct lu_env *, struct 
>> llog_handle *,
>>   struct llog_rec_hdr *, void *);
>> /* obd_config.c */
>> int class_process_config(struct lustre_cfg *lcfg);
>> -int class_process_proc_param(char *prefix, struct lprocfs_vars *lvars,
>> - struct lustre_cfg *lcfg, void *data);
>> +int class_process_attr_param(char *prefix, struct kobject *kobj,
>> + struct lustre_cfg *lcfg);
> 
> As you are exporting these functions, they will need to end up with a
> lustre_* prefix eventually :)

ok.

> 
>> struct obd_device *class_incref(struct obd_device *obd,
>>  const char *scope, const void *source);
>> void class_decref(struct obd_device *obd,
>> diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c 
>> b/drivers/staging/lustre/lustre/llite/llite_lib.c
>> index 7b80040..192b877 100644
>> --- a/drivers/staging/lustre/lustre/llite/llite_lib.c
>> +++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
>> @@ -2259,7 +2259,7 @@ int ll_obd_statfs(struct inode *inode, void __user 
>> *arg)
>> int ll_process_config(struct lustre_cfg *lcfg)
>> {
>>  char *ptr;
>> -void *sb;
>> +struct super_block *sb;
>>  struct lprocfs_static_vars lvars;
>>  unsigned long x;
>>  int rc = 0;
>> @@ -2273,15 +2273,15 @@ int ll_process_config(struct lustre_cfg *lcfg)
>>  rc = kstrtoul(ptr, 16, &x);
>>  if (rc != 0)
>>  return -EINVAL;
>> -sb = (void *)x;
>> +sb = (struct super_block *)x;
>>  /* This better be a real Lustre superblock! */
>> -LASSERT(s2lsi((struct super_block *)sb)->lsi_lmd->lmd_magic == 
>> LMD_MAGIC);
>> +LASSERT(s2lsi(sb)->lsi_lmd->lmd_magic == LMD_MAGIC);
>> 
>>  /* Note we have not called client_common_fill_super yet, so
>>   * proc fns must be able to handle that!
>>   */
>> -rc = class_process_proc_param(PARAM_LLITE, lvars.obd_vars,
>> -  lcfg, sb);
>> +rc = class_process_attr_param(PARAM_LLITE, &ll_s2sbi(sb)->ll_kobj,
>> +  lcfg);
>>  if (rc > 0)
>>  rc = 0;
>>  return rc;
>> diff --git

Re: [PATCH/RFC] staging/lustre: Rework class_process_proc_param

2017-03-18 Thread Oleg Drokin


On Mar 19, 2017, at 12:29 AM, Greg Kroah-Hartman wrote:

> On Sat, Mar 18, 2017 at 11:17:55AM -0400, Oleg Drokin wrote:
>> 
>> On Mar 18, 2017, at 6:34 AM, Greg Kroah-Hartman wrote:
>> 
>>> On Sat, Mar 18, 2017 at 02:24:08AM -0400, Oleg Drokin wrote:
>>>> Ever since sysfs migration, class_process_proc_param stopped working
>>>> correctly as all the useful params were no longer present as lvars.
>>>> Replace all the nasty fake proc writes with hopefully less nasty
>>>> kobject attribute search and then update the attributes as needed.
>>>> 
>>>> Signed-off-by: Oleg Drokin 
>>>> Reported-by: Al Viro 
>>>> ---
>>>> Al has quite rightfully complained in the past that 
>>>> class_process_proc_param
>>>> is a terrible piece of code and needs to go.
>>>> This patch is an attempt at improving it somewhat and in process drop
>>>> all the user/kernel address space games we needed to play to make it work
>>>> in the past (and which I suspect attracted Al's attention in the first 
>>>> place).
>>>> 
>>>> Now I wonder if iterating kobject attributes like that would be ok with
>>>> you Greg, or do you think there is a better way?
>>>> class_find_write_attr could be turned into something generic since it's
>>>> certainly convenient to reuse same table of name-write_method pairs,
>>>> but I did some cursory research and nobody else seems to need anything
>>>> of the sort in-tree.
>>>> 
>>>> I know ll_process_config is still awful and I will likely just
>>>> replace the current hack with kset_find_obj, but I just wanted to make
>>>> sure this new approach would be ok before spending too much time on it.
>>> 
>>> I'm not quite sure what exactly you are even trying to do here.  What is
>>> this interface?  Who calls it, and how?  What does it want to do?
>> 
>> This is a configuration client code.
>> Management server has ability to pass down config information in the form of:
>> fsname.subsystem.attribute=value to clients and other servers
>> (subsystem determines if it's something of interest of clients or servers or
>> both).
>> This could be changed in the real time - i.e. you update it on the server and
>> that gets propagated to all the clients/servers, so no need to ssh into
>> every node to manually apply those changes and worry about client restarts
>> (the config is remembered at the management server and would be applied to 
>> any
>> new nodes connecting/across server restarts and such).
>> 
>> So the way it then works then is once the string 
>> fsname.subsystem.attribute=value is delivered to the client, we find all 
>> instances of filesystem with fsname and then
>> all subsystems within it (one kobject per subsystem instance) and then 
>> update the
>> attributes to the value supplied.
>> 
>> The same filesystem might be mounted more than once and then some layers 
>> might have
>> multiple instances inside a single filesystems.
>> 
>> In the end it would turn something like lustre.osc.max_dirty_mb=128 into
>> writes to
>> /sys/fs/lustre/osc/lustre-OST-osc-8800d75ca000/max_dirty_mb and
>> /sys/fs/lustre/osc/lustre-OST0001-osc-8800d75ca000/max_dirty_mb
>> without actually iterating in sysfs namespace.
> 
> Wait, who is doing the "write"?  From within the kernel?  Or some
> userspace app?  I'm guessing from within the kernel, you are receiving
> the data from some other transport within the filesystem and then need
> to apply the settings?

Yes, kernel code gets the notification "hey, there's a config change, come get 
it",
then it requests the diff in the config and does the "write' by updating the
attributes.

>> The alternative we considered is we can probably just do an upcall and have
>> a userspace tool called with the parameter verbatim and try to figure it out,
>> but that seems a lot less ideal, and also we'll get a bunch of complications 
>> from
>> containers and such too, I imagine.
> 
> Yeah, no, don't do an upcall, that's a mess.
> 
>> The function pre-this patch is assuming that all these values are part of
>> a list of procfs values (no longer true after sysfs migration) so just 
>> iterates
>> that list and calls the write for matched names (but also needs to supply a 
>> userspace
>> buffer so looks much uglier too).
> 
> For kobjects, you don't need userspace buffers, so t

Re: [PATCH/RFC] staging/lustre: Rework class_process_proc_param

2017-03-18 Thread Oleg Drokin

On Mar 18, 2017, at 6:34 AM, Greg Kroah-Hartman wrote:

> On Sat, Mar 18, 2017 at 02:24:08AM -0400, Oleg Drokin wrote:
>> Ever since sysfs migration, class_process_proc_param stopped working
>> correctly as all the useful params were no longer present as lvars.
>> Replace all the nasty fake proc writes with hopefully less nasty
>> kobject attribute search and then update the attributes as needed.
>> 
>> Signed-off-by: Oleg Drokin 
>> Reported-by: Al Viro 
>> ---
>> Al has quite rightfully complained in the past that class_process_proc_param
>> is a terrible piece of code and needs to go.
>> This patch is an attempt at improving it somewhat and in process drop
>> all the user/kernel address space games we needed to play to make it work
>> in the past (and which I suspect attracted Al's attention in the first 
>> place).
>> 
>> Now I wonder if iterating kobject attributes like that would be ok with
>> you Greg, or do you think there is a better way?
>> class_find_write_attr could be turned into something generic since it's
>> certainly convenient to reuse same table of name-write_method pairs,
>> but I did some cursory research and nobody else seems to need anything
>> of the sort in-tree.
>> 
>> I know ll_process_config is still awful and I will likely just
>> replace the current hack with kset_find_obj, but I just wanted to make
>> sure this new approach would be ok before spending too much time on it.
> 
> I'm not quite sure what exactly you are even trying to do here.  What is
> this interface?  Who calls it, and how?  What does it want to do?

This is a configuration client code.
Management server has ability to pass down config information in the form of:
fsname.subsystem.attribute=value to clients and other servers
(subsystem determines if it's something of interest of clients or servers or
both).
This could be changed in the real time - i.e. you update it on the server and
that gets propagated to all the clients/servers, so no need to ssh into
every node to manually apply those changes and worry about client restarts
(the config is remembered at the management server and would be applied to any
new nodes connecting/across server restarts and such).

So the way it then works then is once the string 
fsname.subsystem.attribute=value is delivered to the client, we find all 
instances of filesystem with fsname and then
all subsystems within it (one kobject per subsystem instance) and then update 
the
attributes to the value supplied.

The same filesystem might be mounted more than once and then some layers might 
have
multiple instances inside a single filesystems.

In the end it would turn something like lustre.osc.max_dirty_mb=128 into
writes to
/sys/fs/lustre/osc/lustre-OST-osc-8800d75ca000/max_dirty_mb and
/sys/fs/lustre/osc/lustre-OST0001-osc-8800d75ca000/max_dirty_mb
without actually iterating in sysfs namespace.

The alternative we considered is we can probably just do an upcall and have
a userspace tool called with the parameter verbatim and try to figure it out,
but that seems a lot less ideal, and also we'll get a bunch of complications 
from
containers and such too, I imagine.

The function pre-this patch is assuming that all these values are part of
a list of procfs values (no longer true after sysfs migration) so just iterates
that list and calls the write for matched names (but also needs to supply a 
userspace
buffer so looks much uglier too).

Hopefully this makes at least some sense.

> You can look up attributes for a kobject easily in the show/store
> functions (and some drivers just have a generic one and then you look at
> the string to see which attribute you are wanting to reference.)  But
> you seem to be working backwards here, why do you have to look up a
> kobject?

But that leads to the need to list attribute names essentially twice:
once for the attributes list, once in the show/set function to figure
out how to deal with that name.

> What is wrong with the "normal" way to interact with kobject attributes
> from sysfs?
> 
> What does your "process proc" function do?  Where does it get called
> from?
> 
> totally confused,
> 
> greg k-h

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH/RFC] staging/lustre: Rework class_process_proc_param

2017-03-17 Thread Oleg Drokin

Ever since sysfs migration, class_process_proc_param stopped working
correctly as all the useful params were no longer present as lvars.
Replace all the nasty fake proc writes with hopefully less nasty
kobject attribute search and then update the attributes as needed.

Signed-off-by: Oleg Drokin 
Reported-by: Al Viro 
---
Al has quite rightfully complained in the past that class_process_proc_param
is a terrible piece of code and needs to go.
This patch is an attempt at improving it somewhat and in process drop
all the user/kernel address space games we needed to play to make it work
in the past (and which I suspect attracted Al's attention in the first place).

Now I wonder if iterating kobject attributes like that would be ok with
you Greg, or do you think there is a better way?
class_find_write_attr could be turned into something generic since it's
certainly convenient to reuse same table of name-write_method pairs,
but I did some cursory research and nobody else seems to need anything
of the sort in-tree.

I know ll_process_config is still awful and I will likely just
replace the current hack with kset_find_obj, but I just wanted to make
sure this new approach would be ok before spending too much time on it.

Thanks!

 drivers/staging/lustre/lustre/include/obd_class.h  |  4 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c| 10 +--
 drivers/staging/lustre/lustre/lov/lov_obd.c|  3 +-
 drivers/staging/lustre/lustre/mdc/mdc_request.c|  3 +-
 .../staging/lustre/lustre/obdclass/obd_config.c| 78 ++
 drivers/staging/lustre/lustre/osc/osc_request.c|  3 +-
 6 files changed, 44 insertions(+), 57 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd_class.h 
b/drivers/staging/lustre/lustre/include/obd_class.h
index 083a6ff..badafb8 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -114,8 +114,8 @@ typedef int (*llog_cb_t)(const struct lu_env *, struct 
llog_handle *,
 struct llog_rec_hdr *, void *);
 /* obd_config.c */
 int class_process_config(struct lustre_cfg *lcfg);
-int class_process_proc_param(char *prefix, struct lprocfs_vars *lvars,
-struct lustre_cfg *lcfg, void *data);
+int class_process_attr_param(char *prefix, struct kobject *kobj,
+struct lustre_cfg *lcfg);
 struct obd_device *class_incref(struct obd_device *obd,
const char *scope, const void *source);
 void class_decref(struct obd_device *obd,
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c 
b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 7b80040..192b877 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -2259,7 +2259,7 @@ int ll_obd_statfs(struct inode *inode, void __user *arg)
 int ll_process_config(struct lustre_cfg *lcfg)
 {
char *ptr;
-   void *sb;
+   struct super_block *sb;
struct lprocfs_static_vars lvars;
unsigned long x;
int rc = 0;
@@ -2273,15 +2273,15 @@ int ll_process_config(struct lustre_cfg *lcfg)
rc = kstrtoul(ptr, 16, &x);
if (rc != 0)
return -EINVAL;
-   sb = (void *)x;
+   sb = (struct super_block *)x;
/* This better be a real Lustre superblock! */
-   LASSERT(s2lsi((struct super_block *)sb)->lsi_lmd->lmd_magic == 
LMD_MAGIC);
+   LASSERT(s2lsi(sb)->lsi_lmd->lmd_magic == LMD_MAGIC);
 
/* Note we have not called client_common_fill_super yet, so
 * proc fns must be able to handle that!
 */
-   rc = class_process_proc_param(PARAM_LLITE, lvars.obd_vars,
- lcfg, sb);
+   rc = class_process_attr_param(PARAM_LLITE, &ll_s2sbi(sb)->ll_kobj,
+ lcfg);
if (rc > 0)
rc = 0;
return rc;
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c 
b/drivers/staging/lustre/lustre/lov/lov_obd.c
index b3161fb..c33a327 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -926,8 +926,7 @@ int lov_process_config_base(struct obd_device *obd, struct 
lustre_cfg *lcfg,
 
lprocfs_lov_init_vars(&lvars);
 
-   rc = class_process_proc_param(PARAM_LOV, lvars.obd_vars,
- lcfg, obd);
+   rc = class_process_attr_param(PARAM_LOV, &obd->obd_kobj, lcfg);
if (rc > 0)
rc = 0;
goto out;
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c 
b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 6bc2fb8..00387b8 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -2670,8 +2670,7 @@ static int mdc_process_

Re: [lustre-devel] [PATCH] staging: lustre: fix sparse warning about different address spaces

2017-03-06 Thread Oleg Drokin


On Mar 1, 2017, at 6:57 PM, Mario Bambagini wrote:

> fixed the following sparse warning by adding proper cast:
> drivers/staging//lustre/lustre/obdclass/obd_config.c:1055:74: warning: 
> incorrect type in argument 2 (different address spaces)
> drivers/staging//lustre/lustre/obdclass/obd_config.c:1055:74:expected 
> char const [noderef] *
> drivers/staging//lustre/lustre/obdclass/obd_config.c:1055:74:got char 
> *[assigned] sval
> 
> Signed-off-by: Mario Bambagini 

The patch is fine, but just be advised this whole function is going away real 
soon now
per Al Viro request (and also because it no longer does what it should).

> ---
> drivers/staging/lustre/lustre/obdclass/obd_config.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lustre/obdclass/obd_config.c 
> b/drivers/staging/lustre/lustre/obdclass/obd_config.c
> index 9ca84c7..8fce88f 100644
> --- a/drivers/staging/lustre/lustre/obdclass/obd_config.c
> +++ b/drivers/staging/lustre/lustre/obdclass/obd_config.c
> @@ -1052,7 +1052,8 @@ int class_process_proc_param(char *prefix, struct 
> lprocfs_vars *lvars,
> 
>   oldfs = get_fs();
>   set_fs(KERNEL_DS);
> - rc = var->fops->write(&fakefile, sval,
> + rc = var->fops->write(&fakefile,
> + (const char __user *)sval,
>   vallen, NULL);
>   set_fs(oldfs);
>   }
> -- 
> 2.1.4
> 
> ___
> lustre-devel mailing list
> lustre-de...@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH 13/14] staging: lustre: llog: limit file size of plain logs

2017-02-24 Thread Oleg Drokin


On Feb 24, 2017, at 11:59 AM, Greg Kroah-Hartman wrote:

> On Sat, Feb 18, 2017 at 04:47:14PM -0500, James Simmons wrote:
>> From: Alex Zhuravlev 
>> 
>> on small filesystems plain log can grow dramatically. especially
>> given large record sizes produced by DNE and extended chunksize.
>> I saw >50% of space consumed by a single llog file which was still
>> in use. this leads to test failures (sanityn, etc).
>> the patch introduces additional limit on plain llog size, which
>> is calculated as /64 (128MB at most) at llog creation
>> time.
>> 
>> Signed-off-by: Alex Zhuravlev 
>> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6838
>> Reviewed-on: https://review.whamcloud.com/18028
>> Reviewed-by: Andreas Dilger 
>> Reviewed-by: wangdi 
>> Reviewed-by: Mike Pershin 
>> Reviewed-by: Oleg Drokin 
>> Signed-off-by: James Simmons 
>> ---
>> drivers/staging/lustre/lustre/obdclass/llog.c | 16 
>> 1 file changed, 16 insertions(+)
>> 
>> diff --git a/drivers/staging/lustre/lustre/obdclass/llog.c 
>> b/drivers/staging/lustre/lustre/obdclass/llog.c
>> index 83c5b62..320ff6b 100644
>> --- a/drivers/staging/lustre/lustre/obdclass/llog.c
>> +++ b/drivers/staging/lustre/lustre/obdclass/llog.c
>> @@ -319,10 +319,26 @@ static int llog_process_thread(void *arg)
>>   * the case and re-read the current chunk
>>   * otherwise.
>>   */
>> +int records;
>> +
>>  if (index > loghandle->lgh_last_idx) {
>>  rc = 0;
>>  goto out;
>>  }
>> +/* <2 records means no more records
>> + * if the last record we processed was
>> + * the final one, then the underlying
>> + * object might have been destroyed yet.
>> + * we better don't access that..
>> + */
>> +mutex_lock(&loghandle->lgh_hdr_mutex);
>> +records = loghandle->lgh_hdr->llh_count;
>> +mutex_unlock(&loghandle->lgh_hdr_mutex);
>> +if (records <= 1) {
>> +rc = 0;
>> +goto out;
>> +}
> 
> 
> So you now use the lock, in only one place, when reading a single value?
> That makes no sense, it's obviously wrong, or not needed.
> 
> Please fix up these two patches…

Ah, this is in fact server-side fix, so all the other users were in the
parts not really present in the client.
James, we don't really need this patch in the client, I guess.

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH 13/14] staging: lustre: llog: limit file size of plain logs

2017-02-24 Thread Oleg Drokin


On Feb 24, 2017, at 11:59 AM, Greg Kroah-Hartman wrote:

> On Sat, Feb 18, 2017 at 04:47:14PM -0500, James Simmons wrote:
>> From: Alex Zhuravlev 
>> 
>> on small filesystems plain log can grow dramatically. especially
>> given large record sizes produced by DNE and extended chunksize.
>> I saw >50% of space consumed by a single llog file which was still
>> in use. this leads to test failures (sanityn, etc).
>> the patch introduces additional limit on plain llog size, which
>> is calculated as /64 (128MB at most) at llog creation
>> time.
>> 
>> Signed-off-by: Alex Zhuravlev 
>> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6838
>> Reviewed-on: https://review.whamcloud.com/18028
>> Reviewed-by: Andreas Dilger 
>> Reviewed-by: wangdi 
>> Reviewed-by: Mike Pershin 
>> Reviewed-by: Oleg Drokin 
>> Signed-off-by: James Simmons 
>> ---
>> drivers/staging/lustre/lustre/obdclass/llog.c | 16 
>> 1 file changed, 16 insertions(+)
>> 
>> diff --git a/drivers/staging/lustre/lustre/obdclass/llog.c 
>> b/drivers/staging/lustre/lustre/obdclass/llog.c
>> index 83c5b62..320ff6b 100644
>> --- a/drivers/staging/lustre/lustre/obdclass/llog.c
>> +++ b/drivers/staging/lustre/lustre/obdclass/llog.c
>> @@ -319,10 +319,26 @@ static int llog_process_thread(void *arg)
>>   * the case and re-read the current chunk
>>   * otherwise.
>>   */
>> +int records;
>> +
>>  if (index > loghandle->lgh_last_idx) {
>>  rc = 0;
>>  goto out;
>>  }
>> +/* <2 records means no more records
>> + * if the last record we processed was
>> + * the final one, then the underlying
>> + * object might have been destroyed yet.
>> + * we better don't access that..
>> + */
>> +mutex_lock(&loghandle->lgh_hdr_mutex);
>> +records = loghandle->lgh_hdr->llh_count;
>> +mutex_unlock(&loghandle->lgh_hdr_mutex);
>> +if (records <= 1) {
>> +rc = 0;
>> +goto out;
>> +}
> 
> 
> So you now use the lock, in only one place, when reading a single value?
> That makes no sense, it's obviously wrong, or not needed.
> 
> Please fix up these two patches…

Ah, this is in fact server-side fix, so all the other users were in the
parts not really present in the client.
James, we don't really need this patch in the client, I guess.

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH] staging/lustre/lnet: Fix allocation size for sv_cpt_data

2017-02-19 Thread Oleg Drokin

This is unbreaking another of those "stealth" janitor
patches that got in and subtly broke some things.

sv_cpt_data is a pointer to pointer, so need to
dereference it twice to allocate the correct structure size.

Fixes: 9899cb68c6c23d58b27035c237b2d425f4c6133c
CC: Sandhya Bankar 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lnet/selftest/rpc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/selftest/rpc.c 
b/drivers/staging/lustre/lnet/selftest/rpc.c
index 92cd411..87fe366 100644
--- a/drivers/staging/lustre/lnet/selftest/rpc.c
+++ b/drivers/staging/lustre/lnet/selftest/rpc.c
@@ -255,7 +255,7 @@ srpc_service_init(struct srpc_service *svc)
svc->sv_shuttingdown = 0;
 
svc->sv_cpt_data = cfs_percpt_alloc(lnet_cpt_table(),
-   sizeof(*svc->sv_cpt_data));
+   sizeof(**svc->sv_cpt_data));
if (!svc->sv_cpt_data)
return -ENOMEM;
 
-- 
2.9.3

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [lustre-devel] [patch] staging: lustre: ptlrpc: silence a shift wrapping warning

2017-01-15 Thread Oleg Drokin


On Jan 15, 2017, at 3:14 AM, Dan Carpenter wrote:

> "svcpt->scp_hist_seq" is a u64 so static checkers complain that 1U
> should be 1ULL.  I looked at REQS_SEQ_SHIFT() a little and it seems to
> be capped by the number of CPUs online and the amount of memory, but I
> think it could go above 32 possibly.
> 
> Signed-off-by: Dan Carpenter 

Reviewed-by: Oleg Drokin 

> ---
> I have not tested this change.
> 
> diff --git a/drivers/staging/lustre/lustre/ptlrpc/events.c 
> b/drivers/staging/lustre/lustre/ptlrpc/events.c
> index 49f3e63..ae1650d 100644
> --- a/drivers/staging/lustre/lustre/ptlrpc/events.c
> +++ b/drivers/staging/lustre/lustre/ptlrpc/events.c
> @@ -277,7 +277,7 @@ static void ptlrpc_req_add_history(struct 
> ptlrpc_service_part *svcpt,
>* then we hope there will be less RPCs per bucket at some
>* point, and sequence will catch up again
>*/
> - svcpt->scp_hist_seq += (1U << REQS_SEQ_SHIFT(svcpt));
> + svcpt->scp_hist_seq += (1ULL << REQS_SEQ_SHIFT(svcpt));
>   new_seq = svcpt->scp_hist_seq;
>   }
> 
> ___
> lustre-devel mailing list
> lustre-de...@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH 4/5] staging/lustre/obdclass: Combine two seq_printf() calls into one call in lprocfs_rd_state()

2017-01-12 Thread Oleg Drokin


On Jan 1, 2017, at 11:38 AM, SF Markus Elfring wrote:

> From: Markus Elfring 
> Date: Sun, 1 Jan 2017 16:26:36 +0100
> 
> Some data were printed into a sequence by two separate function calls.
> Print the same data by a single function call instead.
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring 
> ---
> drivers/staging/lustre/lustre/obdclass/lprocfs_status.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c 
> b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
> index 3f6fcab5a1fc..a167cbe8a50e 100644
> --- a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
> +++ b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
> @@ -853,10 +853,8 @@ int lprocfs_rd_state(struct seq_file *m, void *data)
>   return rc;
> 
>   imp = obd->u.cli.cl_import;
> -
> - seq_printf(m, "current_state: %s\n",
> + seq_printf(m, "current_state: %s\nstate_history:\n",
>  ptlrpc_import_state_name(imp->imp_state));
> - seq_printf(m, "state_history:\n");

same as in that other patch - this actually makes the code a bit harder to read,
what's the perceived benefit to make a change like this?

>   k = imp->imp_state_hist_idx;
>   for (j = 0; j < IMP_STATE_HIST_LEN; j++) {
>   struct import_state_hist *ish =
> -- 
> 2.11.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH 2/5] staging/lustre/mgc: Combine two seq_printf() calls into one call in lprocfs_mgc_rd_ir_state()

2017-01-12 Thread Oleg Drokin


On Jan 1, 2017, at 11:35 AM, SF Markus Elfring wrote:

> From: Markus Elfring 
> Date: Sun, 1 Jan 2017 15:40:29 +0100
> 
> Some data were printed into a sequence by two separate function calls.
> Print the same data by a single function call instead.
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring 
> ---
> drivers/staging/lustre/lustre/mgc/mgc_request.c | 5 +
> 1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c 
> b/drivers/staging/lustre/lustre/mgc/mgc_request.c
> index b9c522a3c7a4..a6ca48d7e96b 100644
> --- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
> +++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
> @@ -460,11 +460,8 @@ int lprocfs_mgc_rd_ir_state(struct seq_file *m, void 
> *data)
> 
>   imp = obd->u.cli.cl_import;
>   ocd = &imp->imp_connect_data;
> -
> - seq_printf(m, "imperative_recovery: %s\n",
> + seq_printf(m, "imperative_recovery: %s\nclient_state:\n",
>  OCD_HAS_FLAG(ocd, IMP_RECOV) ? "ENABLED" : "DISABLED");
> - seq_printf(m, "client_state:\n");
> -

Ugh, do we really need this?
I know it saves one call to seq_printf, but this is not a super 
performance-critical
code, and two calls are actually easier to read, don't you think?

>   spin_lock(&config_list_lock);
>   list_for_each_entry(cld, &config_llog_list, cld_list_chain) {
>   if (!cld->cld_recover)
> -- 
> 2.11.0

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH v2 5/5] staging: lustre: headers: use proper byteorder functions in lustre_idl.h

2016-12-13 Thread Oleg Drokin

On Dec 13, 2016, at 3:31 AM, Dan Carpenter wrote:

> It used to be that great swathes of Lustre were used in both user space
> and kernel space.  We had huge unused modules in the kernel that were
> only used for user space.

Huh?
There was nothing of the sort.
There were huge parts of code that were used by the server, but sue to no server
in staging client, ended up being unused, though.

There were also (much smaller) bits that were supporting userspace client
(that is, a library that was able to mount lustre servers completely from
userspace by hijacking libc calls), but that was mostly gone by the time
we got into staging anyway.

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH 0/8] Sparse warning fixes in Lustre.

2016-12-12 Thread Oleg Drokin

On Dec 7, 2016, at 6:46 PM, Al Viro wrote:

> On Wed, Dec 07, 2016 at 05:41:26PM -0500, Oleg Drokin wrote:
>> This set of fixes aims at sparse warnings.
> 
> Speaking of the stuff sparse catches there: class_process_proc_param().
> I've tried to describe what I think of that Fine Piece Of Software
> several times, but I had to give up - my command of obscenity is not
> up to the task, neither in English nor in Russian.  Please, take it
> out.  Preferably - along with the ->ldo_process_config()/->process_config()
> thing.

Well, I can guess what you don't like in the remnants of the
"well, we have uniform procfs, so let's use that to our advantage and simplify
or config parsing".

But what's your beef with ldo_process_config()/->process_config(), I wonder?
Just a way to propagate config info across the layers.

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 7/8] staging/lustre: Move lov_read_and_clear_async_rc declaration

2016-12-07 Thread Oleg Drokin

Move it to obd.h, so that it's included from both the users and
the actual definition, making sure they never get out of sync.
This also silences a sparse warning.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/include/obd.h| 3 +++
 drivers/staging/lustre/lustre/llite/vvp_internal.h | 2 --
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h 
b/drivers/staging/lustre/lustre/include/obd.h
index 0f48e9c..1839f4f 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -43,6 +43,7 @@
 #include "lustre_fld.h"
 #include "lustre_handles.h"
 #include "lustre_intent.h"
+#include "cl_object.h"
 
 #define MAX_OBD_DEVICES 8192
 
@@ -76,6 +77,8 @@ static inline void loi_init(struct lov_oinfo *loi)
 struct lov_stripe_md;
 struct obd_info;
 
+int lov_read_and_clear_async_rc(struct cl_object *clob);
+
 typedef int (*obd_enqueue_update_f)(void *cookie, int rc);
 
 /* obd info for a particular level (lov, osc). */
diff --git a/drivers/staging/lustre/lustre/llite/vvp_internal.h 
b/drivers/staging/lustre/lustre/llite/vvp_internal.h
index c60d041..f40fd7f 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_internal.h
+++ b/drivers/staging/lustre/lustre/llite/vvp_internal.h
@@ -301,8 +301,6 @@ static inline struct vvp_lock *cl2vvp_lock(const struct 
cl_lock_slice *slice)
 # define CLOBINVRNT(env, clob, expr)   \
((void)sizeof(env), (void)sizeof(clob), (void)sizeof(!!(expr)))
 
-int lov_read_and_clear_async_rc(struct cl_object *clob);
-
 int vvp_io_init(const struct lu_env *env, struct cl_object *obj,
struct cl_io *io);
 int vvp_io_write_commit(const struct lu_env *env, struct cl_io *io);
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 8/8] staging/lustre/ptlrpc: Move nrs_conf_fifo extern to a header

2016-12-07 Thread Oleg Drokin

This avoids having an extern definition in a C file which is bad,
and also silences sparse complaint as well.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/ptlrpc/nrs.c | 3 ---
 drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h | 3 +++
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/nrs.c 
b/drivers/staging/lustre/lustre/ptlrpc/nrs.c
index 7b6ffb1..ef19dbe 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/nrs.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/nrs.c
@@ -1559,9 +1559,6 @@ int ptlrpc_nrs_policy_control(const struct ptlrpc_service 
*svc,
return rc;
 }
 
-/* ptlrpc/nrs_fifo.c */
-extern struct ptlrpc_nrs_pol_conf nrs_conf_fifo;
-
 /**
  * Adds all policies that ship with the ptlrpc module, to NRS core's list of
  * policies \e nrs_core.nrs_policies.
diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h 
b/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h
index e0f859c..8e6a805 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h
+++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h
@@ -226,6 +226,9 @@ struct ptlrpc_nrs_policy *nrs_request_policy(struct 
ptlrpc_nrs_request *nrq)
  sizeof(NRS_LPROCFS_QUANTUM_NAME_REG __stringify(LPROCFS_NRS_QUANTUM_MAX) " "  
\
NRS_LPROCFS_QUANTUM_NAME_HP __stringify(LPROCFS_NRS_QUANTUM_MAX))
 
+/* ptlrpc/nrs_fifo.c */
+extern struct ptlrpc_nrs_pol_conf nrs_conf_fifo;
+
 /* recovd_thread.c */
 
 int ptlrpc_expire_one_request(struct ptlrpc_request *req, int async_unlink);
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 0/8] Sparse warning fixes in Lustre.

2016-12-07 Thread Oleg Drokin

This set of fixes aims at sparse warnings.
Most of the patches are just moving declarations around
to deal with the
warning: symbol 'xxx' was not declared. Should it be static?
kind of messages.

Also a screwup with root_squash sysfs control is fixed.

Oleg Drokin (8):
  staging/lustre/llite: move root_squash from sysfs to debugfs
  staging/lustre/ldlm: Correct itree_overlap_cb return type
  staging/lustre/llite: mark ll_io_init() static
  staging/lustre/lov: make lov_lsm_alloc() static
  staging/lustre/osc: extern declare osc_caches in a header
  staging/lustre: Declare lu_context/session_tags_default
  staging/lustre: Move lov_read_and_clear_async_rc declaration
  staging/lustre/ptlrpc: Move nrs_conf_fifo extern to a header

 drivers/staging/lustre/lustre/include/lu_object.h  |  3 +++
 drivers/staging/lustre/lustre/include/obd.h|  3 +++
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c |  2 +-
 drivers/staging/lustre/lustre/llite/file.c |  2 +-
 drivers/staging/lustre/lustre/llite/lproc_llite.c  | 27 --
 drivers/staging/lustre/lustre/llite/vvp_internal.h |  2 --
 drivers/staging/lustre/lustre/lov/lov_pack.c   |  3 ++-
 drivers/staging/lustre/lustre/obdclass/cl_object.c |  3 +--
 drivers/staging/lustre/lustre/osc/osc_internal.h   |  2 ++
 drivers/staging/lustre/lustre/osc/osc_request.c|  2 --
 drivers/staging/lustre/lustre/ptlrpc/nrs.c |  3 ---
 .../staging/lustre/lustre/ptlrpc/ptlrpc_internal.h |  3 +++
 12 files changed, 31 insertions(+), 24 deletions(-)

-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 5/8] staging/lustre/osc: extern declare osc_caches in a header

2016-12-07 Thread Oleg Drokin

This avoids frowned upon extern in the C file, and also
shuts down a sparse warning of
drivers/staging/lustre/lustre/osc/osc_dev.c:55:22: warning: symbol 'osc_caches' 
was not declared. Should it be static?

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/osc/osc_internal.h | 2 ++
 drivers/staging/lustre/lustre/osc/osc_request.c  | 2 --
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_internal.h 
b/drivers/staging/lustre/lustre/osc/osc_internal.h
index 688783d..5cce82b 100644
--- a/drivers/staging/lustre/lustre/osc/osc_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_internal.h
@@ -181,6 +181,8 @@ static inline struct osc_device *obd2osc_dev(const struct 
obd_device *d)
return container_of0(d->obd_lu_dev, struct osc_device, od_cl.cd_lu_dev);
 }
 
+extern struct lu_kmem_descr osc_caches[];
+
 extern struct kmem_cache *osc_quota_kmem;
 struct osc_quota_info {
/** linkage for quota hash table */
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c 
b/drivers/staging/lustre/lustre/osc/osc_request.c
index 7143564..f691297 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -2766,8 +2766,6 @@ static struct obd_ops osc_obd_ops = {
.quotactl   = osc_quotactl,
 };
 
-extern struct lu_kmem_descr osc_caches[];
-
 static int __init osc_init(void)
 {
struct lprocfs_static_vars lvars = { NULL };
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 6/8] staging/lustre: Declare lu_context/session_tags_default

2016-12-07 Thread Oleg Drokin

Make the declaration in a header, not as an extern in a C file,
that is frowned upon.
This also makes sparse a little bit more happy.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/include/lu_object.h  | 3 +++
 drivers/staging/lustre/lustre/obdclass/cl_object.c | 3 +--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lu_object.h 
b/drivers/staging/lustre/lustre/include/lu_object.h
index 260643e..69b2812 100644
--- a/drivers/staging/lustre/lustre/include/lu_object.h
+++ b/drivers/staging/lustre/lustre/include/lu_object.h
@@ -1326,5 +1326,8 @@ void lu_buf_realloc(struct lu_buf *buf, size_t size);
 int lu_buf_check_and_grow(struct lu_buf *buf, size_t len);
 struct lu_buf *lu_buf_check_and_alloc(struct lu_buf *buf, size_t len);
 
+extern __u32 lu_context_tags_default;
+extern __u32 lu_session_tags_default;
+
 /** @} lu */
 #endif /* __LUSTRE_LU_OBJECT_H */
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_object.c 
b/drivers/staging/lustre/lustre/obdclass/cl_object.c
index f5d4e23..703cb67 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_object.c
@@ -54,6 +54,7 @@
 #include 
 #include "../../include/linux/libcfs/libcfs_hash.h"/* for cfs_hash stuff */
 #include "../include/cl_object.h"
+#include "../include/lu_object.h"
 #include "cl_internal.h"
 
 static struct kmem_cache *cl_env_kmem;
@@ -61,8 +62,6 @@ static struct kmem_cache *cl_env_kmem;
 /** Lock class of cl_object_header::coh_attr_guard */
 static struct lock_class_key cl_attr_guard_class;
 
-extern __u32 lu_context_tags_default;
-extern __u32 lu_session_tags_default;
 /**
  * Initialize cl_object_header.
  */
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 1/8] staging/lustre/llite: move root_squash from sysfs to debugfs

2016-12-07 Thread Oleg Drokin

root_squash control got accidentally moved to sysfs instead of
debugfs, and the write side of it was also broken expecting a
userspace buffer.
It contains both uid and gid values in a single file, so debugfs
is a clear place for it.

Reported-by: Al Viro 
Fixes: c948390f10ccc "fix inconsistencies of root squash feature"
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/llite/lproc_llite.c | 27 +--
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/lproc_llite.c 
b/drivers/staging/lustre/lustre/llite/lproc_llite.c
index 03682c1..f3ee584 100644
--- a/drivers/staging/lustre/lustre/llite/lproc_llite.c
+++ b/drivers/staging/lustre/lustre/llite/lproc_llite.c
@@ -924,27 +924,29 @@ static ssize_t ll_unstable_stats_seq_write(struct file 
*file,
 }
 LPROC_SEQ_FOPS(ll_unstable_stats);
 
-static ssize_t root_squash_show(struct kobject *kobj, struct attribute *attr,
-   char *buf)
+static int ll_root_squash_seq_show(struct seq_file *m, void *v)
 {
-   struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info,
- ll_kobj);
+   struct super_block *sb = m->private;
+   struct ll_sb_info *sbi = ll_s2sbi(sb);
struct root_squash_info *squash = &sbi->ll_squash;
 
-   return sprintf(buf, "%u:%u\n", squash->rsi_uid, squash->rsi_gid);
+   seq_printf(m, "%u:%u\n", squash->rsi_uid, squash->rsi_gid);
+   return 0;
 }
 
-static ssize_t root_squash_store(struct kobject *kobj, struct attribute *attr,
-const char *buffer, size_t count)
+static ssize_t ll_root_squash_seq_write(struct file *file,
+   const char __user *buffer,
+   size_t count, loff_t *off)
 {
-   struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info,
- ll_kobj);
+   struct seq_file *m = file->private_data;
+   struct super_block *sb = m->private;
+   struct ll_sb_info *sbi = ll_s2sbi(sb);
struct root_squash_info *squash = &sbi->ll_squash;
 
return lprocfs_wr_root_squash(buffer, count, squash,
- ll_get_fsname(sbi->ll_sb, NULL, 0));
+ ll_get_fsname(sb, NULL, 0));
 }
-LUSTRE_RW_ATTR(root_squash);
+LPROC_SEQ_FOPS(ll_root_squash);
 
 static int ll_nosquash_nids_seq_show(struct seq_file *m, void *v)
 {
@@ -997,6 +999,8 @@ static struct lprocfs_vars lprocfs_llite_obd_vars[] = {
{ "statahead_stats",  &ll_statahead_stats_fops, NULL, 0 },
{ "unstable_stats",   &ll_unstable_stats_fops, NULL },
{ "sbi_flags",&ll_sbi_flags_fops, NULL, 0 },
+   { .name =   "root_squash",
+ .fops =   &ll_root_squash_fops},
{ .name =   "nosquash_nids",
  .fops =   &ll_nosquash_nids_fops  },
{ NULL }
@@ -1027,7 +1031,6 @@ static struct attribute *llite_attrs[] = {
&lustre_attr_max_easize.attr,
&lustre_attr_default_easize.attr,
&lustre_attr_xattr_cache.attr,
-   &lustre_attr_root_squash.attr,
NULL,
 };
 
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 3/8] staging/lustre/llite: mark ll_io_init() static

2016-12-07 Thread Oleg Drokin

It's not used anywhere out of this file.
Highlighted by sparse.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/llite/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c 
b/drivers/staging/lustre/lustre/llite/file.c
index f634c11..d93f06a 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -1016,7 +1016,7 @@ static bool file_is_noatime(const struct file *file)
return false;
 }
 
-void ll_io_init(struct cl_io *io, const struct file *file, int write)
+static void ll_io_init(struct cl_io *io, const struct file *file, int write)
 {
struct inode *inode = file_inode(file);
 
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 2/8] staging/lustre/ldlm: Correct itree_overlap_cb return type

2016-12-07 Thread Oleg Drokin

As per interval_search() prototype, the callback should return
enum, not int.
This fixes correspondign sparse warning.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
index a4a291a..f4cbc89 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
@@ -1148,7 +1148,7 @@ static int lock_matches(struct ldlm_lock *lock, struct 
lock_match_data *data)
return INTERVAL_ITER_STOP;
 }
 
-static unsigned int itree_overlap_cb(struct interval_node *in, void *args)
+static enum interval_iter itree_overlap_cb(struct interval_node *in, void 
*args)
 {
struct ldlm_interval *node = to_ldlm_interval(in);
struct lock_match_data *data = args;
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 4/8] staging/lustre/lov: make lov_lsm_alloc() static

2016-12-07 Thread Oleg Drokin

It's not used anywhere outside of this file.
Highlighted by sparse.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/lov/lov_pack.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c 
b/drivers/staging/lustre/lustre/lov/lov_pack.c
index 6c93d18..68fa2de 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -198,7 +198,8 @@ static int lov_verify_lmm(void *lmm, int lmm_bytes, __u16 
*stripe_count)
return rc;
 }
 
-struct lov_stripe_md *lov_lsm_alloc(u16 stripe_count, u32 pattern, u32 magic)
+static struct lov_stripe_md *lov_lsm_alloc(u16 stripe_count, u32 pattern,
+  u32 magic)
 {
struct lov_stripe_md *lsm;
unsigned int i;
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [lustre-devel] [PATCH] staging/lustre/osc: Revert erroneous list_for_each_entry_safe use

2016-12-07 Thread Oleg Drokin


On Dec 7, 2016, at 3:37 PM, Greg Kroah-Hartman wrote:

> On Wed, Dec 07, 2016 at 11:29:36AM -0500, Oleg Drokin wrote:
>> 
>> On Dec 7, 2016, at 5:40 AM, Greg Kroah-Hartman wrote:
>> 
>>> On Tue, Dec 06, 2016 at 10:53:48PM -0500, Oleg Drokin wrote:
>>>> I have been having a lot of unexplainable crashes in osc_lru_shrink
>>>> lately that I could not see a good explanation for and then I found
>>>> this patch that slip under the radar somehow that incorrectly
>>>> converted while loop for lru list iteration into
>>>> list_for_each_entry_safe totally ignoring that in the body of
>>>> the loop we drop spinlocks guarding this list and move list entries
>>>> around.
>>>> Not sure why it was not showing up right away, perhaps some of the
>>>> more recent LRU changes committed caused some extra pressure on this
>>>> code that finally highlighted the breakage.
>>>> 
>>>> Reverts: 8adddc36b1fc ("staging: lustre: osc: Use 
>>>> list_for_each_entry_safe")
>>>> CC: Bhaktipriya Shridhar 
>>>> Signed-off-by: Oleg Drokin 
>>>> ---
>>>> I also do not see this patch in any of the mailing lists I am subscribed 
>>>> to.
>>>> I wonder if there's a way to subscribe to those Greg's
>>>> "This is a note to let you know that I've just added the patch "
>>>> emails that concern Lustre to get them even if I am not on the CC list in
>>>> the patch itself?
>>> 
>>> This came in from the Outreacy application process, which now requires
>>> that they cc: the maintainers to catch this type of issue.  So you
>>> should have seen these types of patches this last round, the commit you
>>> reference was done before that change happened, sorry.
>> 
>> Do you know approximate date range of when these patches ere sneaking in?
> 
> Anytime before a few months ago.

Ugh, I see.

>> I'd like to take a look at the rest of it proactively just to see if there 
>> are
>> more undiscovered surprises?
> 
> If your testing isn't finding any problems, all should be good, right?
> :)

I see processes hanging waiting for RPC response (rarely) that is very 
suspicious,
but I did not get to the root of it yet.
Also my test system is limited in capacity, they don't let me anywhere near 
those
TOP100 systems with the staging client ;)


___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [lustre-devel] [PATCH] staging: lustre: Fix a spatch warning due to an assignment from kernel to user space

2016-12-07 Thread Oleg Drokin


On Dec 7, 2016, at 10:20 AM, Quentin Lambert wrote:

> Hi all,
> 
> I am looking at the drivers/staging/lustre/lustre/llite/dir.c:
> 
> 1469 /* Call mdc_iocontrol */
> 1470 rc = obd_iocontrol(LL_IOC_FID2MDTIDX, exp, sizeof(fid), 
> &fid,
> 1471&index);
> 1472 if (rc)
> 
> and sparse says:
> 
> drivers/staging/lustre/lustre/llite/dir.c:1471:37: warning: incorrect type in 
> argument 5 (different address spaces)
> 
> I was wondering if there was any value to add a cast to fix the warning?

These's a sister warning to this one, btw, in
drivers/staging/lustre/lustre/lmv/lmv_obd.c:996:19: warning: cast removes 
address space of expression

It is an ugly kludge and I guess needs to just be reworked somehow instead to 
avoid
these ugly games.

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [lustre-devel] [PATCH] staging: lustre: Fix a spatch warning due to an assignment from kernel to user space

2016-12-07 Thread Oleg Drokin

On Dec 7, 2016, at 10:33 AM, Dan Carpenter wrote:

> Lustre is kind of a mess with regards to keeping user and kernel
> pointers separate.  It's not going to be easy to fix.

Actually I believe I made significant inroads in properly cleaning (almost?) 
everything
in this area about a year ago (to the point that only false positives were 
left).
I guess some more stuff crept in, I'll just make another run through and see
what else I can improve.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH] staging/lustre/osc: Revert erroneous list_for_each_entry_safe use

2016-12-07 Thread Oleg Drokin


On Dec 7, 2016, at 5:40 AM, Greg Kroah-Hartman wrote:

> On Tue, Dec 06, 2016 at 10:53:48PM -0500, Oleg Drokin wrote:
>> I have been having a lot of unexplainable crashes in osc_lru_shrink
>> lately that I could not see a good explanation for and then I found
>> this patch that slip under the radar somehow that incorrectly
>> converted while loop for lru list iteration into
>> list_for_each_entry_safe totally ignoring that in the body of
>> the loop we drop spinlocks guarding this list and move list entries
>> around.
>> Not sure why it was not showing up right away, perhaps some of the
>> more recent LRU changes committed caused some extra pressure on this
>> code that finally highlighted the breakage.
>> 
>> Reverts: 8adddc36b1fc ("staging: lustre: osc: Use list_for_each_entry_safe")
>> CC: Bhaktipriya Shridhar 
>> Signed-off-by: Oleg Drokin 
>> ---
>> I also do not see this patch in any of the mailing lists I am subscribed to.
>> I wonder if there's a way to subscribe to those Greg's
>> "This is a note to let you know that I've just added the patch "
>> emails that concern Lustre to get them even if I am not on the CC list in
>> the patch itself?
> 
> This came in from the Outreacy application process, which now requires
> that they cc: the maintainers to catch this type of issue.  So you
> should have seen these types of patches this last round, the commit you
> reference was done before that change happened, sorry.

Do you know approximate date range of when these patches ere sneaking in?
I'd like to take a look at the rest of it proactively just to see if there are
more undiscovered surprises?

> This change should go to stable kernels, so I'll mark it that way.

Thanks!
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH] staging/lustre/lnetselftest: Fix potential integer overflow

2016-12-06 Thread Oleg Drokin

It looks like if the passed in parameter is not present, but
parameter length is non zero, then sanity checks on the length
are skipped and lstcon_test_add() might then use incorrect
allocation that's prone to integer overflow size.

This patch ensures that parameter len is zero if parameter is
not present.

Reported-by: Dan Carpenter 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lnet/selftest/conctl.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/staging/lustre/lnet/selftest/conctl.c 
b/drivers/staging/lustre/lnet/selftest/conctl.c
index 02847bf..9438302 100644
--- a/drivers/staging/lustre/lnet/selftest/conctl.c
+++ b/drivers/staging/lustre/lnet/selftest/conctl.c
@@ -742,6 +742,10 @@ static int lst_test_add_ioctl(lstio_test_args_t *args)
 PAGE_SIZE - sizeof(struct lstcon_test)))
return -EINVAL;
 
+   /* Enforce zero parameter length if there's no parameter */
+   if (!args->lstio_tes_param && args->lstio_tes_param_len)
+   return -EINVAL;
+
LIBCFS_ALLOC(batch_name, args->lstio_tes_bat_nmlen + 1);
if (!batch_name)
return rc;
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 3/5] staging/lustre: Convert all bare unsigned to unsigned int

2016-12-06 Thread Oleg Drokin

Highlighted by relatively new checkpatch test, warnings like:
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/include/linux/lnet/lnetst.h |  6 +-
 .../staging/lustre/lustre/include/lprocfs_status.h |  3 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |  8 +-
 drivers/staging/lustre/lustre/llite/llite_nfs.c|  2 +-
 drivers/staging/lustre/lustre/llite/rw26.c |  4 +-
 drivers/staging/lustre/lustre/llite/xattr_cache.c  |  6 +-
 drivers/staging/lustre/lustre/lov/lov_pool.c   |  3 +-
 .../lustre/lustre/obdclass/lprocfs_status.c|  3 +-
 drivers/staging/lustre/lustre/obdclass/lu_object.c |  6 +-
 .../staging/lustre/lustre/obdclass/obd_config.c|  4 +-
 drivers/staging/lustre/lustre/osc/osc_lock.c   |  2 +-
 drivers/staging/lustre/lustre/osc/osc_quota.c  |  4 +-
 drivers/staging/lustre/lustre/osc/osc_request.c|  6 +-
 drivers/staging/lustre/lustre/ptlrpc/connection.c  |  4 +-
 .../staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c|  4 +-
 drivers/staging/lustre/lustre/ptlrpc/service.c |  6 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c| 92 +++---
 17 files changed, 83 insertions(+), 80 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/lnetst.h 
b/drivers/staging/lustre/include/linux/lnet/lnetst.h
index 78f825d..8a84888 100644
--- a/drivers/staging/lustre/include/linux/lnet/lnetst.h
+++ b/drivers/staging/lustre/include/linux/lnet/lnetst.h
@@ -244,7 +244,7 @@ typedef struct {
int  lstio_ses_timeout; /* IN: session timeout */
int  lstio_ses_force;   /* IN: force create ? */
/** IN: session features */
-   unsigned lstio_ses_feats;
+   unsigned int lstio_ses_feats;
lst_sid_t __user *lstio_ses_idp;/* OUT: session id */
int  lstio_ses_nmlen;   /* IN: name length */
char __user  *lstio_ses_namep;  /* IN: session name */
@@ -255,7 +255,7 @@ typedef struct {
lst_sid_t __user*lstio_ses_idp; /* OUT: session id */
int __user  *lstio_ses_keyp;/* OUT: local key */
/** OUT: session features */
-   unsigned __user *lstio_ses_featp;
+   unsigned int __user *lstio_ses_featp;
lstcon_ndlist_ent_t __user *lstio_ses_ndinfo;   /* OUT: */
int  lstio_ses_nmlen;   /* IN: name length */
char __user *lstio_ses_namep;   /* OUT: session name */
@@ -328,7 +328,7 @@ typedef struct {
char __user *lstio_grp_namep;   /* IN: group name */
int  lstio_grp_count;   /* IN: # of nodes */
/** OUT: session features */
-   unsigned __user *lstio_grp_featp;
+   unsigned int __user *lstio_grp_featp;
lnet_process_id_t __user *lstio_grp_idsp;   /* IN: nodes */
struct list_head __user *lstio_grp_resultp; /* OUT: list head of
result buffer */
diff --git a/drivers/staging/lustre/lustre/include/lprocfs_status.h 
b/drivers/staging/lustre/lustre/include/lprocfs_status.h
index adef2d2..62753da 100644
--- a/drivers/staging/lustre/lustre/include/lprocfs_status.h
+++ b/drivers/staging/lustre/lustre/include/lprocfs_status.h
@@ -542,7 +542,8 @@ lprocfs_alloc_stats(unsigned int num, enum 
lprocfs_stats_flags flags);
 void lprocfs_clear_stats(struct lprocfs_stats *stats);
 void lprocfs_free_stats(struct lprocfs_stats **stats);
 void lprocfs_counter_init(struct lprocfs_stats *stats, int index,
- unsigned conf, const char *name, const char *units);
+ unsigned int conf, const char *name,
+ const char *units);
 struct obd_export;
 int lprocfs_exp_cleanup(struct obd_export *exp);
 struct dentry *ldebugfs_add_simple(struct dentry *root,
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
index 1095331..b22f5ba 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
@@ -445,8 +445,8 @@ static struct ldlm_resource *ldlm_resource_getref(struct 
ldlm_resource *res)
return res;
 }
 
-static unsigned ldlm_res_hop_hash(struct cfs_hash *hs,
- const void *key, unsigned mask)
+static unsigned int ldlm_res_hop_hash(struct cfs_hash *hs,
+ const void *key, unsigned int mask)
 {
const struct ldlm_res_id *id  = key;
unsigned intval = 0;
@@ -457,8 +457,8 @@ static unsigned ldlm_res_hop_hash(struct cfs_hash *hs,
return val & mask;
 }
 
-static unsigned ldlm_res_hop_fid_hash(struct cfs_hash *hs,
- const void *key, unsigned ma

[PATCH 0/5] Lustre style fixes

2016-12-06 Thread Oleg Drokin

These patches fix some more of the low hanging fruits
in the style problems highlighted by checkpatch.
Now only false positive ERRORs are left.
This also converts all bare unsigneds into unsigned ints
and a couple of spelling fixes.

Please consider.

Oleg Drokin (5):
  staging/lustre/o2iblnd: Add missing space
  staging/lustre/socklnd: Fix whitespace problem
  staging/lustre: Convert all bare unsigned to unsigned int
  staging/lustre/o2iblnd: Fix misspelling intialized->intialized
  staging/lustre/o2iblnd: Fix misspelled attemps->attempts

 drivers/staging/lustre/include/linux/lnet/lnetst.h |  6 +-
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c|  4 +-
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c |  4 +-
 .../staging/lustre/lnet/klnds/socklnd/socklnd.h|  2 +-
 .../staging/lustre/lustre/include/lprocfs_status.h |  3 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |  8 +-
 drivers/staging/lustre/lustre/llite/llite_nfs.c|  2 +-
 drivers/staging/lustre/lustre/llite/rw26.c |  4 +-
 drivers/staging/lustre/lustre/llite/xattr_cache.c  |  6 +-
 drivers/staging/lustre/lustre/lov/lov_pool.c   |  3 +-
 .../lustre/lustre/obdclass/lprocfs_status.c|  3 +-
 drivers/staging/lustre/lustre/obdclass/lu_object.c |  6 +-
 .../staging/lustre/lustre/obdclass/obd_config.c|  4 +-
 drivers/staging/lustre/lustre/osc/osc_lock.c   |  2 +-
 drivers/staging/lustre/lustre/osc/osc_quota.c  |  4 +-
 drivers/staging/lustre/lustre/osc/osc_request.c|  6 +-
 drivers/staging/lustre/lustre/ptlrpc/connection.c  |  4 +-
 .../staging/lustre/lustre/ptlrpc/lproc_ptlrpc.c|  4 +-
 drivers/staging/lustre/lustre/ptlrpc/service.c |  6 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c| 92 +++---
 20 files changed, 88 insertions(+), 85 deletions(-)

-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 5/5] staging/lustre/o2iblnd: Fix misspelled attemps->attempts

2016-12-06 Thread Oleg Drokin

Highlighted by checkpatch:
WARNING: 'attemps' may be misspelled - perhaps 'attempts'?
#20278: FILE: drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c:3272:
+ * reconnection attemps.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
index bea408d..c7917ab 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -3269,7 +3269,7 @@ kiblnd_disconnect_conn(struct kib_conn *conn)
 #define KIB_RECONN_HIGH_RACE   10
 /**
  * Allow connd to take a break and handle other things after consecutive
- * reconnection attemps.
+ * reconnection attempts.
  */
 #define KIB_RECONN_BREAK   100
 
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 4/5] staging/lustre/o2iblnd: Fix misspelling intialized->intialized

2016-12-06 Thread Oleg Drokin

Highlighted by checkpatch:
+   if (!ps->ps_net) /* intialized? */

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
index e2fc65f..7f761b3 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
@@ -1489,7 +1489,7 @@ static int kiblnd_create_fmr_pool(struct kib_fmr_poolset 
*fps,
 static void kiblnd_fail_fmr_poolset(struct kib_fmr_poolset *fps,
struct list_head *zombies)
 {
-   if (!fps->fps_net) /* intialized? */
+   if (!fps->fps_net) /* initialized? */
return;
 
spin_lock(&fps->fps_lock);
@@ -1812,7 +1812,7 @@ static void kiblnd_destroy_pool_list(struct list_head 
*head)
 
 static void kiblnd_fail_poolset(struct kib_poolset *ps, struct list_head 
*zombies)
 {
-   if (!ps->ps_net) /* intialized? */
+   if (!ps->ps_net) /* initialized? */
return;
 
spin_lock(&ps->ps_lock);
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 1/5] staging/lustre/o2iblnd: Add missing space

2016-12-06 Thread Oleg Drokin

checkpatch highlighted missing space before assignment
for lock variable.

+   spinlock_t *lock= &kiblnd_data.kib_connd_lock;

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
index 92692a2..bea408d 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -3276,7 +3276,7 @@ kiblnd_disconnect_conn(struct kib_conn *conn)
 int
 kiblnd_connd(void *arg)
 {
-   spinlock_t *lock= &kiblnd_data.kib_connd_lock;
+   spinlock_t *lock = &kiblnd_data.kib_connd_lock;
wait_queue_t wait;
unsigned long flags;
struct kib_conn *conn;
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 2/5] staging/lustre/socklnd: Fix whitespace problem

2016-12-06 Thread Oleg Drokin

checkpatch highlighted there are 8 spaces that could be converted to a tab:
ERROR: code indent should use tabs where possible
+^I^I^I^I^I */$

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h 
b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h
index 2978014..842c453 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h
@@ -334,7 +334,7 @@ struct ksock_conn {
atomic_t   ksnc_conn_refcount;/* conn refcount */
atomic_t   ksnc_sock_refcount;/* sock refcount */
struct ksock_sched *ksnc_scheduler; /* who schedules this connection
-*/
+*/
__u32  ksnc_myipaddr; /* my IP */
__u32  ksnc_ipaddr;   /* peer's IP */
intksnc_port; /* peer's port */
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH] staging/lustre/osc: Revert erroneous list_for_each_entry_safe use

2016-12-06 Thread Oleg Drokin

I have been having a lot of unexplainable crashes in osc_lru_shrink
lately that I could not see a good explanation for and then I found
this patch that slip under the radar somehow that incorrectly
converted while loop for lru list iteration into
list_for_each_entry_safe totally ignoring that in the body of
the loop we drop spinlocks guarding this list and move list entries
around.
Not sure why it was not showing up right away, perhaps some of the
more recent LRU changes committed caused some extra pressure on this
code that finally highlighted the breakage.

Reverts: 8adddc36b1fc ("staging: lustre: osc: Use list_for_each_entry_safe")
CC: Bhaktipriya Shridhar 
Signed-off-by: Oleg Drokin 
---
I also do not see this patch in any of the mailing lists I am subscribed to.
I wonder if there's a way to subscribe to those Greg's
"This is a note to let you know that I've just added the patch "
emails that concern Lustre to get them even if I am not on the CC list in
the patch itself?

 drivers/staging/lustre/lustre/osc/osc_page.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_page.c 
b/drivers/staging/lustre/lustre/osc/osc_page.c
index c5129d1..e356e4a 100644
--- a/drivers/staging/lustre/lustre/osc/osc_page.c
+++ b/drivers/staging/lustre/lustre/osc/osc_page.c
@@ -537,7 +537,6 @@ long osc_lru_shrink(const struct lu_env *env, struct 
client_obd *cli,
struct cl_object *clobj = NULL;
struct cl_page **pvec;
struct osc_page *opg;
-   struct osc_page *temp;
int maxscan = 0;
long count = 0;
int index = 0;
@@ -568,7 +567,7 @@ long osc_lru_shrink(const struct lu_env *env, struct 
client_obd *cli,
if (force)
cli->cl_lru_reclaim++;
maxscan = min(target << 1, atomic_long_read(&cli->cl_lru_in_list));
-   list_for_each_entry_safe(opg, temp, &cli->cl_lru_list, ops_lru) {
+   while (!list_empty(&cli->cl_lru_list)) {
struct cl_page *page;
bool will_free = false;
 
@@ -578,6 +577,8 @@ long osc_lru_shrink(const struct lu_env *env, struct 
client_obd *cli,
if (--maxscan < 0)
break;
 
+   opg = list_entry(cli->cl_lru_list.next, struct osc_page,
+ops_lru);
page = opg->ops_cl.cpl_page;
if (lru_page_busy(cli, page)) {
list_move_tail(&opg->ops_lru, &cli->cl_lru_list);
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [lustre-devel] [bug report] staging: add Lustre file system client support

2016-12-06 Thread Oleg Drokin


On Dec 6, 2016, at 1:37 PM, Dan Carpenter wrote:

> On Tue, Dec 06, 2016 at 10:44:54AM -0500, Oleg Drokin wrote:
>> I see, indeed, it all makes sense now.
>> So basically if we unconditionally check for the size to be > 0, we should be
>> fine then, I imagine.
>> On the other hand there's probably no se for no param and nonzero param len,
>> so it's probably even better to enforce size as zero when no param.
> 
> Checking for > 0 is not enough, because it could also have an integer
> overflow on 32 bit systems.  We need to cap the upper bound as well.

How would it play out, though?
offsetof(struct lstcon_test, tes_param[large_positive_int]) would result in
some real "large" negative number.
So trying to allocate this many negative bytes would fail, right?


___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [lustre-devel] [bug report] staging: add Lustre file system client support

2016-12-06 Thread Oleg Drokin


On Dec 6, 2016, at 6:02 AM, Dan Carpenter wrote:

> On Mon, Dec 05, 2016 at 06:43:37PM -0500, Oleg Drokin wrote:
>> 
>> On Nov 23, 2016, at 7:29 AM, Dan Carpenter wrote:
>> 
>>> Hi Lustre Devs,
>>> 
>>> The patch d7e09d0397e8: "staging: add Lustre file system client
>>> support" from May 2, 2013, leads to the following static checker
>>> warning:
>>> 
>>> drivers/staging/lustre/lnet/selftest/console.c:1336 lstcon_test_add()
>>> error: 'paramlen' from user is not capped properly
>>> 
>>> The story here, is that "paramlen" is capped but only if "param" is
>>> non-NULL.  This causes a problem.
>>> 
>>> drivers/staging/lustre/lnet/selftest/console.c
>>> 1311  
>>> 1312  LIBCFS_ALLOC(test, offsetof(struct lstcon_test, 
>>> tes_param[paramlen]));
>>> 
>>> We don't know that paramlen is non-NULL here.  Because of integer
>>> overflows we could end up allocating less than intended.
>> 
>> I think this must be a false positive in this case?
>> 
>> Before calling this function we do:
>>LIBCFS_ALLOC(param, args->lstio_tes_param_len);
>> 
>> in lst_test_add_ioctl(), so it's not any bigger than 128k (or kmalloc will 
>> fail).
>> Even if kmalloc would allow more than 128k allocations,
>> offsetof(struct lstcon_test, tes_param[0]) is bound to be a lot smaller than
>> the baseline allocation address for kmalloc, and therefore integer overflow
>> cannot happen at all.
>> 
> 
> I explained badly, and I typed the wrong variable names by mistake...
> Here is the relevant code from the caller:
> 
> drivers/staging/lustre/lnet/selftest/conctl.c
>   710  static int lst_test_add_ioctl(lstio_test_args_t *args)
>   711  {
>   712  char *batch_name;
>   713  char *src_name = NULL;
>   714  char *dst_name = NULL;
>   715  void *param = NULL;
>   716  int ret = 0;
>   717  int rc = -ENOMEM;
>   718  
>   719  if (!args->lstio_tes_resultp ||
>   720  !args->lstio_tes_retp ||
>   721  !args->lstio_tes_bat_name ||/* no specified batch 
> */
>   722  args->lstio_tes_bat_nmlen <= 0 ||
>   723  args->lstio_tes_bat_nmlen > LST_NAME_SIZE ||
>   724  !args->lstio_tes_sgrp_name ||   /* no source group */
>   725  args->lstio_tes_sgrp_nmlen <= 0 ||
>   726  args->lstio_tes_sgrp_nmlen > LST_NAME_SIZE ||
>   727  !args->lstio_tes_dgrp_name ||   /* no target group */
>   728  args->lstio_tes_dgrp_nmlen <= 0 ||
>   729  args->lstio_tes_dgrp_nmlen > LST_NAME_SIZE)
>   730  return -EINVAL;
>   731  
>   732  if (!args->lstio_tes_loop ||/* negative is 
> infinite */
>   733  args->lstio_tes_concur <= 0 ||
>   734  args->lstio_tes_dist <= 0 ||
>   735  args->lstio_tes_span <= 0)
>   736  return -EINVAL;
>   737  
>   738  /* have parameter, check if parameter length is valid */
>   739  if (args->lstio_tes_param &&
>   740  (args->lstio_tes_param_len <= 0 ||
>   741   args->lstio_tes_param_len >
>   742   PAGE_SIZE - sizeof(struct lstcon_test)))
>   743  return -EINVAL;
> 
> If we don't have a parameter then we don't check ->lstio_tes_param_len.

I see, indeed, it all makes sense now.
So basically if we unconditionally check for the size to be > 0, we should be
fine then, I imagine.
On the other hand there's probably no se for no param and nonzero param len,
so it's probably even better to enforce size as zero when no param.

Thank you.

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [lustre-devel] [bug report] staging: add Lustre file system client support

2016-12-05 Thread Oleg Drokin


On Nov 23, 2016, at 7:29 AM, Dan Carpenter wrote:

> Hi Lustre Devs,
> 
> The patch d7e09d0397e8: "staging: add Lustre file system client
> support" from May 2, 2013, leads to the following static checker
> warning:
> 
>   drivers/staging/lustre/lnet/selftest/console.c:1336 lstcon_test_add()
>   error: 'paramlen' from user is not capped properly
> 
> The story here, is that "paramlen" is capped but only if "param" is
> non-NULL.  This causes a problem.
> 
> drivers/staging/lustre/lnet/selftest/console.c
>  1311  
>  1312  LIBCFS_ALLOC(test, offsetof(struct lstcon_test, 
> tes_param[paramlen]));
> 
> We don't know that paramlen is non-NULL here.  Because of integer
> overflows we could end up allocating less than intended.

I think this must be a false positive in this case?

Before calling this function we do:
LIBCFS_ALLOC(param, args->lstio_tes_param_len);

in lst_test_add_ioctl(), so it's not any bigger than 128k (or kmalloc will 
fail).
Even if kmalloc would allow more than 128k allocations,
offsetof(struct lstcon_test, tes_param[0]) is bound to be a lot smaller than
the baseline allocation address for kmalloc, and therefore integer overflow
cannot happen at all.

> 
>  1313  if (!test) {
>  1314  CERROR("Can't allocate test descriptor\n");
>  1315  rc = -ENOMEM;
>  1316  
>  1317  goto out;
>  1318  }
>  1319  
>  1320  test->tes_hdr.tsb_id = batch->bat_hdr.tsb_id;
> 
> Which will lead to memory corruption when we use "test".
> 
>  1321  test->tes_batch = batch;
>  1322  test->tes_type = type;
>  1323  test->tes_oneside = 0; /* TODO */
>  1324  test->tes_loop = loop;
>  1325  test->tes_concur = concur;
>  1326  test->tes_stop_onerr = 1; /* TODO */
>  1327  test->tes_span = span;
>  1328  test->tes_dist = dist;
>  1329  test->tes_cliidx = 0; /* just used for creating RPC */
>  1330  test->tes_src_grp = src_grp;
>  1331  test->tes_dst_grp = dst_grp;
>  1332  INIT_LIST_HEAD(&test->tes_trans_list);
>  1333  
>  1334  if (param) {
> 
> Smatch is not smart enough to trace the implication that "'param' is
> non-NULL, means that 'paramlen' has been verified" across a function
> boundary.  Storing that sort of information would really increase the
> hardware requirements for running Smatch so it's not something I have
> planned currently.
> 
>  1335  test->tes_paramlen = paramlen;
>  1336  memcpy(&test->tes_param[0], param, paramlen);
>  1337  }
>  1338  
> 
> regards,
> dan carpenter
> ___
> lustre-devel mailing list
> lustre-de...@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH 04/22] staging: lustre: osc: handle osc eviction correctly

2016-12-05 Thread Oleg Drokin

On Dec 5, 2016, at 3:55 PM, Dan Carpenter wrote:

> On Fri, Dec 02, 2016 at 07:53:11PM -0500, James Simmons wrote:
>> @@ -3183,8 +3182,10 @@ static int discard_cb(const struct lu_env *env, 
>> struct cl_io *io,
>>  /* page is top page. */
>>  info->oti_next_index = osc_index(ops) + 1;
>>  if (cl_page_own(env, io, page) == 0) {
>> -KLASSERT(ergo(page->cp_type == CPT_CACHEABLE,
>> -  !PageDirty(cl_page_vmpage(page;
>> +if (!ergo(page->cp_type == CPT_CACHEABLE,
>> +  !PageDirty(cl_page_vmpage(page
>> +CL_PAGE_DEBUG(D_ERROR, env, page,
>> +  "discard dirty page?\n");
> 
> 
> I don't understand the point of the ergo macro.  There are way too many
> double negatives (some of them hidden for my small brain).  How is that
> simpler than just writing it out:
> 
>   if (page->cp_type == CPT_CACHEABLE &&
>   PageDirty(cl_page_vmpage(page))
>CL_PAGE_DEBUG(D_ERROR, env, page, "discard dirty page?\n");

I guess it makes it sound chic or something?
I am not a huge fan of it either, esp. in a case like this, though
it might be somewhat more convenient in assertions (where this is converted 
from).
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [lustre-devel] [PATCH] staging: lustre: Fix a spatch warning due to an assignment from kernel to user space

2016-12-05 Thread Oleg Drokin


On Dec 2, 2016, at 12:33 PM, Quentin Lambert wrote:

> lnet_ipif_enumerate was assigning a pointer from kernel space to user
> space. This patch uses copy_to_user to properly do that assignment.

I guess it's a false positive?

While lnet_sock_ioctl()->kernel_sock_unlocked_ioctl() does call into the
f_op->unlocked_ioctl() with a userspace argument, note that we have
set_fs(KERNEL_DS); in there, therefore allowig copy_from_user
and friends to work on kernel data too as if it was userspace.
(I know it's ugly and we need to find a better way of getting this data,
but at least it's not incorrect).

> 
> Signed-off-by: Quentin Lambert 
> ---
> shouldn't we be using ifc_req instead of ifc_buf?
> 
> drivers/staging/lustre/lnet/lnet/lib-socket.c |8 +++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
> 
> --- a/drivers/staging/lustre/lnet/lnet/lib-socket.c
> +++ b/drivers/staging/lustre/lnet/lnet/lib-socket.c
> @@ -181,7 +181,13 @@ lnet_ipif_enumerate(char ***namesp)
>   goto out0;
>   }
> 
> - ifc.ifc_buf = (char *)ifr;
> + rc = copy_to_user(ifc.ifc_buf, (char *)ifr,
> +   nalloc * sizeof(*ifr));
> + if (rc) {
> + rc = -ENOMEM;
> + goto out1;
> + }
> +
>   ifc.ifc_len = nalloc * sizeof(*ifr);
> 
>   rc = lnet_sock_ioctl(SIOCGIFCONF, (unsigned long)&ifc);
> ___
> lustre-devel mailing list
> lustre-de...@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH] staging: lustre: Fix function declaration/definition mismatch

2016-12-05 Thread Oleg Drokin


On Dec 4, 2016, at 10:06 PM,  
 wrote:

> From: Sandeep Jain 
> 
> Fixes following Sparse errors.
> lprocfs_status.c:1568:5: error: symbol 'lprocfs_wr_root_squash'
> redeclared with different type...
> lprocfs_status.c:1632:5: error: symbol 'lprocfs_wr_nosquash_nids'
> redeclared with different type...
> 
> Signed-off-by: Sandeep Jain 

Acked-by: Oleg Drokin 

> ---
> drivers/staging/lustre/lustre/include/lprocfs_status.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/include/lprocfs_status.h 
> b/drivers/staging/lustre/lustre/include/lprocfs_status.h
> index cc0713e..b5c24ca 100644
> --- a/drivers/staging/lustre/lustre/include/lprocfs_status.h
> +++ b/drivers/staging/lustre/lustre/include/lprocfs_status.h
> @@ -701,9 +701,9 @@ static struct lustre_attr lustre_attr_##name = 
> __ATTR(name, mode, show, store)
> extern const struct sysfs_ops lustre_sysfs_ops;
> 
> struct root_squash_info;
> -int lprocfs_wr_root_squash(const char *buffer, unsigned long count,
> +int lprocfs_wr_root_squash(const char __user *buffer, unsigned long count,
>  struct root_squash_info *squash, char *name);
> -int lprocfs_wr_nosquash_nids(const char *buffer, unsigned long count,
> +int lprocfs_wr_nosquash_nids(const char __user *buffer, unsigned long count,
>struct root_squash_info *squash, char *name);
> 
> /* all quota proc functions */
> -- 
> 2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH] staging: lustre: mgc: make llog_process_lock static

2016-12-05 Thread Oleg Drokin


On Dec 4, 2016, at 9:21 PM,  
 wrote:

> From: Sandeep Jain 
> 
> Fix following sparse warning.
> mgc_request.c:376:1:
> warning: symbol 'llog_process_lock' was not declared. Should it be static?
> 
> Signed-off-by: Sandeep Jain 

Acked-by: Oleg Drokin 

> ---
> drivers/staging/lustre/lustre/mgc/mgc_request.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c 
> b/drivers/staging/lustre/lustre/mgc/mgc_request.c
> index 23600fb..e98a2ce 100644
> --- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
> +++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
> @@ -373,7 +373,7 @@ static int config_log_add(struct obd_device *obd, char 
> *logname,
>   return rc;
> }
> 
> -DEFINE_MUTEX(llog_process_lock);
> +static DEFINE_MUTEX(llog_process_lock);
> 
> /** Stop watching for updates on this log.
>  */
> -- 
> 2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH 1/1] staging: lustre: lnet: fix improper return value

2016-12-05 Thread Oleg Drokin


On Dec 3, 2016, at 7:52 AM, Pan Bian wrote:

> From: Pan Bian 
> 
> At the end of function lstcon_group_info(), "return 0" seems improper.
> It may be better to return the value of rc.
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188811
> 
> Signed-off-by: Pan Bian 

Acked-by: Oleg Drokin 

> ---
> drivers/staging/lustre/lnet/selftest/console.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/lustre/lnet/selftest/console.c 
> b/drivers/staging/lustre/lnet/selftest/console.c
> index a0fcbf3..9a7c41a 100644
> --- a/drivers/staging/lustre/lnet/selftest/console.c
> +++ b/drivers/staging/lustre/lnet/selftest/console.c
> @@ -820,7 +820,7 @@
> 
>   lstcon_group_decref(grp);
> 
> - return 0;
> + return rc;
> }
> 
> static int
> -- 
> 1.9.1
> 

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH] staging/lustre: Use proper number of bytes in copy_from_user

2016-11-20 Thread Oleg Drokin

From: Jian Yu 

This patch removes the usage of MAX_STRING_SIZE from
copy_from_user() and just copies enough bytes to cover
count passed in.

Signed-off-by: Jian Yu 
Reviewed-on: http://review.whamcloud.com/23462
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8774
Reviewed-by: John L. Hammond 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/obdclass/lprocfs_status.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c 
b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
index 8a2f02f3..db49992 100644
--- a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
+++ b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
@@ -400,10 +400,17 @@ int lprocfs_wr_uint(struct file *file, const char __user 
*buffer,
char dummy[MAX_STRING_SIZE + 1], *end;
unsigned long tmp;
 
-   dummy[MAX_STRING_SIZE] = '\0';
-   if (copy_from_user(dummy, buffer, MAX_STRING_SIZE))
+   if (count >= sizeof(dummy))
+   return -EINVAL;
+
+   if (count == 0)
+   return 0;
+
+   if (copy_from_user(dummy, buffer, count))
return -EFAULT;
 
+   dummy[count] = '\0';
+
tmp = simple_strtoul(dummy, &end, 0);
if (dummy == end)
return -EINVAL;
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 2/2] staging/lustre/ptlrpc: update imp_known_replied_xid on resend-replay

2016-11-16 Thread Oleg Drokin

From: Niu Yawei 

The imp_known_replied_xid should be updated when try to resend
an already replied replay request, because the xid of this replay
request could be less than current imp_known_replied_xid.

Signed-off-by: Niu Yawei 
Reviewed-on: http://review.whamcloud.com/22776
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8645
Reviewed-by: Alex Zhuravlev 
Reviewed-by: Fan Yong 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/ptlrpc/recover.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/recover.c 
b/drivers/staging/lustre/lustre/ptlrpc/recover.c
index 344aedd..c004490 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/recover.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/recover.c
@@ -161,8 +161,10 @@ int ptlrpc_replay_next(struct obd_import *imp, int 
*inflight)
 * unreplied list.
 */
if (req && imp->imp_resend_replay &&
-   list_empty(&req->rq_unreplied_list))
+   list_empty(&req->rq_unreplied_list)) {
ptlrpc_add_unreplied(req);
+   imp->imp_known_replied_xid = ptlrpc_known_replied_xid(imp);
+   }
 
imp->imp_resend_replay = 0;
spin_unlock(&imp->imp_lock);
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 0/2] Lustre fixes

2016-11-16 Thread Oleg Drokin

With multiple metadata RPCs code in now, these two fixes
become important.

Please consider.

Niu Yawei (2):
  staging/lustre/ptlrpc: track unreplied requests
  staging/lustre/ptlrpc: update imp_known_replied_xid on resend-replay

 .../staging/lustre/lustre/include/lustre_import.h  |   5 +
 drivers/staging/lustre/lustre/include/lustre_net.h |   3 +
 drivers/staging/lustre/lustre/obdclass/genops.c|   2 +
 drivers/staging/lustre/lustre/ptlrpc/client.c  | 112 +++--
 drivers/staging/lustre/lustre/ptlrpc/import.c  |  34 +++
 drivers/staging/lustre/lustre/ptlrpc/niobuf.c  |  29 +-
 .../staging/lustre/lustre/ptlrpc/ptlrpc_internal.h |  24 +
 drivers/staging/lustre/lustre/ptlrpc/recover.c |  14 +++
 8 files changed, 187 insertions(+), 36 deletions(-)

-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 1/2] staging/lustre/ptlrpc: track unreplied requests

2016-11-16 Thread Oleg Drokin

From: Niu Yawei 

The request xid was used to make sure the ost object timestamps
being updated by the out of order setattr/punch/write requests
properly. However, this mechanism is broken by the multiple rcvd
slot feature, where we deferred the xid assignment from request
packing to request sending.

This patch moved back the xid assignment to request packing, and
the manner of finding lowest unreplied xid is changed from scan
sending & delay list to scan a unreplied requests list.

This patch also skipped packing the known replied XID in connect
and disconnect request, so that we can make sure the known replied
XID is increased only on both server & client side.

Signed-off-by: Niu Yawei 
Reviewed-on: http://review.whamcloud.com/16759
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5951
Reviewed-by: Gregoire Pichon 
Reviewed-by: Alex Zhuravlev 
Signed-off-by: Oleg Drokin 
---
 .../staging/lustre/lustre/include/lustre_import.h  |   5 +
 drivers/staging/lustre/lustre/include/lustre_net.h |   3 +
 drivers/staging/lustre/lustre/obdclass/genops.c|   2 +
 drivers/staging/lustre/lustre/ptlrpc/client.c  | 112 +++--
 drivers/staging/lustre/lustre/ptlrpc/import.c  |  34 +++
 drivers/staging/lustre/lustre/ptlrpc/niobuf.c  |  29 +-
 .../staging/lustre/lustre/ptlrpc/ptlrpc_internal.h |  24 +
 drivers/staging/lustre/lustre/ptlrpc/recover.c |  12 +++
 8 files changed, 185 insertions(+), 36 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_import.h 
b/drivers/staging/lustre/lustre/include/lustre_import.h
index 5461ba3..4499c69 100644
--- a/drivers/staging/lustre/lustre/include/lustre_import.h
+++ b/drivers/staging/lustre/lustre/include/lustre_import.h
@@ -185,6 +185,11 @@ struct obd_import {
struct list_head   *imp_replay_cursor;
/** @} */
 
+   /** List of not replied requests */
+   struct list_headimp_unreplied_list;
+   /** Known maximal replied XID */
+   __u64   imp_known_replied_xid;
+
/** obd device for this import */
struct obd_device   *imp_obd;
 
diff --git a/drivers/staging/lustre/lustre/include/lustre_net.h 
b/drivers/staging/lustre/lustre/include/lustre_net.h
index d2cbec3..2be135d 100644
--- a/drivers/staging/lustre/lustre/include/lustre_net.h
+++ b/drivers/staging/lustre/lustre/include/lustre_net.h
@@ -596,6 +596,8 @@ struct ptlrpc_cli_req {
union ptlrpc_async_args  cr_async_args;
/** Opaq data for replay and commit callbacks. */
void*cr_cb_data;
+   /** Link to the imp->imp_unreplied_list */
+   struct list_head cr_unreplied_list;
/**
 * Commit callback, called when request is committed and about to be
 * freed.
@@ -635,6 +637,7 @@ struct ptlrpc_cli_req {
 #define rq_interpret_reply rq_cli.cr_reply_interp
 #define rq_async_args  rq_cli.cr_async_args
 #define rq_cb_data rq_cli.cr_cb_data
+#define rq_unreplied_list  rq_cli.cr_unreplied_list
 #define rq_commit_cb   rq_cli.cr_commit_cb
 #define rq_replay_cb   rq_cli.cr_replay_cb
 
diff --git a/drivers/staging/lustre/lustre/obdclass/genops.c 
b/drivers/staging/lustre/lustre/obdclass/genops.c
index 438d619..fa0d38d 100644
--- a/drivers/staging/lustre/lustre/obdclass/genops.c
+++ b/drivers/staging/lustre/lustre/obdclass/genops.c
@@ -907,6 +907,8 @@ struct obd_import *class_new_import(struct obd_device *obd)
INIT_LIST_HEAD(&imp->imp_sending_list);
INIT_LIST_HEAD(&imp->imp_delayed_list);
INIT_LIST_HEAD(&imp->imp_committed_list);
+   INIT_LIST_HEAD(&imp->imp_unreplied_list);
+   imp->imp_known_replied_xid = 0;
imp->imp_replay_cursor = &imp->imp_committed_list;
spin_lock_init(&imp->imp_lock);
imp->imp_last_success_conn = 0;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c 
b/drivers/staging/lustre/lustre/ptlrpc/client.c
index d2f4cd5..ac959ef 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -652,6 +652,42 @@ static void __ptlrpc_free_req_to_pool(struct 
ptlrpc_request *request)
spin_unlock(&pool->prp_lock);
 }
 
+void ptlrpc_add_unreplied(struct ptlrpc_request *req)
+{
+   struct obd_import   *imp = req->rq_import;
+   struct list_head*tmp;
+   struct ptlrpc_request   *iter;
+
+   assert_spin_locked(&imp->imp_lock);
+   LASSERT(list_empty(&req->rq_unreplied_list));
+
+   /* unreplied list is sorted by xid in ascending order */
+   list_for_each_prev(tmp, &imp->imp_unreplied_list) {
+   iter = list_entry(tmp, struct ptlrpc_request,
+ rq_unreplied_list);
+
+   LASSERT(req->rq_xid != iter->rq_xid);
+   if (req->rq_

Re: [PATCH 2/2] staging: lustre: obdclass: Add handling of error returned by lustre_cfg_new

2016-11-07 Thread Oleg Drokin


On Nov 7, 2016, at 4:33 PM, Dilger, Andreas wrote:

> On Nov 6, 2016, at 10:26, Drokin, Oleg  wrote:
>> 
>> Hello!
>> 
>> On Nov 6, 2016, at 12:11 PM, Christophe JAILLET wrote:
>> 
>>> 'lustre_cfg_new()' can return ERR_PTR(-ENOMEM).
>>> Handle these errors and propagate the error code to the callers.
>>> 
>>> Error handling has been rearranged in 'lustre_process_log()' with the
>>> addition of a label in order to free some resources.
>> 
>> I wonder if we should just make it return NULL on allocation failure,
>> and then at least the other error handling that is there (i.e. in your other 
>> patch)
>> would become correct.
>> This would make handling in mgc_apply_recover_logs incorrect, but it's 
>> already
>> geared towards this sort of handling anyway, as it discards the passed error
>> and sets ENOMEM unconditionally (just need to revert 3092c34a in a way).
> 
> I'd agree with Oleg that returning NULL is the preferable solution here. 
> 
> There are also callers of lustre_cfg_new() in class_config_llog_handler(),
> do_lcfg(), and lustre_end_log() that do not check error returns at all that
> should be fixed at the same time.

This patch was actually doing it.

> 
> Cheers, Andreas
> 
>>> 
>>> Signed-off-by: Christophe JAILLET 
>>> ---
>>> drivers/staging/lustre/lustre/obdclass/obd_mount.c | 16 ++--
>>> 1 file changed, 14 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/drivers/staging/lustre/lustre/obdclass/obd_mount.c 
>>> b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
>>> index 59fbc29aae94..5473615cd338 100644
>>> --- a/drivers/staging/lustre/lustre/obdclass/obd_mount.c
>>> +++ b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
>>> @@ -89,11 +89,14 @@ int lustre_process_log(struct super_block *sb, char 
>>> *logname,
>>> lustre_cfg_bufs_set(bufs, 2, cfg, sizeof(*cfg));
>>> lustre_cfg_bufs_set(bufs, 3, &sb, sizeof(sb));
>>> lcfg = lustre_cfg_new(LCFG_LOG_START, bufs);
>>> +   if (IS_ERR(lcfg)) {
>>> +   rc = PTR_ERR(lcfg);
>>> +   goto out_free;
>>> +   }
>>> +
>>> rc = obd_process_config(mgc, sizeof(*lcfg), lcfg);
>>> lustre_cfg_free(lcfg);
>>> 
>>> -   kfree(bufs);
>>> -
>>> if (rc == -EINVAL)
>>> LCONSOLE_ERROR_MSG(0x15b, "%s: The configuration from log '%s' 
>>> failed from the MGS (%d).  Make sure this client and the MGS are running 
>>> compatible versions of Lustre.\n",
>>>mgc->obd_name, logname, rc);
>>> @@ -104,6 +107,9 @@ int lustre_process_log(struct super_block *sb, char 
>>> *logname,
>>>rc);
>>> 
>>> /* class_obd_list(); */
>>> +
>>> +out_free:
>>> +   kfree(bufs);
>>> return rc;
>>> }
>>> EXPORT_SYMBOL(lustre_process_log);
>>> @@ -127,6 +133,9 @@ int lustre_end_log(struct super_block *sb, char 
>>> *logname,
>>> if (cfg)
>>> lustre_cfg_bufs_set(&bufs, 2, cfg, sizeof(*cfg));
>>> lcfg = lustre_cfg_new(LCFG_LOG_END, &bufs);
>>> +   if (IS_ERR(lcfg))
>>> +   return PTR_ERR(lcfg);
>>> +
>>> rc = obd_process_config(mgc, sizeof(*lcfg), lcfg);
>>> lustre_cfg_free(lcfg);
>>> return rc;
>>> @@ -159,6 +168,9 @@ static int do_lcfg(char *cfgname, lnet_nid_t nid, int 
>>> cmd,
>>> lustre_cfg_bufs_set_string(&bufs, 4, s4);
>>> 
>>> lcfg = lustre_cfg_new(cmd, &bufs);
>>> +   if (IS_ERR(lcfg))
>>> +   return PTR_ERR(lcfg);
>>> +
>>> lcfg->lcfg_nid = nid;
>>> rc = class_process_config(lcfg);
>>> lustre_cfg_free(lcfg);
>>> -- 
>>> 2.9.3
>> 

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [lustre-devel] [PATCH 1/2] staging: lustre: replace uses of class_devno_max by MAX_OBD_DEVICES

2016-11-06 Thread Oleg Drokin


On Nov 4, 2016, at 4:37 AM, Aya Mahfouz wrote:

> 
> On Thu, Nov 3, 2016 at 1:05 AM, Dilger, Andreas  
> wrote:
> On Oct 25, 2016, at 10:47, Aya Mahfouz  wrote:
> >
> > On Mon, Oct 17, 2016 at 10:38:31PM +, Dilger, Andreas wrote:
> >> On Oct 17, 2016, at 15:46, Aya Mahfouz  
> >> wrote:
> >>>
> >>> class_devno_max is an inline function that returns
> >>> MAX_OBD_DEVICES. Replace all calls to the function
> >>> by MAX_OBD_DEVICES.
> >>
> >> Thanks for your patch, but unfortunately it can't be accepted.
> >>
> >> This function was added in preparation of being able to tune the maximum
> >> number of storage devices dynamically, rather than having to hard code it
> >> to the maximum possible number of servers that a client can possibly
> >> connect to.
> >>
> >> While the current maximum of 8192 servers has been enough for current
> >> filesystems, I'd rather move in the direction of dynamically handling this
> >> limit rather than re-introducing a hard-coded constant throughout the code.
> >>
> > Hello,
> >
> > I would like to proceed with implementing the function if possible.
> > Kindly direct me to some starting pointers.
> 
> Hi Aya,
> thanks for offering to look into this.
> 
> There are several ways to approach this problem  to make the allocation
> of the obd_devs[] array dynamic.  In most cases, there isn't any value
> to dynamically shrink this array, since the filesystem(s) will typically
> be mounted until the node is rebooted, and it is only in the tens of KB
> size range, so this will not affect ongoing operations, and that simplifies
> the implementation.
> 
> The easiest way would be to have a dynamically-sized obd_devs[] array that
> is reallocated in class_newdev() in PAGE_SIZE chunks whenever the current
> array has no more free slots and copied to the new array, using obd_dev_lock
> to protect the array while it is being reallocated and copied.  In most
> cases, this would save memory over the static array (not many filesystems
> have so many servers), but for the few sites that have 1+ servers they
> don't need to change the source to handle this.  Using libcfs_kvzalloc()
> would avoid issues with allocating large chunks of memory.
> 
> There are a few places where obd_devs[] is accessed outside obd_dev_lock
> that would need to be fixed now that this array may be changed at runtime.
> 
> A second approach that may scale better is to change obd_devs from an array
> to a doubly linked list (using standard list_head helpers).  In many cases
> the whole list is seached linearly, and most of the uses of class_num2obd()
> are just used to walk that list in order, which could be replaced with
> list_for_each_entry() list traversal.  The class_name2dev() function should
> be changed to return the pointer to the obd_device structure, and a new
> helper class_dev2num() would just return the obd_minor number from the
> obd_device struct for the one use in class_resolve_dev_name().  Using a
> linked list has the advantage that there is no need to search for free slots
> in the array, since devices would be removed from the list when it is freed.
> 
> Cheers, Andreas
> 
> Thanks Andreas! Will start looking into it.

I also would like to point out that Alexey Lyashkov had an implementation
of this in http://review.whamcloud.com/347 but it needed to be reverted
as it was way too race-prone in the end.
I don't know if Alexey ever improved the patch to actually work (at least
there was some talk about it), but even if so, the end result was never
contributed back to us.

Also please be advised that this is the kind of change that you'll need to
have fully functional Lustre setup to verify it works,
please let me know if you have any problems setting this up.

Thanks!

> 
> --
> Kind Regards,
> Aya Saif El-yazal Mahfouz
>  
> >> One comment inline below, if you still want to submit a patch.
> >>
> >>> Signed-off-by: Aya Mahfouz 
> >>> ---
> >>> drivers/staging/lustre/lustre/obdclass/class_obd.c |  6 +++---
> >>> drivers/staging/lustre/lustre/obdclass/genops.c| 22 
> >>> +++---
> >>> .../lustre/lustre/obdclass/linux/linux-module.c|  6 +++---
> >>> 3 files changed, 17 insertions(+), 17 deletions(-)
> >>>
> >>> diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c 
> >>> b/drivers/staging/lustre/lustre/obdclass/class_obd.c
> >>> index 2b21675..b775c74 100644
> >>> --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c
> >>> +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c
> >>> @@ -345,7 +345,7 @@ int class_handle_ioctl(unsigned int cmd, unsigned 
> >>> long arg)
> >>> goto out;
> >>> }
> >>> obd = class_name2obd(data->ioc_inlbuf4);
> >>> -   } else if (data->ioc_dev < class_devno_max()) {
> >>> +   } else if (data->ioc_dev < MAX_OBD_DEVICES) {
> >>> obd = class_num2obd(data->ioc_dev);
> >>> } else {
> >>> CERROR("OBD ioctl: No device\n");
> >>> @@ -498,7 +498,7 @@ static int __init ob

Re: [PATCH 2/2] staging: lustre: obdclass: Add handling of error returned by lustre_cfg_new

2016-11-06 Thread Oleg Drokin

Hello!

On Nov 6, 2016, at 12:11 PM, Christophe JAILLET wrote:

> 'lustre_cfg_new()' can return ERR_PTR(-ENOMEM).
> Handle these errors and propagate the error code to the callers.
> 
> Error handling has been rearranged in 'lustre_process_log()' with the
> addition of a label in order to free some resources.

I wonder if we should just make it return NULL on allocation failure,
and then at least the other error handling that is there (i.e. in your other 
patch)
would become correct.
This would make handling in mgc_apply_recover_logs incorrect, but it's already
geared towards this sort of handling anyway, as it discards the passed error
and sets ENOMEM unconditionally (just need to revert 3092c34a in a way).

Thanks!

> 
> Signed-off-by: Christophe JAILLET 
> ---
> drivers/staging/lustre/lustre/obdclass/obd_mount.c | 16 ++--
> 1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/obdclass/obd_mount.c 
> b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
> index 59fbc29aae94..5473615cd338 100644
> --- a/drivers/staging/lustre/lustre/obdclass/obd_mount.c
> +++ b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
> @@ -89,11 +89,14 @@ int lustre_process_log(struct super_block *sb, char 
> *logname,
>   lustre_cfg_bufs_set(bufs, 2, cfg, sizeof(*cfg));
>   lustre_cfg_bufs_set(bufs, 3, &sb, sizeof(sb));
>   lcfg = lustre_cfg_new(LCFG_LOG_START, bufs);
> + if (IS_ERR(lcfg)) {
> + rc = PTR_ERR(lcfg);
> + goto out_free;
> + }
> +
>   rc = obd_process_config(mgc, sizeof(*lcfg), lcfg);
>   lustre_cfg_free(lcfg);
> 
> - kfree(bufs);
> -
>   if (rc == -EINVAL)
>   LCONSOLE_ERROR_MSG(0x15b, "%s: The configuration from log '%s' 
> failed from the MGS (%d).  Make sure this client and the MGS are running 
> compatible versions of Lustre.\n",
>  mgc->obd_name, logname, rc);
> @@ -104,6 +107,9 @@ int lustre_process_log(struct super_block *sb, char 
> *logname,
>  rc);
> 
>   /* class_obd_list(); */
> +
> +out_free:
> + kfree(bufs);
>   return rc;
> }
> EXPORT_SYMBOL(lustre_process_log);
> @@ -127,6 +133,9 @@ int lustre_end_log(struct super_block *sb, char *logname,
>   if (cfg)
>   lustre_cfg_bufs_set(&bufs, 2, cfg, sizeof(*cfg));
>   lcfg = lustre_cfg_new(LCFG_LOG_END, &bufs);
> + if (IS_ERR(lcfg))
> + return PTR_ERR(lcfg);
> +
>   rc = obd_process_config(mgc, sizeof(*lcfg), lcfg);
>   lustre_cfg_free(lcfg);
>   return rc;
> @@ -159,6 +168,9 @@ static int do_lcfg(char *cfgname, lnet_nid_t nid, int cmd,
>   lustre_cfg_bufs_set_string(&bufs, 4, s4);
> 
>   lcfg = lustre_cfg_new(cmd, &bufs);
> + if (IS_ERR(lcfg))
> + return PTR_ERR(lcfg);
> +
>   lcfg->lcfg_nid = nid;
>   rc = class_process_config(lcfg);
>   lustre_cfg_free(lcfg);
> -- 
> 2.9.3

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 09/14] staging/lustre/ptlrpc: Suppress error for flock requests

2016-11-02 Thread Oleg Drokin

From: Patrick Farrell 

-EAGAIN is a normal return when requesting POSIX flocks.
We can't recognize exactly that case here, but it's the
only case that should result in -EAGAIN on LDLM_ENQUEUE, so
don't print to console in that case.

Signed-off-by: Patrick Farrell 
Reviewed-on: http://review.whamcloud.com/22856
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8658
Reviewed-by: Andreas Dilger 
Reviewed-by: Bob Glossman 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/ptlrpc/client.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c 
b/drivers/staging/lustre/lustre/ptlrpc/client.c
index 7cbfb4c..bb7ae4e 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -1191,7 +1191,9 @@ static int ptlrpc_check_status(struct ptlrpc_request *req)
lnet_nid_t nid = imp->imp_connection->c_peer.nid;
__u32 opc = lustre_msg_get_opc(req->rq_reqmsg);
 
-   if (ptlrpc_console_allow(req))
+   /* -EAGAIN is normal when using POSIX flocks */
+   if (ptlrpc_console_allow(req) &&
+   !(opc == LDLM_ENQUEUE && err == -EAGAIN))
LCONSOLE_ERROR_MSG(0x011, "%s: operation %s to node %s 
failed: rc = %d\n",
   imp->imp_obd->obd_name,
   ll_opcode2str(opc),
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 13/14] staging/lustre/llite: do not clear uptodate bit in page delete

2016-11-02 Thread Oleg Drokin

From: Jinshan Xiong 

Otherwise, if the race between page fault and truncate occurs, it
will cause the page fault routine to return an EIO error.

In filemap_fault() {
page_not_uptodate:
...
ClearPageError(page);
error = mapping->a_ops->readpage(file, page);
if (!error) {
wait_on_page_locked(page);
if (!PageUptodate(page))
error = -EIO;
}
...
}

However, I tend to think this is a defect in kernel implementation,
because it assumes PageUptodate shouldn't be cleared but file read
routine doesn't make the same assumption.

Signed-off-by: Jinshan Xiong 
Reviewed-on: http://review.whamcloud.com/22827
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8633
Reviewed-by: Li Dongyang 
Reviewed-by: Bobi Jam 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/llite/vvp_page.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/llite/vvp_page.c 
b/drivers/staging/lustre/lustre/llite/vvp_page.c
index 25490a5..23d6630 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_page.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_page.c
@@ -166,7 +166,6 @@ static void vvp_page_delete(const struct lu_env *env,
refc = atomic_dec_return(&page->cp_ref);
LASSERTF(refc >= 1, "page = %p, refc = %d\n", page, refc);
 
-   ClearPageUptodate(vmpage);
ClearPagePrivate(vmpage);
vmpage->private = 0;
/*
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 14/14] staging/lustre: Get rid of LIBLUSTRE_CLIENT and its users

2016-11-02 Thread Oleg Drokin

This define only made sense in a userspace library client, not in the kernel.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/include/lustre_lib.h |  2 --
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  | 15 +--
 2 files changed, 1 insertion(+), 16 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_lib.h 
b/drivers/staging/lustre/lustre/include/lustre_lib.h
index 6b23191..27f3148 100644
--- a/drivers/staging/lustre/lustre/include/lustre_lib.h
+++ b/drivers/staging/lustre/lustre/include/lustre_lib.h
@@ -350,8 +350,6 @@ do {
   \
l_wait_event_exclusive_head(wq, condition, &lwi);   \
 })
 
-#define LIBLUSTRE_CLIENT (0)
-
 /** @} lib */
 
 #endif /* _LUSTRE_LIB_H */
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
index c5d00d1..6a96f2c 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
@@ -475,12 +475,7 @@ int ldlm_cli_enqueue_fini(struct obd_export *exp, struct 
ptlrpc_request *req,
   "client-side enqueue, new policy data");
}
 
-   if ((*flags) & LDLM_FL_AST_SENT ||
-   /* Cancel extent locks as soon as possible on a liblustre client,
-* because it cannot handle asynchronous ASTs robustly (see
-* bug 7311).
-*/
-   (LIBLUSTRE_CLIENT && type == LDLM_EXTENT)) {
+   if ((*flags) & LDLM_FL_AST_SENT) {
lock_res_and_lock(lock);
lock->l_flags |= LDLM_FL_CBPENDING |  LDLM_FL_BL_AST;
unlock_res_and_lock(lock);
@@ -775,14 +770,6 @@ int ldlm_cli_enqueue(struct obd_export *exp, struct 
ptlrpc_request **reqp,
body->lock_flags = ldlm_flags_to_wire(*flags);
body->lock_handle[0] = *lockh;
 
-   /*
-* Liblustre client doesn't get extent locks, except for O_APPEND case
-* where [0, OBD_OBJECT_EOF] lock is taken, or truncate, where
-* [i_size, OBD_OBJECT_EOF] lock is taken.
-*/
-   LASSERT(ergo(LIBLUSTRE_CLIENT, einfo->ei_type != LDLM_EXTENT ||
-policy->l_extent.end == OBD_OBJECT_EOF));
-
if (async) {
LASSERT(reqp);
return 0;
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 11/14] staging/lustre/ptlrpc: Correctly calculate hrp->hrp_nthrs

2016-11-02 Thread Oleg Drokin

From: Amir Shehata 

cpu_pattern can specify exactly 1 cpu in a partition:
"0[0]". That means CPT0 will have CPU 0. CPU 0 can have
hyperthreading enabled. This combination would result in

weight = cfs_cpu_ht_nsiblings(0);
hrp->hrp_nthrs = cfs_cpt_weight(ptlrpc_hr.hr_cpt_table, i);
hrp->hrp_nthrs /= weight;

evaluating to 0. Where
cfs_cpt_weight(ptlrpc_hr.hr_cpt_table, i) == 1
weight == 2

Therefore, if hrp_nthrs becomes zero, just set it to 1.

Signed-off-by: Amir Shehata 
Reviewed-on: http://review.whamcloud.com/19106
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8492
Reviewed-by: Liang Zhen 
Reviewed-by: Doug Oucharek 
Reviewed-by: James Simmons 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/ptlrpc/service.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/service.c 
b/drivers/staging/lustre/lustre/ptlrpc/service.c
index 72f3930..fc754e7 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/service.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/service.c
@@ -2541,8 +2541,9 @@ int ptlrpc_hr_init(void)
 
hrp->hrp_nthrs = cfs_cpt_weight(ptlrpc_hr.hr_cpt_table, i);
hrp->hrp_nthrs /= weight;
+   if (hrp->hrp_nthrs == 0)
+   hrp->hrp_nthrs = 1;
 
-   LASSERT(hrp->hrp_nthrs > 0);
hrp->hrp_thrs =
kzalloc_node(hrp->hrp_nthrs * sizeof(*hrt), GFP_NOFS,
 cfs_cpt_spread_node(ptlrpc_hr.hr_cpt_table,
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 12/14] staging/lustre/llite: update ras window correctly

2016-11-02 Thread Oleg Drokin

From: Bobi Jam 

When stride-RA hit case miss, we only reset normal sequential
read-ahead window, but not reset the stride IO to avoid the overhead
of re-detecting stride IO. While when the normal RA window is set
to not insect with the stride-RA window, when we try to increase
the stride-RA window length later, the presumption does not hold.

This patch resets the stride IO as well in this case.

Signed-off-by: Bobi Jam 
Reviewed-on: http://review.whamcloud.com/23032
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8683
Reviewed-by: wangdi 
Reviewed-by: Jinshan Xiong 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/llite/rw.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/rw.c 
b/drivers/staging/lustre/lustre/llite/rw.c
index d2515a8..e34017d 100644
--- a/drivers/staging/lustre/lustre/llite/rw.c
+++ b/drivers/staging/lustre/lustre/llite/rw.c
@@ -809,13 +809,20 @@ static void ras_update(struct ll_sb_info *sbi, struct 
inode *inode,
if (ra_miss) {
if (index_in_stride_window(ras, index) &&
stride_io_mode(ras)) {
-   /*If stride-RA hit cache miss, the stride dector
-*will not be reset to avoid the overhead of
-*redetecting read-ahead mode
-*/
if (index != ras->ras_last_readpage + 1)
ras->ras_consecutive_pages = 0;
ras_reset(inode, ras, index);
+
+   /* If stride-RA hit cache miss, the stride
+* detector will not be reset to avoid the
+* overhead of redetecting read-ahead mode,
+* but on the condition that the stride window
+* is still intersect with normal sequential
+* read-ahead window.
+*/
+   if (ras->ras_window_start <
+   ras->ras_stride_offset)
+   ras_stride_reset(ras);
RAS_CDEBUG(ras);
} else {
/* Reset both stride window and normal RA
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 10/14] staging/lustre/llite: protect from accessing NULL lli_clob

2016-11-02 Thread Oleg Drokin

From: Bobi Jam 

Need to check file's lli_clob object before calling
lov_read_and_clear_async_rc().

Signed-off-by: Bobi Jam 
Reviewed-by: Jinshan Xiong 
Reviewed-by: Oleg Drokin 
Reviewed-on: http://review.whamcloud.com/23031
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8682
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/llite/file.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c 
b/drivers/staging/lustre/lustre/llite/file.c
index c1c7551..7886840 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -2328,9 +2328,11 @@ int ll_fsync(struct file *file, loff_t start, loff_t 
end, int datasync)
lli->lli_async_rc = 0;
if (rc == 0)
rc = err;
-   err = lov_read_and_clear_async_rc(lli->lli_clob);
-   if (rc == 0)
-   rc = err;
+   if (lli->lli_clob) {
+   err = lov_read_and_clear_async_rc(lli->lli_clob);
+   if (rc == 0)
+   rc = err;
+   }
}
 
err = md_sync(ll_i2sbi(inode)->ll_md_exp, ll_inode2fid(inode), &req);
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 08/14] staging/lustre/ldlm: engage ELC for all ldlm enqueue req

2016-11-02 Thread Oleg Drokin

From: Hongchao Zhang 

If there is no request passed into ldlm_cli_enqueue, the enqueue
request will not engage ELC to drop unneeded locks. currently,
this kind of request is mainly related to EXTENT locks enqueue
requests (except for glimpse EXTENT lock for it has an intent).

Signed-off-by: Hongchao Zhang 
Reviewed-on: http://review.whamcloud.com/21739
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8209
Reviewed-by: Andreas Dilger 
Reviewed-by: Vitaly Fertman 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c | 21 -
 1 file changed, 4 insertions(+), 17 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
index 1b9ae77..c5d00d1 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
@@ -748,17 +748,14 @@ int ldlm_cli_enqueue(struct obd_export *exp, struct 
ptlrpc_request **reqp,
lock->l_last_activity = ktime_get_real_seconds();
 
/* lock not sent to server yet */
-
if (!reqp || !*reqp) {
-   req = ptlrpc_request_alloc_pack(class_exp2cliimp(exp),
-   &RQF_LDLM_ENQUEUE,
-   LUSTRE_DLM_VERSION,
-   LDLM_ENQUEUE);
-   if (!req) {
+   req = ldlm_enqueue_pack(exp, lvb_len);
+   if (IS_ERR(req)) {
failed_lock_cleanup(ns, lock, einfo->ei_mode);
LDLM_LOCK_RELEASE(lock);
-   return -ENOMEM;
+   return PTR_ERR(req);
}
+
req_passed_in = 0;
if (reqp)
*reqp = req;
@@ -778,16 +775,6 @@ int ldlm_cli_enqueue(struct obd_export *exp, struct 
ptlrpc_request **reqp,
body->lock_flags = ldlm_flags_to_wire(*flags);
body->lock_handle[0] = *lockh;
 
-   /* Continue as normal. */
-   if (!req_passed_in) {
-   if (lvb_len > 0)
-   req_capsule_extend(&req->rq_pill,
-  &RQF_LDLM_ENQUEUE_LVB);
-   req_capsule_set_size(&req->rq_pill, &RMF_DLM_LVB, RCL_SERVER,
-lvb_len);
-   ptlrpc_request_set_replen(req);
-   }
-
/*
 * Liblustre client doesn't get extent locks, except for O_APPEND case
 * where [0, OBD_OBJECT_EOF] lock is taken, or truncate, where
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 07/14] staging/lustre/ldlm: Reinstate ldlm_enqueue_pack()

2016-11-02 Thread Oleg Drokin

The function becomes used again with the next patch, so bring it back
from dead, only this time make it static.

Reverts: bf2a033360f7 ("staging/lustre/ldlm: Remove unused ldlm_enqueue_pack()")
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
index 6e704c7..1b9ae77 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
@@ -657,6 +657,27 @@ int ldlm_prep_enqueue_req(struct obd_export *exp, struct 
ptlrpc_request *req,
 }
 EXPORT_SYMBOL(ldlm_prep_enqueue_req);
 
+static struct ptlrpc_request *ldlm_enqueue_pack(struct obd_export *exp,
+   int lvb_len)
+{
+   struct ptlrpc_request *req;
+   int rc;
+
+   req = ptlrpc_request_alloc(class_exp2cliimp(exp), &RQF_LDLM_ENQUEUE);
+   if (!req)
+   return ERR_PTR(-ENOMEM);
+
+   rc = ldlm_prep_enqueue_req(exp, req, NULL, 0);
+   if (rc) {
+   ptlrpc_request_free(req);
+   return ERR_PTR(rc);
+   }
+
+   req_capsule_set_size(&req->rq_pill, &RMF_DLM_LVB, RCL_SERVER, lvb_len);
+   ptlrpc_request_set_replen(req);
+   return req;
+}
+
 /**
  * Client-side lock enqueue.
  *
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 05/14] staging/lustre: Get rid of cl_env hash table

2016-11-02 Thread Oleg Drokin

From: Jinshan Xiong 

cl_env hash table is under heavy contention when there are lots of
processes doing IO at the same time;
reduce lock contention by replacing cl_env cache with percpu array;
remove cl_env_nested_get() and cl_env_nested_put();
remove cl_env_reenter() and cl_env_reexit();

Signed-off-by: Jinshan Xiong 
Reviewed-on: http://review.whamcloud.com/20254
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4257
Reviewed-by: Andreas Dilger 
Reviewed-by: Bobi Jam 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/include/cl_object.h  |  22 --
 drivers/staging/lustre/lustre/ldlm/ldlm_pool.c |   9 -
 drivers/staging/lustre/lustre/llite/file.c |  18 +-
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |  11 +-
 drivers/staging/lustre/lustre/llite/lcommon_misc.c |   8 +-
 drivers/staging/lustre/lustre/llite/llite_mmap.c   |  59 ++--
 drivers/staging/lustre/lustre/llite/rw.c   |   6 +-
 drivers/staging/lustre/lustre/llite/rw26.c |  16 +-
 drivers/staging/lustre/lustre/lov/lov_io.c |   5 -
 drivers/staging/lustre/lustre/lov/lov_object.c |   7 +-
 .../staging/lustre/lustre/obdclass/cl_internal.h   |  23 --
 drivers/staging/lustre/lustre/obdclass/cl_io.c |   1 -
 drivers/staging/lustre/lustre/obdclass/cl_object.c | 389 -
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   4 -
 .../staging/lustre/lustre/obdecho/echo_client.c|  14 +-
 drivers/staging/lustre/lustre/osc/osc_cache.c  |   6 +-
 drivers/staging/lustre/lustre/osc/osc_lock.c   | 116 +++---
 drivers/staging/lustre/lustre/osc/osc_page.c   |   6 +-
 18 files changed, 183 insertions(+), 537 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h 
b/drivers/staging/lustre/lustre/include/cl_object.h
index 514d650..3fe26e7 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -2640,35 +2640,13 @@ void cl_sync_io_end(const struct lu_env *env, struct 
cl_sync_io *anchor);
  * - allocation and destruction of environment is amortized by caching no
  * longer used environments instead of destroying them;
  *
- * - there is a notion of "current" environment, attached to the kernel
- * data structure representing current thread Top-level lustre code
- * allocates an environment and makes it current, then calls into
- * non-lustre code, that in turn calls lustre back. Low-level lustre
- * code thus called can fetch environment created by the top-level code
- * and reuse it, avoiding additional environment allocation.
- *   Right now, three interfaces can attach the cl_env to running thread:
- *   - cl_env_get
- *   - cl_env_implant
- *   - cl_env_reexit(cl_env_reenter had to be called priorly)
- *
  * \see lu_env, lu_context, lu_context_key
  * @{
  */
 
-struct cl_env_nest {
-   int   cen_refcheck;
-   void *cen_cookie;
-};
-
 struct lu_env *cl_env_get(int *refcheck);
 struct lu_env *cl_env_alloc(int *refcheck, __u32 tags);
-struct lu_env *cl_env_nested_get(struct cl_env_nest *nest);
 void cl_env_put(struct lu_env *env, int *refcheck);
-void cl_env_nested_put(struct cl_env_nest *nest, struct lu_env *env);
-void *cl_env_reenter(void);
-void cl_env_reexit(void *cookie);
-void cl_env_implant(struct lu_env *env, int *refcheck);
-void cl_env_unplant(struct lu_env *env, int *refcheck);
 unsigned int cl_env_cache_purge(unsigned int nr);
 struct lu_env *cl_env_percpu_get(void);
 void cl_env_percpu_put(struct lu_env *env);
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c
index b29c9561..19831c5 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c
@@ -794,7 +794,6 @@ static unsigned long ldlm_pools_count(ldlm_side_t client, 
gfp_t gfp_mask)
int nr_ns;
struct ldlm_namespace *ns;
struct ldlm_namespace *ns_old = NULL; /* loop detection */
-   void *cookie;
 
if (client == LDLM_NAMESPACE_CLIENT && !(gfp_mask & __GFP_FS))
return 0;
@@ -802,8 +801,6 @@ static unsigned long ldlm_pools_count(ldlm_side_t client, 
gfp_t gfp_mask)
CDEBUG(D_DLMTRACE, "Request to count %s locks from all pools\n",
   client == LDLM_NAMESPACE_CLIENT ? "client" : "server");
 
-   cookie = cl_env_reenter();
-
/*
 * Find out how many resources we may release.
 */
@@ -812,7 +809,6 @@ static unsigned long ldlm_pools_count(ldlm_side_t client, 
gfp_t gfp_mask)
mutex_lock(ldlm_namespace_lock(client));
if (list_empty(ldlm_namespace_list(client))) {
mutex_unlock(ldlm_namespace_lock(client));
-   cl_env_reexit(cookie);
return 0;
}
ns = ldlm_namespace_first_lock

[PATCH 06/14] staging/lustre/llite: drop_caches hangs in cl_inode_fini()

2016-11-02 Thread Oleg Drokin

From: Andrew Perepechko 

This patch releases cl_pages on error in ll_write_begin()
to avoid memory and object reference leaks. Also, it
reuses per-cpu lu_env in ll_invalidatepage() in the same
way as done in ll_releasepage().

Signed-off-by: Andrew Perepechko 
Seagate-bug-id: MRP-3504
Reviewed-on: http://review.whamcloud.com/22745
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8509
Reviewed-by: Jinshan Xiong 
Reviewed-by: Bobi Jam 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/llite/rw26.c | 36 --
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/rw26.c 
b/drivers/staging/lustre/lustre/llite/rw26.c
index 1a08a9d..ca45b44 100644
--- a/drivers/staging/lustre/lustre/llite/rw26.c
+++ b/drivers/staging/lustre/lustre/llite/rw26.c
@@ -71,8 +71,6 @@ static void ll_invalidatepage(struct page *vmpage, unsigned 
int offset,
struct cl_page   *page;
struct cl_object *obj;
 
-   int refcheck;
-
LASSERT(PageLocked(vmpage));
LASSERT(!PageWriteback(vmpage));
 
@@ -82,21 +80,21 @@ static void ll_invalidatepage(struct page *vmpage, unsigned 
int offset,
 * happening with locked page too
 */
if (offset == 0 && length == PAGE_SIZE) {
-   env = cl_env_get(&refcheck);
-   if (!IS_ERR(env)) {
-   inode = vmpage->mapping->host;
-   obj = ll_i2info(inode)->lli_clob;
-   if (obj) {
-   page = cl_vmpage_page(vmpage, obj);
-   if (page) {
-   cl_page_delete(env, page);
-   cl_page_put(env, page);
-   }
-   } else {
-   LASSERT(vmpage->private == 0);
+   /* See the comment in ll_releasepage() */
+   env = cl_env_percpu_get();
+   LASSERT(!IS_ERR(env));
+   inode = vmpage->mapping->host;
+   obj = ll_i2info(inode)->lli_clob;
+   if (obj) {
+   page = cl_vmpage_page(vmpage, obj);
+   if (page) {
+   cl_page_delete(env, page);
+   cl_page_put(env, page);
}
-   cl_env_put(env, &refcheck);
+   } else {
+   LASSERT(vmpage->private == 0);
}
+   cl_env_percpu_put(env);
}
 }
 
@@ -466,9 +464,9 @@ static int ll_write_begin(struct file *file, struct 
address_space *mapping,
  struct page **pagep, void **fsdata)
 {
struct ll_cl_context *lcc;
-   const struct lu_env  *env;
+   const struct lu_env *env = NULL;
struct cl_io   *io;
-   struct cl_page *page;
+   struct cl_page *page = NULL;
struct cl_object *clob = ll_i2info(mapping->host)->lli_clob;
pgoff_t index = pos >> PAGE_SHIFT;
struct page *vmpage = NULL;
@@ -556,6 +554,10 @@ static int ll_write_begin(struct file *file, struct 
address_space *mapping,
unlock_page(vmpage);
put_page(vmpage);
}
+   if (!IS_ERR_OR_NULL(page)) {
+   lu_ref_del(&page->cp_reference, "cl_io", io);
+   cl_page_put(env, page);
+   }
} else {
*pagep = vmpage;
*fsdata = lcc;
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 03/14] staging/lustre: conflicting PW & PR extent locks on a client

2016-11-02 Thread Oleg Drokin

From: Andriy Skulysh 

PW lock isn't replayed once a lock is marked
LDLM_FL_CANCELING and glimpse lock doesn't wait for
conflicting locks on the client. So the server will
grant a PR lock in response to the glimpse lock request,
which conflicts with the PW lock in LDLM_FL_CANCELING
state on the client.

Lock in LDLM_FL_CANCELING state may still have pending IO,
so it should be replayed until LDLM_FL_BL_DONE is set to
avoid granted conflicting lock by a server.

Seagate-bug-id: MRP-3311
Signed-off-by: Andriy Skulysh 
Reviewed-on: http://review.whamcloud.com/20345
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8175
Reviewed-by: Jinshan Xiong 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/include/obd_support.h |  3 +++
 drivers/staging/lustre/lustre/ldlm/ldlm_extent.c| 20 
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c   |  4 ++--
 drivers/staging/lustre/lustre/osc/osc_request.c |  1 +
 4 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd_support.h 
b/drivers/staging/lustre/lustre/include/obd_support.h
index 7f3f8cd..aaedec7 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -321,6 +321,8 @@ extern char obd_jobid_var[];
 #define OBD_FAIL_LDLM_CP_CB_WAIT4   0x322
 #define OBD_FAIL_LDLM_CP_CB_WAIT5   0x323
 
+#define OBD_FAIL_LDLM_GRANT_CHECK0x32a
+
 /* LOCKLESS IO */
 #define OBD_FAIL_LDLM_SET_CONTENTION 0x385
 
@@ -343,6 +345,7 @@ extern char obd_jobid_var[];
 #define OBD_FAIL_OSC_CP_ENQ_RACE0x410
 #define OBD_FAIL_OSC_NO_GRANT  0x411
 #define OBD_FAIL_OSC_DELAY_SETTIME  0x412
+#define OBD_FAIL_OSC_DELAY_IO   0x414
 
 #define OBD_FAIL_PTLRPC  0x500
 #define OBD_FAIL_PTLRPC_ACK  0x501
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c
index ecf472e..a7b34e4 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c
@@ -193,6 +193,26 @@ void ldlm_extent_add_lock(struct ldlm_resource *res,
 * add the locks into grant list, for debug purpose, ..
 */
ldlm_resource_add_lock(res, &res->lr_granted, lock);
+
+   if (OBD_FAIL_CHECK(OBD_FAIL_LDLM_GRANT_CHECK)) {
+   struct ldlm_lock *lck;
+
+   list_for_each_entry_reverse(lck, &res->lr_granted,
+   l_res_link) {
+   if (lck == lock)
+   continue;
+   if (lockmode_compat(lck->l_granted_mode,
+   lock->l_granted_mode))
+   continue;
+   if (ldlm_extent_overlap(&lck->l_req_extent,
+   &lock->l_req_extent)) {
+   CDEBUG(D_ERROR, "granting conflicting lock %p 
%p\n",
+  lck, lock);
+   ldlm_resource_dump(D_ERROR, res);
+   LBUG();
+   }
+   }
+   }
 }
 
 /** Remove cancelled lock from resource interval tree. */
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
index 43856ff..6e704c7 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
@@ -1846,7 +1846,7 @@ static int ldlm_chain_lock_for_replay(struct ldlm_lock 
*lock, void *closure)
 * bug 17614: locks being actively cancelled. Get a reference
 * on a lock so that it does not disappear under us (e.g. due to cancel)
 */
-   if (!(lock->l_flags & (LDLM_FL_FAILED | LDLM_FL_CANCELING))) {
+   if (!(lock->l_flags & (LDLM_FL_FAILED | LDLM_FL_BL_DONE))) {
list_add(&lock->l_pending_chain, list);
LDLM_LOCK_GET(lock);
}
@@ -1915,7 +1915,7 @@ static int replay_one_lock(struct obd_import *imp, struct 
ldlm_lock *lock)
int flags;
 
/* Bug 11974: Do not replay a lock which is actively being canceled */
-   if (ldlm_is_canceling(lock)) {
+   if (ldlm_is_bl_done(lock)) {
LDLM_DEBUG(lock, "Not replaying canceled lock:");
return 0;
}
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c 
b/drivers/staging/lustre/lustre/osc/osc_request.c
index 091558e..8023561 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -1823,6 +1823,7 @@ int osc_build_rpc(const struct lu_env *env, struct 
client_obd *cli,
DEBUG_REQ(D_INODE, req, "%d pages, aa %p. now %ur/%dw in flight",
  page_c

[PATCH 02/14] staging/lustre/ldlm: fix export reference problem

2016-11-02 Thread Oleg Drokin

From: Hongchao Zhang 

1, in client_import_del_conn, the export returned from
   class_conn2export is not released after using it.

2, in ptlrpc_connect_interpret, the export is not released
   if the connect_flags isn't compatible.

Signed-off-by: Hongchao Zhang 
Reviewed-on: http://review.whamcloud.com/22031
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8500
Reviewed-by: James Simmons 
Reviewed-by: Bobi Jam 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c |  3 +++
 drivers/staging/lustre/lustre/ptlrpc/import.c | 19 ++-
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
index 4f9480e..06d3cc6 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
@@ -170,6 +170,9 @@ int client_import_del_conn(struct obd_import *imp, struct 
obd_uuid *uuid)
ptlrpc_connection_put(dlmexp->exp_connection);
dlmexp->exp_connection = NULL;
}
+
+   if (dlmexp)
+   class_export_put(dlmexp);
}
 
list_del(&imp_conn->oic_item);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/import.c 
b/drivers/staging/lustre/lustre/ptlrpc/import.c
index 46ba5a4..05fd92d 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/import.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/import.c
@@ -972,6 +972,16 @@ static int ptlrpc_connect_interpret(const struct lu_env 
*env,
 
spin_unlock(&imp->imp_lock);
 
+   if (!exp) {
+   /* This could happen if export is cleaned during the
+* connect attempt
+*/
+   CERROR("%s: missing export after connect\n",
+  imp->imp_obd->obd_name);
+   rc = -ENODEV;
+   goto out;
+   }
+
/* check that server granted subset of flags we asked for. */
if ((ocd->ocd_connect_flags & imp->imp_connect_flags_orig) !=
ocd->ocd_connect_flags) {
@@ -982,15 +992,6 @@ static int ptlrpc_connect_interpret(const struct lu_env 
*env,
goto out;
}
 
-   if (!exp) {
-   /* This could happen if export is cleaned during the
-* connect attempt
-*/
-   CERROR("%s: missing export after connect\n",
-  imp->imp_obd->obd_name);
-   rc = -ENODEV;
-   goto out;
-   }
old_connect_flags = exp_connect_flags(exp);
exp->exp_connect_data = *ocd;
imp->imp_obd->obd_self_export->exp_connect_data = *ocd;
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 04/14] staging/lustre/llite: clear inode timestamps after losing UPDATE lock

2016-11-02 Thread Oleg Drokin

From: Niu Yawei 

Otherwise, those leftovers would interfere with new timestamps
especially when the timestamps are set back in time on the other
clients.

Signed-off-by: Jinshan Xiong 
Signed-off-by: Niu Yawei 
Reviewed-on: http://review.whamcloud.com/22623
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8446
Reviewed-by: Bobi Jam 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/llite/namei.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/staging/lustre/lustre/llite/namei.c 
b/drivers/staging/lustre/lustre/llite/namei.c
index 74d9b73..c268f32 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -251,6 +251,16 @@ int ll_md_blocking_ast(struct ldlm_lock *lock, struct 
ldlm_lock_desc *desc,
   PFID(ll_inode2fid(inode)), rc);
}
 
+   if (bits & MDS_INODELOCK_UPDATE) {
+   struct ll_inode_info *lli = ll_i2info(inode);
+
+   spin_lock(&lli->lli_lock);
+   LTIME_S(inode->i_mtime) = 0;
+   LTIME_S(inode->i_atime) = 0;
+   LTIME_S(inode->i_ctime) = 0;
+   spin_unlock(&lli->lli_lock);
+   }
+
if ((bits & MDS_INODELOCK_UPDATE) && S_ISDIR(inode->i_mode)) {
struct ll_inode_info *lli = ll_i2info(inode);
 
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 01/14] staging/lustre/ldlm: Drop unused blocking_refs flock field

2016-11-02 Thread Oleg Drokin

blocking_refs is only used on the server, so drop it on the client.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/include/lustre_dlm.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_dlm.h 
b/drivers/staging/lustre/lustre/include/lustre_dlm.h
index 1c6b7b8..f770b86 100644
--- a/drivers/staging/lustre/lustre/include/lustre_dlm.h
+++ b/drivers/staging/lustre/lustre/include/lustre_dlm.h
@@ -550,8 +550,6 @@ struct ldlm_flock {
__u64 owner;
__u64 blocking_owner;
struct obd_export *blocking_export;
-   /* Protected by the hash lock */
-   __u32 blocking_refs;
__u32 pid;
 };
 
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 00/14] Lustre fixes

2016-11-02 Thread Oleg Drokin

This batch of patches represents mostly recent fixes,
also a couple of cleanups and a couple of changes that fixes depend on.

Amir Shehata (1):
  staging/lustre/ptlrpc: Correctly calculate hrp->hrp_nthrs

Andrew Perepechko (1):
  staging/lustre/llite: drop_caches hangs in cl_inode_fini()

Andriy Skulysh (1):
  staging/lustre: conflicting PW & PR extent locks on a client

Bobi Jam (2):
  staging/lustre/llite: protect from accessing NULL lli_clob
  staging/lustre/llite: update ras window correctly

Hongchao Zhang (2):
  staging/lustre/ldlm: fix export reference problem
  staging/lustre/ldlm: engage ELC for all ldlm enqueue req

Jinshan Xiong (2):
  staging/lustre: Get rid of cl_env hash table
  staging/lustre/llite: do not clear uptodate bit in page delete

Niu Yawei (1):
  staging/lustre/llite: clear inode timestamps after losing UPDATE lock

Oleg Drokin (3):
  staging/lustre/ldlm: Drop unused blocking_refs flock field
  staging/lustre/ldlm: Reinstate ldlm_enqueue_pack()
  staging/lustre: Get rid of LIBLUSTRE_CLIENT and its users

Patrick Farrell (1):
  staging/lustre/ptlrpc: Suppress error for flock requests

 drivers/staging/lustre/lustre/include/cl_object.h  |  22 --
 drivers/staging/lustre/lustre/include/lustre_dlm.h |   2 -
 drivers/staging/lustre/lustre/include/lustre_lib.h |   2 -
 .../staging/lustre/lustre/include/obd_support.h|   3 +
 drivers/staging/lustre/lustre/ldlm/ldlm_extent.c   |  20 ++
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c  |   3 +
 drivers/staging/lustre/lustre/ldlm/ldlm_pool.c |   9 -
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |  61 ++--
 drivers/staging/lustre/lustre/llite/file.c |  26 +-
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |  11 +-
 drivers/staging/lustre/lustre/llite/lcommon_misc.c |   8 +-
 drivers/staging/lustre/lustre/llite/llite_mmap.c   |  59 ++--
 drivers/staging/lustre/lustre/llite/namei.c|  10 +
 drivers/staging/lustre/lustre/llite/rw.c   |  21 +-
 drivers/staging/lustre/lustre/llite/rw26.c |  52 +--
 drivers/staging/lustre/lustre/llite/vvp_page.c |   1 -
 drivers/staging/lustre/lustre/lov/lov_io.c |   5 -
 drivers/staging/lustre/lustre/lov/lov_object.c |   7 +-
 .../staging/lustre/lustre/obdclass/cl_internal.h   |  23 --
 drivers/staging/lustre/lustre/obdclass/cl_io.c |   1 -
 drivers/staging/lustre/lustre/obdclass/cl_object.c | 389 -
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   4 -
 .../staging/lustre/lustre/obdecho/echo_client.c|  14 +-
 drivers/staging/lustre/lustre/osc/osc_cache.c  |   6 +-
 drivers/staging/lustre/lustre/osc/osc_lock.c   | 116 +++---
 drivers/staging/lustre/lustre/osc/osc_page.c   |   6 +-
 drivers/staging/lustre/lustre/osc/osc_request.c|   1 +
 drivers/staging/lustre/lustre/ptlrpc/client.c  |   4 +-
 drivers/staging/lustre/lustre/ptlrpc/import.c  |  19 +-
 drivers/staging/lustre/lustre/ptlrpc/service.c |   3 +-
 30 files changed, 298 insertions(+), 610 deletions(-)

-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH 57/80] staging: lustre: osc: revise unstable pages accounting

2016-10-16 Thread Oleg Drokin


On Oct 16, 2016, at 11:14 AM, Greg Kroah-Hartman wrote:

> Digging up an old email...
> 
> On Tue, Aug 16, 2016 at 04:19:10PM -0400, James Simmons wrote:
>> From: Jinshan Xiong 
>> 
>> A few changes are made in this patch for unstable pages tracking:
>> 
>> 1. Remove kernel NFS unstable pages tracking because it killed
>>   performance
>> 2. Track unstable pages as part of LRU cache. Otherwise Lustre
>>   can use much more memory than max_cached_mb
>> 3. Remove obd_unstable_pages tracking to avoid using global
>>   atomic counter
>> 4. Make unstable pages track optional. Tracking unstable pages is
>>   turned off by default, and can be controlled by
>>   llite.*.unstable_stats.
>> 
>> Signed-off-by: Jinshan Xiong 
>> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4841
>> Reviewed-on: http://review.whamcloud.com/10003
>> Reviewed-by: Andreas Dilger 
>> Reviewed-by: Lai Siyao 
>> Reviewed-by: Oleg Drokin 
>> Signed-off-by: James Simmons 
>> ---
>> drivers/staging/lustre/lustre/include/cl_object.h  |   35 +++-
>> .../staging/lustre/lustre/include/obd_support.h|1 -
>> drivers/staging/lustre/lustre/llite/lproc_llite.c  |   41 -
>> drivers/staging/lustre/lustre/obdclass/class_obd.c |2 -
>> drivers/staging/lustre/lustre/osc/osc_cache.c  |   96 +-
>> drivers/staging/lustre/lustre/osc/osc_internal.h   |2 +-
>> drivers/staging/lustre/lustre/osc/osc_page.c   |  208 
>> +---
>> drivers/staging/lustre/lustre/osc/osc_request.c|   13 +-
>> 8 files changed, 253 insertions(+), 145 deletions(-)
>> 
>> diff --git a/drivers/staging/lustre/lustre/include/cl_object.h 
>> b/drivers/staging/lustre/lustre/include/cl_object.h
>> index d269b32..ec6cf7c 100644
>> --- a/drivers/staging/lustre/lustre/include/cl_object.h
>> +++ b/drivers/staging/lustre/lustre/include/cl_object.h
>> @@ -1039,23 +1039,32 @@ do { 
>>   \
>>  }\
>> } while (0)
>> 
>> -static inline int __page_in_use(const struct cl_page *page, int refc)
>> -{
>> -if (page->cp_type == CPT_CACHEABLE)
>> -++refc;
>> -LASSERT(atomic_read(&page->cp_ref) > 0);
>> -return (atomic_read(&page->cp_ref) > refc);
>> -}
>> -
>> -#define cl_page_in_use(pg)   __page_in_use(pg, 1)
>> -#define cl_page_in_use_noref(pg) __page_in_use(pg, 0)
>> -
>> static inline struct page *cl_page_vmpage(struct cl_page *page)
>> {
>>  LASSERT(page->cp_vmpage);
>>  return page->cp_vmpage;
>> }
>> 
>> +/**
>> + * Check if a cl_page is in use.
>> + *
>> + * Client cache holds a refcount, this refcount will be dropped when
>> + * the page is taken out of cache, see vvp_page_delete().
>> + */
>> +static inline bool __page_in_use(const struct cl_page *page, int refc)
>> +{
>> +return (atomic_read(&page->cp_ref) > refc + 1);
>> +}
>> +
>> +/**
>> + * Caller itself holds a refcount of cl_page.
>> + */
>> +#define cl_page_in_use(pg)   __page_in_use(pg, 1)
>> +/**
>> + * Caller doesn't hold a refcount.
>> + */
>> +#define cl_page_in_use_noref(pg) __page_in_use(pg, 0)
>> +
>> /** @} cl_page */
>> 
>> /** \addtogroup cl_lock cl_lock
>> @@ -2331,6 +2340,10 @@ struct cl_client_cache {
>>   */
>>  spinlock_t  ccc_lru_lock;
>>  /**
>> + * Set if unstable check is enabled
>> + */
>> +unsigned intccc_unstable_check:1;
>> +/**
>>   * # of unstable pages for this mount point
>>   */
>>  atomic_tccc_unstable_nr;
>> diff --git a/drivers/staging/lustre/lustre/include/obd_support.h 
>> b/drivers/staging/lustre/lustre/include/obd_support.h
>> index 26fdff6..a11fff1 100644
>> --- a/drivers/staging/lustre/lustre/include/obd_support.h
>> +++ b/drivers/staging/lustre/lustre/include/obd_support.h
>> @@ -54,7 +54,6 @@ extern int at_early_margin;
>> extern int at_extra;
>> extern unsigned int obd_sync_filter;
>> extern unsigned int obd_max_dirty_pages;
>> -extern atomic_t obd_unstable_pages;
>> extern atomic_t obd_dirty_pages;
>> extern atomic_t obd_dirty_transit_pages;
>> extern char obd_jobid_var[];
>> diff --git a/drivers/staging/lustre/lustre/llite/lproc_llite.c 
>> b/drivers/staging/lustre/lustre/llite/lproc_llite.c
>>

[PATCH] staging/lustre/llite: Move unstable_stats from sysfs to debugfs

2016-10-16 Thread Oleg Drokin

It's multiple values per file, so it has no business being in sysfs,
besides it was assuming seqfile anyway.

Introduced by
commit d806f30e639b ("staging: lustre: osc: revise unstable pages accounting")

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/llite/lproc_llite.c | 34 +++
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/lproc_llite.c 
b/drivers/staging/lustre/lustre/llite/lproc_llite.c
index 6eae605..23fda9d 100644
--- a/drivers/staging/lustre/lustre/llite/lproc_llite.c
+++ b/drivers/staging/lustre/lustre/llite/lproc_llite.c
@@ -871,12 +871,10 @@ static ssize_t xattr_cache_store(struct kobject *kobj,
 }
 LUSTRE_RW_ATTR(xattr_cache);
 
-static ssize_t unstable_stats_show(struct kobject *kobj,
-  struct attribute *attr,
-  char *buf)
+static int ll_unstable_stats_seq_show(struct seq_file *m, void *v)
 {
-   struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info,
- ll_kobj);
+   struct super_block *sb= m->private;
+   struct ll_sb_info  *sbi   = ll_s2sbi(sb);
struct cl_client_cache *cache = sbi->ll_cache;
long pages;
int mb;
@@ -884,19 +882,21 @@ static ssize_t unstable_stats_show(struct kobject *kobj,
pages = atomic_long_read(&cache->ccc_unstable_nr);
mb = (pages * PAGE_SIZE) >> 20;
 
-   return sprintf(buf, "unstable_check: %8d\n"
-   "unstable_pages: %12ld\n"
-   "unstable_mb:%8d\n",
-   cache->ccc_unstable_check, pages, mb);
+   seq_printf(m,
+  "unstable_check: %8d\n"
+  "unstable_pages: %12ld\n"
+  "unstable_mb:%8d\n",
+  cache->ccc_unstable_check, pages, mb);
+
+   return 0;
 }
 
-static ssize_t unstable_stats_store(struct kobject *kobj,
-   struct attribute *attr,
-   const char *buffer,
-   size_t count)
+static ssize_t ll_unstable_stats_seq_write(struct file *file,
+  const char __user *buffer,
+  size_t count, loff_t *off)
 {
-   struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info,
- ll_kobj);
+   struct super_block *sb = ((struct seq_file 
*)file->private_data)->private;
+   struct ll_sb_info *sbi = ll_s2sbi(sb);
char kernbuf[128];
int val, rc;
 
@@ -922,7 +922,7 @@ static ssize_t unstable_stats_store(struct kobject *kobj,
 
return count;
 }
-LUSTRE_RW_ATTR(unstable_stats);
+LPROC_SEQ_FOPS(ll_unstable_stats);
 
 static ssize_t root_squash_show(struct kobject *kobj, struct attribute *attr,
char *buf)
@@ -995,6 +995,7 @@ static struct lprocfs_vars lprocfs_llite_obd_vars[] = {
/* { "filegroups",   lprocfs_rd_filegroups,  0, 0 }, */
{ "max_cached_mb",&ll_max_cached_mb_fops, NULL },
{ "statahead_stats",  &ll_statahead_stats_fops, NULL, 0 },
+   { "unstable_stats",   &ll_unstable_stats_fops, NULL },
{ "sbi_flags",&ll_sbi_flags_fops, NULL, 0 },
{ .name =   "nosquash_nids",
  .fops =   &ll_nosquash_nids_fops  },
@@ -1026,7 +1027,6 @@ static struct attribute *llite_attrs[] = {
&lustre_attr_max_easize.attr,
&lustre_attr_default_easize.attr,
&lustre_attr_xattr_cache.attr,
-   &lustre_attr_unstable_stats.attr,
&lustre_attr_root_squash.attr,
NULL,
 };
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH] staging/lustre: avoid zero buf for the first time

2016-08-26 Thread Oleg Drokin

Hello!

On Aug 22, 2016, at 6:04 AM, Greg Kroah-Hartman wrote:

> On Mon, Aug 22, 2016 at 04:46:04PM +0800, Shawn Lin wrote:
>> We only need to zero it when repeating in order to
>> avoid old garbage. Let's improve it by moving this
>> before we repeat the calculation to save some cpu
>> cycle.
>> 
>> Signed-off-by: Shawn Lin 
> 
> Have you noticed a change with this in a benchmark?
> 
> If not, is it really worth it?

The other problem is we would need to remember to memset it
should there be more paths jumping to the repeat label
which might be easy to miss,
so we are probably better off without this patch.

> I need an ack from the lustre developers before taking patches like
> this...
> 
> thanks,
> 
> greg k-h

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH] staging/lustre: Fix max_dirty_mb output in sysfs

2016-08-25 Thread Oleg Drokin

%ul definitely was supposed to be %lu in the format string,
so we print long unsigned int value, not just unsigned int
with a letter l added at the end.

Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/obdclass/linux/linux-sysctl.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/linux/linux-sysctl.c 
b/drivers/staging/lustre/lustre/obdclass/linux/linux-sysctl.c
index 8f70dd2..bcf005d 100644
--- a/drivers/staging/lustre/lustre/obdclass/linux/linux-sysctl.c
+++ b/drivers/staging/lustre/lustre/obdclass/linux/linux-sysctl.c
@@ -95,8 +95,9 @@ LUSTRE_STATIC_UINT_ATTR(timeout, &obd_timeout);
 static ssize_t max_dirty_mb_show(struct kobject *kobj, struct attribute *attr,
 char *buf)
 {
-   return sprintf(buf, "%ul\n",
-   obd_max_dirty_pages / (1 << (20 - PAGE_SHIFT)));
+   return sprintf(buf, "%lu\n",
+  (unsigned long)obd_max_dirty_pages /
+  (1 << (20 - PAGE_SHIFT)));
 }
 
 static ssize_t max_dirty_mb_store(struct kobject *kobj, struct attribute *attr,
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH v2 8/8] staging/lustre/o2iblnd: handle mixed page size configurations.

2016-08-24 Thread Oleg Drokin

From: James Simmons 

Currently it is not possible to send LNet traffic between
two nodes using infiniband hardware that have different
page sizes for the case when RDMA fragments are used.
When two nodes establish a connection they tell the other
node the maximum number of RDMA fragments they support.
The issue is that the units are pages, and 256 64K pages
corresponds to 16MB of data, whereas a 4K page system is
limited to messages with 1MB of data. The solution is to
report over the wire the maximum number of fragments in
4K unites regardless of the native page size. The recipient
then uses its native page size to translate into the
maximum number of pages sized fragments it can send to
the other node.

Signed-off-by: James Simmons 
Reviewed-on: http://review.whamcloud.com/21304
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7650
Reviewed-by: Doug Oucharek 
Reviewed-by: Olaf Weber 
Signed-off-by: Oleg Drokin 
---
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c| 14 +++---
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h| 13 ++---
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 55 ++
 3 files changed, 41 insertions(+), 41 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
index e93dbeb..c7a5d49 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
@@ -128,6 +128,7 @@ static int kiblnd_msgtype2size(int type)
 static int kiblnd_unpack_rd(struct kib_msg *msg, int flip)
 {
struct kib_rdma_desc *rd;
+   int msg_size;
int nob;
int n;
int i;
@@ -146,12 +147,6 @@ static int kiblnd_unpack_rd(struct kib_msg *msg, int flip)
 
n = rd->rd_nfrags;
 
-   if (n <= 0 || n > IBLND_MAX_RDMA_FRAGS) {
-   CERROR("Bad nfrags: %d, should be 0 < n <= %d\n",
-  n, IBLND_MAX_RDMA_FRAGS);
-   return 1;
-   }
-
nob = offsetof(struct kib_msg, ibm_u) +
  kiblnd_rd_msg_size(rd, msg->ibm_type, n);
 
@@ -161,6 +156,13 @@ static int kiblnd_unpack_rd(struct kib_msg *msg, int flip)
return 1;
}
 
+   msg_size = kiblnd_rd_size(rd);
+   if (msg_size <= 0 || msg_size > LNET_MAX_PAYLOAD) {
+   CERROR("Bad msg_size: %d, should be 0 < n <= %d\n",
+  msg_size, LNET_MAX_PAYLOAD);
+   return 1;
+   }
+
if (!flip)
return 0;
 
diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h
index 3cf8942..1457697 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h
@@ -113,8 +113,9 @@ extern struct kib_tunables  kiblnd_tunables;
 #define IBLND_OOB_CAPABLE(v)   ((v) != IBLND_MSG_VERSION_1)
 #define IBLND_OOB_MSGS(v) (IBLND_OOB_CAPABLE(v) ? 2 : 0)
 
-#define IBLND_MSG_SIZE (4 << 10)/* max size of queued messages 
(inc hdr) */
-#define IBLND_MAX_RDMA_FRAGSLNET_MAX_IOV  /* max # of fragments 
supported */
+#define IBLND_FRAG_SHIFT   (PAGE_SHIFT - 12)   /* frag size on wire is 
in 4K units */
+#define IBLND_MSG_SIZE (4 << 10)   /* max size of queued 
messages (inc hdr) */
+#define IBLND_MAX_RDMA_FRAGS   (LNET_MAX_PAYLOAD >> 12)/* max # of fragments 
supported in 4K size */
 
 //
 /* derived constants... */
@@ -133,8 +134,8 @@ extern struct kib_tunables  kiblnd_tunables;
 /* WRs and CQEs (per connection) */
 #define IBLND_RECV_WRS(c)  IBLND_RX_MSGS(c)
 #define IBLND_SEND_WRS(c)  \
-   ((c->ibc_max_frags + 1) * kiblnd_concurrent_sends(c->ibc_version, \
- c->ibc_peer->ibp_ni))
+   (((c->ibc_max_frags + 1) << IBLND_FRAG_SHIFT) * \
+ kiblnd_concurrent_sends(c->ibc_version, c->ibc_peer->ibp_ni))
 #define IBLND_CQ_ENTRIES(c)(IBLND_RECV_WRS(c) + IBLND_SEND_WRS(c))
 
 struct kib_hca_dev;
@@ -609,14 +610,14 @@ kiblnd_cfg_rdma_frags(struct lnet_ni *ni)
 
tunables = &ni->ni_lnd_tunables->lt_tun_u.lt_o2ib;
mod = tunables->lnd_map_on_demand;
-   return mod ? mod : IBLND_MAX_RDMA_FRAGS;
+   return mod ? mod : IBLND_MAX_RDMA_FRAGS >> IBLND_FRAG_SHIFT;
 }
 
 static inline int
 kiblnd_rdma_frags(int version, struct lnet_ni *ni)
 {
return version == IBLND_MSG_VERSION_1 ?
- IBLND_MAX_RDMA_FRAGS :
+ (IBLND_MAX_RDMA_FRAGS >> IBLND_FRAG_SHIFT) :
  kiblnd_cfg_rdma_frags(ni);
 }
 
diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
index 1563428..3a86879 100644
--- a/drivers/s

[PATCH v2 6/8] staging/lustre/llite: changes to avoid cache corruption

2016-08-24 Thread Oleg Drokin

From: Lokesh Nagappa Jaliminche 

ll_find_alias is responsible for getting alias for inode
which can be reused. Directories are assumed to have unique
alias, where in case of non-directories there can be multiple
aliases. In case of lustre there can be two type of aliases
i.e. discon_alias and invalid_alias. Usage of discon_alias in
case of non-directories may corrupt dcache and leads to kernel
crash. Changes made to avoid use of discon_alias in case of
non-directories.

Seagate-bug-id: MRP-2739, MRP-3601
Signed-off-by: Lokesh Nagappa Jaliminche 
Reviewed-by: Ujjwal Lanjewar 
Reviewed-by: Ashish Purkar 
Reviewed-by: Andrew Perepechko 
Tested-by: Parinay Vijayprakash Kondekar 
Reviewed-on: http://review.whamcloud.com/17732
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7613
Reviewed-by: Niu Yawei 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/llite/namei.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/llite/namei.c 
b/drivers/staging/lustre/lustre/llite/namei.c
index 788a3f0..b7d448f 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -363,7 +363,8 @@ static struct dentry *ll_find_alias(struct inode *inode, 
struct dentry *dentry)
LASSERT(alias != dentry);
 
spin_lock(&alias->d_lock);
-   if (alias->d_flags & DCACHE_DISCONNECTED)
+   if ((alias->d_flags & DCACHE_DISCONNECTED) &&
+   S_ISDIR(inode->i_mode))
/* LASSERT(last_discon == NULL); LU-405, bz 20055 */
discon_alias = alias;
else if (alias->d_parent == dentry->d_parent &&
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH v2 1/8] staging/lustre: const correct set_lock_data()

2016-08-24 Thread Oleg Drokin

From: "John L. Hammond" 

Change the __u64 *cookie parameter of md_ops->set_lock_data() to
const struct lustre_handle *lockh.

Signed-off-by: John L. Hammond 
Reviewed-on: http://review.whamcloud.com/17072
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7403
Reviewed-by: Frank Zago 
Reviewed-by: James Simmons 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/include/obd.h  | 3 ++-
 drivers/staging/lustre/lustre/include/obd_class.h| 3 ++-
 drivers/staging/lustre/lustre/llite/file.c   | 2 +-
 drivers/staging/lustre/lustre/llite/llite_internal.h | 5 ++---
 drivers/staging/lustre/lustre/lmv/lmv_intent.c   | 2 +-
 drivers/staging/lustre/lustre/lmv/lmv_obd.c  | 5 +++--
 drivers/staging/lustre/lustre/mdc/mdc_internal.h | 3 ++-
 drivers/staging/lustre/lustre/mdc/mdc_locks.c| 8 
 drivers/staging/lustre/lustre/mdc/mdc_request.c  | 5 ++---
 9 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h 
b/drivers/staging/lustre/lustre/include/obd.h
index ac620fd..ed0fd41 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -1023,7 +1023,8 @@ struct md_ops {
struct lookup_intent *);
int (*clear_open_replay_data)(struct obd_export *,
  struct obd_client_handle *);
-   int (*set_lock_data)(struct obd_export *, __u64 *, void *, __u64 *);
+   int (*set_lock_data)(struct obd_export *, const struct lustre_handle *,
+void *, __u64 *);
 
enum ldlm_mode (*lock_match)(struct obd_export *, __u64,
 const struct lu_fid *, enum ldlm_type,
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h 
b/drivers/staging/lustre/lustre/include/obd_class.h
index 79fc041..4f48968 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -1610,7 +1610,8 @@ static inline int md_clear_open_replay_data(struct 
obd_export *exp,
 }
 
 static inline int md_set_lock_data(struct obd_export *exp,
-  __u64 *lockh, void *data, __u64 *bits)
+  const struct lustre_handle *lockh,
+  void *data, __u64 *bits)
 {
EXP_CHECK_MD_OP(exp, set_lock_data);
EXP_MD_COUNTER_INCREMENT(exp, set_lock_data);
diff --git a/drivers/staging/lustre/lustre/llite/file.c 
b/drivers/staging/lustre/lustre/llite/file.c
index 55ccd84..13ff212 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -3629,7 +3629,7 @@ static int ll_layout_lock_set(struct lustre_handle 
*lockh, enum ldlm_mode mode,
   PFID(&lli->lli_fid), inode, reconf);
 
/* in case this is a caching lock and reinstate with new inode */
-   md_set_lock_data(sbi->ll_md_exp, &lockh->cookie, inode, NULL);
+   md_set_lock_data(sbi->ll_md_exp, lockh, inode, NULL);
 
lock_res_and_lock(lock);
lvb_ready = ldlm_is_lvb_ready(lock);
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h 
b/drivers/staging/lustre/lustre/llite/llite_internal.h
index a5a3023..cbd5bc5 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -1243,7 +1243,7 @@ static inline void ll_set_lock_data(struct obd_export 
*exp, struct inode *inode,
CDEBUG(D_DLMTRACE, "setting l_data to inode "DFID"%p 
for remote lock %#llx\n",
   PFID(ll_inode2fid(inode)), inode,
   handle.cookie);
-   md_set_lock_data(exp, &handle.cookie, inode, NULL);
+   md_set_lock_data(exp, &handle, inode, NULL);
}
 
handle.cookie = it->it_lock_handle;
@@ -1251,8 +1251,7 @@ static inline void ll_set_lock_data(struct obd_export 
*exp, struct inode *inode,
CDEBUG(D_DLMTRACE, "setting l_data to inode "DFID"%p for lock 
%#llx\n",
   PFID(ll_inode2fid(inode)), inode, handle.cookie);
 
-   md_set_lock_data(exp, &handle.cookie, inode,
-&it->it_lock_bits);
+   md_set_lock_data(exp, &handle, inode, &it->it_lock_bits);
it->it_lock_set = 1;
}
 
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_intent.c 
b/drivers/staging/lustre/lustre/lmv/lmv_intent.c
index 62f6bd0..85cc5cb 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_intent.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_intent.c
@@ -250,7 +250,7 @@ int lmv_revalidate_slaves(struct obd_export *exp, struct 
mdt_body *mbody,
ptlrpc_req_

[PATCH v2 5/8] staging/lustre/llite: Fix suspicious dereference of pointer 'vma->vm_file'

2016-08-24 Thread Oleg Drokin

From: Dmitry Eremin 

Remove useless LASSERT(vma->vm_file) because of if it's NULL it
will crash early in file_inode(vma->vm_file).

Signed-off-by: Dmitry Eremin 
Reviewed-on: http://review.whamcloud.com/21171
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8372
Reviewed-by: John L. Hammond 
Reviewed-by: Bob Glossman 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/llite/llite_mmap.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_mmap.c 
b/drivers/staging/lustre/lustre/llite/llite_mmap.c
index 9d03e79..37f82ed 100644
--- a/drivers/staging/lustre/lustre/llite/llite_mmap.c
+++ b/drivers/staging/lustre/lustre/llite/llite_mmap.c
@@ -429,7 +429,6 @@ static void ll_vm_open(struct vm_area_struct *vma)
struct inode *inode= file_inode(vma->vm_file);
struct vvp_object *vob = cl_inode2vvp(inode);
 
-   LASSERT(vma->vm_file);
LASSERT(atomic_read(&vob->vob_mmap_cnt) >= 0);
atomic_inc(&vob->vob_mmap_cnt);
 }
@@ -442,7 +441,6 @@ static void ll_vm_close(struct vm_area_struct *vma)
struct inode  *inode = file_inode(vma->vm_file);
struct vvp_object *vob   = cl_inode2vvp(inode);
 
-   LASSERT(vma->vm_file);
atomic_dec(&vob->vob_mmap_cnt);
LASSERT(atomic_read(&vob->vob_mmap_cnt) >= 0);
 }
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH v2 2/8] staging/lustre/mdc: fix panic at mdc_free_open()

2016-08-24 Thread Oleg Drokin

From: Alexander Boyko 

Assertion was happened for open request when rq_replay is set
to 1.
ASSERTION(mod->mod_open_req->rq_replay == 0)
But this situation is not fatal for client, and could happened
when mdc_close() failed.
The fix allow to free such requests. If mdc_close fail, MDS doesn`t
receive close request from client. And in a worst case client would
be evicted.

The test recreates issue when mdc_close failed and
client asserts:
   ASSERTION( mod->mod_open_req->rq_replay == 0 ) failed

Signed-off-by: Alexander Boyko 
Seagate-bug-id: MRP-3156
Reviewed-on: http://review.whamcloud.com/17495
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5282
Reviewed-by: Alex Zhuravlev 
Reviewed-by: Andreas Dilger 
Signed-off-by: Oleg Drokin 
---
 .../staging/lustre/lustre/include/obd_support.h|  1 +
 drivers/staging/lustre/lustre/mdc/mdc_request.c| 56 ++
 2 files changed, 38 insertions(+), 19 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd_support.h 
b/drivers/staging/lustre/lustre/include/obd_support.h
index 0c29a33..4a9fe88 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -402,6 +402,7 @@ extern char obd_jobid_var[];
 #define OBD_FAIL_MDC_GETATTR_ENQUEUE 0x803
 #define OBD_FAIL_MDC_RPCS_SEM   0x804
 #define OBD_FAIL_MDC_LIGHTWEIGHT0x805
+#define OBD_FAIL_MDC_CLOSE  0x806
 
 #define OBD_FAIL_MGS0x900
 #define OBD_FAIL_MGS_ALL_REQUEST_NET 0x901
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c 
b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 91c0b45..313889a 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -677,9 +677,15 @@ static void mdc_free_open(struct md_open_data *mod)
imp_connect_disp_stripe(mod->mod_open_req->rq_import))
committed = 1;
 
-   LASSERT(mod->mod_open_req->rq_replay == 0);
-
-   DEBUG_REQ(D_RPCTRACE, mod->mod_open_req, "free open request\n");
+   /*
+* No reason to asssert here if the open request has
+* rq_replay == 1. It means that mdc_close failed, and
+* close request wasn`t sent. It is not fatal to client.
+* The worst thing is eviction if the client gets open lock
+*/
+   DEBUG_REQ(D_RPCTRACE, mod->mod_open_req,
+ "free open request rq_replay = %d\n",
+  mod->mod_open_req->rq_replay);
 
ptlrpc_request_committed(mod->mod_open_req, committed);
if (mod->mod_close_req)
@@ -749,22 +755,10 @@ static int mdc_close(struct obd_export *exp, struct 
md_op_data *op_data,
}
 
*request = NULL;
-   req = ptlrpc_request_alloc(class_exp2cliimp(exp), req_fmt);
-   if (!req)
-   return -ENOMEM;
-
-   rc = ptlrpc_request_pack(req, LUSTRE_MDS_VERSION, MDS_CLOSE);
-   if (rc) {
-   ptlrpc_request_free(req);
-   return rc;
-   }
-
-   /* To avoid a livelock (bug 7034), we need to send CLOSE RPCs to a
-* portal whose threads are not taking any DLM locks and are therefore
-* always progressing
-*/
-   req->rq_request_portal = MDS_READPAGE_PORTAL;
-   ptlrpc_at_set_req_timeout(req);
+   if (OBD_FAIL_CHECK(OBD_FAIL_MDC_CLOSE))
+   req = NULL;
+   else
+   req = ptlrpc_request_alloc(class_exp2cliimp(exp), req_fmt);
 
/* Ensure that this close's handle is fixed up during replay. */
if (likely(mod)) {
@@ -785,6 +779,29 @@ static int mdc_close(struct obd_export *exp, struct 
md_op_data *op_data,
 CDEBUG(D_HA,
"couldn't find open req; expecting close error\n");
}
+   if (!req) {
+   /*
+* TODO: repeat close after errors
+*/
+   CWARN("%s: close of FID "DFID" failed, file reference will be 
dropped when this client unmounts or is evicted\n",
+ obd->obd_name, PFID(&op_data->op_fid1));
+   rc = -ENOMEM;
+   goto out;
+   }
+
+   rc = ptlrpc_request_pack(req, LUSTRE_MDS_VERSION, MDS_CLOSE);
+   if (rc) {
+   ptlrpc_request_free(req);
+   goto out;
+   }
+
+   /*
+* To avoid a livelock (bug 7034), we need to send CLOSE RPCs to a
+* portal whose threads are not taking any DLM locks and are therefore
+* always progressing
+*/
+   req->rq_request_portal = MDS_READPAGE_PORTAL;
+   ptlrpc_at_set_req_timeout(req);
 
mdc_close_pack(req, op_data);
 
@@ -830,6 +847,7 @@ static int mdc_close(struct obd_export *exp, struct 
md_op_data *op_data,
}
}
 
+out:
if (mod) {
if (rc !=

[PATCH v2 3/8] staging/lustre: avoid clearing i_nlink for inodes in use

2016-08-24 Thread Oleg Drokin

From: Andrew Perepechko 

The patch removes find_cbdata callbacks and clear_nlink
from dentry_iput path, since this piece of code makes
a few races possible.

The test case reproduces one of the possible races
described in LU-7925:

1) two hard links are created for the same file
2) the test calls stat(2) for link #1
3) in the middle of 2) the test opens and closes link #2
4) in the middle of 2) the test drops the ldlm locks and
   forces dentry reclaim via vm.drop_caches=2
5) in the middle of 2) ll_d_iput() clears i_nlink for
   the inode
6) the initial stat(2) continues and copies the wrong
   i_nlink value into st_nlink

Signed-off-by: Andrew Perepechko 
Seagate-bug-id: MRP-3271
Reviewed-on: http://review.whamcloud.com/19164
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7925
Reviewed-by: Wally Wang 
Reviewed-by: Lai Siyao 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/include/obd.h|  4 --
 drivers/staging/lustre/lustre/include/obd_class.h  | 25 --
 .../staging/lustre/lustre/include/obd_support.h|  1 +
 drivers/staging/lustre/lustre/llite/dcache.c   | 55 --
 drivers/staging/lustre/lustre/llite/file.c |  2 +
 drivers/staging/lustre/lustre/lmv/lmv_obd.c| 42 -
 drivers/staging/lustre/lustre/lov/lov_obd.c| 41 
 drivers/staging/lustre/lustre/mdc/mdc_internal.h   |  3 --
 drivers/staging/lustre/lustre/mdc/mdc_locks.c  | 22 -
 drivers/staging/lustre/lustre/mdc/mdc_request.c|  1 -
 drivers/staging/lustre/lustre/osc/osc_request.c| 22 -
 11 files changed, 3 insertions(+), 215 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h 
b/drivers/staging/lustre/lustre/include/obd.h
index ed0fd41..f3d141b 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -896,8 +896,6 @@ struct obd_ops {
struct niobuf_remote *remote, int pages,
struct niobuf_local *local,
struct obd_trans_info *oti, int rc);
-   int (*find_cbdata)(struct obd_export *, struct lov_stripe_md *,
-  ldlm_iterator_t it, void *data);
int (*init_export)(struct obd_export *exp);
int (*destroy_export)(struct obd_export *exp);
 
@@ -958,8 +956,6 @@ struct cl_attr;
 struct md_ops {
int (*getstatus)(struct obd_export *, struct lu_fid *);
int (*null_inode)(struct obd_export *, const struct lu_fid *);
-   int (*find_cbdata)(struct obd_export *, const struct lu_fid *,
-  ldlm_iterator_t, void *);
int (*close)(struct obd_export *, struct md_op_data *,
 struct md_open_data *, struct ptlrpc_request **);
int (*create)(struct obd_export *, struct md_op_data *,
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h 
b/drivers/staging/lustre/lustre/include/obd_class.h
index 4f48968..9702ad4 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -1177,19 +1177,6 @@ static inline int obd_iocontrol(unsigned int cmd, struct 
obd_export *exp,
return rc;
 }
 
-static inline int obd_find_cbdata(struct obd_export *exp,
- struct lov_stripe_md *lsm,
- ldlm_iterator_t it, void *data)
-{
-   int rc;
-
-   EXP_CHECK_DT_OP(exp, find_cbdata);
-   EXP_COUNTER_INCREMENT(exp, find_cbdata);
-
-   rc = OBP(exp->exp_obd, find_cbdata)(exp, lsm, it, data);
-   return rc;
-}
-
 static inline void obd_import_event(struct obd_device *obd,
struct obd_import *imp,
enum obd_import_event event)
@@ -1358,18 +1345,6 @@ static inline int md_null_inode(struct obd_export *exp,
return rc;
 }
 
-static inline int md_find_cbdata(struct obd_export *exp,
-const struct lu_fid *fid,
-ldlm_iterator_t it, void *data)
-{
-   int rc;
-
-   EXP_CHECK_MD_OP(exp, find_cbdata);
-   EXP_MD_COUNTER_INCREMENT(exp, find_cbdata);
-   rc = MDP(exp->exp_obd, find_cbdata)(exp, fid, it, data);
-   return rc;
-}
-
 static inline int md_close(struct obd_export *exp, struct md_op_data *op_data,
   struct md_open_data *mod,
   struct ptlrpc_request **request)
diff --git a/drivers/staging/lustre/lustre/include/obd_support.h 
b/drivers/staging/lustre/lustre/include/obd_support.h
index 4a9fe88..4d7a5c8 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -458,6 +458,7 @@ extern char obd_jobid_var[];
 #define OBD_FAIL_LOV_INIT  0x1403
 #define OBD_FAIL_GLIMPSE_DELAY 0x1404
 #define OBD_FAIL_LLITE_XATTR_ENOMEM0x1405
+#

[PATCH v2 7/8] staging/lustre: release MGC device if connect fails

2016-08-24 Thread Oleg Drokin

From: "John L. Hammond" 

In lustre_fill_super() if lustre_start_mgc() fails then call
lustre_common_put_super() to release a reference on the MGC device
attached to the LSI.

Signed-off-by: John L. Hammond 
Reviewed-on: http://review.whamcloud.com/20851
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8297
Reviewed-by: Andreas Dilger 
Reviewed-by: Mike Pershin 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/obdclass/obd_mount.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/obd_mount.c 
b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
index ae702ce..0273768 100644
--- a/drivers/staging/lustre/lustre/obdclass/obd_mount.c
+++ b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
@@ -1144,7 +1144,7 @@ static int lustre_fill_super(struct super_block *sb, void 
*data, int silent)
} else {
rc = lustre_start_mgc(sb);
if (rc) {
-   lustre_put_lsi(sb);
+   lustre_common_put_super(sb);
goto out;
}
/* Connect and start */
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH v2 4/8] staging/lustre/llite: check return value for obd_set_info_async

2016-08-24 Thread Oleg Drokin

From: Yang Sheng 

The return value is ignored in client_common_fill_super.
Restore to check it and error out.

Signed-off-by: Yang Sheng 
Reviewed-on: http://review.whamcloud.com/21125
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8360
Reviewed-by: Emoly Liu 
Reviewed-by: Bob Glossman 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/llite/llite_lib.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c 
b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 72ff7c4..1ff788e 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -498,11 +498,21 @@ static int client_common_fill_super(struct super_block 
*sb, char *md, char *dt,
err = obd_set_info_async(NULL, sbi->ll_dt_exp, sizeof(KEY_CHECKSUM),
 KEY_CHECKSUM, sizeof(checksum), &checksum,
 NULL);
+   if (err) {
+   CERROR("%s: Set checksum failed: rc = %d\n",
+  sbi->ll_dt_exp->exp_obd->obd_name, err);
+   goto out_root;
+   }
cl_sb_init(sb);
 
err = obd_set_info_async(NULL, sbi->ll_dt_exp, sizeof(KEY_CACHE_SET),
 KEY_CACHE_SET, sizeof(*sbi->ll_cache),
 sbi->ll_cache, NULL);
+   if (err) {
+   CERROR("%s: Set cache_set failed: rc = %d\n",
+  sbi->ll_dt_exp->exp_obd->obd_name, err);
+   goto out_root;
+   }
 
sb->s_root = d_make_root(root);
if (!sb->s_root) {
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH v2 0/8] Lustre fixes

2016-08-24 Thread Oleg Drokin

Here are some more recent Lustre fixes and a cleanup.

This resend fixes the "fix panic at mdc_free_open()" patch to actually work.

Please consider.

Alexander Boyko (1):
  staging/lustre/mdc: fix panic at mdc_free_open()

Andrew Perepechko (1):
  staging/lustre: avoid clearing i_nlink for inodes in use

Dmitry Eremin (1):
  staging/lustre/llite: Fix suspicious dereference of pointer
'vma->vm_file'

James Simmons (1):
  staging/lustre/o2iblnd: handle mixed page size configurations.

John L. Hammond (2):
  staging/lustre: const correct set_lock_data()
  staging/lustre: release MGC device if connect fails

Lokesh Nagappa Jaliminche (1):
  staging/lustre/llite: changes to avoid cache corruption

Yang Sheng (1):
  staging/lustre/llite: check return value for obd_set_info_async

 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c| 14 ++---
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h| 13 ++---
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 55 +--
 drivers/staging/lustre/lustre/include/obd.h|  7 +--
 drivers/staging/lustre/lustre/include/obd_class.h  | 28 +-
 .../staging/lustre/lustre/include/obd_support.h|  2 +
 drivers/staging/lustre/lustre/llite/dcache.c   | 55 ---
 drivers/staging/lustre/lustre/llite/file.c |  4 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |  5 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c| 10 
 drivers/staging/lustre/lustre/llite/llite_mmap.c   |  2 -
 drivers/staging/lustre/lustre/llite/namei.c|  3 +-
 drivers/staging/lustre/lustre/lmv/lmv_intent.c |  2 +-
 drivers/staging/lustre/lustre/lmv/lmv_obd.c| 47 ++--
 drivers/staging/lustre/lustre/lov/lov_obd.c| 41 --
 drivers/staging/lustre/lustre/mdc/mdc_internal.h   |  6 +--
 drivers/staging/lustre/lustre/mdc/mdc_locks.c  | 30 ++-
 drivers/staging/lustre/lustre/mdc/mdc_request.c| 62 ++
 drivers/staging/lustre/lustre/obdclass/obd_mount.c |  2 +-
 drivers/staging/lustre/lustre/osc/osc_request.c| 22 
 20 files changed, 114 insertions(+), 296 deletions(-)

-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Re: [PATCH 2/8] staging/lustre/mdc: fix panic at mdc_free_open()

2016-08-23 Thread Oleg Drokin

Actually, please do not apply this one, there was a testing error
that made me not noticing there's a bug in this one that insta-crashes 
everything on access.

I tested the rest nd the rest are good without this one too.

Sorry about this.

On Aug 23, 2016, at 5:11 PM, Oleg Drokin wrote:

> From: Alexander Boyko 
> 
> Assertion was happened for open request when rq_replay is set
> to 1.
>ASSERTION(mod->mod_open_req->rq_replay == 0)
> But this situation is not fatal for client, and could happened
> when mdc_close() failed.
> The fix allow to free such requests. If mdc_close fail, MDS doesn`t
> receive close request from client. And in a worst case client would
> be evicted.
> 
> The test recreates issue when mdc_close failed and
> client asserts:
>   ASSERTION( mod->mod_open_req->rq_replay == 0 ) failed
> 
> Signed-off-by: Alexander Boyko 
> Seagate-bug-id: MRP-3156
> Reviewed-on: http://review.whamcloud.com/17495
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5282
> Reviewed-by: Alex Zhuravlev 
> Reviewed-by: Andreas Dilger 
> Signed-off-by: Oleg Drokin 
> ---
> .../staging/lustre/lustre/include/obd_support.h|  1 +
> drivers/staging/lustre/lustre/mdc/mdc_request.c| 50 ++
> 2 files changed, 32 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/include/obd_support.h 
> b/drivers/staging/lustre/lustre/include/obd_support.h
> index 0c29a33..4a9fe88 100644
> --- a/drivers/staging/lustre/lustre/include/obd_support.h
> +++ b/drivers/staging/lustre/lustre/include/obd_support.h
> @@ -402,6 +402,7 @@ extern char obd_jobid_var[];
> #define OBD_FAIL_MDC_GETATTR_ENQUEUE 0x803
> #define OBD_FAIL_MDC_RPCS_SEM  0x804
> #define OBD_FAIL_MDC_LIGHTWEIGHT   0x805
> +#define OBD_FAIL_MDC_CLOSE0x806
> 
> #define OBD_FAIL_MGS   0x900
> #define OBD_FAIL_MGS_ALL_REQUEST_NET 0x901
> diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c 
> b/drivers/staging/lustre/lustre/mdc/mdc_request.c
> index 91c0b45..8369afd 100644
> --- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
> +++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
> @@ -677,9 +677,15 @@ static void mdc_free_open(struct md_open_data *mod)
>   imp_connect_disp_stripe(mod->mod_open_req->rq_import))
>   committed = 1;
> 
> - LASSERT(mod->mod_open_req->rq_replay == 0);
> -
> - DEBUG_REQ(D_RPCTRACE, mod->mod_open_req, "free open request\n");
> + /*
> +  * No reason to asssert here if the open request has
> +  * rq_replay == 1. It means that mdc_close failed, and
> +  * close request wasn`t sent. It is not fatal to client.
> +  * The worst thing is eviction if the client gets open lock
> +  */
> + DEBUG_REQ(D_RPCTRACE, mod->mod_open_req,
> +   "free open request rq_replay = %d\n",
> +mod->mod_open_req->rq_replay);
> 
>   ptlrpc_request_committed(mod->mod_open_req, committed);
>   if (mod->mod_close_req)
> @@ -749,22 +755,10 @@ static int mdc_close(struct obd_export *exp, struct 
> md_op_data *op_data,
>   }
> 
>   *request = NULL;
> - req = ptlrpc_request_alloc(class_exp2cliimp(exp), req_fmt);
> - if (!req)
> - return -ENOMEM;
> -
> - rc = ptlrpc_request_pack(req, LUSTRE_MDS_VERSION, MDS_CLOSE);
> - if (rc) {
> - ptlrpc_request_free(req);
> - return rc;
> - }
> -
> - /* To avoid a livelock (bug 7034), we need to send CLOSE RPCs to a
> -  * portal whose threads are not taking any DLM locks and are therefore
> -  * always progressing
> -  */
> - req->rq_request_portal = MDS_READPAGE_PORTAL;
> - ptlrpc_at_set_req_timeout(req);
> + if (OBD_FAIL_CHECK(OBD_FAIL_MDC_CLOSE))
> + req = NULL;
> + else
> + req = ptlrpc_request_alloc(class_exp2cliimp(exp), req_fmt);
> 
>   /* Ensure that this close's handle is fixed up during replay. */
>   if (likely(mod)) {
> @@ -785,6 +779,23 @@ static int mdc_close(struct obd_export *exp, struct 
> md_op_data *op_data,
>CDEBUG(D_HA,
>   "couldn't find open req; expecting close error\n");
>   }
> + if (!req) {
> + /*
> +  * TODO: repeat close after errors
> +  */
> + CWARN("%s: close of FID "DFID" failed, file reference will be 
> dropped when this client unmounts or is evicted\n",
> +   obd->obd_name, PFID(&op_data->op_fid1));
> + rc = -ENOMEM;
> + goto

[PATCH 8/8] staging/lustre/o2iblnd: handle mixed page size configurations.

2016-08-23 Thread Oleg Drokin

From: James Simmons 

Currently it is not possible to send LNet traffic between
two nodes using infiniband hardware that have different
page sizes for the case when RDMA fragments are used.
When two nodes establish a connection they tell the other
node the maximum number of RDMA fragments they support.
The issue is that the units are pages, and 256 64K pages
corresponds to 16MB of data, whereas a 4K page system is
limited to messages with 1MB of data. The solution is to
report over the wire the maximum number of fragments in
4K unites regardless of the native page size. The recipient
then uses its native page size to translate into the
maximum number of pages sized fragments it can send to
the other node.

Signed-off-by: James Simmons 
Reviewed-on: http://review.whamcloud.com/21304
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7650
Reviewed-by: Doug Oucharek 
Reviewed-by: Olaf Weber 
Signed-off-by: Oleg Drokin 
---
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c| 14 +++---
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h| 13 ++---
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 55 ++
 3 files changed, 41 insertions(+), 41 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
index e93dbeb..c7a5d49 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
@@ -128,6 +128,7 @@ static int kiblnd_msgtype2size(int type)
 static int kiblnd_unpack_rd(struct kib_msg *msg, int flip)
 {
struct kib_rdma_desc *rd;
+   int msg_size;
int nob;
int n;
int i;
@@ -146,12 +147,6 @@ static int kiblnd_unpack_rd(struct kib_msg *msg, int flip)
 
n = rd->rd_nfrags;
 
-   if (n <= 0 || n > IBLND_MAX_RDMA_FRAGS) {
-   CERROR("Bad nfrags: %d, should be 0 < n <= %d\n",
-  n, IBLND_MAX_RDMA_FRAGS);
-   return 1;
-   }
-
nob = offsetof(struct kib_msg, ibm_u) +
  kiblnd_rd_msg_size(rd, msg->ibm_type, n);
 
@@ -161,6 +156,13 @@ static int kiblnd_unpack_rd(struct kib_msg *msg, int flip)
return 1;
}
 
+   msg_size = kiblnd_rd_size(rd);
+   if (msg_size <= 0 || msg_size > LNET_MAX_PAYLOAD) {
+   CERROR("Bad msg_size: %d, should be 0 < n <= %d\n",
+  msg_size, LNET_MAX_PAYLOAD);
+   return 1;
+   }
+
if (!flip)
return 0;
 
diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h
index 3cf8942..1457697 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h
@@ -113,8 +113,9 @@ extern struct kib_tunables  kiblnd_tunables;
 #define IBLND_OOB_CAPABLE(v)   ((v) != IBLND_MSG_VERSION_1)
 #define IBLND_OOB_MSGS(v) (IBLND_OOB_CAPABLE(v) ? 2 : 0)
 
-#define IBLND_MSG_SIZE (4 << 10)/* max size of queued messages 
(inc hdr) */
-#define IBLND_MAX_RDMA_FRAGSLNET_MAX_IOV  /* max # of fragments 
supported */
+#define IBLND_FRAG_SHIFT   (PAGE_SHIFT - 12)   /* frag size on wire is 
in 4K units */
+#define IBLND_MSG_SIZE (4 << 10)   /* max size of queued 
messages (inc hdr) */
+#define IBLND_MAX_RDMA_FRAGS   (LNET_MAX_PAYLOAD >> 12)/* max # of fragments 
supported in 4K size */
 
 //
 /* derived constants... */
@@ -133,8 +134,8 @@ extern struct kib_tunables  kiblnd_tunables;
 /* WRs and CQEs (per connection) */
 #define IBLND_RECV_WRS(c)  IBLND_RX_MSGS(c)
 #define IBLND_SEND_WRS(c)  \
-   ((c->ibc_max_frags + 1) * kiblnd_concurrent_sends(c->ibc_version, \
- c->ibc_peer->ibp_ni))
+   (((c->ibc_max_frags + 1) << IBLND_FRAG_SHIFT) * \
+ kiblnd_concurrent_sends(c->ibc_version, c->ibc_peer->ibp_ni))
 #define IBLND_CQ_ENTRIES(c)(IBLND_RECV_WRS(c) + IBLND_SEND_WRS(c))
 
 struct kib_hca_dev;
@@ -609,14 +610,14 @@ kiblnd_cfg_rdma_frags(struct lnet_ni *ni)
 
tunables = &ni->ni_lnd_tunables->lt_tun_u.lt_o2ib;
mod = tunables->lnd_map_on_demand;
-   return mod ? mod : IBLND_MAX_RDMA_FRAGS;
+   return mod ? mod : IBLND_MAX_RDMA_FRAGS >> IBLND_FRAG_SHIFT;
 }
 
 static inline int
 kiblnd_rdma_frags(int version, struct lnet_ni *ni)
 {
return version == IBLND_MSG_VERSION_1 ?
- IBLND_MAX_RDMA_FRAGS :
+ (IBLND_MAX_RDMA_FRAGS >> IBLND_FRAG_SHIFT) :
  kiblnd_cfg_rdma_frags(ni);
 }
 
diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
index 1563428..3a86879 100644
--- a/drivers/s

[PATCH 2/8] staging/lustre/mdc: fix panic at mdc_free_open()

2016-08-23 Thread Oleg Drokin

From: Alexander Boyko 

Assertion was happened for open request when rq_replay is set
to 1.
ASSERTION(mod->mod_open_req->rq_replay == 0)
But this situation is not fatal for client, and could happened
when mdc_close() failed.
The fix allow to free such requests. If mdc_close fail, MDS doesn`t
receive close request from client. And in a worst case client would
be evicted.

The test recreates issue when mdc_close failed and
client asserts:
   ASSERTION( mod->mod_open_req->rq_replay == 0 ) failed

Signed-off-by: Alexander Boyko 
Seagate-bug-id: MRP-3156
Reviewed-on: http://review.whamcloud.com/17495
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5282
Reviewed-by: Alex Zhuravlev 
Reviewed-by: Andreas Dilger 
Signed-off-by: Oleg Drokin 
---
 .../staging/lustre/lustre/include/obd_support.h|  1 +
 drivers/staging/lustre/lustre/mdc/mdc_request.c| 50 ++
 2 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd_support.h 
b/drivers/staging/lustre/lustre/include/obd_support.h
index 0c29a33..4a9fe88 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -402,6 +402,7 @@ extern char obd_jobid_var[];
 #define OBD_FAIL_MDC_GETATTR_ENQUEUE 0x803
 #define OBD_FAIL_MDC_RPCS_SEM   0x804
 #define OBD_FAIL_MDC_LIGHTWEIGHT0x805
+#define OBD_FAIL_MDC_CLOSE  0x806
 
 #define OBD_FAIL_MGS0x900
 #define OBD_FAIL_MGS_ALL_REQUEST_NET 0x901
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c 
b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 91c0b45..8369afd 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -677,9 +677,15 @@ static void mdc_free_open(struct md_open_data *mod)
imp_connect_disp_stripe(mod->mod_open_req->rq_import))
committed = 1;
 
-   LASSERT(mod->mod_open_req->rq_replay == 0);
-
-   DEBUG_REQ(D_RPCTRACE, mod->mod_open_req, "free open request\n");
+   /*
+* No reason to asssert here if the open request has
+* rq_replay == 1. It means that mdc_close failed, and
+* close request wasn`t sent. It is not fatal to client.
+* The worst thing is eviction if the client gets open lock
+*/
+   DEBUG_REQ(D_RPCTRACE, mod->mod_open_req,
+ "free open request rq_replay = %d\n",
+  mod->mod_open_req->rq_replay);
 
ptlrpc_request_committed(mod->mod_open_req, committed);
if (mod->mod_close_req)
@@ -749,22 +755,10 @@ static int mdc_close(struct obd_export *exp, struct 
md_op_data *op_data,
}
 
*request = NULL;
-   req = ptlrpc_request_alloc(class_exp2cliimp(exp), req_fmt);
-   if (!req)
-   return -ENOMEM;
-
-   rc = ptlrpc_request_pack(req, LUSTRE_MDS_VERSION, MDS_CLOSE);
-   if (rc) {
-   ptlrpc_request_free(req);
-   return rc;
-   }
-
-   /* To avoid a livelock (bug 7034), we need to send CLOSE RPCs to a
-* portal whose threads are not taking any DLM locks and are therefore
-* always progressing
-*/
-   req->rq_request_portal = MDS_READPAGE_PORTAL;
-   ptlrpc_at_set_req_timeout(req);
+   if (OBD_FAIL_CHECK(OBD_FAIL_MDC_CLOSE))
+   req = NULL;
+   else
+   req = ptlrpc_request_alloc(class_exp2cliimp(exp), req_fmt);
 
/* Ensure that this close's handle is fixed up during replay. */
if (likely(mod)) {
@@ -785,6 +779,23 @@ static int mdc_close(struct obd_export *exp, struct 
md_op_data *op_data,
 CDEBUG(D_HA,
"couldn't find open req; expecting close error\n");
}
+   if (!req) {
+   /*
+* TODO: repeat close after errors
+*/
+   CWARN("%s: close of FID "DFID" failed, file reference will be 
dropped when this client unmounts or is evicted\n",
+ obd->obd_name, PFID(&op_data->op_fid1));
+   rc = -ENOMEM;
+   goto out;
+   }
+
+   /*
+* To avoid a livelock (bug 7034), we need to send CLOSE RPCs to a
+* portal whose threads are not taking any DLM locks and are therefore
+* always progressing
+*/
+   req->rq_request_portal = MDS_READPAGE_PORTAL;
+   ptlrpc_at_set_req_timeout(req);
 
mdc_close_pack(req, op_data);
 
@@ -830,6 +841,7 @@ static int mdc_close(struct obd_export *exp, struct 
md_op_data *op_data,
}
}
 
+out:
if (mod) {
if (rc != 0)
mod->mod_close_req = NULL;
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 7/8] staging/lustre: release MGC device if connect fails

2016-08-23 Thread Oleg Drokin

From: "John L. Hammond" 

In lustre_fill_super() if lustre_start_mgc() fails then call
lustre_common_put_super() to release a reference on the MGC device
attached to the LSI.

Signed-off-by: John L. Hammond 
Reviewed-on: http://review.whamcloud.com/20851
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8297
Reviewed-by: Andreas Dilger 
Reviewed-by: Mike Pershin 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/obdclass/obd_mount.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/obd_mount.c 
b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
index ae702ce..0273768 100644
--- a/drivers/staging/lustre/lustre/obdclass/obd_mount.c
+++ b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
@@ -1144,7 +1144,7 @@ static int lustre_fill_super(struct super_block *sb, void 
*data, int silent)
} else {
rc = lustre_start_mgc(sb);
if (rc) {
-   lustre_put_lsi(sb);
+   lustre_common_put_super(sb);
goto out;
}
/* Connect and start */
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 1/8] staging/lustre: const correct set_lock_data()

2016-08-23 Thread Oleg Drokin

From: "John L. Hammond" 

Change the __u64 *cookie parameter of md_ops->set_lock_data() to
const struct lustre_handle *lockh.

Signed-off-by: John L. Hammond 
Reviewed-on: http://review.whamcloud.com/17072
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7403
Reviewed-by: Frank Zago 
Reviewed-by: James Simmons 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/include/obd.h  | 3 ++-
 drivers/staging/lustre/lustre/include/obd_class.h| 3 ++-
 drivers/staging/lustre/lustre/llite/file.c   | 2 +-
 drivers/staging/lustre/lustre/llite/llite_internal.h | 5 ++---
 drivers/staging/lustre/lustre/lmv/lmv_intent.c   | 2 +-
 drivers/staging/lustre/lustre/lmv/lmv_obd.c  | 5 +++--
 drivers/staging/lustre/lustre/mdc/mdc_internal.h | 3 ++-
 drivers/staging/lustre/lustre/mdc/mdc_locks.c| 8 
 drivers/staging/lustre/lustre/mdc/mdc_request.c  | 5 ++---
 9 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h 
b/drivers/staging/lustre/lustre/include/obd.h
index ac620fd..ed0fd41 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -1023,7 +1023,8 @@ struct md_ops {
struct lookup_intent *);
int (*clear_open_replay_data)(struct obd_export *,
  struct obd_client_handle *);
-   int (*set_lock_data)(struct obd_export *, __u64 *, void *, __u64 *);
+   int (*set_lock_data)(struct obd_export *, const struct lustre_handle *,
+void *, __u64 *);
 
enum ldlm_mode (*lock_match)(struct obd_export *, __u64,
 const struct lu_fid *, enum ldlm_type,
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h 
b/drivers/staging/lustre/lustre/include/obd_class.h
index 79fc041..4f48968 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -1610,7 +1610,8 @@ static inline int md_clear_open_replay_data(struct 
obd_export *exp,
 }
 
 static inline int md_set_lock_data(struct obd_export *exp,
-  __u64 *lockh, void *data, __u64 *bits)
+  const struct lustre_handle *lockh,
+  void *data, __u64 *bits)
 {
EXP_CHECK_MD_OP(exp, set_lock_data);
EXP_MD_COUNTER_INCREMENT(exp, set_lock_data);
diff --git a/drivers/staging/lustre/lustre/llite/file.c 
b/drivers/staging/lustre/lustre/llite/file.c
index 55ccd84..13ff212 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -3629,7 +3629,7 @@ static int ll_layout_lock_set(struct lustre_handle 
*lockh, enum ldlm_mode mode,
   PFID(&lli->lli_fid), inode, reconf);
 
/* in case this is a caching lock and reinstate with new inode */
-   md_set_lock_data(sbi->ll_md_exp, &lockh->cookie, inode, NULL);
+   md_set_lock_data(sbi->ll_md_exp, lockh, inode, NULL);
 
lock_res_and_lock(lock);
lvb_ready = ldlm_is_lvb_ready(lock);
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h 
b/drivers/staging/lustre/lustre/llite/llite_internal.h
index a5a3023..cbd5bc5 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -1243,7 +1243,7 @@ static inline void ll_set_lock_data(struct obd_export 
*exp, struct inode *inode,
CDEBUG(D_DLMTRACE, "setting l_data to inode "DFID"%p 
for remote lock %#llx\n",
   PFID(ll_inode2fid(inode)), inode,
   handle.cookie);
-   md_set_lock_data(exp, &handle.cookie, inode, NULL);
+   md_set_lock_data(exp, &handle, inode, NULL);
}
 
handle.cookie = it->it_lock_handle;
@@ -1251,8 +1251,7 @@ static inline void ll_set_lock_data(struct obd_export 
*exp, struct inode *inode,
CDEBUG(D_DLMTRACE, "setting l_data to inode "DFID"%p for lock 
%#llx\n",
   PFID(ll_inode2fid(inode)), inode, handle.cookie);
 
-   md_set_lock_data(exp, &handle.cookie, inode,
-&it->it_lock_bits);
+   md_set_lock_data(exp, &handle, inode, &it->it_lock_bits);
it->it_lock_set = 1;
}
 
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_intent.c 
b/drivers/staging/lustre/lustre/lmv/lmv_intent.c
index 62f6bd0..85cc5cb 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_intent.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_intent.c
@@ -250,7 +250,7 @@ int lmv_revalidate_slaves(struct obd_export *exp, struct 
mdt_body *mbody,
ptlrpc_req_

[PATCH 3/8] staging/lustre: avoid clearing i_nlink for inodes in use

2016-08-23 Thread Oleg Drokin

From: Andrew Perepechko 

The patch removes find_cbdata callbacks and clear_nlink
from dentry_iput path, since this piece of code makes
a few races possible.

The test case reproduces one of the possible races
described in LU-7925:

1) two hard links are created for the same file
2) the test calls stat(2) for link #1
3) in the middle of 2) the test opens and closes link #2
4) in the middle of 2) the test drops the ldlm locks and
   forces dentry reclaim via vm.drop_caches=2
5) in the middle of 2) ll_d_iput() clears i_nlink for
   the inode
6) the initial stat(2) continues and copies the wrong
   i_nlink value into st_nlink

Signed-off-by: Andrew Perepechko 
Seagate-bug-id: MRP-3271
Reviewed-on: http://review.whamcloud.com/19164
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7925
Reviewed-by: Wally Wang 
Reviewed-by: Lai Siyao 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/include/obd.h|  4 --
 drivers/staging/lustre/lustre/include/obd_class.h  | 25 --
 .../staging/lustre/lustre/include/obd_support.h|  1 +
 drivers/staging/lustre/lustre/llite/dcache.c   | 55 --
 drivers/staging/lustre/lustre/llite/file.c |  2 +
 drivers/staging/lustre/lustre/lmv/lmv_obd.c| 42 -
 drivers/staging/lustre/lustre/lov/lov_obd.c| 41 
 drivers/staging/lustre/lustre/mdc/mdc_internal.h   |  3 --
 drivers/staging/lustre/lustre/mdc/mdc_locks.c  | 22 -
 drivers/staging/lustre/lustre/mdc/mdc_request.c|  1 -
 drivers/staging/lustre/lustre/osc/osc_request.c| 22 -
 11 files changed, 3 insertions(+), 215 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h 
b/drivers/staging/lustre/lustre/include/obd.h
index ed0fd41..f3d141b 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -896,8 +896,6 @@ struct obd_ops {
struct niobuf_remote *remote, int pages,
struct niobuf_local *local,
struct obd_trans_info *oti, int rc);
-   int (*find_cbdata)(struct obd_export *, struct lov_stripe_md *,
-  ldlm_iterator_t it, void *data);
int (*init_export)(struct obd_export *exp);
int (*destroy_export)(struct obd_export *exp);
 
@@ -958,8 +956,6 @@ struct cl_attr;
 struct md_ops {
int (*getstatus)(struct obd_export *, struct lu_fid *);
int (*null_inode)(struct obd_export *, const struct lu_fid *);
-   int (*find_cbdata)(struct obd_export *, const struct lu_fid *,
-  ldlm_iterator_t, void *);
int (*close)(struct obd_export *, struct md_op_data *,
 struct md_open_data *, struct ptlrpc_request **);
int (*create)(struct obd_export *, struct md_op_data *,
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h 
b/drivers/staging/lustre/lustre/include/obd_class.h
index 4f48968..9702ad4 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -1177,19 +1177,6 @@ static inline int obd_iocontrol(unsigned int cmd, struct 
obd_export *exp,
return rc;
 }
 
-static inline int obd_find_cbdata(struct obd_export *exp,
- struct lov_stripe_md *lsm,
- ldlm_iterator_t it, void *data)
-{
-   int rc;
-
-   EXP_CHECK_DT_OP(exp, find_cbdata);
-   EXP_COUNTER_INCREMENT(exp, find_cbdata);
-
-   rc = OBP(exp->exp_obd, find_cbdata)(exp, lsm, it, data);
-   return rc;
-}
-
 static inline void obd_import_event(struct obd_device *obd,
struct obd_import *imp,
enum obd_import_event event)
@@ -1358,18 +1345,6 @@ static inline int md_null_inode(struct obd_export *exp,
return rc;
 }
 
-static inline int md_find_cbdata(struct obd_export *exp,
-const struct lu_fid *fid,
-ldlm_iterator_t it, void *data)
-{
-   int rc;
-
-   EXP_CHECK_MD_OP(exp, find_cbdata);
-   EXP_MD_COUNTER_INCREMENT(exp, find_cbdata);
-   rc = MDP(exp->exp_obd, find_cbdata)(exp, fid, it, data);
-   return rc;
-}
-
 static inline int md_close(struct obd_export *exp, struct md_op_data *op_data,
   struct md_open_data *mod,
   struct ptlrpc_request **request)
diff --git a/drivers/staging/lustre/lustre/include/obd_support.h 
b/drivers/staging/lustre/lustre/include/obd_support.h
index 4a9fe88..4d7a5c8 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -458,6 +458,7 @@ extern char obd_jobid_var[];
 #define OBD_FAIL_LOV_INIT  0x1403
 #define OBD_FAIL_GLIMPSE_DELAY 0x1404
 #define OBD_FAIL_LLITE_XATTR_ENOMEM0x1405
+#

[PATCH 6/8] staging/lustre/llite: changes to avoid cache corruption

2016-08-23 Thread Oleg Drokin

From: Lokesh Nagappa Jaliminche 

ll_find_alias is responsible for getting alias for inode
which can be reused. Directories are assumed to have unique
alias, where in case of non-directories there can be multiple
aliases. In case of lustre there can be two type of aliases
i.e. discon_alias and invalid_alias. Usage of discon_alias in
case of non-directories may corrupt dcache and leads to kernel
crash. Changes made to avoid use of discon_alias in case of
non-directories.

Seagate-bug-id: MRP-2739, MRP-3601
Signed-off-by: Lokesh Nagappa Jaliminche 
Reviewed-by: Ujjwal Lanjewar 
Reviewed-by: Ashish Purkar 
Reviewed-by: Andrew Perepechko 
Tested-by: Parinay Vijayprakash Kondekar 
Reviewed-on: http://review.whamcloud.com/17732
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7613
Reviewed-by: Niu Yawei 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/llite/namei.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/llite/namei.c 
b/drivers/staging/lustre/lustre/llite/namei.c
index 788a3f0..b7d448f 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -363,7 +363,8 @@ static struct dentry *ll_find_alias(struct inode *inode, 
struct dentry *dentry)
LASSERT(alias != dentry);
 
spin_lock(&alias->d_lock);
-   if (alias->d_flags & DCACHE_DISCONNECTED)
+   if ((alias->d_flags & DCACHE_DISCONNECTED) &&
+   S_ISDIR(inode->i_mode))
/* LASSERT(last_discon == NULL); LU-405, bz 20055 */
discon_alias = alias;
else if (alias->d_parent == dentry->d_parent &&
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 5/8] staging/lustre/llite: Fix suspicious dereference of pointer 'vma->vm_file'

2016-08-23 Thread Oleg Drokin

From: Dmitry Eremin 

Remove useless LASSERT(vma->vm_file) because of if it's NULL it
will crash early in file_inode(vma->vm_file).

Signed-off-by: Dmitry Eremin 
Reviewed-on: http://review.whamcloud.com/21171
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8372
Reviewed-by: John L. Hammond 
Reviewed-by: Bob Glossman 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/llite/llite_mmap.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_mmap.c 
b/drivers/staging/lustre/lustre/llite/llite_mmap.c
index 9d03e79..37f82ed 100644
--- a/drivers/staging/lustre/lustre/llite/llite_mmap.c
+++ b/drivers/staging/lustre/lustre/llite/llite_mmap.c
@@ -429,7 +429,6 @@ static void ll_vm_open(struct vm_area_struct *vma)
struct inode *inode= file_inode(vma->vm_file);
struct vvp_object *vob = cl_inode2vvp(inode);
 
-   LASSERT(vma->vm_file);
LASSERT(atomic_read(&vob->vob_mmap_cnt) >= 0);
atomic_inc(&vob->vob_mmap_cnt);
 }
@@ -442,7 +441,6 @@ static void ll_vm_close(struct vm_area_struct *vma)
struct inode  *inode = file_inode(vma->vm_file);
struct vvp_object *vob   = cl_inode2vvp(inode);
 
-   LASSERT(vma->vm_file);
atomic_dec(&vob->vob_mmap_cnt);
LASSERT(atomic_read(&vob->vob_mmap_cnt) >= 0);
 }
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 4/8] staging/lustre/llite: check return value for obd_set_info_async

2016-08-23 Thread Oleg Drokin

From: Yang Sheng 

The return value is ignored in client_common_fill_super.
Restore to check it and error out.

Signed-off-by: Yang Sheng 
Reviewed-on: http://review.whamcloud.com/21125
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8360
Reviewed-by: Emoly Liu 
Reviewed-by: Bob Glossman 
Signed-off-by: Oleg Drokin 
---
 drivers/staging/lustre/lustre/llite/llite_lib.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c 
b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 72ff7c4..1ff788e 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -498,11 +498,21 @@ static int client_common_fill_super(struct super_block 
*sb, char *md, char *dt,
err = obd_set_info_async(NULL, sbi->ll_dt_exp, sizeof(KEY_CHECKSUM),
 KEY_CHECKSUM, sizeof(checksum), &checksum,
 NULL);
+   if (err) {
+   CERROR("%s: Set checksum failed: rc = %d\n",
+  sbi->ll_dt_exp->exp_obd->obd_name, err);
+   goto out_root;
+   }
cl_sb_init(sb);
 
err = obd_set_info_async(NULL, sbi->ll_dt_exp, sizeof(KEY_CACHE_SET),
 KEY_CACHE_SET, sizeof(*sbi->ll_cache),
 sbi->ll_cache, NULL);
+   if (err) {
+   CERROR("%s: Set cache_set failed: rc = %d\n",
+  sbi->ll_dt_exp->exp_obd->obd_name, err);
+   goto out_root;
+   }
 
sb->s_root = d_make_root(root);
if (!sb->s_root) {
-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

[PATCH 0/8] Lustre fixes

2016-08-23 Thread Oleg Drokin

HEre are some more recent Lustre fixes and a cleanup.

Alexander Boyko (1):
  staging/lustre/mdc: fix panic at mdc_free_open()

Andrew Perepechko (1):
  staging/lustre: avoid clearing i_nlink for inodes in use

Dmitry Eremin (1):
  staging/lustre/llite: Fix suspicious dereference of pointer
'vma->vm_file'

James Simmons (1):
  staging/lustre/o2iblnd: handle mixed page size configurations.

John L. Hammond (2):
  staging/lustre: const correct set_lock_data()
  staging/lustre: release MGC device if connect fails

Lokesh Nagappa Jaliminche (1):
  staging/lustre/llite: changes to avoid cache corruption

Yang Sheng (1):
  staging/lustre/llite: check return value for obd_set_info_async

 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c| 14 +++---
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h| 13 ++---
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 55 ++---
 drivers/staging/lustre/lustre/include/obd.h|  7 +--
 drivers/staging/lustre/lustre/include/obd_class.h  | 28 +--
 .../staging/lustre/lustre/include/obd_support.h|  2 +
 drivers/staging/lustre/lustre/llite/dcache.c   | 55 -
 drivers/staging/lustre/lustre/llite/file.c |  4 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |  5 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c| 10 
 drivers/staging/lustre/lustre/llite/llite_mmap.c   |  2 -
 drivers/staging/lustre/lustre/llite/namei.c|  3 +-
 drivers/staging/lustre/lustre/lmv/lmv_intent.c |  2 +-
 drivers/staging/lustre/lustre/lmv/lmv_obd.c| 47 ++
 drivers/staging/lustre/lustre/lov/lov_obd.c| 41 
 drivers/staging/lustre/lustre/mdc/mdc_internal.h   |  6 +--
 drivers/staging/lustre/lustre/mdc/mdc_locks.c  | 30 ++--
 drivers/staging/lustre/lustre/mdc/mdc_request.c| 56 +-
 drivers/staging/lustre/lustre/obdclass/obd_mount.c |  2 +-
 drivers/staging/lustre/lustre/osc/osc_request.c| 22 -
 20 files changed, 108 insertions(+), 296 deletions(-)

-- 
2.7.4

___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

1 2 3 4 5 >

1 - 100 of 434 matches

Mail list logo