Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave

2017-12-08 Thread 'Marcelo Ricardo Leitner'
On Fri, Dec 08, 2017 at 11:08:56AM -0500, Neil Horman wrote:
> On Fri, Dec 08, 2017 at 04:04:58PM +, David Laight wrote:
> > From: Marcelo Ricardo Leitner
> > > Sent: 08 December 2017 16:00
> > ...
> > > > Is it worth replacing the si struct with an index/enum value, and 
> > > > indexing an
> > > > array of method pointer structs?  That would save you at least one 
> > > > dereference.
> > > 
> > > Hmmm, maybe, yes. It would be like
> > > sctp_stream_interleave[asoc->stream.si].make_datafrag(...)
> > 
> > If you only expect 2 choices then an if () is likely
> > to produce better code that the above.
> > 
> > The actual implementation can be hidden inside a #define
> > or static inline function.
> > 
> Thats the real question though, will we expect more than two interleaving
> strategies?  Currently its a boolean operation so the answer seems like yes, 
> but
> is there a possiblity of a biased interleaving, or other worthwhile algorithm?

For the chunk format, I don't think so. It would require another RFC
update.
For other possibilities on having a 3rd choice in there, I also don't
think so. Stream scheduling is handled apart from it and rx buffer
stuff is being covered by it now, don't see how we could have a 3rd
option without another chunk format.

But that said, I don't think the macro or even inline wrappers are as
clear as the struct with function pointers here and in some cases it
even won't avoid the second deref. I wouldn't like to compromise code
readability and OO because of 1 fetch, specially considering that this
will be barely noticeable.

Neil's idea on using the array of structs and indexing on it is nice.
Saves a deref and keeps the abstractions without inserting too much
noise on it. Plus, it also allows replacing the struct pointer in
sctp_stream with a bit. If then placed before the union in there, we
can save up to the whole pointer. Looks like a good compromise to me.

  Marcelo


Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave

2017-12-08 Thread Xin Long
On Sat, Dec 9, 2017 at 1:29 AM, David Laight  wrote:
> From: Xin Long ...
>> Hi, David, Sorry,  I'm not sure we're worrying about the cpu cost or
>> codes style now ?
>>
>> For cpu cost,  I think 0x848(%r13) operation must be better than the
>> generated code of if-else.
>
> Nope - the call xxx(%ryyy) is likely to be a data cache miss and a complete
> cpu pipeline stall.
>
> The conditional will be a data cache hit and the code (for one branch)
> will be prefetched and speculatively executed.
>
> Some very modern cpu might manage to predict indirect jumps, but for
> most it is a full pipeline stall.
Thanks for the CPU information.

The thing is with if-else can't avoid xxx(%ryyy) in this case, as Marcelo
said above. It seems if-else will just be a extra cost compare to this one.


RE: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave

2017-12-08 Thread David Laight
From: Xin Long ...
> Hi, David, Sorry,  I'm not sure we're worrying about the cpu cost or
> codes style now ?
> 
> For cpu cost,  I think 0x848(%r13) operation must be better than the
> generated code of if-else.

Nope - the call xxx(%ryyy) is likely to be a data cache miss and a complete
cpu pipeline stall.

The conditional will be a data cache hit and the code (for one branch)
will be prefetched and speculatively executed.

Some very modern cpu might manage to predict indirect jumps, but for
most it is a full pipeline stall.

David



Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave

2017-12-08 Thread Xin Long
On Sat, Dec 9, 2017 at 12:22 AM, David Laight  wrote:
> From: Xin Long
>> Sent: 08 December 2017 16:18
>>
> ...
>> >> Alternatively you could preform the dereference in two steps (i.e. 
>> >> declare an si
>> >> pointer on the stack and set it equal to asoc->stream.si, then deref
>> >> si->make_datafrag at call time.  That will at least give the compiler an
>> >> opportunity to preload the first pointer.
>
> You want to save the function pointer itself.
>
> ...
>> Another small difference:
>>   as you can see, comparing to (X), (Y) is using 0x28(%rsp) in the loop,
>>   instead of %r13.
>>
>> So that's what I can see from the related generated code.
>> If 0x848(%r13) is not worse than 0x28(%rsp) for cpu, I think
>> asoc->stream.si->make_datafrag() is even better. No ?
>
> That code must have far too many life local variables.
> Otherwise there's be a caller saved register available.
>
Hi, David, Sorry,  I'm not sure we're worrying about the cpu cost or
codes style now ?

For cpu cost,  I think 0x848(%r13) operation must be better than the
generated code of if-else.

For the codes style, comparing to the if-else, I think this one is
more readable. (ignore extendible stuff first, as probably no more
new type of data chunk).


RE: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave

2017-12-08 Thread David Laight
From: Xin Long
> Sent: 08 December 2017 16:18
> 
...
> >> Alternatively you could preform the dereference in two steps (i.e. declare 
> >> an si
> >> pointer on the stack and set it equal to asoc->stream.si, then deref
> >> si->make_datafrag at call time.  That will at least give the compiler an
> >> opportunity to preload the first pointer.

You want to save the function pointer itself.

...
> Another small difference:
>   as you can see, comparing to (X), (Y) is using 0x28(%rsp) in the loop,
>   instead of %r13.
> 
> So that's what I can see from the related generated code.
> If 0x848(%r13) is not worse than 0x28(%rsp) for cpu, I think
> asoc->stream.si->make_datafrag() is even better. No ?

That code must have far too many life local variables.
Otherwise there's be a caller saved register available.

David



Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave

2017-12-08 Thread Xin Long
On Sat, Dec 9, 2017 at 12:00 AM, Marcelo Ricardo Leitner
 wrote:
> On Fri, Dec 08, 2017 at 10:37:34AM -0500, Neil Horman wrote:
>> On Fri, Dec 08, 2017 at 12:56:30PM -0200, Marcelo Ricardo Leitner wrote:
>> > On Fri, Dec 08, 2017 at 02:06:04PM +, David Laight wrote:
>> > > From: Xin Long
>> > > > Sent: 08 December 2017 13:04
>> > > ...
>> > > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct 
>> > > > sctp_association *asoc,
>> > > > frag |= SCTP_DATA_SACK_IMM;
>> > > > }
>> > > >
>> > > > -   chunk = sctp_make_datafrag_empty(asoc, sinfo, len, 
>> > > > frag,
>> > > > -0, GFP_KERNEL);
>> > > > +   chunk = asoc->stream.si->make_datafrag(asoc, sinfo, 
>> > > > len, frag,
>> > > > +  GFP_KERNEL);
>> > >
>> > > I know that none of the sctp code is very optimised, but that indirect
>> > > call is going to be horrid.
>> >
>> > Yeah.. but there is no way to avoid the double derreference
>> > considering we only have the asoc pointer in there and we have to
>> > reach the contents of the data chunk operations struct, and the .si
>> > part is the same as 'stream' part as it's a constant offset.
>> >
>> > Due to the for() in there, we could add a variable to store
>> > asoc->stream.si outside the for and then we can do only a single deref
>> > inside it. Xin, can you please try and see if the generated code is
>> > different?
>> >
>> > Other suggestions?
>> >
>> Is it worth replacing the si struct with an index/enum value, and indexing an
>> array of method pointer structs?  That would save you at least one 
>> dereference.
>
> Hmmm, maybe, yes. It would be like
> sctp_stream_interleave[asoc->stream.si].make_datafrag(...)
>
> Then same goes for pf->af, probably.
>
>>
>> Alternatively you could preform the dereference in two steps (i.e. declare 
>> an si
>> pointer on the stack and set it equal to asoc->stream.si, then deref
>> si->make_datafrag at call time.  That will at least give the compiler an
>> opportunity to preload the first pointer.
>
> Yep, that was my 2nd paragraph above :-) but it only works for cases
> such as this one.

Now:
  for(N) {
...
chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
 0xfb58 <+360>: mov0x848(%r13),%rax  < [a]
 0xfb5f <+367>: movzbl %cl,%ecx
 0xfb62 <+370>: mov$0x14000c0,%r8d
 0xfb68 <+376>: mov%r12d,%edx
 0xfb6b <+379>: mov(%rsp),%rsi
 0xfb6f <+383>: mov%r13,%rdi  <=(X)
 0xfb72 <+386>: callq  *0x8(%rax)  < [b]
 0xfb78 <+392>: mov%rax,%r15
   }

   ret = N * ([a] + [b])


After using a variable:
  struct sctp_stream_interleave *si;
  ...
  si = asoc->stream.si;
 0xfb44 <+340>: mov0x848(%r14),%rax
 0xfb4e <+350>: mov%rax,0x20(%rsp) <- [1]

  for(N) {
...
chunk = si->make_datafrag(asoc, sinfo, len, frag, GFP_KERNEL);
 0xfb69 <+377>: mov0x20(%rsp),%rax <- [2]
 0xfb6e <+382>: movzbl %cl,%ecx
 0xfb71 <+385>: mov$0x14000c0,%r8d
 0xfb77 <+391>: mov%r12d,%edx
 0xfb7a <+394>: mov(%rsp),%rsi
 0xfb7e <+398>: mov0x28(%rsp),%rdi <=(Y)
 0xfb83 <+403>: callq  *0x8(%rax) <- [3]
 0xfb89 <+409>: mov%rax,%r14
   }

   ret = [1] + N * ([2] + [3])


Another small difference:
  as you can see, comparing to (X), (Y) is using 0x28(%rsp) in the loop,
  instead of %r13.

So that's what I can see from the related generated code.
If 0x848(%r13) is not worse than 0x28(%rsp) for cpu, I think
asoc->stream.si->make_datafrag() is even better. No ?


Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave

2017-12-08 Thread Neil Horman
On Fri, Dec 08, 2017 at 04:04:58PM +, David Laight wrote:
> From: Marcelo Ricardo Leitner
> > Sent: 08 December 2017 16:00
> ...
> > > Is it worth replacing the si struct with an index/enum value, and 
> > > indexing an
> > > array of method pointer structs?  That would save you at least one 
> > > dereference.
> > 
> > Hmmm, maybe, yes. It would be like
> > sctp_stream_interleave[asoc->stream.si].make_datafrag(...)
> 
> If you only expect 2 choices then an if () is likely
> to produce better code that the above.
> 
> The actual implementation can be hidden inside a #define
> or static inline function.
> 
Thats the real question though, will we expect more than two interleaving
strategies?  Currently its a boolean operation so the answer seems like yes, but
is there a possiblity of a biased interleaving, or other worthwhile algorithm?

Neil

>   David
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


RE: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave

2017-12-08 Thread David Laight
From: Marcelo Ricardo Leitner
> Sent: 08 December 2017 16:00
...
> > Is it worth replacing the si struct with an index/enum value, and indexing 
> > an
> > array of method pointer structs?  That would save you at least one 
> > dereference.
> 
> Hmmm, maybe, yes. It would be like
> sctp_stream_interleave[asoc->stream.si].make_datafrag(...)

If you only expect 2 choices then an if () is likely
to produce better code that the above.

The actual implementation can be hidden inside a #define
or static inline function.

David



Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave

2017-12-08 Thread 'Marcelo Ricardo Leitner'
On Fri, Dec 08, 2017 at 03:32:54PM +, David Laight wrote:
> From: 'Marcelo Ricardo Leitner'
> > Sent: 08 December 2017 15:16
> > On Fri, Dec 08, 2017 at 03:01:31PM +, David Laight wrote:
> > > From: Marcelo Ricardo Leitner
> > > > Sent: 08 December 2017 14:57
> > > >
> > > > On Fri, Dec 08, 2017 at 02:06:04PM +, David Laight wrote:
> > > > > From: Xin Long
> > > > > > Sent: 08 December 2017 13:04
> > > > > ...
> > > > > > @@ -264,8 +264,8 @@ struct sctp_datamsg 
> > > > > > *sctp_datamsg_from_user(struct sctp_association *asoc,
> > > > > > frag |= SCTP_DATA_SACK_IMM;
> > > > > > }
> > > > > >
> > > > > > -   chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > > > > > -0, GFP_KERNEL);
> > > > > > +   chunk = asoc->stream.si->make_datafrag(asoc, sinfo, 
> > > > > > len, frag,
> > > > > > +  GFP_KERNEL);
> > > > >
> > > > > I know that none of the sctp code is very optimised, but that indirect
> > > > > call is going to be horrid.
> > > >
> > > > Yeah.. but there is no way to avoid the double derreference
> > > > considering we only have the asoc pointer in there and we have to
> > > > reach the contents of the data chunk operations struct, and the .si
> > > > part is the same as 'stream' part as it's a constant offset.
> > > ...
> > >
> > > It isn't only the double indirect, the indirect call itself isn't 'fun'.
> > 
> > I meant in this context.
> > 
> > The indirect call is so we don't have to flood the stack with
> > if (old data chunk fmt) {
> > ...
> > } else {
> > ...
> > }
> > 
> > So instead of this, we now have some key operations identified and
> > wrapped up behind this struct, allowing us to abstract whatever data
> > chunk format it is.
> 
> Nothing wrong with:
> #define foo(asoc, ...) \
>   if (asoc->new_fmt) \

Not all function pointers in sctp_stream_interleave have asoc as
a parameter, so we would have to have something like:

#define foo_asoc(asoc, ...) \
if (asoc->new_fmt) \
...

#define foo_chunk(chunk, ...) \
if (chunk->asoc->new_fmt) \
...

#define foo_ulpq(ulpq, ...) \
if (ulpq->asoc->new_fmt) \
...

and we're pretty much back to double deref.
Maybe some reworking on the parameters could alleviate some of these,
but not all.

>   foo_new(asoc, __VA_LIST__); \
>   else \
>   foo_old(asoc, __VA_LIST__);
> 
> > > I think there are other hot paths where you've replaced a sizeof()
> > > with a ?: clause.
> > > Caching the result might be much better.
> > 
> > The only new ?: clause I could find this patchset is on patch 12 and
> > has nothing to do with sizeof().
> > 
> > The sizeof() results are indeed cached, as you can see in patch 4:
> > +static struct sctp_stream_interleave sctp_stream_interleave_0 = {
> > +   .data_chunk_len = sizeof(struct sctp_data_chunk),
> > and the two helpers on it at the begining of the patch.
> 
> I was getting two bits mixed up.
> But the code that reads data_chunk_len is following an awful lot of pointers.

>From path 4:
max_data = asoc->pathmtu -
   sctp_sk(asoc->base.sk)->pf->af->net_header_len -
-  sizeof(struct sctphdr) - sizeof(struct sctp_data_chunk);
+  sizeof(struct sctphdr) - sctp_datachk_len(&asoc->stream);

You're worried with the double deref in sctp_datachk_len() but on the
line right above it we have 4 derefs. There are several other cases
of double deref in sctp stack even on hot path. Not sure why you're
picking on this one, but ok.

  Marcelo


Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave

2017-12-08 Thread Marcelo Ricardo Leitner
On Fri, Dec 08, 2017 at 10:37:34AM -0500, Neil Horman wrote:
> On Fri, Dec 08, 2017 at 12:56:30PM -0200, Marcelo Ricardo Leitner wrote:
> > On Fri, Dec 08, 2017 at 02:06:04PM +, David Laight wrote:
> > > From: Xin Long
> > > > Sent: 08 December 2017 13:04
> > > ...
> > > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct 
> > > > sctp_association *asoc,
> > > > frag |= SCTP_DATA_SACK_IMM;
> > > > }
> > > > 
> > > > -   chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > > > -0, GFP_KERNEL);
> > > > +   chunk = asoc->stream.si->make_datafrag(asoc, sinfo, 
> > > > len, frag,
> > > > +  GFP_KERNEL);
> > > 
> > > I know that none of the sctp code is very optimised, but that indirect
> > > call is going to be horrid.
> > 
> > Yeah.. but there is no way to avoid the double derreference
> > considering we only have the asoc pointer in there and we have to
> > reach the contents of the data chunk operations struct, and the .si
> > part is the same as 'stream' part as it's a constant offset.
> > 
> > Due to the for() in there, we could add a variable to store
> > asoc->stream.si outside the for and then we can do only a single deref
> > inside it. Xin, can you please try and see if the generated code is
> > different?
> > 
> > Other suggestions?
> > 
> Is it worth replacing the si struct with an index/enum value, and indexing an
> array of method pointer structs?  That would save you at least one 
> dereference.

Hmmm, maybe, yes. It would be like
sctp_stream_interleave[asoc->stream.si].make_datafrag(...)

Then same goes for pf->af, probably.

> 
> Alternatively you could preform the dereference in two steps (i.e. declare an 
> si
> pointer on the stack and set it equal to asoc->stream.si, then deref
> si->make_datafrag at call time.  That will at least give the compiler an
> opportunity to preload the first pointer.

Yep, that was my 2nd paragraph above :-) but it only works for cases
such as this one.

  Marcelo

> 
> Neil
> 
> >   Marcelo
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave

2017-12-08 Thread Neil Horman
On Fri, Dec 08, 2017 at 12:56:30PM -0200, Marcelo Ricardo Leitner wrote:
> On Fri, Dec 08, 2017 at 02:06:04PM +, David Laight wrote:
> > From: Xin Long
> > > Sent: 08 December 2017 13:04
> > ...
> > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct 
> > > sctp_association *asoc,
> > >   frag |= SCTP_DATA_SACK_IMM;
> > >   }
> > > 
> > > - chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > > -  0, GFP_KERNEL);
> > > + chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> > > +GFP_KERNEL);
> > 
> > I know that none of the sctp code is very optimised, but that indirect
> > call is going to be horrid.
> 
> Yeah.. but there is no way to avoid the double derreference
> considering we only have the asoc pointer in there and we have to
> reach the contents of the data chunk operations struct, and the .si
> part is the same as 'stream' part as it's a constant offset.
> 
> Due to the for() in there, we could add a variable to store
> asoc->stream.si outside the for and then we can do only a single deref
> inside it. Xin, can you please try and see if the generated code is
> different?
> 
> Other suggestions?
> 
Is it worth replacing the si struct with an index/enum value, and indexing an
array of method pointer structs?  That would save you at least one dereference.

Alternatively you could preform the dereference in two steps (i.e. declare an si
pointer on the stack and set it equal to asoc->stream.si, then deref
si->make_datafrag at call time.  That will at least give the compiler an
opportunity to preload the first pointer.

Neil

>   Marcelo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


RE: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave

2017-12-08 Thread David Laight
From: 'Marcelo Ricardo Leitner'
> Sent: 08 December 2017 15:16
> On Fri, Dec 08, 2017 at 03:01:31PM +, David Laight wrote:
> > From: Marcelo Ricardo Leitner
> > > Sent: 08 December 2017 14:57
> > >
> > > On Fri, Dec 08, 2017 at 02:06:04PM +, David Laight wrote:
> > > > From: Xin Long
> > > > > Sent: 08 December 2017 13:04
> > > > ...
> > > > > @@ -264,8 +264,8 @@ struct sctp_datamsg 
> > > > > *sctp_datamsg_from_user(struct sctp_association *asoc,
> > > > >   frag |= SCTP_DATA_SACK_IMM;
> > > > >   }
> > > > >
> > > > > - chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > > > > -  0, GFP_KERNEL);
> > > > > + chunk = asoc->stream.si->make_datafrag(asoc, sinfo, 
> > > > > len, frag,
> > > > > +GFP_KERNEL);
> > > >
> > > > I know that none of the sctp code is very optimised, but that indirect
> > > > call is going to be horrid.
> > >
> > > Yeah.. but there is no way to avoid the double derreference
> > > considering we only have the asoc pointer in there and we have to
> > > reach the contents of the data chunk operations struct, and the .si
> > > part is the same as 'stream' part as it's a constant offset.
> > ...
> >
> > It isn't only the double indirect, the indirect call itself isn't 'fun'.
> 
> I meant in this context.
> 
> The indirect call is so we don't have to flood the stack with
> if (old data chunk fmt) {
>   ...
> } else {
>   ...
> }
> 
> So instead of this, we now have some key operations identified and
> wrapped up behind this struct, allowing us to abstract whatever data
> chunk format it is.

Nothing wrong with:
#define foo(asoc, ...) \
if (asoc->new_fmt) \
foo_new(asoc, __VA_LIST__); \
else \
foo_old(asoc, __VA_LIST__);

> > I think there are other hot paths where you've replaced a sizeof()
> > with a ?: clause.
> > Caching the result might be much better.
> 
> The only new ?: clause I could find this patchset is on patch 12 and
> has nothing to do with sizeof().
> 
> The sizeof() results are indeed cached, as you can see in patch 4:
> +static struct sctp_stream_interleave sctp_stream_interleave_0 = {
> +   .data_chunk_len = sizeof(struct sctp_data_chunk),
> and the two helpers on it at the begining of the patch.

I was getting two bits mixed up.
But the code that reads data_chunk_len is following an awful lot of pointers.

David




Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave

2017-12-08 Thread 'Marcelo Ricardo Leitner'
On Fri, Dec 08, 2017 at 03:01:31PM +, David Laight wrote:
> From: Marcelo Ricardo Leitner
> > Sent: 08 December 2017 14:57
> > 
> > On Fri, Dec 08, 2017 at 02:06:04PM +, David Laight wrote:
> > > From: Xin Long
> > > > Sent: 08 December 2017 13:04
> > > ...
> > > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct 
> > > > sctp_association *asoc,
> > > > frag |= SCTP_DATA_SACK_IMM;
> > > > }
> > > >
> > > > -   chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > > > -0, GFP_KERNEL);
> > > > +   chunk = asoc->stream.si->make_datafrag(asoc, sinfo, 
> > > > len, frag,
> > > > +  GFP_KERNEL);
> > >
> > > I know that none of the sctp code is very optimised, but that indirect
> > > call is going to be horrid.
> > 
> > Yeah.. but there is no way to avoid the double derreference
> > considering we only have the asoc pointer in there and we have to
> > reach the contents of the data chunk operations struct, and the .si
> > part is the same as 'stream' part as it's a constant offset.
> ...
> 
> It isn't only the double indirect, the indirect call itself isn't 'fun'.

I meant in this context.

The indirect call is so we don't have to flood the stack with
if (old data chunk fmt) {
...
} else {
...
}

So instead of this, we now have some key operations identified and
wrapped up behind this struct, allowing us to abstract whatever data
chunk format it is.

> 
> I think there are other hot paths where you've replaced a sizeof()
> with a ?: clause.
> Caching the result might be much better.

The only new ?: clause I could find this patchset is on patch 12 and
has nothing to do with sizeof().

The sizeof() results are indeed cached, as you can see in patch 4:
+static struct sctp_stream_interleave sctp_stream_interleave_0 = {
+   .data_chunk_len = sizeof(struct sctp_data_chunk),
and the two helpers on it at the begining of the patch.

  Marcelo


RE: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave

2017-12-08 Thread David Laight
From: Marcelo Ricardo Leitner
> Sent: 08 December 2017 14:57
> 
> On Fri, Dec 08, 2017 at 02:06:04PM +, David Laight wrote:
> > From: Xin Long
> > > Sent: 08 December 2017 13:04
> > ...
> > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct 
> > > sctp_association *asoc,
> > >   frag |= SCTP_DATA_SACK_IMM;
> > >   }
> > >
> > > - chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > > -  0, GFP_KERNEL);
> > > + chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> > > +GFP_KERNEL);
> >
> > I know that none of the sctp code is very optimised, but that indirect
> > call is going to be horrid.
> 
> Yeah.. but there is no way to avoid the double derreference
> considering we only have the asoc pointer in there and we have to
> reach the contents of the data chunk operations struct, and the .si
> part is the same as 'stream' part as it's a constant offset.
...

It isn't only the double indirect, the indirect call itself isn't 'fun'.

I think there are other hot paths where you've replaced a sizeof()
with a ?: clause.
Caching the result might be much better.

David



Re: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave

2017-12-08 Thread Marcelo Ricardo Leitner
On Fri, Dec 08, 2017 at 02:06:04PM +, David Laight wrote:
> From: Xin Long
> > Sent: 08 December 2017 13:04
> ...
> > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct 
> > sctp_association *asoc,
> > frag |= SCTP_DATA_SACK_IMM;
> > }
> > 
> > -   chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > -0, GFP_KERNEL);
> > +   chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> > +  GFP_KERNEL);
> 
> I know that none of the sctp code is very optimised, but that indirect
> call is going to be horrid.

Yeah.. but there is no way to avoid the double derreference
considering we only have the asoc pointer in there and we have to
reach the contents of the data chunk operations struct, and the .si
part is the same as 'stream' part as it's a constant offset.

Due to the for() in there, we could add a variable to store
asoc->stream.si outside the for and then we can do only a single deref
inside it. Xin, can you please try and see if the generated code is
different?

Other suggestions?

  Marcelo


RE: [PATCHv2 net-next 04/12] sctp: implement make_datafrag for sctp_stream_interleave

2017-12-08 Thread David Laight
From: Xin Long
> Sent: 08 December 2017 13:04
...
> @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct 
> sctp_association *asoc,
>   frag |= SCTP_DATA_SACK_IMM;
>   }
> 
> - chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> -  0, GFP_KERNEL);
> + chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> +GFP_KERNEL);

I know that none of the sctp code is very optimised, but that indirect
call is going to be horrid.

David