date:20170328

Re: [PATCH v6 0/4] Broadcom SBA RAID support

2017-03-28 Thread Anup Patel

On Tue, Mar 21, 2017 at 2:48 PM, Vinod Koul  wrote:
> On Tue, Mar 21, 2017 at 02:17:21PM +0530, Anup Patel wrote:
>> On Tue, Mar 21, 2017 at 2:00 PM, Vinod Koul  wrote:
>> > On Mon, Mar 06, 2017 at 03:13:24PM +0530, Anup Patel wrote:
>> >> The Broadcom SBA RAID is a stream-based device which provides
>> >> RAID5/6 offload.
>> >>
>> >> It requires a SoC specific ring manager (such as Broadcom FlexRM
>> >> ring manager) to provide ring-based programming interface. Due to
>> >> this, the Broadcom SBA RAID driver (mailbox client) implements
>> >> DMA device having one DMA channel using a set of mailbox channels
>> >> provided by Broadcom SoC specific ring manager driver (mailbox
>> >> controller).
>> >>
>> >> The Broadcom SBA RAID hardware requires PQ disk position instead
>> >> of PQ disk coefficient. To address this, we have added raid_gflog
>> >> table which will help driver to convert PQ disk coefficient to PQ
>> >> disk position.
>> >>
>> >> This patchset is based on Linux-4.11-rc1 and depends on patchset
>> >> "[PATCH v5 0/2] Broadcom FlexRM ring manager support"
>> >
>> > Okay I applied and was about to push when I noticed this :(
>> >
>> > So what is the status of this..?
>>
>> PATCH2 is Acked but PATCH1 is under-review. Currently, its
>> v6 of that patchset.
>>
>> The only dependency on that patchset is the changes in
>> brcm-message.h which are required by this BCM-SBA-RAID
>> driver.
>>
>> @Jassi,
>> Can you please have a look at PATCH v6?
>
> And I would need an immutable branch/tag once merged. I am going to keep
> this series pending till then.

The Broadcom FlexRM patchset is pickedup by Jassi and
can be found in mailbox-for-next branch of
git://git.linaro.org/landing-teams/working/fujitsu/integration

Both patchset (Broadcom FlexRM patchset and this one) are
also available in sba-raid-v7 branch of
https://github.com/Broadcom/arm64-linux.git

Regards,
Anup

Re: [PATCH v2] arm64: dts: ls1012a: add crypto node

2017-03-28 Thread Shawn Guo

On Tue, Mar 28, 2017 at 02:46:19PM +0300, Horia Geantă wrote:
> LS1012A has a SEC v5.4 security engine.
> 
> Signed-off-by: Horia Geantă 

Applied, thanks.

Re: [PATCH v2 1/5] dt-bindings: Document STM32 CRC bindings

2017-03-28 Thread Rob Herring

On Tue, Mar 21, 2017 at 04:13:27PM +0100, Fabien Dessenne wrote:
> Document device tree bindings for the STM32 CRC (crypto CRC32)
> 
> Signed-off-by: Fabien Dessenne 
> ---
>  .../devicetree/bindings/crypto/st,stm32-crc.txt  | 16 
> 
>  1 file changed, 16 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/crypto/st,stm32-crc.txt

Acked-by: Rob Herring

Re: [PATCH v3 1/3] clk: meson-gxbb: expose clock CLKID_RNG0

2017-03-28 Thread Michael Turquette

Herbert,

On Thu, Mar 23, 2017 at 12:56 AM, Herbert Xu
 wrote:
> On Wed, Mar 22, 2017 at 08:24:08AM -0700, Kevin Hilman wrote:
>>
>> Because this will be causing conflicts with both the platform (amlogic)
>> tree and the clk tree, could provide an immutable branch where these are
>> applied to help us handle these conflicts?
>
> If you apply the same patches to your tree there should be no
> conflicts at all.  git is able to resolve this automatically.

The commits will have different SHA1 hashes (commit id's). Having
multiple "copies" of the same patch with separate id's is undesirable
and completely avoidable. Immutable, shared branches resolve this
issue.

Best regards,
Mike

>
> Cheers,
> --
> Email: Herbert Xu 
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [RFC PATCH v2 16/32] x86: kvm: Provide support to create Guest and HV shared per-CPU variables

2017-03-28 Thread Borislav Petkov

On Thu, Mar 02, 2017 at 10:15:36AM -0500, Brijesh Singh wrote:
> Some KVM specific MSR's (steal-time, asyncpf, avic_eio) allocates per-CPU
> variable at compile time and share its physical address with hypervisor.
> It presents a challege when SEV is active in guest OS. When SEV is active,
> guest memory is encrypted with guest key and hypervisor will no longer able
> to modify the guest memory. When SEV is active, we need to clear the
> encryption attribute of shared physical addresses so that both guest and
> hypervisor can access the data.
> 
> To solve this problem, I have tried these three options:
> 
> 1) Convert the static per-CPU to dynamic per-CPU allocation. When SEV is
> detected then clear the encryption attribute. But while doing so I found
> that per-CPU dynamic allocator was not ready when kvm_guest_cpu_init was
> called.
> 
> 2) Since the encryption attributes works on PAGE_SIZE hence add some extra
> padding to 'struct kvm-steal-time' to make it PAGE_SIZE and then at runtime
> clear the encryption attribute of the full PAGE. The downside of this was
> now we need to modify structure which may break the compatibility.

>From SEV-ES whitepaper:

"To facilitate this communication, the SEV-ES architecture defines
a Guest Hypervisor Communication Block (GHCB). The GHCB resides in
page of shared memory so it is accessible to both the guest VM and the
hypervisor."

So this is kinda begging to be implemented with a shared page between
guest and host. And then put steal-time, ... etc in there too. Provided
there's enough room in the single page for the GHCB *and* our stuff.

> 
> 3) Define a new per-CPU section (.data..percpu.hv_shared) which will be
> used to hold the compile time shared per-CPU variables. When SEV is
> detected we map this section with encryption attribute cleared.
> 
> This patch implements #3. It introduces a new DEFINE_PER_CPU_HV_SHAHRED
> macro to create a compile time per-CPU variable. When SEV is detected we
> map the per-CPU variable as decrypted (i.e with encryption attribute cleared).
> 
> Signed-off-by: Brijesh Singh 
> ---
>  arch/x86/kernel/kvm.c |   43 
> +++--
>  include/asm-generic/vmlinux.lds.h |3 +++
>  include/linux/percpu-defs.h   |9 
>  3 files changed, 48 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 099fcba..706a08e 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -75,8 +75,8 @@ static int parse_no_kvmclock_vsyscall(char *arg)
>  
>  early_param("no-kvmclock-vsyscall", parse_no_kvmclock_vsyscall);
>  
> -static DEFINE_PER_CPU(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
> -static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);
> +static DEFINE_PER_CPU_HV_SHARED(struct kvm_vcpu_pv_apf_data, apf_reason) 
> __aligned(64);
> +static DEFINE_PER_CPU_HV_SHARED(struct kvm_steal_time, steal_time) 
> __aligned(64);
>  static int has_steal_clock = 0;
>  
>  /*
> @@ -290,6 +290,22 @@ static void __init paravirt_ops_setup(void)
>  #endif
>  }
>  
> +static int kvm_map_percpu_hv_shared(void *addr, unsigned long size)
> +{
> + /* When SEV is active, the percpu static variables initialized
> +  * in data section will contain the encrypted data so we first
> +  * need to decrypt it and then map it as decrypted.
> +  */

Kernel comments style is:

/*
 * A sentence ending with a full-stop.
 * Another sentence. ...
 * More sentences. ...
 */

But you get the idea. Please check your whole patchset for this.

> + if (sev_active()) {
> + unsigned long pa = slow_virt_to_phys(addr);
> +
> + sme_early_decrypt(pa, size);
> + return early_set_memory_decrypted(addr, size);
> + }
> +
> + return 0;
> +}
> +
>  static void kvm_register_steal_time(void)
>  {
>   int cpu = smp_processor_id();
> @@ -298,12 +314,17 @@ static void kvm_register_steal_time(void)
>   if (!has_steal_clock)
>   return;
>  
> + if (kvm_map_percpu_hv_shared(st, sizeof(*st))) {
> + pr_err("kvm-stealtime: failed to map hv_shared percpu\n");
> + return;
> + }
> +
>   wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
>   pr_info("kvm-stealtime: cpu %d, msr %llx\n",
>   cpu, (unsigned long long) slow_virt_to_phys(st));
>  }
>  
> -static DEFINE_PER_CPU(unsigned long, kvm_apic_eoi) = KVM_PV_EOI_DISABLED;
> +static DEFINE_PER_CPU_HV_SHARED(unsigned long, kvm_apic_eoi) = 
> KVM_PV_EOI_DISABLED;
>  
>  static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val)
>  {
> @@ -327,25 +348,33 @@ static void kvm_guest_cpu_init(void)
>   if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF) && kvmapf) {
>   u64 pa = slow_virt_to_phys(this_cpu_ptr(&apf_reason));
>  
> + if (kvm_map_percpu_hv_shared(this_cpu_ptr(&apf_reason),
> +

Re: [PATCH 0/7] crypto: aes - allow generic AES to be omitted

2017-03-28 Thread Eric Biggers

On Tue, Mar 28, 2017 at 09:51:54AM +0100, Ard Biesheuvel wrote:
> On 28 March 2017 at 06:43, Eric Biggers  wrote:
> >
> > Just a thought: how about renaming CRYPTO_AES to CRYPTO_AES_GENERIC, then
> > renaming what you called CRYPTO_NEED_AES to CRYPTO_AES?  Then all the 
> > 'select
> > CRYPTO_AES' can remain as-is, instead of replacing them with the (in my 
> > opinion
> > uglier) 'select CRYPTO_NEED_AES'.  And it should still work for people who 
> > have
> > CRYPTO_AES=y or CRYPTO_AES=m in their kernel config, since they'll still 
> > get at
> > least one AES implementation (though they may stop getting the generic one).
> >
> > Also, in general I think we need better Kconfig help text.  As proposed you 
> > can
> > now toggle simply "AES cipher algorithms", and nowhere in the help text is 
> > it
> > mentioned that that is only the generic implementation, which you don't 
> > need if
> > you have enabled some other implementation.  Similarly for "Fixed time AES
> > cipher"; it perhaps should be mentioned that it's only useful if a 
> > fixed-time
> > implementation using special CPU instructions like AES-NI or ARMv8-CE isn't
> > usable.
> >
> 
> Thanks for the feedback. I take it you are on board with the general idea 
> then?
> 
> Re name change, those are good points. I will experiment with that.
> 
> I was a bit on the fence about modifying the x86 code more than
> required, but actually, I think it makes sense for the AES-NI code to
> use fixed-time AES as a fallback rather than the table-based x86 code,
> given that the fallback is rarely used (only when executed in the
> context of an interrupt taken from kernel code that is already using
> the FPU) and falling back to a non-fixed time implementation loses
> some guarantees that the AES-NI code gives.

Definitely, I just feel it needs to be cleaned up a little so that the different
AES config options and modules aren't quite as confusing to those not as
familiar with them.

Did you also consider having 

crypto_aes_set_key_generic()
and
crypto_aes_expand_key_ti()
crypto_aes_set_key_ti()

instead of crypto_aes_set_key() and crypto_aes_expand_key()?  As-is, it isn't
immediately clear which function is part of which module.

- Eric

Re: [PATCH v3 1/3] crypto: hw_random - Add new Exynos RNG driver

2017-03-28 Thread Krzysztof Kozlowski

On Tue, Mar 28, 2017 at 07:41:47PM +0200, Stephan Müller wrote:
> Am Dienstag, 28. März 2017, 18:48:24 CEST schrieb Krzysztof Kozlowski:
> 
> Hi Krzysztof,
> 
> > I tested a little bit and:
> > 1. Seeding with some value
> > 2. generating random,
> > 3. kcapi_rng_destroy+kcrng_init, (I cannot do a hardware reset except
> >reboot of entire system)
> > 4. seeding with the same value as in (1) - different random numbers.
> > 
> > Doing a system reboot and repeating above - different random numbers
> > (all are different: step (2) and in (4)).
> > 
> > Your test case also produces different random values every time.
> 
> Then I would assume that simply adding an outer loop to your for() loop to 
> inject seed larger than the minimum required seed size should be fine.

Yes, makes sense. I'll send an updated version of patch.

Best regards,
Krzysztof

Re: [PATCH v3 1/3] crypto: hw_random - Add new Exynos RNG driver

2017-03-28 Thread Stephan Müller

Am Dienstag, 28. März 2017, 18:48:24 CEST schrieb Krzysztof Kozlowski:

Hi Krzysztof,

> I tested a little bit and:
> 1. Seeding with some value
> 2. generating random,
> 3. kcapi_rng_destroy+kcrng_init, (I cannot do a hardware reset except
>reboot of entire system)
> 4. seeding with the same value as in (1) - different random numbers.
> 
> Doing a system reboot and repeating above - different random numbers
> (all are different: step (2) and in (4)).
> 
> Your test case also produces different random values every time.

Then I would assume that simply adding an outer loop to your for() loop to 
inject seed larger than the minimum required seed size should be fine.
> 
> Best regards,
> Krzysztof



Ciao
Stephan

Re: [PATCH v3 1/3] crypto: hw_random - Add new Exynos RNG driver

2017-03-28 Thread Krzysztof Kozlowski

On Mon, Mar 27, 2017 at 03:53:03PM +0200, Stephan Müller wrote:
> Am Montag, 27. März 2017, 06:23:11 CEST schrieb PrasannaKumar Muralidharan:
> 
> Hi PrasannaKumar,
> 
> > > Oh my, if you are right with your first guess, this is a bad DRNG design.
> > > 
> > > Just out of curiousity: what happens if a caller invokes the seed function
> > > twice or more times (each time with the sufficient amount of bits)? What
> > > is
> > > your guess here?
> > 
> > Should the second seed use the random data generated by the device?
> 
> A DRNG should be capable of processing an arbitrary amount of seed data. It 
> may be the case that the seed data must be processed in chunks though.
> 

As I said, I do not know the implementation details about hardware. They
are just not disclossed.

> That said, it may be the case that after injecting one chunk of seed the 
> currently discussed RNG simply needs to generate a random number to process 
> the input data before another seed can be added. But that is pure speculation.
> 
> But I guess that can be easily tested: inject a known seed into the DRNG, 
> generate a random number, inject the same seed again and again generate a 
> random number. If both are identical (which I do not hope), then the internal 
> state is simply overwritten (strange DRNG design).
> 
> A similar test can be made to see whether a larger set of seed simply 
> overwrites the state or is really processed.
> 
> 1. seed
> 2. generate random data
> 3. reset
> 4. seed with anther seed
> 5. generate random data
> 6. reset
> 7. seed with same data from 1
> 8. seed with same data from 2
> 9. generate random data
> 
> If data from 9 is identical to 2, then additional seed data is discarded -> 
> bad design. If data from 9 is identical to 5, then the additional data 
> overwrites the initial data -> bad DRNG design. If data from 9 neither 
> matches 
> 2 or 5, then all seed is taken -> good design.

I tested a little bit and:
1. Seeding with some value
2. generating random,
3. kcapi_rng_destroy+kcrng_init, (I cannot do a hardware reset except
   reboot of entire system)
4. seeding with the same value as in (1) - different random numbers.

Doing a system reboot and repeating above - different random numbers
(all are different: step (2) and in (4)).

Your test case also produces different random values every time.

Best regards,
Krzysztof

[PATCH V2] crypto: ccp - Rearrange structure members to minimize size

2017-03-28 Thread Gary R Hook

The AES GCM function (in ccp-ops) requires a fair amount of
stack space, which elicits a complaint when KASAN is enabled.
Rearranging and packing a few structures eliminates the
warning.

Signed-off-by: Gary R Hook 
---
 drivers/crypto/ccp/ccp-dev.h |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/ccp/ccp-dev.h b/drivers/crypto/ccp/ccp-dev.h
index 3a45c2a..191274d 100644
--- a/drivers/crypto/ccp/ccp-dev.h
+++ b/drivers/crypto/ccp/ccp-dev.h
@@ -427,33 +427,33 @@ enum ccp_memtype {
 };
 #defineCCP_MEMTYPE_LSB CCP_MEMTYPE_KSB
 
+
 struct ccp_dma_info {
dma_addr_t address;
unsigned int offset;
unsigned int length;
enum dma_data_direction dir;
-};
+} __packed __aligned(4);
 
 struct ccp_dm_workarea {
struct device *dev;
struct dma_pool *dma_pool;
-   unsigned int length;
 
u8 *address;
struct ccp_dma_info dma;
+   unsigned int length;
 };
 
 struct ccp_sg_workarea {
struct scatterlist *sg;
int nents;
+   unsigned int sg_used;
 
struct scatterlist *dma_sg;
struct device *dma_dev;
unsigned int dma_count;
enum dma_data_direction dma_dir;
 
-   unsigned int sg_used;
-
u64 bytes_left;
 };

Re: [PATCH 1/2] crypto: ccp - Reduce stack frame size with KASAN

2017-03-28 Thread Gary R Hook


On 03/28/2017 10:10 AM, Arnd Bergmann wrote:

-};
+} __packed __aligned(4);


My gcc 4.8 doesn't understand __aligned(). Shouldn't we use
#pragma(4) here?


That is odd, the __aligned() macro gets defined for all compiler versions
in linux/compiler.h, and the aligned attribute should work for all supported
compilers (3.2 and higher), while #pragma pack() requires gcc-4.0 or
higher.



Tried again in a couple of trees. Not sure what I did wrong, but the 
compiler

seems to be happy now. Huh.

Will submit a V2.


--
This is my day job. Follow me at:
IG/Twitter/Facebook: @grhookphoto
IG/Twitter/Facebook: @grhphotographer

Re: [PATCH 1/2] crypto: ccp - Reduce stack frame size with KASAN

2017-03-28 Thread Gary R Hook


On 03/28/2017 10:10 AM, Arnd Bergmann wrote:

On Tue, Mar 28, 2017 at 4:15 PM, Gary R Hook  wrote:

On 03/28/2017 04:58 AM, Arnd Bergmann wrote:> The newly added AES GCM
implementation uses one of the largest stack frames



diff --git a/drivers/crypto/ccp/ccp-dev.h b/drivers/crypto/ccp/ccp-dev.h
index 3a45c2af2fbd..c5ea0796a891 100644
--- a/drivers/crypto/ccp/ccp-dev.h
+++ b/drivers/crypto/ccp/ccp-dev.h
@@ -432,24 +432,24 @@ struct ccp_dma_info {
 unsigned int offset;
 unsigned int length;
 enum dma_data_direction dir;
-};
+} __packed __aligned(4);



My gcc 4.8 doesn't understand __aligned(). Shouldn't we use
#pragma(4) here?


That is odd, the __aligned() macro gets defined for all compiler versions
in linux/compiler.h, and the aligned attribute should work for all supported
compilers (3.2 and higher), while #pragma pack() requires gcc-4.0 or
higher.

We generally prefer attribute syntax in the kernel over pragmas, even
when they are functionally the same.


Yes, it's extremely odd, and I understand this preference. Please ignore my
submitted alternate and let me track this down


--
This is my day job. Follow me at:
IG/Twitter/Facebook: @grhookphoto
IG/Twitter/Facebook: @grhphotographer

Re: [PATCH 2/2] crypto: ccp - Mark driver as little-endian only

2017-03-28 Thread Gary R Hook


On 03/28/2017 09:59 AM, Arnd Bergmann wrote:

On Tue, Mar 28, 2017 at 4:08 PM, Gary R Hook  wrote:


In fact, the use of bit fields in hardware defined data structures is
not portable to start with, so until all these bit fields get replaced
by something else, the driver cannot work on big-endian machines, and
I'm adding an annotation here to prevent it from being selected.



This is a driver that talks to hardware, a device which, AFAIK, has no
plan to be implemented in a big endian flavor. I clearly need to be more
diligent in building with various checkers enabled. I'd prefer my fix
over your suggested refusal to compile, if that's okay.


It's hard to predict the future. If this device ever makes it into an
ARM based chip, the chances are relatively high that someone
will eventually run a big-endian kernel on it. As long as it's guaranteed
to be x86-only, the risk of anyone running into the bug is close to
zero, but we normally still try to write device drivers in portable C
code to prevent it from getting copied incorrectly into another driver.


Understood, and I had surmised as such. Totally agree.


The CCPv3 code seems to not suffer from this problem, only v5 uses
bitfields.



Yes, I took a different approach when I wrote the code. IMO (arguably)
more readable. Same result: words full of hardware-dependent bit patterns.

Please help me understand what I could do better.


The rule for portable drivers is that you must not use bitfields in
structures
that can be accessed by the hardware. I think you can do this in a more
readable way by removing the CCP5_CMD_* macros etc completely
and just accessing the members of the structure as __le32 words.
The main advantage for readability here is that you can grep for the
struct members and see where they are used without following the
macros. If it helps, you can also encapsulate the generation of the
word inside of an inline function, like:



Please see my follow-on patch.

--
This is my day job. Follow me at:
IG/Twitter/Facebook: @grhookphoto
IG/Twitter/Facebook: @grhphotographer

Re: [PATCH 1/2] crypto: ccp - Reduce stack frame size with KASAN

2017-03-28 Thread Arnd Bergmann

On Tue, Mar 28, 2017 at 4:15 PM, Gary R Hook  wrote:
> On 03/28/2017 04:58 AM, Arnd Bergmann wrote:> The newly added AES GCM
> implementation uses one of the largest stack frames

>> diff --git a/drivers/crypto/ccp/ccp-dev.h b/drivers/crypto/ccp/ccp-dev.h
>> index 3a45c2af2fbd..c5ea0796a891 100644
>> --- a/drivers/crypto/ccp/ccp-dev.h
>> +++ b/drivers/crypto/ccp/ccp-dev.h
>> @@ -432,24 +432,24 @@ struct ccp_dma_info {
>>  unsigned int offset;
>>  unsigned int length;
>>  enum dma_data_direction dir;
>> -};
>> +} __packed __aligned(4);
>
>
> My gcc 4.8 doesn't understand __aligned(). Shouldn't we use
> #pragma(4) here?

That is odd, the __aligned() macro gets defined for all compiler versions
in linux/compiler.h, and the aligned attribute should work for all supported
compilers (3.2 and higher), while #pragma pack() requires gcc-4.0 or
higher.

We generally prefer attribute syntax in the kernel over pragmas, even
when they are functionally the same.

  Arnd

Re: [PATCH 2/2] crypto: ccp - Mark driver as little-endian only

2017-03-28 Thread Arnd Bergmann

On Tue, Mar 28, 2017 at 4:08 PM, Gary R Hook  wrote:

>> In fact, the use of bit fields in hardware defined data structures is
>> not portable to start with, so until all these bit fields get replaced
>> by something else, the driver cannot work on big-endian machines, and
>> I'm adding an annotation here to prevent it from being selected.
>
>
> This is a driver that talks to hardware, a device which, AFAIK, has no
> plan to be implemented in a big endian flavor. I clearly need to be more
> diligent in building with various checkers enabled. I'd prefer my fix
> over your suggested refusal to compile, if that's okay.

It's hard to predict the future. If this device ever makes it into an
ARM based chip, the chances are relatively high that someone
will eventually run a big-endian kernel on it. As long as it's guaranteed
to be x86-only, the risk of anyone running into the bug is close to
zero, but we normally still try to write device drivers in portable C
code to prevent it from getting copied incorrectly into another driver.

>> The CCPv3 code seems to not suffer from this problem, only v5 uses
>> bitfields.
>
>
> Yes, I took a different approach when I wrote the code. IMO (arguably)
> more readable. Same result: words full of hardware-dependent bit patterns.
>
> Please help me understand what I could do better.

The rule for portable drivers is that you must not use bitfields in structures
that can be accessed by the hardware. I think you can do this in a more
readable way by removing the CCP5_CMD_* macros etc completely
and just accessing the members of the structure as __le32 words.
The main advantage for readability here is that you can grep for the
struct members and see where they are used without following the
macros. If it helps, you can also encapsulate the generation of the
word inside of an inline function, like:

static inline __le32 ccp5_cmd_dw0(bool soc, bool ioc, bool init, bool
eom, u32 engine)
{
u32 dw0 = (soc  ? CCP5_WORD0_SOC  : 0)  |
  (ioc  ? CCP5_WORD0_IOC  : 0)  |
  (init ? CCP5_WORD0_INIT : 0)  |
  (eom  ? CCP5_WORD0_EOM  : 0)  |
CCP5_WORD0_ENGINE(engine);

return __cpu_to_le32(dw0);
}

...
desc->dw0 = ccp5_cmd_dw0(op->soc, 0, op->init, op->oem, op->engine);

   Arnd

Re: [PATCH 1/2] crypto: ccp - Reduce stack frame size with KASAN

2017-03-28 Thread Arnd Bergmann

On Tue, Mar 28, 2017 at 4:15 PM, Gary R Hook  wrote:

>> A more drastic refactoring of the driver might be needed to reduce the
>> stack usage more substantially, but this patch is fairly simple and
>> at least addresses the third one of the problems I mentioned, reducing the
>> stack size by about 150 bytes and bringing it below the warning limit
>> I picked.
>
>
> Again, I'll devote some time to this.
>
>> diff --git a/drivers/crypto/ccp/ccp-dev.h b/drivers/crypto/ccp/ccp-dev.h
>> index 3a45c2af2fbd..c5ea0796a891 100644
>> --- a/drivers/crypto/ccp/ccp-dev.h
>> +++ b/drivers/crypto/ccp/ccp-dev.h
>> @@ -432,24 +432,24 @@ struct ccp_dma_info {
>>  unsigned int offset;
>>  unsigned int length;
>>  enum dma_data_direction dir;
>> -};
>> +} __packed __aligned(4);
>
>
> My gcc 4.8 doesn't understand __aligned(). Shouldn't we use
> #pragma(4) here?



>>  struct ccp_dm_workarea {
>>  struct device *dev;
>>  struct dma_pool *dma_pool;
>> -   unsigned int length;
>>
>>  u8 *address;
>>  struct ccp_dma_info dma;
>> +   unsigned int length;
>>  };
>>
>>  struct ccp_sg_workarea {
>>  struct scatterlist *sg;
>>  int nents;
>> +   unsigned int dma_count;
>>
>>  struct scatterlist *dma_sg;
>>  struct device *dma_dev;
>> -   unsigned int dma_count;
>>  enum dma_data_direction dma_dir;
>>
>>  unsigned int sg_used;
>
>
> I'm okay with rearranging, but I'm going to submit an alternative patch.

Ok, thanks a lot!

Re: [RFC TLS Offload Support 05/15] tcp: Add TLS socket options for TCP sockets

2017-03-28 Thread Tom Herbert

On Tue, Mar 28, 2017 at 6:26 AM, Aviad Yehezkel  wrote:
> This patch adds TLS_TX and TLS_RX TCP socket options.
>
> Setting these socket options will change the sk->sk_prot
> operations of the TCP socket. The user is responsible to
> prevent races between calls to the previous operations
> and the new operations. After successful return, data
> sent on this socket will be encapsulated in TLS.
>
> Signed-off-by: Aviad Yehezkel 
> Signed-off-by: Boris Pismenny 
> Signed-off-by: Ilya Lesokhin 
> ---
>  include/uapi/linux/tcp.h |  2 ++
>  net/ipv4/tcp.c   | 32 
>  2 files changed, 34 insertions(+)
>
> diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
> index c53de26..f9f0e29 100644
> --- a/include/uapi/linux/tcp.h
> +++ b/include/uapi/linux/tcp.h
> @@ -116,6 +116,8 @@ enum {
>  #define TCP_SAVE_SYN   27  /* Record SYN headers for new 
> connections */
>  #define TCP_SAVED_SYN  28  /* Get SYN headers recorded for 
> connection */
>  #define TCP_REPAIR_WINDOW  29  /* Get/set window parameters */
> +#define TCP_TLS_TX 30
> +#define TCP_TLS_RX 31
>
>  struct tcp_repair_opt {
> __u32   opt_code;
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 302fee9..2d190e3 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -273,6 +273,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -2676,6 +2677,21 @@ static int do_tcp_setsockopt(struct sock *sk, int 
> level,
> tp->notsent_lowat = val;
> sk->sk_write_space(sk);
> break;
> +   case TCP_TLS_TX:
> +   case TCP_TLS_RX: {
> +   int (*fn)(struct sock *sk, int optname,
> + char __user *optval, unsigned int optlen);
> +
> +   fn = symbol_get(tls_sk_attach);
> +   if (!fn) {
> +   err = -EINVAL;
> +   break;
> +   }
> +
> +   err = fn(sk, optname, optval, optlen);
> +   symbol_put(tls_sk_attach);
> +   break;
> +   }
> default:
> err = -ENOPROTOOPT;
> break;
> @@ -3064,6 +3080,22 @@ static int do_tcp_getsockopt(struct sock *sk, int 
> level,
> }
> return 0;
> }
> +   case TCP_TLS_TX:
> +   case TCP_TLS_RX: {
> +   int err;
> +   int (*fn)(struct sock *sk, int optname,
> + char __user *optval, int __user *optlen);
> +
> +   fn = symbol_get(tls_sk_query);
> +   if (!fn) {
> +   err = -EINVAL;
> +   break;
> +   }
> +
> +   err = fn(sk, optname, optval, optlen);
> +   symbol_put(tls_sk_query);
> +   return err;
> +   }

This mechanism should be generalized. If we can do this with TLS then
there will likely be other ULPs that we might want to set on a TCP
socket. Maybe something like TCP_ULP_PUSH, TCP_ULP_POP (borrowing from
STREAMS ever so slightly :-) ). I'd also suggest that the ULPs are
indicated by a text string in the socket option argument, then have
each ULP perform a registration for their service.


> default:
> return -ENOPROTOOPT;
> }
> --
> 2.7.4
>

Re: [PATCH 1/2] crypto: ccp - Reduce stack frame size with KASAN

2017-03-28 Thread Gary R Hook

On 03/28/2017 04:58 AM, Arnd Bergmann wrote:> The newly added AES GCM 
implementation uses one of the largest stack frames

in the kernel, around 1KB on normal 64-bit kernels, and 1.6KB when
CONFIG_KASAN
is enabled:

drivers/crypto/ccp/ccp-ops.c: In function 'ccp_run_aes_gcm_cmd':
drivers/crypto/ccp/ccp-ops.c:851:1: error: the frame size of 1632 bytes
is larger than 1536 bytes [-Werror=frame-larger-than=]

This is problematic for multiple reasons:

 - The crypto functions are often used in deep call chains, e.g. behind
   mm, fs and dm layers, making it more likely to run into an actual stack
   overflow

 - Using this much stack space is an indicator that the code is not
   written to be as efficient as it could be.


I'm not sure I agree that A -> B, but I will certainly look into this.


 - While this goes unnoticed at the moment in mainline with the frame size
   warning being disabled when KASAN is in use, I would like to enable
   the warning again, and the current code is slightly above my arbitrary
   pick for a limit of 1536 bytes (I already did patches for every other
   driver exceeding this).


I've got my stack frame size (also) set to 1536, and would have paid 
more attention

had a warning occurred due to my code.


A more drastic refactoring of the driver might be needed to reduce the
stack usage more substantially, but this patch is fairly simple and
at least addresses the third one of the problems I mentioned, reducing the
stack size by about 150 bytes and bringing it below the warning limit
I picked.


Again, I'll devote some time to this.


diff --git a/drivers/crypto/ccp/ccp-dev.h b/drivers/crypto/ccp/ccp-dev.h
index 3a45c2af2fbd..c5ea0796a891 100644
--- a/drivers/crypto/ccp/ccp-dev.h
+++ b/drivers/crypto/ccp/ccp-dev.h
@@ -432,24 +432,24 @@ struct ccp_dma_info {
 unsigned int offset;
 unsigned int length;
 enum dma_data_direction dir;
-};
+} __packed __aligned(4);


My gcc 4.8 doesn't understand __aligned(). Shouldn't we use
#pragma(4) here?



 struct ccp_dm_workarea {
 struct device *dev;
 struct dma_pool *dma_pool;
-   unsigned int length;

 u8 *address;
 struct ccp_dma_info dma;
+   unsigned int length;
 };

 struct ccp_sg_workarea {
 struct scatterlist *sg;
 int nents;
+   unsigned int dma_count;

 struct scatterlist *dma_sg;
 struct device *dma_dev;
-   unsigned int dma_count;
 enum dma_data_direction dma_dir;

 unsigned int sg_used;


I'm okay with rearranging, but I'm going to submit an alternative patch.

Re: [PATCH 2/2] crypto: ccp - Mark driver as little-endian only

2017-03-28 Thread Gary R Hook


Ack. Didn't reply all  Sorry, Arnd.

There was a krobot warning about this and I submitted a patch just now.

(I thought) my mistake was (in this function) not handling the structure
elements in the same manner as other functions. My patch rectifies that.

On 03/28/2017 04:58 AM, Arnd Bergmann wrote:

The driver causes a warning when built as big-endian:

drivers/crypto/ccp/ccp-dev-v5.c: In function 'ccp5_perform_des3':
include/uapi/linux/byteorder/big_endian.h:32:26: error: large integer
implicitly truncated to unsigned type [-Werror=overflow]
 #define __cpu_to_le32(x) ((__force __le32)__swab32((x)))
  ^
include/linux/byteorder/generic.h:87:21: note: in expansion of macro
'__cpu_to_le32'
 #define cpu_to_le32 __cpu_to_le32
 ^
drivers/crypto/ccp/ccp-dev-v5.c:436:28: note: in expansion of macro
'cpu_to_le32'
  CCP5_CMD_KEY_MEM(&desc) = cpu_to_le32(CCP_MEMTYPE_SB);

The warning is correct, doing a 32-bit byte swap on a value that gets
assigned into a bit field cannot work, since we would only write zeroes
in this case, regardless of the input.


Yes, this was all wrong.


In fact, the use of bit fields in hardware defined data structures is
not portable to start with, so until all these bit fields get replaced
by something else, the driver cannot work on big-endian machines, and
I'm adding an annotation here to prevent it from being selected.


This is a driver that talks to hardware, a device which, AFAIK, has no
plan to be implemented in a big endian flavor. I clearly need to be more
diligent in building with various checkers enabled. I'd prefer my fix
over your suggested refusal to compile, if that's okay.


The CCPv3 code seems to not suffer from this problem, only v5 uses
bitfields.


Yes, I took a different approach when I wrote the code. IMO (arguably)
more readable. Same result: words full of hardware-dependent bit patterns.

Please help me understand what I could do better.

--
This is my day job. Follow me at:
IG/Twitter/Facebook: @grhookphoto
IG/Twitter/Facebook: @grhphotographer

[PATCH] crypto: ccp - Remove redundant cpu-to-le32 macros

2017-03-28 Thread Gary R Hook

Endianness is dealt with when the command descriptor is
copied into the command queue. Remove any occurrences of
cpu_to_le32() found elsewhere.

Signed-off-by: Gary R Hook 
---
 drivers/crypto/ccp/ccp-dev-v5.c |   22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/crypto/ccp/ccp-dev-v5.c b/drivers/crypto/ccp/ccp-dev-v5.c
index 5e08654..e03d06a 100644
--- a/drivers/crypto/ccp/ccp-dev-v5.c
+++ b/drivers/crypto/ccp/ccp-dev-v5.c
@@ -419,22 +419,22 @@ static int ccp5_perform_des3(struct ccp_op *op)
CCP_DES3_ENCRYPT(&function) = op->u.des3.action;
CCP_DES3_MODE(&function) = op->u.des3.mode;
CCP_DES3_TYPE(&function) = op->u.des3.type;
-   CCP5_CMD_FUNCTION(&desc) = cpu_to_le32(function.raw);
+   CCP5_CMD_FUNCTION(&desc) = function.raw;
 
-   CCP5_CMD_LEN(&desc) = cpu_to_le32(op->src.u.dma.length);
+   CCP5_CMD_LEN(&desc) = op->src.u.dma.length;
 
-   CCP5_CMD_SRC_LO(&desc) = cpu_to_le32(ccp_addr_lo(&op->src.u.dma));
-   CCP5_CMD_SRC_HI(&desc) = cpu_to_le32(ccp_addr_hi(&op->src.u.dma));
-   CCP5_CMD_SRC_MEM(&desc) = cpu_to_le32(CCP_MEMTYPE_SYSTEM);
+   CCP5_CMD_SRC_LO(&desc) = ccp_addr_lo(&op->src.u.dma);
+   CCP5_CMD_SRC_HI(&desc) = ccp_addr_hi(&op->src.u.dma);
+   CCP5_CMD_SRC_MEM(&desc) = CCP_MEMTYPE_SYSTEM;
 
-   CCP5_CMD_DST_LO(&desc) = cpu_to_le32(ccp_addr_lo(&op->dst.u.dma));
-   CCP5_CMD_DST_HI(&desc) = cpu_to_le32(ccp_addr_hi(&op->dst.u.dma));
-   CCP5_CMD_DST_MEM(&desc) = cpu_to_le32(CCP_MEMTYPE_SYSTEM);
+   CCP5_CMD_DST_LO(&desc) = ccp_addr_lo(&op->dst.u.dma);
+   CCP5_CMD_DST_HI(&desc) = ccp_addr_hi(&op->dst.u.dma);
+   CCP5_CMD_DST_MEM(&desc) = CCP_MEMTYPE_SYSTEM;
 
-   CCP5_CMD_KEY_LO(&desc) = cpu_to_le32(lower_32_bits(key_addr));
+   CCP5_CMD_KEY_LO(&desc) = lower_32_bits(key_addr);
CCP5_CMD_KEY_HI(&desc) = 0;
-   CCP5_CMD_KEY_MEM(&desc) = cpu_to_le32(CCP_MEMTYPE_SB);
-   CCP5_CMD_LSB_ID(&desc) = cpu_to_le32(op->sb_ctx);
+   CCP5_CMD_KEY_MEM(&desc) = CCP_MEMTYPE_SB;
+   CCP5_CMD_LSB_ID(&desc) = op->sb_ctx;
 
return ccp5_do_cmd(&desc, op->cmd_q);
 }

[RFC TLS Offload Support 10/15] mlx/tls: Add mlx_accel offload driver for TLS

2017-03-28 Thread Aviad Yehezkel

From: Ilya Lesokhin 

Implement the transmit and receive callbacks as well as the netdev
operations for adding and removing sockets.

Signed-off-by: Guy Shapiro 
Signed-off-by: Ilya Lesokhin 
Signed-off-by: Matan Barak 
Signed-off-by: Haggai Eran 
Signed-off-by: Aviad Yehezkel 
---
 .../net/ethernet/mellanox/accelerator/tls/tls.c| 652 +
 .../net/ethernet/mellanox/accelerator/tls/tls.h| 100 
 2 files changed, 752 insertions(+)
 create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls.c
 create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls.h

diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls.c 
b/drivers/net/ethernet/mellanox/accelerator/tls/tls.c
new file mode 100644
index 000..07a4b67
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls.c
@@ -0,0 +1,652 @@
+/*
+ * Copyright (c) 2015-2017 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+#include "tls.h"
+#include "tls_sysfs.h"
+#include "tls_hw.h"
+#include "tls_cmds.h"
+#include 
+#include 
+
+static LIST_HEAD(mlx_tls_devs);
+static DEFINE_MUTEX(mlx_tls_mutex);
+
+/* Start of context identifiers range (inclusive) */
+#define SWID_START 5
+/* End of context identifiers range (exclusive) */
+#define SWID_END   BIT(24)
+
+static netdev_features_t mlx_tls_feature_chk(struct sk_buff *skb,
+struct net_device *netdev,
+netdev_features_t features,
+bool *done)
+{
+   return features;
+}
+
+int mlx_tls_get_count(struct net_device *netdev)
+{
+   return 0;
+}
+
+int mlx_tls_get_strings(struct net_device *netdev, uint8_t *data)
+{
+   return 0;
+}
+
+int mlx_tls_get_stats(struct net_device *netdev, u64 *data)
+{
+   return 0;
+}
+
+/* must hold mlx_tls_mutex to call this function */
+static struct mlx_tls_dev *find_mlx_tls_dev_by_netdev(
+   struct net_device *netdev)
+{
+   struct mlx_tls_dev *dev;
+
+   list_for_each_entry(dev, &mlx_tls_devs, accel_dev_list) {
+   if (dev->netdev == netdev)
+   return dev;
+   }
+
+   return NULL;
+}
+
+struct mlx_tls_offload_context *get_tls_context(struct sock *sk)
+{
+   struct tls_context *tls_ctx = tls_get_ctx(sk);
+
+   return container_of(tls_offload_ctx(tls_ctx),
+   struct mlx_tls_offload_context,
+   context);
+}
+
+static int mlx_tls_add(struct net_device *netdev,
+  struct sock *sk,
+  enum tls_offload_ctx_dir direction,
+  struct tls_crypto_info *crypto_info,
+  struct tls_offload_context **ctx)
+{
+   struct tls_crypto_info_aes_gcm_128 *crypto_info_aes_gcm_128;
+   struct mlx_tls_offload_context *context;
+   struct mlx_tls_dev *dev;
+   int swid;
+   int ret;
+
+   pr_info("mlx_tls_add called\n");
+
+   if (direction == TLS_OFFLOAD_CTX_DIR_RX) {
+   pr_err("mlx_tls_add(): do not support recv\n");
+   ret = -EINVAL;
+   goto out;
+   }
+
+   if (!crypto_info ||
+   crypto_info->cipher_type != TLS_CIPHER_AES_GCM_128) {
+   pr_err("mlx_tls_add(): support only aes_gcm_128\n");
+   ret = -EINVAL;
+   goto out;
+   }
+   crypto_info_aes_gcm_128 =
+   (struct tls_crypto_info_aes_gcm_128 *)crypto_info;
+
+   dev = mlx_tls_find_dev_by_netdev(netdev);
+   if (!dev) {
+

[RFC TLS Offload Support 07/15] mlx/mlx5_core: Allow sending multiple packets

2017-03-28 Thread Aviad Yehezkel

From: Ilya Lesokhin 

Modify mlx5e_xmit to xmit multiple packet chained
using skb->next

Signed-off-by: Ilya Lesokhin 
Signed-off-by: Aviad Yehezkel 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index e6ce509..f2d0cc0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -35,7 +35,7 @@
 #include "en.h"
 
 #define MLX5E_SQ_NOPS_ROOM  MLX5_SEND_WQE_MAX_WQEBBS
-#define MLX5E_SQ_STOP_ROOM (MLX5_SEND_WQE_MAX_WQEBBS +\
+#define MLX5E_SQ_STOP_ROOM (2 * MLX5_SEND_WQE_MAX_WQEBBS +\
MLX5E_SQ_NOPS_ROOM)
 
 void mlx5e_send_nop(struct mlx5e_sq *sq, bool notify_hw)
@@ -405,6 +405,8 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct 
net_device *dev)
struct mlx5e_sq *sq = NULL;
struct mlx5_accel_ops *accel_ops;
struct mlx5_swp_info swp_info = {0};
+   struct sk_buff *next;
+   int rc;
 
rcu_read_lock();
accel_ops = mlx5_accel_get(priv->mdev);
@@ -417,7 +419,12 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct 
net_device *dev)
 
sq = priv->txq_to_sq_map[skb_get_queue_mapping(skb)];
 
-   return mlx5e_sq_xmit(sq, skb, &swp_info);
+   do {
+   next = skb->next;
+   rc = mlx5e_sq_xmit(sq, skb, &swp_info);
+   skb = next;
+   } while (next);
+   return rc;
 }
 
 bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
-- 
2.7.4

[PATCH 4.10 075/111] hwrng: amd - Revert managed API changes

2017-03-28 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Prarit Bhargava 

commit 69db7009318758769d625b023402161c750f7876 upstream.

After commit 31b2a73c9c5f ("hwrng: amd - Migrate to managed API"), the
amd-rng driver uses devres with pci_dev->dev to keep track of resources,
but does not actually register a PCI driver.  This results in the
following issues:

1. The message

WARNING: CPU: 2 PID: 621 at drivers/base/dd.c:349 driver_probe_device+0x38c

is output when the i2c_amd756 driver loads and attempts to register a PCI
driver.  The PCI & device subsystems assume that no resources have been
registered for the device, and the WARN_ON() triggers since amd-rng has
already do so.

2.  The driver leaks memory because the driver does not attach to a
device.  The driver only uses the PCI device as a reference.   devm_*()
functions will release resources on driver detach, which the amd-rng
driver will never do.  As a result,

3.  The driver cannot be reloaded because there is always a use of the
ioport and region after the first load of the driver.

Revert the changes made by 31b2a73c9c5f ("hwrng: amd - Migrate to managed
API").

Signed-off-by: Prarit Bhargava 
Fixes: 31b2a73c9c5f ("hwrng: amd - Migrate to managed API").
Cc: Matt Mackall 
Cc: Corentin LABBE 
Cc: PrasannaKumar Muralidharan 
Cc: Wei Yongjun 
Cc: linux-crypto@vger.kernel.org
Cc: linux-ge...@lists.infradead.org
Signed-off-by: Herbert Xu 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/char/hw_random/amd-rng.c |   42 +++
 1 file changed, 34 insertions(+), 8 deletions(-)

--- a/drivers/char/hw_random/amd-rng.c
+++ b/drivers/char/hw_random/amd-rng.c
@@ -55,6 +55,7 @@ MODULE_DEVICE_TABLE(pci, pci_tbl);
 struct amd768_priv {
void __iomem *iobase;
struct pci_dev *pcidev;
+   u32 pmbase;
 };
 
 static int amd_rng_read(struct hwrng *rng, void *buf, size_t max, bool wait)
@@ -148,33 +149,58 @@ found:
if (pmbase == 0)
return -EIO;
 
-   priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+   priv = kzalloc(sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
 
-   if (!devm_request_region(&pdev->dev, pmbase + PMBASE_OFFSET,
-   PMBASE_SIZE, DRV_NAME)) {
+   if (!request_region(pmbase + PMBASE_OFFSET, PMBASE_SIZE, DRV_NAME)) {
dev_err(&pdev->dev, DRV_NAME " region 0x%x already in use!\n",
pmbase + 0xF0);
-   return -EBUSY;
+   err = -EBUSY;
+   goto out;
}
 
-   priv->iobase = devm_ioport_map(&pdev->dev, pmbase + PMBASE_OFFSET,
-   PMBASE_SIZE);
+   priv->iobase = ioport_map(pmbase + PMBASE_OFFSET, PMBASE_SIZE);
if (!priv->iobase) {
pr_err(DRV_NAME "Cannot map ioport\n");
-   return -ENOMEM;
+   err = -EINVAL;
+   goto err_iomap;
}
 
amd_rng.priv = (unsigned long)priv;
+   priv->pmbase = pmbase;
priv->pcidev = pdev;
 
pr_info(DRV_NAME " detected\n");
-   return devm_hwrng_register(&pdev->dev, &amd_rng);
+   err = hwrng_register(&amd_rng);
+   if (err) {
+   pr_err(DRV_NAME " registering failed (%d)\n", err);
+   goto err_hwrng;
+   }
+   return 0;
+
+err_hwrng:
+   ioport_unmap(priv->iobase);
+err_iomap:
+   release_region(pmbase + PMBASE_OFFSET, PMBASE_SIZE);
+out:
+   kfree(priv);
+   return err;
 }
 
 static void __exit mod_exit(void)
 {
+   struct amd768_priv *priv;
+
+   priv = (struct amd768_priv *)amd_rng.priv;
+
+   hwrng_unregister(&amd_rng);
+
+   ioport_unmap(priv->iobase);
+
+   release_region(priv->pmbase + PMBASE_OFFSET, PMBASE_SIZE);
+
+   kfree(priv);
 }
 
 module_init(mod_init);

[PATCH 4.10 076/111] hwrng: geode - Revert managed API changes

2017-03-28 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Prarit Bhargava 

commit 8c75704ebcac2ffa31ee7bcc359baf701b52bf00 upstream.

After commit e9afc746299d ("hwrng: geode - Use linux/io.h instead of
asm/io.h") the geode-rng driver uses devres with pci_dev->dev to keep
track of resources, but does not actually register a PCI driver.  This
results in the following issues:

1.  The driver leaks memory because the driver does not attach to a
device.  The driver only uses the PCI device as a reference.   devm_*()
functions will release resources on driver detach, which the geode-rng
driver will never do.  As a result,

2.  The driver cannot be reloaded because there is always a use of the
ioport and region after the first load of the driver.

Revert the changes made by  e9afc746299d ("hwrng: geode - Use linux/io.h
instead of asm/io.h").

Signed-off-by: Prarit Bhargava 
Fixes: 6e9b5e76882c ("hwrng: geode - Migrate to managed API")
Cc: Matt Mackall 
Cc: Corentin LABBE 
Cc: PrasannaKumar Muralidharan 
Cc: Wei Yongjun 
Cc: linux-crypto@vger.kernel.org
Cc: linux-ge...@lists.infradead.org
Signed-off-by: Herbert Xu 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/char/hw_random/geode-rng.c |   50 +
 1 file changed, 35 insertions(+), 15 deletions(-)

--- a/drivers/char/hw_random/geode-rng.c
+++ b/drivers/char/hw_random/geode-rng.c
@@ -31,6 +31,9 @@
 #include 
 #include 
 
+
+#define PFXKBUILD_MODNAME ": "
+
 #define GEODE_RNG_DATA_REG   0x50
 #define GEODE_RNG_STATUS_REG 0x54
 
@@ -82,6 +85,7 @@ static struct hwrng geode_rng = {
 
 static int __init mod_init(void)
 {
+   int err = -ENODEV;
struct pci_dev *pdev = NULL;
const struct pci_device_id *ent;
void __iomem *mem;
@@ -89,27 +93,43 @@ static int __init mod_init(void)
 
for_each_pci_dev(pdev) {
ent = pci_match_id(pci_tbl, pdev);
-   if (ent) {
-   rng_base = pci_resource_start(pdev, 0);
-   if (rng_base == 0)
-   return -ENODEV;
-
-   mem = devm_ioremap(&pdev->dev, rng_base, 0x58);
-   if (!mem)
-   return -ENOMEM;
-   geode_rng.priv = (unsigned long)mem;
-
-   pr_info("AMD Geode RNG detected\n");
-   return devm_hwrng_register(&pdev->dev, &geode_rng);
-   }
+   if (ent)
+   goto found;
}
-
/* Device not found. */
-   return -ENODEV;
+   goto out;
+
+found:
+   rng_base = pci_resource_start(pdev, 0);
+   if (rng_base == 0)
+   goto out;
+   err = -ENOMEM;
+   mem = ioremap(rng_base, 0x58);
+   if (!mem)
+   goto out;
+   geode_rng.priv = (unsigned long)mem;
+
+   pr_info("AMD Geode RNG detected\n");
+   err = hwrng_register(&geode_rng);
+   if (err) {
+   pr_err(PFX "RNG registering failed (%d)\n",
+  err);
+   goto err_unmap;
+   }
+out:
+   return err;
+
+err_unmap:
+   iounmap(mem);
+   goto out;
 }
 
 static void __exit mod_exit(void)
 {
+   void __iomem *mem = (void __iomem *)geode_rng.priv;
+
+   hwrng_unregister(&geode_rng);
+   iounmap(mem);
 }
 
 module_init(mod_init);

[RFC TLS Offload Support 08/15] mlx/tls: Hardware interface

2017-03-28 Thread Aviad Yehezkel

From: Ilya Lesokhin 

Implement the hardware interface to set up TLS offload.

Signed-off-by: Guy Shapiro 
Signed-off-by: Ilya Lesokhin 
Signed-off-by: Matan Barak 
Signed-off-by: Haggai Eran 
Signed-off-by: Aviad Yehezkel 
---
 .../ethernet/mellanox/accelerator/tls/tls_cmds.h   | 112 ++
 .../net/ethernet/mellanox/accelerator/tls/tls_hw.c | 429 +
 .../net/ethernet/mellanox/accelerator/tls/tls_hw.h |  49 +++
 3 files changed, 590 insertions(+)
 create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls_cmds.h
 create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls_hw.c
 create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls_hw.h

diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls_cmds.h 
b/drivers/net/ethernet/mellanox/accelerator/tls/tls_cmds.h
new file mode 100644
index 000..8916f00
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls_cmds.h
@@ -0,0 +1,112 @@
+/*
+ * Copyright (c) 2015-2017 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#ifndef MLX_TLS_CMDS_H
+#define MLX_TLS_CMDS_H
+
+#define MLX_TLS_SADB_RDMA
+
+enum fpga_cmds {
+   CMD_SETUP_STREAM= 1,
+   CMD_TEARDOWN_STREAM = 2,
+};
+
+enum fpga_response {
+   EVENT_SETUP_STREAM_RESPONSE = 0x81,
+};
+
+#define TLS_TCP_IP_PROTO   BIT(3)  /* 0 - UDP; 1 - TCP */
+#define TLS_TCP_INIT   BIT(2)  /* 1 - Initialized */
+#define TLS_TCP_VALID  BIT(1)  /* 1 - Valid */
+#define TLS_TCP_IPV6   BIT(0)  /* 0 - IPv4;1 - IPv6 */
+
+struct tls_cntx_tcp {
+   __be32 ip_da[4];
+   __be32 flags;
+   __be16 src_port;
+   __be16 dst_port;
+   u32 pad;
+   __be32 tcp_sn;
+   __be32 ip_sa[4];
+} __packed;
+
+struct tls_cntx_crypto {
+   u8 enc_state[16];
+   u8 enc_key[32];
+} __packed;
+
+struct tls_cntx_record {
+   u8 rcd_sn[8];
+   u16 pad;
+   u8 flags;
+   u8 rcd_type_ver;
+   __be32 rcd_tcp_sn_nxt;
+   __be32 rcd_implicit_iv;
+   u8 rcd_residue[32];
+} __packed;
+
+#define TLS_RCD_ENC_AES_GCM128 (0)
+#define TLS_RCD_ENC_AES_GCM256 (BIT(4))
+#define TLS_RCD_AUTH_AES_GCM128(0)
+#define TLS_RCD_AUTH_AES_GCM256(1)
+
+#define TLS_RCD_VER_1_2(3)
+
+struct tls_cntx {
+   struct tls_cntx_tcp tcp;
+   struct tls_cntx_record  rcd;
+   struct tls_cntx_crypto  crypto;
+} __packed;
+
+struct setup_stream_cmd {
+   u8 cmd;
+   __be32 stream_id;
+   struct tls_cntx tls;
+} __packed;
+
+struct teardown_stream_cmd {
+   u8 cmd;
+   __be32 stream_id;
+} __packed;
+
+struct generic_event {
+   __be32 opcode;
+   __be32 stream_id;
+};
+
+struct setup_stream_response {
+   __be32 opcode;
+   __be32 stream_id;
+};
+
+#endif /* MLX_TLS_CMDS_H */
diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls_hw.c 
b/drivers/net/ethernet/mellanox/accelerator/tls/tls_hw.c
new file mode 100644
index 000..3a02f1e
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls_hw.c
@@ -0,0 +1,429 @@
+/*
+ * Copyright (c) 2015-2017 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the

[RFC TLS Offload Support 04/15] net: Add TLS offload netdevice and socket support

2017-03-28 Thread Aviad Yehezkel

From: Ilya Lesokhin 

This patch add a new NDO to add and delete TLS contexts on netdevices.

Signed-off-by: Boris Pismenny 
Signed-off-by: Ilya Lesokhin 
Signed-off-by: Aviad Yehezkel 
---
 include/linux/netdevice.h | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 51f9336..ce4760c 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -844,6 +844,25 @@ struct xfrmdev_ops {
 };
 #endif
 
+#if IS_ENABLED(CONFIG_TLS)
+enum tls_offload_ctx_dir {
+   TLS_OFFLOAD_CTX_DIR_RX,
+   TLS_OFFLOAD_CTX_DIR_TX,
+};
+
+struct tls_crypto_info;
+struct tls_offload_context;
+
+struct tlsdev_ops {
+   int (*tls_dev_add)(struct net_device *netdev, struct sock *sk,
+   enum tls_offload_ctx_dir direction,
+   struct tls_crypto_info *crypto_info,
+   struct tls_offload_context **ctx);
+   void (*tls_dev_del)(struct net_device *netdev, struct sock *sk,
+   enum tls_offload_ctx_dir direction);
+};
+#endif
+
 /*
  * This structure defines the management hooks for network devices.
  * The following hooks can be defined; unless noted otherwise, they are
@@ -1722,6 +1741,10 @@ struct net_device {
const struct xfrmdev_ops *xfrmdev_ops;
 #endif
 
+#if IS_ENABLED(CONFIG_TLS)
+   const struct tlsdev_ops *tlsdev_ops;
+#endif
+
const struct header_ops *header_ops;
 
unsigned intflags;
-- 
2.7.4

[RFC TLS Offload Support 13/15] crypto: Add gcm template for rfc5288

2017-03-28 Thread Aviad Yehezkel

From: Dave Watson 

AAD data length is 13 bytes, tag is 16.

Signed-off-by: Dave Watson 
---
 crypto/gcm.c | 122 +++
 crypto/tcrypt.c  |  14 ---
 crypto/testmgr.c |  16 
 crypto/testmgr.h |  47 +
 4 files changed, 194 insertions(+), 5 deletions(-)

diff --git a/crypto/gcm.c b/crypto/gcm.c
index f624ac9..07c2805 100644
--- a/crypto/gcm.c
+++ b/crypto/gcm.c
@@ -1016,6 +1016,120 @@ static struct crypto_template crypto_rfc4106_tmpl = {
.module = THIS_MODULE,
 };
 
+static int crypto_rfc5288_encrypt(struct aead_request *req)
+{
+   if (req->assoclen != 21)
+   return -EINVAL;
+
+   req = crypto_rfc4106_crypt(req);
+
+   return crypto_aead_encrypt(req);
+}
+
+static int crypto_rfc5288_decrypt(struct aead_request *req)
+{
+   if (req->assoclen != 21)
+   return -EINVAL;
+
+   req = crypto_rfc4106_crypt(req);
+
+   return crypto_aead_decrypt(req);
+}
+
+static int crypto_rfc5288_create(struct crypto_template *tmpl,
+struct rtattr **tb)
+{
+   struct crypto_attr_type *algt;
+   struct aead_instance *inst;
+   struct crypto_aead_spawn *spawn;
+   struct aead_alg *alg;
+   const char *ccm_name;
+   int err;
+
+   algt = crypto_get_attr_type(tb);
+   if (IS_ERR(algt))
+   return PTR_ERR(algt);
+
+   if ((algt->type ^ CRYPTO_ALG_TYPE_AEAD) & algt->mask)
+   return -EINVAL;
+
+   ccm_name = crypto_attr_alg_name(tb[1]);
+   if (IS_ERR(ccm_name))
+   return PTR_ERR(ccm_name);
+
+   inst = kzalloc(sizeof(*inst) + sizeof(*spawn), GFP_KERNEL);
+   if (!inst)
+   return -ENOMEM;
+
+   spawn = aead_instance_ctx(inst);
+   crypto_set_aead_spawn(spawn, aead_crypto_instance(inst));
+   err = crypto_grab_aead(spawn, ccm_name, 0,
+  crypto_requires_sync(algt->type, algt->mask));
+   if (err)
+   goto out_free_inst;
+
+   alg = crypto_spawn_aead_alg(spawn);
+
+   err = -EINVAL;
+
+   /* Underlying IV size must be 12. */
+   if (crypto_aead_alg_ivsize(alg) != 12)
+   goto out_drop_alg;
+
+   /* Not a stream cipher? */
+   if (alg->base.cra_blocksize != 1)
+   goto out_drop_alg;
+
+   err = -ENAMETOOLONG;
+   if (snprintf(inst->alg.base.cra_name, CRYPTO_MAX_ALG_NAME,
+"rfc5288(%s)", alg->base.cra_name) >=
+   CRYPTO_MAX_ALG_NAME ||
+   snprintf(inst->alg.base.cra_driver_name, CRYPTO_MAX_ALG_NAME,
+"rfc5288(%s)", alg->base.cra_driver_name) >=
+   CRYPTO_MAX_ALG_NAME)
+   goto out_drop_alg;
+
+   inst->alg.base.cra_flags = alg->base.cra_flags & CRYPTO_ALG_ASYNC;
+   inst->alg.base.cra_priority = alg->base.cra_priority;
+   inst->alg.base.cra_blocksize = 1;
+   inst->alg.base.cra_alignmask = alg->base.cra_alignmask;
+
+   inst->alg.base.cra_ctxsize = sizeof(struct crypto_rfc4106_ctx);
+
+   inst->alg.ivsize = 8;
+   inst->alg.chunksize = crypto_aead_alg_chunksize(alg);
+   inst->alg.maxauthsize = crypto_aead_alg_maxauthsize(alg);
+
+   inst->alg.init = crypto_rfc4106_init_tfm;
+   inst->alg.exit = crypto_rfc4106_exit_tfm;
+
+   inst->alg.setkey = crypto_rfc4106_setkey;
+   inst->alg.setauthsize = crypto_rfc4106_setauthsize;
+   inst->alg.encrypt = crypto_rfc5288_encrypt;
+   inst->alg.decrypt = crypto_rfc5288_decrypt;
+
+   inst->free = crypto_rfc4106_free;
+
+   err = aead_register_instance(tmpl, inst);
+   if (err)
+   goto out_drop_alg;
+
+out:
+   return err;
+
+out_drop_alg:
+   crypto_drop_aead(spawn);
+out_free_inst:
+   kfree(inst);
+   goto out;
+}
+
+static struct crypto_template crypto_rfc5288_tmpl = {
+   .name = "rfc5288",
+   .create = crypto_rfc5288_create,
+   .module = THIS_MODULE,
+};
+
 static int crypto_rfc4543_setkey(struct crypto_aead *parent, const u8 *key,
 unsigned int keylen)
 {
@@ -1284,8 +1398,14 @@ static int __init crypto_gcm_module_init(void)
if (err)
goto out_undo_rfc4106;
 
+   err = crypto_register_template(&crypto_rfc5288_tmpl);
+   if (err)
+   goto out_undo_rfc4543;
+
return 0;
 
+out_undo_rfc4543:
+   crypto_unregister_template(&crypto_rfc4543_tmpl);
 out_undo_rfc4106:
crypto_unregister_template(&crypto_rfc4106_tmpl);
 out_undo_gcm:
@@ -1302,6 +1422,7 @@ static void __exit crypto_gcm_module_exit(void)
kfree(gcm_zeroes);
crypto_unregister_template(&crypto_rfc4543_tmpl);
crypto_unregister_template(&crypto_rfc4106_tmpl);
+   crypto_unregister_template(&crypto_rfc5288_tmpl);
crypto_unregister_template(&crypto_gcm_tmpl);
crypto_unregister_template(&crypto_gcm_base_tmpl);
 }
@@ -1315,4 +1436,5 @@ MODULE_AUTH

[RFC TLS Offload Support 06/15] tls: tls offload support

2017-03-28 Thread Aviad Yehezkel

This patch introduces TX HW offload.

tls_main: contains generic logic that will be shared by both
SW and HW implementations.
tls_device: contains generic HW logic that is shared by all
HW offload implementations.

Signed-off-by: Boris Pismenny 
Signed-off-by: Ilya Lesokhin 
Signed-off-by: Aviad Yehezkel 
---
 MAINTAINERS   |  13 +
 include/net/tls.h | 184 ++
 include/uapi/linux/Kbuild |   1 +
 include/uapi/linux/tls.h  |  84 +++
 net/Kconfig   |   1 +
 net/Makefile  |   1 +
 net/tls/Kconfig   |  12 +
 net/tls/Makefile  |   7 +
 net/tls/tls_device.c  | 594 ++
 net/tls/tls_main.c| 348 +++
 10 files changed, 1245 insertions(+)
 create mode 100644 include/net/tls.h
 create mode 100644 include/uapi/linux/tls.h
 create mode 100644 net/tls/Kconfig
 create mode 100644 net/tls/Makefile
 create mode 100644 net/tls/tls_device.c
 create mode 100644 net/tls/tls_main.c

diff --git a/MAINTAINERS b/MAINTAINERS
index b340ef6..e3b70c3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8486,6 +8486,19 @@ F:   net/ipv6/
 F: include/net/ip*
 F: arch/x86/net/*
 
+NETWORKING [TLS]
+M: Ilya Lesokhin 
+M: Aviad Yehezkel 
+M: Boris Pismenny 
+M: Haggai Eran 
+L: net...@vger.kernel.org
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git
+S: Maintained
+F: net/tls/*
+F: include/uapi/linux/tls.h
+F: include/net/tls.h
+
 NETWORKING [IPSEC]
 M: Steffen Klassert 
 M: Herbert Xu 
diff --git a/include/net/tls.h b/include/net/tls.h
new file mode 100644
index 000..f7f0cde
--- /dev/null
+++ b/include/net/tls.h
@@ -0,0 +1,184 @@
+/* Copyright (c) 2016-2017, Mellanox Technologies All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ *  - Neither the name of the Mellanox Technologies nor the
+ *names of its contributors may be used to endorse or promote
+ *products derived from this software without specific prior written
+ *permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED.
+ * IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+ * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE
+ */
+
+#ifndef _TLS_OFFLOAD_H
+#define _TLS_OFFLOAD_H
+
+#include 
+
+#include 
+
+
+/* Maximum data size carried in a TLS record */
+#define TLS_MAX_PAYLOAD_SIZE   ((size_t)1 << 14)
+
+#define TLS_HEADER_SIZE5
+#define TLS_NONCE_OFFSET   TLS_HEADER_SIZE
+
+#define TLS_CRYPTO_INFO_READY(info)((info)->cipher_type)
+#define TLS_IS_STATE_HW(info)  ((info)->state == TLS_STATE_HW)
+
+#define TLS_RECORD_TYPE_DATA   0x17
+
+
+struct tls_record_info {
+   struct list_head list;
+   u32 end_seq;
+   int len;
+   int num_frags;
+   skb_frag_t frags[MAX_SKB_FRAGS];
+};
+
+struct tls_offload_context {
+   struct list_head records_list;
+   struct tls_record_info *open_record;
+   struct tls_record_info *retransmit_hint;
+   u32 expectedSN;
+   spinlock_t lock;/* protects records list */
+};
+
+struct tls_context {
+   union {
+   struct tls_crypto_info crypto_send;
+   struct tls_crypto_info_aes_gcm_128 crypto_send_aes_gcm_128;
+   };
+
+   void *priv_ctx;
+
+   u16 prepand_size;
+   u16 tag_size;
+   u16 iv_size;
+   char *iv;
+
+   /* TODO: change sw code to use below fields and push_frags function */
+   skb_frag_t *pending_frags;
+   u16 num_pending_frags;
+   u16 pending_offset;
+
+   void (*sk_write_space)(struct sock *sk);
+   void (*sk_destruct)(struct sock *sk);

[RFC TLS Offload Support 09/15] mlx/tls: Sysfs configuration interface Configure the driver/hardware interface via sysfs.

2017-03-28 Thread Aviad Yehezkel

From: Ilya Lesokhin 

Signed-off-by: Guy Shapiro 
Signed-off-by: Ilya Lesokhin 
Signed-off-by: Matan Barak 
Signed-off-by: Aviad Yehezkel 
---
 .../ethernet/mellanox/accelerator/tls/tls_sysfs.c  | 194 +
 .../ethernet/mellanox/accelerator/tls/tls_sysfs.h  |  45 +
 2 files changed, 239 insertions(+)
 create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c
 create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.h

diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c 
b/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c
new file mode 100644
index 000..2860fc3
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c
@@ -0,0 +1,194 @@
+/*
+ * Copyright (c) 2015-2017 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#include 
+
+#include "tls_sysfs.h"
+#include "tls_cmds.h"
+
+#ifdef MLX_TLS_SADB_RDMA
+struct mlx_tls_attribute {
+   struct attribute attr;
+   ssize_t (*show)(struct mlx_tls_dev *dev, char *buf);
+   ssize_t (*store)(struct mlx_tls_dev *dev, const char *buf,
+size_t count);
+};
+
+#define MLX_TLS_ATTR(_name, _mode, _show, _store) \
+   struct mlx_tls_attribute mlx_tls_attr_##_name = { \
+   .attr = {.name = __stringify(_name), .mode = _mode}, \
+   .show = _show, \
+   .store = _store, \
+   }
+#define to_mlx_tls_dev(obj)\
+   container_of(kobj, struct mlx_tls_dev, kobj)
+#define to_mlx_tls_attr(_attr) \
+   container_of(attr, struct mlx_tls_attribute, attr)
+
+static ssize_t mlx_tls_attr_show(struct kobject *kobj, struct attribute *attr,
+char *buf)
+{
+   struct mlx_tls_dev *dev = to_mlx_tls_dev(kobj);
+   struct mlx_tls_attribute *mlx_tls_attr = to_mlx_tls_attr(attr);
+   ssize_t ret = -EIO;
+
+   if (mlx_tls_attr->show)
+   ret = mlx_tls_attr->show(dev, buf);
+
+   return ret;
+}
+
+static ssize_t mlx_tls_attr_store(struct kobject *kobj, struct attribute *attr,
+ const char *buf, size_t count)
+{
+   struct mlx_tls_dev *dev = to_mlx_tls_dev(kobj);
+   struct mlx_tls_attribute *mlx_tls_attr = to_mlx_tls_attr(attr);
+   ssize_t ret = -EIO;
+
+   if (mlx_tls_attr->store)
+   ret = mlx_tls_attr->store(dev, buf, count);
+
+   return ret;
+}
+
+static ssize_t mlx_tls_sqpn_read(struct mlx_tls_dev *dev, char *buf)
+{
+   return sprintf(buf, "%u\n", dev->conn->qp->qp_num);
+}
+
+static ssize_t mlx_tls_sgid_read(struct mlx_tls_dev *dev, char *buf)
+{
+   union ib_gid *sgid = (union ib_gid *)&dev->conn->fpga_qpc.remote_ip;
+
+   return sprintf(buf, "%04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x\n",
+  be16_to_cpu(((__be16 *)sgid->raw)[0]),
+  be16_to_cpu(((__be16 *)sgid->raw)[1]),
+  be16_to_cpu(((__be16 *)sgid->raw)[2]),
+  be16_to_cpu(((__be16 *)sgid->raw)[3]),
+  be16_to_cpu(((__be16 *)sgid->raw)[4]),
+  be16_to_cpu(((__be16 *)sgid->raw)[5]),
+  be16_to_cpu(((__be16 *)sgid->raw)[6]),
+  be16_to_cpu(((__be16 *)sgid->raw)[7]));
+}
+
+static ssize_t mlx_tls_dqpn_read(struct mlx_tls_dev *dev, char *buf)
+{
+   return sprintf(buf, "%u\n", dev->conn->fpga_qpn);
+}
+
+static ssize_t mlx_tls_dqpn_write(struct mlx_tls_dev *dev, const char *buf,
+ size_t count)
+{
+   int tmp

[RFC TLS Offload Support 11/15] mlx/tls: TLS offload driver Add the main module entrypoints and tie the module into the build system

2017-03-28 Thread Aviad Yehezkel

From: Ilya Lesokhin 

Signed-off-by: Guy Shapiro 
Signed-off-by: Ilya Lesokhin 
Signed-off-by: Matan Barak 
Signed-off-by: Haggai Eran 
Signed-off-by: Aviad Yehezkel 
---
 drivers/net/ethernet/mellanox/Kconfig  |  1 +
 drivers/net/ethernet/mellanox/Makefile |  1 +
 .../net/ethernet/mellanox/accelerator/tls/Kconfig  | 11 
 .../net/ethernet/mellanox/accelerator/tls/Makefile |  4 ++
 .../ethernet/mellanox/accelerator/tls/tls_main.c   | 77 ++
 5 files changed, 94 insertions(+)
 create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/Kconfig
 create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/Makefile
 create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls_main.c

diff --git a/drivers/net/ethernet/mellanox/Kconfig 
b/drivers/net/ethernet/mellanox/Kconfig
index 1b3ca6a..f270b76 100644
--- a/drivers/net/ethernet/mellanox/Kconfig
+++ b/drivers/net/ethernet/mellanox/Kconfig
@@ -21,6 +21,7 @@ source "drivers/net/ethernet/mellanox/mlx5/core/Kconfig"
 source "drivers/net/ethernet/mellanox/mlxsw/Kconfig"
 source "drivers/net/ethernet/mellanox/accelerator/core/Kconfig"
 source "drivers/net/ethernet/mellanox/accelerator/ipsec/Kconfig"
+source "drivers/net/ethernet/mellanox/accelerator/tls/Kconfig"
 source "drivers/net/ethernet/mellanox/accelerator/tools/Kconfig"
 
 endif # NET_VENDOR_MELLANOX
diff --git a/drivers/net/ethernet/mellanox/Makefile 
b/drivers/net/ethernet/mellanox/Makefile
index 96a5856..fd8afc0 100644
--- a/drivers/net/ethernet/mellanox/Makefile
+++ b/drivers/net/ethernet/mellanox/Makefile
@@ -7,4 +7,5 @@ obj-$(CONFIG_MLX5_CORE) += mlx5/core/
 obj-$(CONFIG_MLXSW_CORE) += mlxsw/
 obj-$(CONFIG_MLX_ACCEL_CORE) += accelerator/core/
 obj-$(CONFIG_MLX_ACCEL_IPSEC) += accelerator/ipsec/
+obj-$(CONFIG_MLX_ACCEL_TLS) += accelerator/tls/
 obj-$(CONFIG_MLX_ACCEL_TOOLS) += accelerator/tools/
diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/Kconfig 
b/drivers/net/ethernet/mellanox/accelerator/tls/Kconfig
new file mode 100644
index 000..d9c0733
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/Kconfig
@@ -0,0 +1,11 @@
+#
+# Mellanox tls accelerator driver configuration
+#
+
+config MLX_ACCEL_TLS
+   tristate "Mellanox Technologies TLS accelarator driver"
+   depends on MLX_ACCEL_CORE
+   default n
+   ---help---
+ TLS accelarator driver by Mellanox Technologies.
+
diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/Makefile 
b/drivers/net/ethernet/mellanox/accelerator/tls/Makefile
new file mode 100644
index 000..93a7733
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/Makefile
@@ -0,0 +1,4 @@
+obj-$(CONFIG_MLX_ACCEL_TLS)+= mlx_tls.o
+
+ccflags-y := -I$(srctree)/
+mlx_tls-y :=  tls_main.o tls_sysfs.o tls_hw.o tls.o
diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls_main.c 
b/drivers/net/ethernet/mellanox/accelerator/tls/tls_main.c
new file mode 100644
index 000..85078f5
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls_main.c
@@ -0,0 +1,77 @@
+/*
+ * Copyright (c) 2015-2017 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+#include 
+
+#include "tls.h"
+
+MODULE_AUTHOR("Mellanox Technologies Advance Develop Team 
");
+MODULE_DESCRIPTION("Mellanox Innova TLS Driver");
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_VERSION(DRIVER_VERSION);
+
+static struct mlx_accel_core_client mlx_tls_client = {
+   .name   = "mlx_tls",
+   .add= mlx_tls_add_one,
+   .remove = mlx_tls_remove_one,
+};
+
+static struct notifier_block mlx_tls_netdev_notifier

[RFC TLS Offload Support 02/15] tcp: export do_tcp_sendpages function

2017-03-28 Thread Aviad Yehezkel

We will use it via tls new code.

Signed-off-by: Aviad Yehezkel 
Signed-off-by: Ilya Lesokhin 
Signed-off-by: Boris Pismenny 
---
 include/net/tcp.h | 2 ++
 net/ipv4/tcp.c| 5 +++--
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 207147b..3a72d4c 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -348,6 +348,8 @@ int tcp_v4_tw_remember_stamp(struct inet_timewait_sock *tw);
 int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
 int tcp_sendpage(struct sock *sk, struct page *page, int offset, size_t size,
 int flags);
+ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
+size_t size, int flags);
 void tcp_release_cb(struct sock *sk);
 void tcp_wfree(struct sk_buff *skb);
 void tcp_write_timer_handler(struct sock *sk);
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 1149b48..302fee9 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -873,8 +873,8 @@ static int tcp_send_mss(struct sock *sk, int *size_goal, 
int flags)
return mss_now;
 }
 
-static ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
-   size_t size, int flags)
+ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
+size_t size, int flags)
 {
struct tcp_sock *tp = tcp_sk(sk);
int mss_now, size_goal;
@@ -1003,6 +1003,7 @@ static ssize_t do_tcp_sendpages(struct sock *sk, struct 
page *page, int offset,
}
return sk_stream_error(sk, flags, err);
 }
+EXPORT_SYMBOL(do_tcp_sendpages);
 
 int tcp_sendpage(struct sock *sk, struct page *page, int offset,
 size_t size, int flags)
-- 
2.7.4

[RFC TLS Offload Support 14/15] crypto: rfc5288 aesni optimized intel routines

2017-03-28 Thread Aviad Yehezkel

From: Dave Watson 

The assembly routines require the AAD data to be padded out
to the nearest 4 bytes.
Copy the 13 byte tag to a spare assoc data area when necessary

Signed-off-by: Dave Watson 
---
 arch/x86/crypto/aesni-intel_asm.S|   6 ++
 arch/x86/crypto/aesni-intel_avx-x86_64.S |   4 ++
 arch/x86/crypto/aesni-intel_glue.c   | 105 ++-
 3 files changed, 99 insertions(+), 16 deletions(-)

diff --git a/arch/x86/crypto/aesni-intel_asm.S 
b/arch/x86/crypto/aesni-intel_asm.S
index 383a6f8..4e80bb8 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -229,6 +229,9 @@ XMM2 XMM3 XMM4 XMMDst TMP6 TMP7 i i_seq operation
 MOVADQ SHUF_MASK(%rip), %xmm14
movarg7, %r10   # %r10 = AAD
movarg8, %r12   # %r12 = aadLen
+   add$3, %r12
+   and$~3, %r12
+
mov%r12, %r11
pxor   %xmm\i, %xmm\i
 
@@ -454,6 +457,9 @@ XMM2 XMM3 XMM4 XMMDst TMP6 TMP7 i i_seq operation
 MOVADQ SHUF_MASK(%rip), %xmm14
movarg7, %r10   # %r10 = AAD
movarg8, %r12   # %r12 = aadLen
+   add$3, %r12
+   and$~3, %r12
+
mov%r12, %r11
pxor   %xmm\i, %xmm\i
 _get_AAD_loop\num_initial_blocks\operation:
diff --git a/arch/x86/crypto/aesni-intel_avx-x86_64.S 
b/arch/x86/crypto/aesni-intel_avx-x86_64.S
index 522ab68..0756e4a 100644
--- a/arch/x86/crypto/aesni-intel_avx-x86_64.S
+++ b/arch/x86/crypto/aesni-intel_avx-x86_64.S
@@ -360,6 +360,8 @@ VARIABLE_OFFSET = 16*8
 
 mov arg6, %r10  # r10 = AAD
 mov arg7, %r12  # r12 = aadLen
+add $3, %r12
+and $~3, %r12
 
 
 mov %r12, %r11
@@ -1619,6 +1621,8 @@ ENDPROC(aesni_gcm_dec_avx_gen2)
 
 mov arg6, %r10   # r10 = AAD
 mov arg7, %r12   # r12 = aadLen
+add $3, %r12
+and $~3, %r12
 
 
 mov %r12, %r11
diff --git a/arch/x86/crypto/aesni-intel_glue.c 
b/arch/x86/crypto/aesni-intel_glue.c
index 4ff90a7..dcada94 100644
--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -957,6 +957,8 @@ static int helper_rfc4106_encrypt(struct aead_request *req)
 {
u8 one_entry_in_sg = 0;
u8 *src, *dst, *assoc;
+   u8 *assocmem = NULL;
+
__be32 counter = cpu_to_be32(1);
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct aesni_rfc4106_gcm_ctx *ctx = aesni_rfc4106_gcm_ctx_get(tfm);
@@ -966,12 +968,8 @@ static int helper_rfc4106_encrypt(struct aead_request *req)
struct scatter_walk src_sg_walk;
struct scatter_walk dst_sg_walk = {};
unsigned int i;
-
-   /* Assuming we are supporting rfc4106 64-bit extended */
-   /* sequence numbers We need to have the AAD length equal */
-   /* to 16 or 20 bytes */
-   if (unlikely(req->assoclen != 16 && req->assoclen != 20))
-   return -EINVAL;
+   unsigned int padded_assoclen = (req->assoclen + 3) & ~3;
+   u8 assocbuf[24];
 
/* IV below built */
for (i = 0; i < 4; i++)
@@ -996,7 +994,8 @@ static int helper_rfc4106_encrypt(struct aead_request *req)
} else {
/* Allocate memory for src, dst, assoc */
assoc = kmalloc(req->cryptlen + auth_tag_len + req->assoclen,
-   GFP_ATOMIC);
+   GFP_ATOMIC);
+   assocmem = assoc;
if (unlikely(!assoc))
return -ENOMEM;
scatterwalk_map_and_copy(assoc, req->src, 0,
@@ -1005,6 +1004,14 @@ static int helper_rfc4106_encrypt(struct aead_request 
*req)
dst = src;
}
 
+   if (req->assoclen != padded_assoclen) {
+   scatterwalk_map_and_copy(assocbuf, req->src, 0,
+req->assoclen, 0);
+   memset(assocbuf + req->assoclen, 0,
+  padded_assoclen - req->assoclen);
+   assoc = assocbuf;
+   }
+
kernel_fpu_begin();
aesni_gcm_enc_tfm(aes_ctx, dst, src, req->cryptlen, iv,
  ctx->hash_subkey, assoc, req->assoclen - 8,
@@ -1025,7 +1032,7 @@ static int helper_rfc4106_encrypt(struct aead_request 
*req)
} else {
scatterwalk_map_and_copy(dst, req->dst, req->assoclen,
 req->cryptlen + auth_tag_len, 1);
-   kfree(assoc);
+   kfree(assocmem);
}
return 0;
 }
@@ -1034,6 +1041,7 @@ static int helper_rfc4106_decrypt(struct aead_request 
*req)
 {
u8 one_entry_in_sg = 0;
u8 *src, *dst, *assoc;
+   u8 *assocmem = NULL;
unsigned long tempCipherLen = 0;
__be32 counter = cpu_to_be32(1);
int retval = 0;
@

[RFC TLS Offload Support 12/15] mlx/tls: Enable MLX5_CORE_QP_SIM mode for tls

2017-03-28 Thread Aviad Yehezkel

Signed-off-by: Aviad Yehezkel 
Signed-off-by: Ilya Lesokhin 
---
 drivers/net/ethernet/mellanox/accelerator/tls/tls.c   | 6 ++
 drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c | 2 ++
 drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.h | 2 ++
 3 files changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls.c 
b/drivers/net/ethernet/mellanox/accelerator/tls/tls.c
index 07a4b67..3560f784 100644
--- a/drivers/net/ethernet/mellanox/accelerator/tls/tls.c
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls.c
@@ -494,9 +494,11 @@ static struct sk_buff *mlx_tls_rx_handler(struct sk_buff 
*skb, u8 *rawpet,
 static void mlx_tls_free(struct mlx_tls_dev *dev)
 {
list_del(&dev->accel_dev_list);
+#if IS_ENABLED(CONFIG_MLX5_CORE_FPGA_QP_SIM)
 #ifdef MLX_TLS_SADB_RDMA
kobject_put(&dev->kobj);
 #endif
+#endif
dev_put(dev->netdev);
kfree(dev);
 }
@@ -592,6 +594,7 @@ int mlx_tls_add_one(struct mlx_accel_core_device 
*accel_device)
goto err_netdev;
}
 
+#if IS_ENABLED(CONFIG_MLX5_CORE_FPGA_QP_SIM)
 #ifdef MLX_TLS_SADB_RDMA
ret = tls_sysfs_init_and_add(&dev->kobj,
 mlx_accel_core_kobj(dev->accel_device),
@@ -603,6 +606,7 @@ int mlx_tls_add_one(struct mlx_accel_core_device 
*accel_device)
goto err_ops_register;
}
 #endif
+#endif
 
mutex_lock(&mlx_tls_mutex);
list_add(&dev->accel_dev_list, &mlx_tls_devs);
@@ -611,10 +615,12 @@ int mlx_tls_add_one(struct mlx_accel_core_device 
*accel_device)
dev->netdev->tlsdev_ops = &mlx_tls_ops;
goto out;
 
+#if IS_ENABLED(CONFIG_MLX5_CORE_FPGA_QP_SIM)
 #ifdef MLX_TLS_SADB_RDMA
 err_ops_register:
mlx_accel_core_client_ops_unregister(accel_device);
 #endif
+#endif
 err_netdev:
dev_put(netdev);
 err_conn:
diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c 
b/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c
index 2860fc3..76ba784 100644
--- a/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c
@@ -36,6 +36,7 @@
 #include "tls_sysfs.h"
 #include "tls_cmds.h"
 
+#if IS_ENABLED(CONFIG_MLX5_CORE_FPGA_QP_SIM)
 #ifdef MLX_TLS_SADB_RDMA
 struct mlx_tls_attribute {
struct attribute attr;
@@ -192,3 +193,4 @@ int tls_sysfs_init_and_add(struct kobject *kobj, struct 
kobject *parent,
fmt, arg);
 }
 #endif
+#endif
diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.h 
b/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.h
index bfaa857..d7c3185 100644
--- a/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.h
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.h
@@ -37,9 +37,11 @@
 
 #include "tls.h"
 
+#if IS_ENABLED(CONFIG_MLX5_CORE_FPGA_QP_SIM)
 #ifdef MLX_TLS_SADB_RDMA
 int tls_sysfs_init_and_add(struct kobject *kobj, struct kobject *parent,
   const char *fmt, char *arg);
 #endif
+#endif
 
 #endif /* __TLS_SYSFS_H__ */
-- 
2.7.4

[RFC TLS Offload Support 00/15] cover letter

2017-03-28 Thread Aviad Yehezkel

Overview

A kernel TLS Tx only socket option for TCP sockets.
Similarly to the kernel TLS socket(https://lwn.net/Articles/665602),
only symmetric crypto is done in the kernel, as well as TLS record framing.
The handshake remains in userspace, and the negotiated cipher keys/iv are 
provided to the TCP socket.

Today, userspace TLS must perform 2 passes over the data. First, it has to 
encrypt the data. Second, the data is copied to the TCP socket in the kernel.
Kernel TLS avoids one pass over the data by encrypting the data from userspace 
pages into kernelspace buffers.

Non application-data TLS records must be encrypted using the latest crypto 
state available in the kernel. It is possible to get the crypto context from 
the kernel and encrypt such recrods in user-space. But we choose to encrypt 
such TLS records in the kernel by setting the MSG_OOB flag and providing the 
record type with the data.

TLS Tx crypto offload is a new feature of network devices. It enables the 
kernel TLS socket to skip encryption and authentication operations on the 
transmit side of the data path, delegating those to the NIC. In turn, the NIC 
encrypts packets that belong to an offloaded TLS socket on the fly. The NIC 
does not modify any packet headers. It expects to receive fully framed TCP 
packets with TLS records as payload. The NIC replaces plaintext with ciphertext 
and fills the authentication tag. The NIC does not hold any state beyond the 
context needed to encrypt the next expected packet, i.e. expected TCP sequence 
number and crypto state.

There are 2 flows for TLS Tx offload, a fast path and a slow path.
Fast path: packet matches the expected TCP sequence number in the context.
Slow path: packet does not match the expected TCP sequence number in the 
context. For example: TCP retransmissions. For a packet in the slow path, we 
need to resynchronize the crypto context of the NIC by providing the TLS record 
data for that packet before it could be encrypted and transmitted by the NIC.

Motivation
==
1) Performance: The CPU overhead of encryption in the data path is high, at 
least 4x for netperf over TLS between 2 machines connected back-to-back.
Our single stream performance tests show that using crypto offload for TLS 
sockets achieves the same throughput as plain TCP traffic while increasing CPU 
utilization by only
 x1.4.

2) Flexibility: The protocol stack is implemented entirely on the host CPU.
Compared to solutions based on TCP offload, this approach offloads only 
encryption. Keeping memory management, congestion control, etc. in the host CPU.

Notes
=
1) New paths:
o net/tls - TLS layer in kernel
o drivers/net/ethernet/mellanox/accelerator/* - NIC driver support, 
currently implemented as seperated modules.
  In the future this code will go into the mlx5 driver. We attached to this 
patch only the module that integrated with TLS layer.
  The complete NIC sample driver is available at 
https://github.com/Mellanox/tls-offload/tree/tx_rfc_v5

2) We implemented support for this API in OpenSSL 1.1.0, the code is available 
at https://github.com/Mellanox/tls-openssl/tree/master

3) TLS crypto offload was presented during netdevconf1.2, more details could be 
found in the presentation and paper:
   https://netdevconf.org/1.2/session.html?boris-pismenny

4) These RFC patches are based on kernel 4.9-rc7.

Aviad Yehezkel (5):
  tcp: export do_tcp_sendpages function
  tcp: export tcp_rate_check_app_limited function
  tcp: Add TLS socket options for TCP sockets
  tls: tls offload support
  mlx/tls: Enable MLX5_CORE_QP_SIM mode for tls

Dave Watson (2):
  crypto: Add gcm template for rfc5288
  crypto: rfc5288 aesni optimized intel routines

Ilya Lesokhin (8):
  tcp: Add clean acked data hook
  net: Add TLS offload netdevice and socket support
  mlx/mlx5_core: Allow sending multiple packets
  mlx/tls: Hardware interface
  mlx/tls: Sysfs configuration interface Configure the driver/hardware
interface via sysfs.
  mlx/tls: Add mlx_accel offload driver for TLS
  mlx/tls: TLS offload driver Add the main module entrypoints and tie
the module into the build system
  net/tls: Add software offload

 MAINTAINERS|  14 +
 arch/x86/crypto/aesni-intel_asm.S  |   6 +
 arch/x86/crypto/aesni-intel_avx-x86_64.S   |   4 +
 arch/x86/crypto/aesni-intel_glue.c | 105 ++-
 crypto/gcm.c   | 122 
 crypto/tcrypt.c|  14 +-
 crypto/testmgr.c   |  16 +
 crypto/testmgr.h   |  47 ++
 drivers/net/ethernet/mellanox/Kconfig  |   1 +
 drivers/net/ethernet/mellanox/Makefile |   1 +
 .../net/ethernet/mellanox/accelerator/tls/Kconfig  |  11 +
 .../net/ethernet/mellanox/accelerator/tls/Makefile |   4 +
 .../net/ethernet/mellanox/accelerator/tls/tls.c| 658 +++

[RFC TLS Offload Support 15/15] net/tls: Add software offload

2017-03-28 Thread Aviad Yehezkel

From: Ilya Lesokhin 

Signed-off-by: Dave Watson 
Signed-off-by: Ilya Lesokhin 
Signed-off-by: Aviad Yehezkel 
---
 MAINTAINERS|   1 +
 include/net/tls.h  |  44 
 net/tls/Makefile   |   2 +-
 net/tls/tls_main.c |  34 +--
 net/tls/tls_sw.c   | 729 +
 5 files changed, 794 insertions(+), 16 deletions(-)
 create mode 100644 net/tls/tls_sw.c

diff --git a/MAINTAINERS b/MAINTAINERS
index e3b70c3..413c1d9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8491,6 +8491,7 @@ M:Ilya Lesokhin 
 M: Aviad Yehezkel 
 M: Boris Pismenny 
 M: Haggai Eran 
+M: Dave Watson 
 L: net...@vger.kernel.org
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git
diff --git a/include/net/tls.h b/include/net/tls.h
index f7f0cde..bb1f41e 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -48,6 +48,7 @@
 
 #define TLS_CRYPTO_INFO_READY(info)((info)->cipher_type)
 #define TLS_IS_STATE_HW(info)  ((info)->state == TLS_STATE_HW)
+#define TLS_IS_STATE_SW(info)  ((info)->state == TLS_STATE_SW)
 
 #define TLS_RECORD_TYPE_DATA   0x17
 
@@ -68,6 +69,37 @@ struct tls_offload_context {
spinlock_t lock;/* protects records list */
 };
 
+#define TLS_DATA_PAGES (TLS_MAX_PAYLOAD_SIZE / PAGE_SIZE)
+/* +1 for aad, +1 for tag, +1 for chaining */
+#define TLS_SG_DATA_SIZE   (TLS_DATA_PAGES + 3)
+#define ALG_MAX_PAGES 16 /* for skb_to_sgvec */
+#define TLS_AAD_SPACE_SIZE 21
+#define TLS_AAD_SIZE   13
+#define TLS_TAG_SIZE   16
+
+#define TLS_NONCE_SIZE 8
+#define TLS_PREPEND_SIZE   (TLS_HEADER_SIZE + TLS_NONCE_SIZE)
+#define TLS_OVERHEAD   (TLS_PREPEND_SIZE + TLS_TAG_SIZE)
+
+struct tls_sw_context {
+   struct sock *sk;
+   void (*sk_write_space)(struct sock *sk);
+   struct crypto_aead *aead_send;
+
+   /* Sending context */
+   struct scatterlist sg_tx_data[TLS_SG_DATA_SIZE];
+   struct scatterlist sg_tx_data2[ALG_MAX_PAGES + 1];
+   char aad_send[TLS_AAD_SPACE_SIZE];
+   char tag_send[TLS_TAG_SIZE];
+   skb_frag_t tx_frag;
+   int wmem_len;
+   int order_npages;
+   struct scatterlist sgaad_send[2];
+   struct scatterlist sgtag_send[2];
+   struct sk_buff_head tx_queue;
+   int unsent;
+};
+
 struct tls_context {
union {
struct tls_crypto_info crypto_send;
@@ -102,6 +134,12 @@ int tls_device_sendmsg(struct sock *sk, struct msghdr 
*msg, size_t size);
 int tls_device_sendpage(struct sock *sk, struct page *page,
int offset, size_t size, int flags);
 
+int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx);
+void tls_clear_sw_offload(struct sock *sk);
+int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
+int tls_sw_sendpage(struct sock *sk, struct page *page,
+   int offset, size_t size, int flags);
+
 struct tls_record_info *tls_get_record(struct tls_offload_context *context,
   u32 seq);
 
@@ -174,6 +212,12 @@ static inline struct tls_context *tls_get_ctx(const struct 
sock *sk)
return sk->sk_user_data;
 }
 
+static inline struct tls_sw_context *tls_sw_ctx(
+   const struct tls_context *tls_ctx)
+{
+   return (struct tls_sw_context *)tls_ctx->priv_ctx;
+}
+
 static inline struct tls_offload_context *tls_offload_ctx(
const struct tls_context *tls_ctx)
 {
diff --git a/net/tls/Makefile b/net/tls/Makefile
index 65e5677..61457e0 100644
--- a/net/tls/Makefile
+++ b/net/tls/Makefile
@@ -4,4 +4,4 @@
 
 obj-$(CONFIG_TLS) += tls.o
 
-tls-y := tls_main.o tls_device.o
+tls-y := tls_main.o tls_device.o tls_sw.o
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 6a3df25..a4efd02 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -46,6 +46,7 @@ MODULE_DESCRIPTION("Transport Layer Security Support");
 MODULE_LICENSE("Dual BSD/GPL");
 
 static struct proto tls_device_prot;
+static struct proto tls_sw_prot;
 
 int tls_push_frags(struct sock *sk,
   struct tls_context *ctx,
@@ -188,13 +189,10 @@ int tls_sk_query(struct sock *sk, int optname, char 
__user *optval,
rc = -EINVAL;
goto out;
}
-   if (TLS_IS_STATE_HW(crypto_info)) {
-   lock_sock(sk);
-   memcpy(crypto_info_aes_gcm_128->iv,
-  ctx->iv,
-  TLS_CIPHER_AES_GCM_128_IV_SIZE);
-   release_sock(sk);
-   }
+   lock_sock(sk);
+   memcpy(crypto_info_aes_gcm_128->iv, ctx->iv,
+  TLS_CIPHER_AES_GCM_128_IV_SIZE);
+   release_sock(sk);
rc = copy_to_user(optval,

[RFC TLS Offload Support 03/15] tcp: export tcp_rate_check_app_limited function

2017-03-28 Thread Aviad Yehezkel

We will use it via tls new code.

Signed-off-by: Aviad Yehezkel 
Signed-off-by: Ilya Lesokhin 
Signed-off-by: Boris Pismenny 
---
 net/ipv4/tcp_rate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv4/tcp_rate.c b/net/ipv4/tcp_rate.c
index 9be1581..a226f76 100644
--- a/net/ipv4/tcp_rate.c
+++ b/net/ipv4/tcp_rate.c
@@ -184,3 +184,4 @@ void tcp_rate_check_app_limited(struct sock *sk)
tp->app_limited =
(tp->delivered + tcp_packets_in_flight(tp)) ? : 1;
 }
+EXPORT_SYMBOL(tcp_rate_check_app_limited);
-- 
2.7.4

[RFC TLS Offload Support 01/15] tcp: Add clean acked data hook

2017-03-28 Thread Aviad Yehezkel

From: Ilya Lesokhin 

Called when a TCP segment is acknowledged.
Could be used by application protocols who hold additional
metadata associated with the stream data
This is required for TLS offloads to release metadata
for acknowledged TLS records.

Signed-off-by: Boris Pismenny 
Signed-off-by: Ilya Lesokhin 
Signed-off-by: Aviad Yehezkel 
---
 include/net/inet_connection_sock.h | 2 ++
 net/ipv4/tcp_input.c   | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/include/net/inet_connection_sock.h 
b/include/net/inet_connection_sock.h
index 146054c..0b0aceb 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -77,6 +77,7 @@ struct inet_connection_sock_af_ops {
  * @icsk_pmtu_cookie  Last pmtu seen by socket
  * @icsk_ca_ops   Pluggable congestion control hook
  * @icsk_af_ops   Operations which are AF_INET{4,6} specific
+ * @icsk_clean_acked  Clean acked data hook
  * @icsk_ca_state:Congestion control state
  * @icsk_retransmits: Number of unrecovered [RTO] timeouts
  * @icsk_pending: Scheduled timer event
@@ -99,6 +100,7 @@ struct inet_connection_sock {
__u32 icsk_pmtu_cookie;
const struct tcp_congestion_ops *icsk_ca_ops;
const struct inet_connection_sock_af_ops *icsk_af_ops;
+   void  (*icsk_clean_acked)(struct sock *sk);
unsigned int  (*icsk_sync_mss)(struct sock *sk, u32 pmtu);
__u8  icsk_ca_state:6,
  icsk_ca_setsockopt:1,
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index fe668c1..c158bec 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3667,6 +3667,9 @@ static int tcp_ack(struct sock *sk, const struct sk_buff 
*skb, int flag)
if (!prior_packets)
goto no_queue;
 
+   if (icsk->icsk_clean_acked)
+   icsk->icsk_clean_acked(sk);
+
/* See if we can take anything off of the retransmit queue. */
flag |= tcp_clean_rtx_queue(sk, prior_fackets, prior_snd_una, &acked,
&sack_state, &now);
-- 
2.7.4

[RFC TLS Offload Support 05/15] tcp: Add TLS socket options for TCP sockets

2017-03-28 Thread Aviad Yehezkel

This patch adds TLS_TX and TLS_RX TCP socket options.

Setting these socket options will change the sk->sk_prot
operations of the TCP socket. The user is responsible to
prevent races between calls to the previous operations
and the new operations. After successful return, data
sent on this socket will be encapsulated in TLS.

Signed-off-by: Aviad Yehezkel 
Signed-off-by: Boris Pismenny 
Signed-off-by: Ilya Lesokhin 
---
 include/uapi/linux/tcp.h |  2 ++
 net/ipv4/tcp.c   | 32 
 2 files changed, 34 insertions(+)

diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index c53de26..f9f0e29 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -116,6 +116,8 @@ enum {
 #define TCP_SAVE_SYN   27  /* Record SYN headers for new 
connections */
 #define TCP_SAVED_SYN  28  /* Get SYN headers recorded for 
connection */
 #define TCP_REPAIR_WINDOW  29  /* Get/set window parameters */
+#define TCP_TLS_TX 30
+#define TCP_TLS_RX 31
 
 struct tcp_repair_opt {
__u32   opt_code;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 302fee9..2d190e3 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -273,6 +273,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -2676,6 +2677,21 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
tp->notsent_lowat = val;
sk->sk_write_space(sk);
break;
+   case TCP_TLS_TX:
+   case TCP_TLS_RX: {
+   int (*fn)(struct sock *sk, int optname,
+ char __user *optval, unsigned int optlen);
+
+   fn = symbol_get(tls_sk_attach);
+   if (!fn) {
+   err = -EINVAL;
+   break;
+   }
+
+   err = fn(sk, optname, optval, optlen);
+   symbol_put(tls_sk_attach);
+   break;
+   }
default:
err = -ENOPROTOOPT;
break;
@@ -3064,6 +3080,22 @@ static int do_tcp_getsockopt(struct sock *sk, int level,
}
return 0;
}
+   case TCP_TLS_TX:
+   case TCP_TLS_RX: {
+   int err;
+   int (*fn)(struct sock *sk, int optname,
+ char __user *optval, int __user *optlen);
+
+   fn = symbol_get(tls_sk_query);
+   if (!fn) {
+   err = -EINVAL;
+   break;
+   }
+
+   err = fn(sk, optname, optval, optlen);
+   symbol_put(tls_sk_query);
+   return err;
+   }
default:
return -ENOPROTOOPT;
}
-- 
2.7.4

[PATCH 4.9 61/88] hwrng: amd - Revert managed API changes

2017-03-28 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Prarit Bhargava 

commit 69db7009318758769d625b023402161c750f7876 upstream.

After commit 31b2a73c9c5f ("hwrng: amd - Migrate to managed API"), the
amd-rng driver uses devres with pci_dev->dev to keep track of resources,
but does not actually register a PCI driver.  This results in the
following issues:

1. The message

WARNING: CPU: 2 PID: 621 at drivers/base/dd.c:349 driver_probe_device+0x38c

is output when the i2c_amd756 driver loads and attempts to register a PCI
driver.  The PCI & device subsystems assume that no resources have been
registered for the device, and the WARN_ON() triggers since amd-rng has
already do so.

2.  The driver leaks memory because the driver does not attach to a
device.  The driver only uses the PCI device as a reference.   devm_*()
functions will release resources on driver detach, which the amd-rng
driver will never do.  As a result,

3.  The driver cannot be reloaded because there is always a use of the
ioport and region after the first load of the driver.

Revert the changes made by 31b2a73c9c5f ("hwrng: amd - Migrate to managed
API").

Signed-off-by: Prarit Bhargava 
Fixes: 31b2a73c9c5f ("hwrng: amd - Migrate to managed API").
Cc: Matt Mackall 
Cc: Corentin LABBE 
Cc: PrasannaKumar Muralidharan 
Cc: Wei Yongjun 
Cc: linux-crypto@vger.kernel.org
Cc: linux-ge...@lists.infradead.org
Signed-off-by: Herbert Xu 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/char/hw_random/amd-rng.c |   42 +++
 1 file changed, 34 insertions(+), 8 deletions(-)

--- a/drivers/char/hw_random/amd-rng.c
+++ b/drivers/char/hw_random/amd-rng.c
@@ -55,6 +55,7 @@ MODULE_DEVICE_TABLE(pci, pci_tbl);
 struct amd768_priv {
void __iomem *iobase;
struct pci_dev *pcidev;
+   u32 pmbase;
 };
 
 static int amd_rng_read(struct hwrng *rng, void *buf, size_t max, bool wait)
@@ -148,33 +149,58 @@ found:
if (pmbase == 0)
return -EIO;
 
-   priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+   priv = kzalloc(sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
 
-   if (!devm_request_region(&pdev->dev, pmbase + PMBASE_OFFSET,
-   PMBASE_SIZE, DRV_NAME)) {
+   if (!request_region(pmbase + PMBASE_OFFSET, PMBASE_SIZE, DRV_NAME)) {
dev_err(&pdev->dev, DRV_NAME " region 0x%x already in use!\n",
pmbase + 0xF0);
-   return -EBUSY;
+   err = -EBUSY;
+   goto out;
}
 
-   priv->iobase = devm_ioport_map(&pdev->dev, pmbase + PMBASE_OFFSET,
-   PMBASE_SIZE);
+   priv->iobase = ioport_map(pmbase + PMBASE_OFFSET, PMBASE_SIZE);
if (!priv->iobase) {
pr_err(DRV_NAME "Cannot map ioport\n");
-   return -ENOMEM;
+   err = -EINVAL;
+   goto err_iomap;
}
 
amd_rng.priv = (unsigned long)priv;
+   priv->pmbase = pmbase;
priv->pcidev = pdev;
 
pr_info(DRV_NAME " detected\n");
-   return devm_hwrng_register(&pdev->dev, &amd_rng);
+   err = hwrng_register(&amd_rng);
+   if (err) {
+   pr_err(DRV_NAME " registering failed (%d)\n", err);
+   goto err_hwrng;
+   }
+   return 0;
+
+err_hwrng:
+   ioport_unmap(priv->iobase);
+err_iomap:
+   release_region(pmbase + PMBASE_OFFSET, PMBASE_SIZE);
+out:
+   kfree(priv);
+   return err;
 }
 
 static void __exit mod_exit(void)
 {
+   struct amd768_priv *priv;
+
+   priv = (struct amd768_priv *)amd_rng.priv;
+
+   hwrng_unregister(&amd_rng);
+
+   ioport_unmap(priv->iobase);
+
+   release_region(priv->pmbase + PMBASE_OFFSET, PMBASE_SIZE);
+
+   kfree(priv);
 }
 
 module_init(mod_init);

[PATCH 4.9 62/88] hwrng: geode - Revert managed API changes

2017-03-28 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Prarit Bhargava 

commit 8c75704ebcac2ffa31ee7bcc359baf701b52bf00 upstream.

After commit e9afc746299d ("hwrng: geode - Use linux/io.h instead of
asm/io.h") the geode-rng driver uses devres with pci_dev->dev to keep
track of resources, but does not actually register a PCI driver.  This
results in the following issues:

1.  The driver leaks memory because the driver does not attach to a
device.  The driver only uses the PCI device as a reference.   devm_*()
functions will release resources on driver detach, which the geode-rng
driver will never do.  As a result,

2.  The driver cannot be reloaded because there is always a use of the
ioport and region after the first load of the driver.

Revert the changes made by  e9afc746299d ("hwrng: geode - Use linux/io.h
instead of asm/io.h").

Signed-off-by: Prarit Bhargava 
Fixes: 6e9b5e76882c ("hwrng: geode - Migrate to managed API")
Cc: Matt Mackall 
Cc: Corentin LABBE 
Cc: PrasannaKumar Muralidharan 
Cc: Wei Yongjun 
Cc: linux-crypto@vger.kernel.org
Cc: linux-ge...@lists.infradead.org
Signed-off-by: Herbert Xu 
Signed-off-by: Greg Kroah-Hartman 

---
 drivers/char/hw_random/geode-rng.c |   50 +
 1 file changed, 35 insertions(+), 15 deletions(-)

--- a/drivers/char/hw_random/geode-rng.c
+++ b/drivers/char/hw_random/geode-rng.c
@@ -31,6 +31,9 @@
 #include 
 #include 
 
+
+#define PFXKBUILD_MODNAME ": "
+
 #define GEODE_RNG_DATA_REG   0x50
 #define GEODE_RNG_STATUS_REG 0x54
 
@@ -82,6 +85,7 @@ static struct hwrng geode_rng = {
 
 static int __init mod_init(void)
 {
+   int err = -ENODEV;
struct pci_dev *pdev = NULL;
const struct pci_device_id *ent;
void __iomem *mem;
@@ -89,27 +93,43 @@ static int __init mod_init(void)
 
for_each_pci_dev(pdev) {
ent = pci_match_id(pci_tbl, pdev);
-   if (ent) {
-   rng_base = pci_resource_start(pdev, 0);
-   if (rng_base == 0)
-   return -ENODEV;
-
-   mem = devm_ioremap(&pdev->dev, rng_base, 0x58);
-   if (!mem)
-   return -ENOMEM;
-   geode_rng.priv = (unsigned long)mem;
-
-   pr_info("AMD Geode RNG detected\n");
-   return devm_hwrng_register(&pdev->dev, &geode_rng);
-   }
+   if (ent)
+   goto found;
}
-
/* Device not found. */
-   return -ENODEV;
+   goto out;
+
+found:
+   rng_base = pci_resource_start(pdev, 0);
+   if (rng_base == 0)
+   goto out;
+   err = -ENOMEM;
+   mem = ioremap(rng_base, 0x58);
+   if (!mem)
+   goto out;
+   geode_rng.priv = (unsigned long)mem;
+
+   pr_info("AMD Geode RNG detected\n");
+   err = hwrng_register(&geode_rng);
+   if (err) {
+   pr_err(PFX "RNG registering failed (%d)\n",
+  err);
+   goto err_unmap;
+   }
+out:
+   return err;
+
+err_unmap:
+   iounmap(mem);
+   goto out;
 }
 
 static void __exit mod_exit(void)
 {
+   void __iomem *mem = (void __iomem *)geode_rng.priv;
+
+   hwrng_unregister(&geode_rng);
+   iounmap(mem);
 }
 
 module_init(mod_init);

[PATCH v2] arm64: dts: ls1012a: add crypto node

2017-03-28 Thread Horia Geantă

LS1012A has a SEC v5.4 security engine.

Signed-off-by: Horia Geantă 
---
v2: move aliases from board specific files into the shared dtsi.

 arch/arm64/boot/dts/freescale/fsl-ls1012a.dtsi | 100 -
 1 file changed, 99 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1012a.dtsi 
b/arch/arm64/boot/dts/freescale/fsl-ls1012a.dtsi
index cffebb4b3df1..1c3493606cca 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1012a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1012a.dtsi
@@ -42,7 +42,7 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */
 
-#include 
+#include 
 
 / {
compatible = "fsl,ls1012a";
@@ -50,6 +50,15 @@
#address-cells = <2>;
#size-cells = <2>;
 
+   aliases {
+   crypto = &crypto;
+   rtic_a = &rtic_a;
+   rtic_b = &rtic_b;
+   rtic_c = &rtic_c;
+   rtic_d = &rtic_d;
+   sec_mon = &sec_mon;
+   };
+
cpus {
#address-cells = <1>;
#size-cells = <0>;
@@ -113,6 +122,95 @@
big-endian;
};
 
+   crypto: crypto@170 {
+   compatible = "fsl,sec-v5.4", "fsl,sec-v5.0",
+"fsl,sec-v4.0";
+   fsl,sec-era = <8>;
+   #address-cells = <1>;
+   #size-cells = <1>;
+   ranges = <0x0 0x00 0x170 0x10>;
+   reg = <0x00 0x170 0x0 0x10>;
+   interrupts = ;
+
+   sec_jr0: jr@1 {
+   compatible = "fsl,sec-v5.4-job-ring",
+"fsl,sec-v5.0-job-ring",
+"fsl,sec-v4.0-job-ring";
+   reg= <0x1 0x1>;
+   interrupts = ;
+   };
+
+   sec_jr1: jr@2 {
+   compatible = "fsl,sec-v5.4-job-ring",
+"fsl,sec-v5.0-job-ring",
+"fsl,sec-v4.0-job-ring";
+   reg= <0x2 0x1>;
+   interrupts = ;
+   };
+
+   sec_jr2: jr@3 {
+   compatible = "fsl,sec-v5.4-job-ring",
+"fsl,sec-v5.0-job-ring",
+"fsl,sec-v4.0-job-ring";
+   reg= <0x3 0x1>;
+   interrupts = ;
+   };
+
+   sec_jr3: jr@4 {
+   compatible = "fsl,sec-v5.4-job-ring",
+"fsl,sec-v5.0-job-ring",
+"fsl,sec-v4.0-job-ring";
+   reg= <0x4 0x1>;
+   interrupts = ;
+   };
+
+   rtic@6 {
+   compatible = "fsl,sec-v5.4-rtic",
+"fsl,sec-v5.0-rtic",
+"fsl,sec-v4.0-rtic";
+   #address-cells = <1>;
+   #size-cells = <1>;
+   reg = <0x6 0x100 0x60e00 0x18>;
+   ranges = <0x0 0x60100 0x500>;
+
+   rtic_a: rtic-a@0 {
+   compatible = "fsl,sec-v5.4-rtic-memory",
+"fsl,sec-v5.0-rtic-memory",
+"fsl,sec-v4.0-rtic-memory";
+   reg = <0x00 0x20 0x100 0x100>;
+   };
+
+   rtic_b: rtic-b@20 {
+   compatible = "fsl,sec-v5.4-rtic-memory",
+"fsl,sec-v5.0-rtic-memory",
+"fsl,sec-v4.0-rtic-memory";
+   reg = <0x20 0x20 0x200 0x100>;
+   };
+
+   rtic_c: rtic-c@40 {
+   compatible = "fsl,sec-v5.4-rtic-memory",
+"fsl,sec-v5.0-rtic-memory",
+"fsl,sec-v4.0-rtic-memory";
+   reg = <0x40 0x20 0x300 0x100>;
+   };
+
+   rtic_d: rtic-d@60 {
+   compatible = "fsl,sec-v5.4-rtic-memory",
+

Re: [PATCH] arm64: dts: ls1012a: add crypto node

2017-03-28 Thread Shawn Guo

On Tue, Mar 28, 2017 at 07:19:43AM +, Horia Geantă wrote:
> For the sake of current patch, please clarify whether a v2 is needed.
> IIUC:
> -sec_mon node name could stay the same (existing binding)
> -label names are ok, since underline is the only option allowed by DTC
> -alias names are out-of-spec but accepted by DTC; if changing underline
> to hyphen is requested, I will push out v2

All these are fine.  But we agreed that the alias definitions can be
shared and should be moved to fsl-ls1012a.dtsi, right?

Shawn

[PATCH 2/2] crypto: ccp - Mark driver as little-endian only

2017-03-28 Thread Arnd Bergmann

The driver causes a warning when built as big-endian:

drivers/crypto/ccp/ccp-dev-v5.c: In function 'ccp5_perform_des3':
include/uapi/linux/byteorder/big_endian.h:32:26: error: large integer 
implicitly truncated to unsigned type [-Werror=overflow]
 #define __cpu_to_le32(x) ((__force __le32)__swab32((x)))
  ^
include/linux/byteorder/generic.h:87:21: note: in expansion of macro 
'__cpu_to_le32'
 #define cpu_to_le32 __cpu_to_le32
 ^
drivers/crypto/ccp/ccp-dev-v5.c:436:28: note: in expansion of macro 
'cpu_to_le32'
  CCP5_CMD_KEY_MEM(&desc) = cpu_to_le32(CCP_MEMTYPE_SB);

The warning is correct, doing a 32-bit byte swap on a value that gets
assigned into a bit field cannot work, since we would only write zeroes
in this case, regardless of the input.

In fact, the use of bit fields in hardware defined data structures is
not portable to start with, so until all these bit fields get replaced
by something else, the driver cannot work on big-endian machines, and
I'm adding an annotation here to prevent it from being selected.

The CCPv3 code seems to not suffer from this problem, only v5 uses
bitfields.

Fixes: 4b394a232df7 ("crypto: ccp - Let a v5 CCP provide the same function as 
v3")
Signed-off-by: Arnd Bergmann 
---
 drivers/crypto/ccp/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/crypto/ccp/Kconfig b/drivers/crypto/ccp/Kconfig
index 2238f77aa248..07af9ece84f9 100644
--- a/drivers/crypto/ccp/Kconfig
+++ b/drivers/crypto/ccp/Kconfig
@@ -1,6 +1,7 @@
 config CRYPTO_DEV_CCP_DD
tristate "Cryptographic Coprocessor device driver"
depends on CRYPTO_DEV_CCP
+   depends on !CPU_BIG_ENDIAN || BROKEN
default m
select HW_RANDOM
select DMA_ENGINE
-- 
2.9.0

[PATCH 1/2] crypto: ccp - Reduce stack frame size with KASAN

2017-03-28 Thread Arnd Bergmann

The newly added AES GCM implementation uses one of the largest stack frames
in the kernel, around 1KB on normal 64-bit kernels, and 1.6KB when CONFIG_KASAN
is enabled:

drivers/crypto/ccp/ccp-ops.c: In function 'ccp_run_aes_gcm_cmd':
drivers/crypto/ccp/ccp-ops.c:851:1: error: the frame size of 1632 bytes is 
larger than 1536 bytes [-Werror=frame-larger-than=]

This is problematic for multiple reasons:

 - The crypto functions are often used in deep call chains, e.g. behind
   mm, fs and dm layers, making it more likely to run into an actual stack
   overflow

 - Using this much stack space is an indicator that the code is not
   written to be as efficient as it could be.

 - While this goes unnoticed at the moment in mainline with the frame size
   warning being disabled when KASAN is in use, I would like to enable
   the warning again, and the current code is slightly above my arbitrary
   pick for a limit of 1536 bytes (I already did patches for every other
   driver exceeding this).

A more drastic refactoring of the driver might be needed to reduce the
stack usage more substantially, but this patch is fairly simple and
at least addresses the third one of the problems I mentioned, reducing the
stack size by about 150 bytes and bringing it below the warning limit
I picked.

Fixes: 36cf515b9bbe ("crypto: ccp - Enable support for AES GCM on v5 CCPs")
Signed-off-by: Arnd Bergmann 
---
 drivers/crypto/ccp/ccp-dev.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/ccp/ccp-dev.h b/drivers/crypto/ccp/ccp-dev.h
index 3a45c2af2fbd..c5ea0796a891 100644
--- a/drivers/crypto/ccp/ccp-dev.h
+++ b/drivers/crypto/ccp/ccp-dev.h
@@ -432,24 +432,24 @@ struct ccp_dma_info {
unsigned int offset;
unsigned int length;
enum dma_data_direction dir;
-};
+} __packed __aligned(4);
 
 struct ccp_dm_workarea {
struct device *dev;
struct dma_pool *dma_pool;
-   unsigned int length;
 
u8 *address;
struct ccp_dma_info dma;
+   unsigned int length;
 };
 
 struct ccp_sg_workarea {
struct scatterlist *sg;
int nents;
+   unsigned int dma_count;
 
struct scatterlist *dma_sg;
struct device *dma_dev;
-   unsigned int dma_count;
enum dma_data_direction dma_dir;
 
unsigned int sg_used;
-- 
2.9.0

Re: [PATCH 0/7] crypto: aes - allow generic AES to be omitted

2017-03-28 Thread Ard Biesheuvel

On 28 March 2017 at 06:43, Eric Biggers  wrote:
> Hi Ard,
>
> On Sun, Mar 26, 2017 at 07:49:01PM +0100, Ard Biesheuvel wrote:
>> The generic AES driver uses 16 lookup tables of 1 KB each, and has
>> encryption and decryption routines that are fully unrolled. Given how
>> the dependencies between this code and other drivers are declared in
>> Kconfig files, this code is always pulled into the core kernel, even
>> if it is usually superseded at runtime by accelerated drivers that
>> exist for many architectures.
>>
>> This leaves us with 25 KB of dead code in the kernel, which is negligible
>> in typical environments, but which is actually a big deal for the IoT
>> domain, where every kilobyte counts.
>>
>> For this reason, this series refactors the way the various AES
>> implementations are wired up, to allow the generic version in
>> crypto/aes_generic.c to be omitted from the build entirely.
>>
>> Patch #1 removes some bogus 'select CRYPTO_AES' statement.
>>
>> Patch #2 introduces CRYPTO_NEED_AES which can be selected by driver that
>> require an AES cipher to be available, but don't care how it is implemented.
>>
>> Patches #3 and #4 make some preparatory changes that allow dependencies on
>> crypto_aes_expand_key to be fulfilled by the new (and much smaller) fixed
>> time AES driver. (#5)
>>
>> Patch #6 splits the generic AES driver into a core containing the precomputed
>> sub/shift/mix tables and the key expansion routines on the one hand, and the
>> encryption/decryption routines and the crypto API registration on the other.
>>
>> Patch #7 introduces the CRYPTO_HAVE_AES Kconfig symbol, and adds statements 
>> to
>> various AES implementations that can fulfil the CRYPTO_NEED_AES dependencies
>> added in patch #2. The introduced Kconfig logic allows CRYPTO_AES to be
>> deselected even if AES dependencies exist, as long as one of these 
>> alternatives
>> is selected.
>
> Just a thought: how about renaming CRYPTO_AES to CRYPTO_AES_GENERIC, then
> renaming what you called CRYPTO_NEED_AES to CRYPTO_AES?  Then all the 'select
> CRYPTO_AES' can remain as-is, instead of replacing them with the (in my 
> opinion
> uglier) 'select CRYPTO_NEED_AES'.  And it should still work for people who 
> have
> CRYPTO_AES=y or CRYPTO_AES=m in their kernel config, since they'll still get 
> at
> least one AES implementation (though they may stop getting the generic one).
>
> Also, in general I think we need better Kconfig help text.  As proposed you 
> can
> now toggle simply "AES cipher algorithms", and nowhere in the help text is it
> mentioned that that is only the generic implementation, which you don't need 
> if
> you have enabled some other implementation.  Similarly for "Fixed time AES
> cipher"; it perhaps should be mentioned that it's only useful if a fixed-time
> implementation using special CPU instructions like AES-NI or ARMv8-CE isn't
> usable.
>

Thanks for the feedback. I take it you are on board with the general idea then?

Re name change, those are good points. I will experiment with that.

I was a bit on the fence about modifying the x86 code more than
required, but actually, I think it makes sense for the AES-NI code to
use fixed-time AES as a fallback rather than the table-based x86 code,
given that the fallback is rarely used (only when executed in the
context of an interrupt taken from kernel code that is already using
the FPU) and falling back to a non-fixed time implementation loses
some guarantees that the AES-NI code gives.

Re: [PATCH] arm64: dts: ls1012a: add crypto node

2017-03-28 Thread Horia Geantă

On 3/24/2017 4:04 PM, Shawn Guo wrote:
> On Fri, Mar 24, 2017 at 08:29:17AM +, Horia Geantă wrote:
>> On 3/24/2017 9:35 AM, Shawn Guo wrote:
>>> On Fri, Mar 24, 2017 at 07:17:50AM +, Horia Geantă wrote:
>> +sec_mon: sec_mon@1e9 {
>
> Hyphen is more preferred to be used in node name than underscore.
>
 This would imply changing the
 Documentation/devicetree/bindings/crypto/fsl-sec4.txt binding and
 dealing with all the consequences, which IIUC is probably not worth.
>>>
>>> I do not care the bindings doc that much, since I'm not the maintainer
>>> of it.  What are the consequences specifically, if we use a better node
>>> name in dts than bindings example?
>>>
>> Users relying on finding the sec_mon node will obviously stop working.
>> I don't see any in-kernel users, however there could be others I am not
>> aware of and DT bindings should provide for backwards compatibility.
> 
> Okay, point taken.  You can keep the node name as it is.
> 
>> I could deprecate "sec_mon" in the bindings and suggest "sec-mon"
>> instead, while leaving all existing dts files as-is.
>> The risk is breaking LS1012A users relying on "sec_mon".
> 
> For existing bindings, I do not care that much.  But for new ones, I do
> hope that we recommend to use hyphen, as that's more idiomatic at least
> for Linux kernel.
> 
>> I see that ePAPR:
>> -allows both for hyphen and underline in case of node names
>> -allows only for hyphen (i.e. forbids underline) in case of alias nodes
>>
>> In the first case, I understand there's an (undocumented?) agreement to
>> prefer hyphen over underline.
> 
> Both are valid, but hyphen is more idiomatic for Linux kernel.
> 
>> For the 2nd one, does this mean I should change alias names?
> 
> This is something I see difference between specification and DTC.
> 
>   aliases {
>   alias-name = &label_name;
>   };
> 
>   label_name: node-name {
>   ...
>   };
> 
> The spec says that only hyphen is valid for alias name, but DTC works
> happily with underscore too.  From my experience with DTC playing, both
> hyphen and underscore are valid for alias and node name.  But for label
> name, only underscore is valid.  Using hyphen in label name will cause
> DTC to report syntax error.
> 
Yes indeed, thanks for pointing it out.

For the sake of current patch, please clarify whether a v2 is needed.
IIUC:
-sec_mon node name could stay the same (existing binding)
-label names are ok, since underline is the only option allowed by DTC
-alias names are out-of-spec but accepted by DTC; if changing underline
to hyphen is requested, I will push out v2

Thanks,
Horia

46 matches

Mail list logo