t; algorithm to use aligned loads.
>
> Given that the performance benefit of using of aligned loads appears
> to be limited (~0.25% for 1k blocks using tcrypt on a Corei7-8650U),
> and the fact that this hack has leaked into generic ChaCha code,
> let's just remove it.
Reviewed-by: Martin Willi
Thanks,
Martin
> > Also, I wonder if we shouldn't simply change the chacha code to use
> > unaligned loads for the state array, as it likely makes very little
> > difference in practice (the state is not accessed from inside the
> > round processing loop)
>
> I am seeing a 0.25% slowdown on 1k blocks in the SS
Hi Ard,
> Since turning the FPU on and off is cheap these days, simplify the
> SIMD routine by dropping the per-page yield, which makes for a
> cleaner switch to the library API as well.
In my measurements that lazy FPU restore works as intended, and I could
not identify any slowdown by this chan
> If the rfc7539 template is instantiated with specific
> implementations, e.g. "rfc7539(chacha20-generic,poly1305-generic)"
> rather than "rfc7539(chacha20,poly1305)", then the implementation
> names end up included in the instance's cra_name. This is i
> [...] This bug was originally detected by my patches that improve
> testmgr to fuzz algorithms against their generic implementation.
Thanks Eric. This shows how valuable your continued work on the crypto
testing code is, and how useful such a (common) testing infrastructure
can be.
Reviewed-by: Martin Willi
> To improve responsiveness, disable preemption for each step of the
> walk (which is at most PAGE_SIZE) rather than for the entire
> encryption/decryption operation.
It seems that it is not that uncommon for IPsec to get small inputs
scattered over multiple blocks. Doing FPU context saving for
y
> Adiantum.
>
> Signed-off-by: Eric Biggers
Reviewed-by: Martin Willi
> In preparation for adding XChaCha12 support, rename/refactor the
> x86_64 SIMD implementations of ChaCha20 to support different numbers
> of rounds.
>
> Signed-off-by: Eric Biggers
Reviewed-by: Martin Willi
ermute
AFAIK, the general convention is to create proper stack frames using
FRAME_BEGIN/END for non leaf-functions. Should chacha20_permute()
callers do so?
For the other parts:
Reviewed-by: Martin Willi
ion of ~20%.
Signed-off-by: Martin Willi
---
arch/x86/crypto/chacha20-avx512vl-x86_64.S | 272 +
arch/x86/crypto/chacha20_glue.c| 7 +
2 files changed, 279 insertions(+)
diff --git a/arch/x86/crypto/chacha20-avx512vl-x86_64.S
b/arch/x86/crypto/chacha20-avx512vl-x86_
1453 1947
1496 1477 1963 1438 1930
Martin Willi (3):
crypto: x86/chacha20 - Add a 8-block AVX-512VL variant
crypto: x86/chacha20 - Add a 2-block AVX-512VL variant
crypto: x86/chacha20 - Add a 4-block AVX-512VL variant
arch/x86/crypto/Makefile | 5 +
arch
process a single block. Hence we engage that function for (partial)
single block lengths as well.
Signed-off-by: Martin Willi
---
arch/x86/crypto/chacha20-avx512vl-x86_64.S | 171 +
arch/x86/crypto/chacha20_glue.c| 7 +
2 files changed, 178 insertions(+)
diff --git
namic masks is not part of
the AVX-512VL instruction set, hence we depend on AVX-512BW as well. Given
that the major AVX-512VL architectures provide AVX-512BW and this extension
does not affect core clocking, this seems to be no problem at least for
now.
Signed-off-by: Martin Willi
Hi Jason,
> [...] I have a massive Xeon Gold 5120 machine that I can give you
> access to if you'd like to do some testing and benching.
Thanks for the offer, no need at this time. But I certainly would
welcome if you could do some (Wireguard) benching with that code to see
if it works for you.
Hi Jason,
> I'd be inclined to roll with your implementation if it can eventually
> become competitive with Andy Polyakov's, [...]
I think for the SSSE3/AVX2 code paths it is competitive; especially for
small sizes it is faster, which is not that unimportant when
implementing layer 3 VPNs.
> the
place.
The partial XORing function trailer is very similar to the AVX2 2-block
variant. While it could be shared, that code segment is rather short;
profiling is also easier with the trailer integrated, so we keep it per
function.
Signed-off-by: Martin Willi
---
arch/x86/crypto/chacha20-avx2
.
Signed-off-by: Martin Willi
---
arch/x86/crypto/chacha20-avx2-x86_64.S | 189 +
arch/x86/crypto/chacha20_glue.c| 5 +-
2 files changed, 133 insertions(+), 61 deletions(-)
diff --git a/arch/x86/crypto/chacha20-avx2-x86_64.S
b/arch/x86/crypto/chacha20-avx2-x86_64.S
Now that all block functions support partial lengths, engage the wider
block sizes more aggressively. This prevents using smaller block
functions multiple times, where the next larger block function would
have been faster.
Signed-off-by: Martin Willi
---
arch/x86/crypto/chacha20_glue.c | 39
s probably not worth it.
Signed-off-by: Martin Willi
---
arch/x86/crypto/chacha20-ssse3-x86_64.S | 74 -
arch/x86/crypto/chacha20_glue.c | 11 ++--
2 files changed, 63 insertions(+), 22 deletions(-)
diff --git a/arch/x86/crypto/chacha20-ssse3-x86_64.S
b/arch/
function.
Signed-off-by: Martin Willi
---
arch/x86/crypto/chacha20-ssse3-x86_64.S | 163 ++--
arch/x86/crypto/chacha20_glue.c | 5 +-
2 files changed, 128 insertions(+), 40 deletions(-)
diff --git a/arch/x86/crypto/chacha20-ssse3-x86_64.S
b/arch/x86/crypto/chacha20
1027 1522 1537
1440 1027 1564 1523
1448 1026 1507 1512
1456 1025 1515 1491
1464 1023 1522 1481
1472 1037 1559 1577
1480 927 1518 1559
1488 926 1514 1548
1496 926 1513 1534
Martin Willi (6):
crypto: x86/chacha20 - Support partial lengths in 1-block SSSE3
variant
crypto: x86/chacha20
require a 4-block function.
Signed-off-by: Martin Willi
---
arch/x86/crypto/chacha20-avx2-x86_64.S | 197 +
arch/x86/crypto/chacha20_glue.c| 7 +
2 files changed, 204 insertions(+)
diff --git a/arch/x86/crypto/chacha20-avx2-x86_64.S
b/arch/x86/crypto/chacha20-avx2
t; skcipher template.
Nice work. I did a quick review only, but you may add my
Acked-by: Martin Willi
for patches 1-5, 10 and 11.
Thanks,
Martin
Hi Jason,
> Now that ChaCha20 is in Zinc, we can have the crypto API code simply
> call into it.
> delete mode 100644 arch/x86/crypto/chacha20-avx2-x86_64.S
> delete mode 100644 arch/x86/crypto/chacha20-ssse3-x86_64.S
I did some more testing with that new Zinc ChaCha20 code on x64, and
I'm sti
Hi Jason,
> Now that ChaCha20 is in Zinc, we can have the crypto API code simply
> call into it.
> delete mode 100644 arch/x86/crypto/chacha20-avx2-x86_64.S
> delete mode 100644 arch/x86/crypto/chacha20-ssse3-x86_64.S
I did some trivial benchmarking with tcrypt for the ChaCha20Poly1305
AEAD as
Hi,
> Anyway, I actually thought it was intentional that the ChaCha
> implementations in the Linux kernel allowed specifying the block
> counter, and therefore allowed seeking to any point in the keystream,
> exposing the full functionality of the cipher.
If I remember correctly, it was indeed in
> By using the unaligned access helpers, we drastically improve
> performance on small MIPS routers that have to go through the
> exception fix-up handler for these unaligned accesses.
I couldn't measure any slowdown here, so:
Acked-by: Martin Willi
> - dctx->s[0]
ersion seems to be ok, so is Poly1305.
Acked-by: Martin Willi
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
.
Signed-off-by: Martin Willi
---
arch/x86/crypto/poly1305-sse2-x86_64.S | 306 +
arch/x86/crypto/poly1305_glue.c| 54 +-
2 files changed, 355 insertions(+), 5 deletions(-)
diff --git a/arch/x86/crypto/poly1305-sse2-x86_64.S
b/arch/x86/crypto/poly1305-sse2
): 684405
opers/sec, 2825226316 bytes/sec
test 11 ( 8224 byte blocks, 8224 bytes per update, 1 updates): 367101
opers/sec, 3019039446 bytes/sec
Benchmark results from a Core i5-4670T.
Signed-off-by: Martin Willi
---
arch/x86/crypto/Makefile | 1 +
arch/x86/crypto/poly1305
operations in 10 seconds
(18672197632 bytes)
Benchmark results from a Core i5-4670T.
Signed-off-by: Martin Willi
---
arch/x86/crypto/Makefile | 1 +
arch/x86/crypto/chacha20-avx2-x86_64.S | 443 +
arch/x86/crypto/chacha20_glue.c| 19 ++
crypto
10 seconds
(11846409216 bytes)
test 4 (256 bit key, 8192 byte blocks): 1448761 operations in 10 seconds
(11868250112 bytes)
Benchmark results from a Core i5-4670T.
Signed-off-by: Martin Willi
---
arch/x86/crypto/chacha20-ssse3-x86_64.S | 483
arch/x86/crypto
test 11 ( 8224 byte blocks, 8224 bytes per update, 1 updates): 153075
opers/sec, 1258896201 bytes/sec
Benchmark results from a Core i5-4670T.
Signed-off-by: Martin Willi
---
arch/x86/crypto/Makefile | 2 +
arch/x86/crypto/poly1305-sse2-x86_64.S | 276
As architecture specific drivers need a software fallback, export Poly1305
init/update/final functions together with some helpers in a header file.
Signed-off-by: Martin Willi
---
crypto/chacha20poly1305.c | 4 +--
crypto/poly1305_generic.c | 73
As architecture specific drivers need a software fallback, export a
ChaCha20 en-/decryption function together with some helpers in a header
file.
Signed-off-by: Martin Willi
---
crypto/chacha20_generic.c | 28
crypto/chacha20poly1305.c | 3 +--
include/crypto
for typical
IPsec MTUs. On Ivy Bridge using SSE2/SSSE3 the numbers compared to AES-GCM
are very similar due to the less efficient CLMUL instructions.
Changes in v2:
- No code changes
- Use sec=10 for more reliable benchmark results
Martin Willi (10):
crypto: tcrypt - Add ChaCha20/Poly1305 speed
The AVX2 variant of ChaCha20 is used only for messages with >= 512 bytes
length. With the existing test vectors, the implementation could not be
tested. Due that lack of such a long official test vector, this one is
self-generated using chacha20-generic.
Signed-off-by: Martin Willi
---
cry
Adds individual ChaCha20 and Poly1305 and a combined rfc7539esp AEAD speed
test using mode numbers 214, 321 and 213. For Poly1305 we add a specific
speed template, as it expects the key prepended to the input data.
Signed-off-by: Martin Willi
---
crypto/tcrypt.c | 15 +++
crypto
): 5360533 operations in 10 seconds
(5489185792 bytes)
test 4 (256 bit key, 8192 byte blocks): 692846 operations in 10 seconds
(5675794432 bytes)
Benchmark results from a Core i5-4670T.
Signed-off-by: Martin Willi
---
arch/x86/crypto/Makefile| 2 +
arch/x86/crypto/chacha20
ad, so you may
add my:
Tested-by: Martin Willi
Regards
Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
> If you're going to use sec you need to use at least 10 in order
> for it to be meaningful as shorter values often result in bogus
> numbers.
Ok, I'll use sec=10 in v2. There is no fundamental difference compared
to sec=1 (except for very short blocks):
testing speed of rfc7539esp(chacha20,poly
Herbert,
> Running the speed test with sec=1 makes no sense because it's
> too short. Please use sec=0 and count cycles instead.
I get less constant numbers between different runs when using sec=0,
hence I've used sec=1. Below are the numbers of "average" runs for the
AEAD measuring cycles; I'll
Adds individual ChaCha20 and Poly1305 and a combined rfc7539esp AEAD speed
test using mode numbers 214, 321 and 213. For Poly1305 we add a specific
speed template, as it expects the key prepended to the input data.
Signed-off-by: Martin Willi
---
crypto/tcrypt.c | 15 +++
crypto
As architecture specific drivers need a software fallback, export a
ChaCha20 en-/decryption function together with some helpers in a header
file.
Signed-off-by: Martin Willi
---
crypto/chacha20_generic.c | 28
crypto/chacha20poly1305.c | 3 +--
include/crypto
test 11 ( 8224 byte blocks, 8224 bytes per update, 1 updates): 153136
opers/sec, 1259390464 bytes/sec
Benchmark results from a Core i5-4670T.
Signed-off-by: Martin Willi
---
arch/x86/crypto/Makefile | 2 +
arch/x86/crypto/poly1305-sse2-x86_64.S | 276
CLMUL instructions.
Martin Willi (10):
crypto: tcrypt - Add ChaCha20/Poly1305 speed tests
crypto: chacha20 - Export common ChaCha20 helpers
crypto: chacha20 - Add a SSSE3 SIMD variant for x86_64
crypto: chacha20 - Add a four block SSSE3 variant for x86_64
crypto: chacha20 - Add an eight
1 seconds
(532198400 bytes)
test 4 (256 bit key, 8192 byte blocks): 67132 operations in 1 seconds
(549945344 bytes)
Benchmark results from a Core i5-4670T.
Signed-off-by: Martin Willi
---
arch/x86/crypto/Makefile| 2 +
arch/x86/crypto/chacha20-ssse3-x86_64.S | 142
.
Signed-off-by: Martin Willi
---
arch/x86/crypto/poly1305-sse2-x86_64.S | 306 +
arch/x86/crypto/poly1305_glue.c| 54 +-
2 files changed, 355 insertions(+), 5 deletions(-)
diff --git a/arch/x86/crypto/poly1305-sse2-x86_64.S
b/arch/x86/crypto/poly1305-sse2
bytes)
test 4 (256 bit key, 8192 byte blocks): 140107 operations in 1 seconds
(1147756544 bytes)
Benchmark results from a Core i5-4670T.
Signed-off-by: Martin Willi
---
arch/x86/crypto/chacha20-ssse3-x86_64.S | 483
arch/x86/crypto/chacha20_glue.c | 8
): 677578
opers/sec, 2797041984 bytes/sec
test 11 ( 8224 byte blocks, 8224 bytes per update, 1 updates): 364094
opers/sec, 2994309056 bytes/sec
Benchmark results from a Core i5-4670T.
Signed-off-by: Martin Willi
---
arch/x86/crypto/Makefile | 1 +
arch/x86/crypto/poly1305
The AVX2 variant of ChaCha20 is used only for messages with >= 512 bytes
length. With the existing test vectors, the implementation could not be
tested. Due that lack of such a long official test vector, this one is
self-generated using chacha20-generic.
Signed-off-by: Martin Willi
---
cry
As architecture specific drivers need a software fallback, export Poly1305
init/update/final functions together with some helpers in a header file.
Signed-off-by: Martin Willi
---
crypto/chacha20poly1305.c | 4 +--
crypto/poly1305_generic.c | 73
bytes)
Benchmark results from a Core i5-4670T.
Signed-off-by: Martin Willi
---
arch/x86/crypto/Makefile | 1 +
arch/x86/crypto/chacha20-avx2-x86_64.S | 443 +
arch/x86/crypto/chacha20_glue.c| 19 ++
crypto/Kconfig
The Poly1305 authenticator requires a unique key for each generated tag. This
implies that we can't set the key per tfm, as multiple users set individual
keys. Instead we pass a desc specific key as the first two blocks of the
message to authenticate in update().
Signed-off-by: Martin
Herbert,
> I just realised that this doesn't quite work. The key is shared
> by all users of the tfm, yet in your case you need it to be local
I agree, as Poly1305 uses a different key for each tag the current
approach doesn't work.
> I think the simplest solution is to make the key the beginni
. It uses a 16-byte IV, which includes the 12-byte ChaCha20 nonce
prepended by the initial block counter. Some algorithms require an explicit
counter value, for example the mentioned AEAD construction.
Signed-off-by: Martin Willi
---
crypto/Kconfig| 13 +++
crypto/Makefile
We explicitly set the Initial block Counter by prepending it to the nonce in
Little Endian. The same test vector is used for both encryption and decryption,
ChaCha20 is a cipher XORing a keystream.
Signed-off-by: Martin Willi
---
crypto/testmgr.c | 15 +
crypto/testmgr.h | 177
Signed-off-by: Martin Willi
---
net/xfrm/xfrm_algo.c | 12
1 file changed, 12 insertions(+)
diff --git a/net/xfrm/xfrm_algo.c b/net/xfrm/xfrm_algo.c
index 67266b7..42f7c76 100644
--- a/net/xfrm/xfrm_algo.c
+++ b/net/xfrm/xfrm_algo.c
@@ -159,6 +159,18 @@ static struct xfrm_algo_desc
Signed-off-by: Martin Willi
---
crypto/testmgr.c | 9 ++
crypto/testmgr.h | 259 +++
2 files changed, 268 insertions(+)
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index abd09c2..faf93a6 100644
--- a/crypto/testmgr.c
+++ b/crypto
Signed-off-by: Martin Willi
---
crypto/testmgr.c | 15 +
crypto/testmgr.h | 179 +++
2 files changed, 194 insertions(+)
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index 915a9ef..ccd19cf 100644
--- a/crypto/testmgr.c
+++ b/crypto
draft-ietf-ipsecme-chacha20-poly1305 defines the use of ChaCha20/Poly1305 in
ESP. It uses additional four byte key material as a salt, which is then used
with an 8 byte IV to form the ChaCha20 nonce as defined in the RFC7539.
Signed-off-by: Martin Willi
---
crypto/chacha20poly1305.c | 26
Signed-off-by: Martin Willi
---
crypto/testmgr.c | 15
crypto/testmgr.h | 269 +++
2 files changed, 284 insertions(+)
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index faf93a6..915a9ef 100644
--- a/crypto/testmgr.c
+++ b/crypto
This AEAD uses a chacha20 ablkcipher and a poly1305 ahash to construct the
ChaCha20-Poly1305 AEAD as defined in RFC7539. It supports both synchronous and
asynchronous operations, even if we currently have no async chacha20 or poly1305
drivers.
Signed-off-by: Martin Willi
---
crypto/Kconfig
public domain code by Daniel J. Bernstein and
Andrew Moon.
Signed-off-by: Martin Willi
---
crypto/Kconfig| 9 ++
crypto/Makefile | 1 +
crypto/poly1305_generic.c | 300 ++
3 files changed, 310 insertions(+)
create mode 100644
test setup the IPsec throughput is ~700Mbits/s with these portable
drivers. Architecture specific drivers subject to a future patchset can improve
performance, for example with SSE doubling performance is feasible.
Martin Willi (9):
crypto: Add a generic ChaCha20 stream cipher implementation
Hi Steffen,
> > It looks like our IPsec implementations of CCM and GCM are buggy
> > in that they don't include the IV in the authentication calculation.
>
> Seems like crypto_rfc4106_crypt() passes the associated data it
> got from ESP directly to gcm, without chaining with the IV.
Do you have
Hi Herbert,
> > Does this mean that even the test vectors (crypto/testmgr.h) are broken?
>
> Indeed. The test vectors appear to be generated either through
> our implementation or by one that is identical to us.
I'm not sure about that. RFC4106 refers to [1] for test vectors, which
is still ava
Add TFC padding to all packets smaller than the boundary configured
on the xfrm state. If the boundary is larger than the PMTU, limit
padding to the PMTU.
Signed-off-by: Martin Willi
---
net/ipv4/esp4.c | 32
1 files changed, 24 insertions(+), 8 deletions
.
Changes from v2:
- Remove unused flag field in attribute, use a plain u32 as attribute payload
- Reject installation of TFC padding on non-tunnel SAs
Martin Willi (3):
xfrm: Add Traffic Flow Confidentiality padding XFRM attribute
xfrm: Traffic Flow Confidentiality for IPv4 ESP
The XFRMA_TFCPAD attribute for XFRM state installation configures
Traffic Flow Confidentiality by padding ESP packets to a specified
length.
Signed-off-by: Martin Willi
---
include/linux/xfrm.h |1 +
include/net/xfrm.h |1 +
net/xfrm/xfrm_user.c | 19 +--
3 files
Add TFC padding to all packets smaller than the boundary configured
on the xfrm state. If the boundary is larger than the PMTU, limit
padding to the PMTU.
Signed-off-by: Martin Willi
---
net/ipv6/esp6.c | 32
1 files changed, 24 insertions(+), 8 deletions
> In particular, why would we need a boundary at all? Setting it to
> anything other than the PMTU would seem to defeat the purpose of
> TFC for packets between the boundary and the PMTU.
I don't agree, this highly depends on the traffic on the SA. For a
general purpose tunnel with TCP flows, PMT
e kept the currently unused flags in the XFRM attribute to implement
ESPv2 fallback or other extensions in the future without changing the ABI.
Martin Willi (3):
xfrm: Add Traffic Flow Confidentiality padding XFRM attribute
xfrm: Traffic Flow Confidentiality for IPv4 ESP
xfrm: Tr
The XFRMA_TFC attribute for XFRM state installation configures
Traffic Flow Confidentiality by padding ESP packets to a specified
length.
Signed-off-by: Martin Willi
---
include/linux/xfrm.h |6 ++
include/net/xfrm.h |1 +
net/xfrm/xfrm_user.c | 16 ++--
3 files
Add TFC padding to all packets smaller than the boundary configured
on the xfrm state. If the boundary is larger than the PMTU, limit
padding to the PMTU.
Signed-off-by: Martin Willi
---
net/ipv4/esp4.c | 33 +
1 files changed, 25 insertions(+), 8 deletions
Add TFC padding to all packets smaller than the boundary configured
on the xfrm state. If the boundary is larger than the PMTU, limit
padding to the PMTU.
Signed-off-by: Martin Willi
---
net/ipv6/esp6.c | 33 +
1 files changed, 25 insertions(+), 8 deletions
Hi Herbert,
> I know why you want to do this, what I'm asking is do you have any
> research behind this with regards to security
>
> Has this scheme been discussed on a public forum somewhere?
No, sorry, I haven't found much valuable discussion about TFC padding.
Nothing at all how to overcome
> What is the basis of this random length padding?
Let assume a peer does not support ESPv3 padding, but we have to pad a
small packet with more than 255 bytes. We can't, the ESP padding length
field is limited to 255.
We could add 255 fixed bytes, but an eavesdropper could just subtract
the 255
, but I'm not sure if my PMTU lookup works in
all cases (nested transforms?). Any pointer would be appreciated.
Martin Willi (5):
xfrm: Add Traffic Flow Confidentiality padding XFRM attribute
xfrm: Remove unused ESP padlen field
xfrm: Traffic Flow Confidentiality for IPv4 ESP
Traffic Flow Confidentiality padding is most effective if all packets
have exactly the same size. For SAs with mixed traffic, the largest
packet size is usually the PMTU. Instead of calculating the PMTU
manually, the XFRM_TFC_PMTU flag automatically pads to the PMTU.
Signed-off-by: Martin Willi
The XFRMA_TFCPAD attribute for XFRM state installation configures
Traffic Flow Confidentiality by padding ESP packets to a specified
length. To use RFC4303 TFC padding and overcome the 255 byte ESP
padding field limit, the XFRM_TFC_ESPV3 flag must be set.
Signed-off-by: Martin Willi
---
include
The padlen field in IPv4/6 ESP is used to align the ESP padding length
to a value larger than the aead block size. There is however no
option to set this field, hence it is removed.
Signed-off-by: Martin Willi
---
include/net/esp.h |3 ---
net/ipv4/esp4.c | 11 ++-
net/ipv6/esp6
If configured on xfrm state, increase the length of all packets to
a given boundary using TFC padding as specified in RFC4303. For
transport mode, or if the XFRM_TFC_ESPV3 is not set, grow the ESP
padding field instead.
Signed-off-by: Martin Willi
---
net/ipv6/esp6.c | 42
If configured on xfrm state, increase the length of all packets to
a given boundary using TFC padding as specified in RFC4303. For
transport mode, or if the XFRM_TFC_ESPV3 is not set, grow the ESP
padding field instead.
Signed-off-by: Martin Willi
---
net/ipv4/esp4.c | 42
> This patch adds the af_alg plugin for hash, corresponding to
> the ahash kernel operation type.
Tested-by: Martin Willi
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordo
> This patch creates the backbone of the user-space interface for
> the Crypto API, through a new socket family AF_ALG.
Tested-by: Martin Willi
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
Mor
> This patch adds the af_alg plugin for symmetric key ciphers,
> corresponding to the ablkcipher kernel operation type.
I can confirm that the newest patch fixes the page leak.
Tested-by: Martin Willi
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto"
> Hmm, can you show me your test program and how you determined
> that it was leaking pages?
The test program below runs 1000 encryptions:
# grep nr_free /proc/vmstat
nr_free_pages 11031
# ./test
...
# grep nr_free /proc/vmstat
nr_free_pages 10026
# ./test
...
# grep nr_free /proc/vmstat
nr_f
Hi Herbert,
I did a proof-of-concept implementation for our crypto library, the
interface looks good so far. All our hash, hmac, xcbc and cipher test
vectors matched.
> + sg_assign_page(sg + i, alloc_page(GFP_KERNEL));
Every skcipher operation leaks memory on my box (this pag
-off-by: Martin Willi
---
net/key/af_key.c |1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/net/key/af_key.c b/net/key/af_key.c
index 84209fb..76fa6fe 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -1193,6 +1193,7 @@ static struct xfrm_state * pfkey_msg2xfrm_state
These algorithms use a truncation of 192/256 bits, as specified
in RFC4868.
Signed-off-by: Martin Willi
---
net/xfrm/xfrm_algo.c | 34 ++
1 files changed, 34 insertions(+), 0 deletions(-)
diff --git a/net/xfrm/xfrm_algo.c b/net/xfrm/xfrm_algo.c
index faf54c6
The new XFRMA_ALG_AUTH_TRUNC attribute taking a xfrm_algo_auth as
argument allows the installation of authentication algorithms with
a truncation length specified in userspace, i.e. SHA256 with 128 bit
instead of 96 bit truncation.
Signed-off-by: Martin Willi
---
include/linux/xfrm.h |8
specified, or the authentication algorithm
is specified using xfrm_algo, the truncation length from the algorithm
description in the kernel is used.
Signed-off-by: Martin Willi
---
include/net/xfrm.h| 12 -
net/xfrm/xfrm_state.c |2 +-
net/xfrm/xfrm_user.c | 129
The following patchset adds support for defining truncation lengths
for authentication algorithms in userspace. The main purpose for this
is to support SHA256 in IPsec using the standardized 128 bit
instead of the currently used 96 bit truncation.
Martin Willi (3):
xfrm: Define new XFRM netlink
Instead of using the hardcoded truncation for authentication
algorithms, use the truncation length specified on xfrm_state.
Signed-off-by: Martin Willi
---
net/ipv4/ah4.c |2 +-
net/ipv4/esp4.c |2 +-
net/ipv6/ah6.c |2 +-
net/ipv6/esp6.c |2 +-
4 files changed, 4 insertions
> You must getting an sg entry that crosses a page boundary, rather than
> two sg entries that both stay within a page.
Yes.
> These things are very rare, and usually occurs as
> a result of SLAB debugging causing kmalloc to return memory that
> crosses page boundaries.
Indeed, SLAB_DEBUG was en
> > Switching the hash implementations to the new shash API introduced a
> > regression. HMACs are created incorrectly if the data is scattered over
> > multiple pages, resulting in very unreliable IPsec tunnels.
>
> What are the symptoms?
After doing further tests, it seems that this is addition
Hi,
Switching the hash implementations to the new shash API introduced a
regression. HMACs are created incorrectly if the data is scattered over
multiple pages, resulting in very unreliable IPsec tunnels.
The appended patch adds a silly hmac(sha1) test vector larger than a 4KB
page and fails on c
98 matches
Mail list logo