Re: V3 [PATCH] x86: Add X86_ENDBR and CET marker to config.m4.in

2020-03-09 Thread Simo Sorce
On Mon, 2020-03-09 at 14:59 -0700, H.J. Lu wrote:
> On Mon, Mar 9, 2020 at 2:42 PM Simo Sorce  wrote:
> > On Mon, 2020-03-09 at 14:31 -0700, H.J. Lu wrote:
> > > On Mon, Mar 9, 2020 at 2:15 PM Simo Sorce  wrote:
> > > > On Mon, 2020-03-09 at 12:46 -0700, H.J. Lu wrote:
> > > > > On Mon, Mar 9, 2020 at 12:22 PM Simo Sorce  wrote:
> > > > > > On Mon, 2020-03-09 at 15:19 -0400, Simo Sorce wrote:
> > > > > > > On Mon, 2020-03-09 at 11:56 -0700, H.J. Lu wrote:
> > > > > > > > On Mon, Mar 9, 2020 at 11:19 AM Simo Sorce  
> > > > > > > > wrote:
> > > > > > > > > On Mon, 2020-03-09 at 19:03 +0100, Niels Möller wrote:
> > > > > > > > > > Simo Sorce  writes:
> > > > > > > > > > 
> > > > > > > > > > > The patchset i solder than I did remember, April 2019
> > > > > > > > > > > But I recall running at least one version of it on our 
> > > > > > > > > > > CET emulator @
> > > > > > > > > > > Red Hat.
> > > > > > > > > > 
> > > > > > > > > > Sorry I forgot to followup on that. It seems only the first 
> > > > > > > > > > easy cleanup
> > > > > > > > > > patch, "Add missing EPILOGUEs in assembly files", was 
> > > > > > > > > > applied back then.
> > > > > > > > > > 
> > > > > > > > > > Do you remember why you used GNU_CET_SECTION() explicitly 
> > > > > > > > > > in .asm files,
> > > > > > > > > > rather than using an m4 divert?
> > > > > > > > > 
> > > > > > > > > Not really I do not recall anymore, but I think there was a 
> > > > > > > > > reason, as
> > > > > > > > > I recall you made that comment back then and it "didn't work 
> > > > > > > > > out" when
> > > > > > > > > I tried is the memory I have of it.
> > > > > > > > > Might have to do with differences in how it lays out the code 
> > > > > > > > > when done
> > > > > > > > > via m4 divert, but not 100% sure.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > m4 divert  requires much less changes.   Here is the updated 
> > > > > > > > patch with
> > > > > > > > ASM_X86_ENDBR, ASM_X86_MARK_CET_ALIGN and ASM_X86_MARK_CET.
> > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > Two comments on your patch.
> > > > > > > 
> > > > > > > 1. It is an error to align based on architecture. All GNU Notes 
> > > > > > > MUST be
> > > > > > > aligned 8 bytes. Since 2018 GNU Libc ignores misaligned notes.
> > > > > > 
> > > > > > Ah nevermind this point, misunderstanding with my libc expert, the 4
> > > > > > bytes alignment is ok on 32 bit code.
> > > > > > 
> > > > > > > 2. It is better to use .pushsection .popsection pairs around the 
> > > > > > > note
> > > > > > > instead of .section because of the side effects of using .section
> > > > > 
> > > > > Done.
> > > > > 
> > > > > > > The m4 divert looks smaller impact, feel free to lift the Gnu Note
> > > > > > > section in my patch #3 and place it into your patch if you want. 
> > > > > > > My
> > > > > > > code also made it more explicit what all the sections values 
> > > > > > > actually
> > > > > > > mean which will help in long term maintenance if someone else 
> > > > > > > need to
> > > > > > > change anything (like for example changing to enable only 
> > > > > > > ShadowStack
> > > > > > > vs IBT).
> > > > > > > 
> > > > > 
> > > > > Since CET support requires all objects are marked for CET,  CET 
> > > > > marker on
> > > > > assembly sources is controlled by compiler options, not by configure 
> > > > > option.
> > > > > Also linker can merge multiple .note.gnu.property sections in a single
> > > > > input file:
> > > > > 
> > > > > [hjl@gnu-cfl-1 tmp]$ cat p.s
> > > > > .pushsection ".note.gnu.property", "a"
> > > > > .p2align 3
> > > > > .long 1f - 0f
> > > > > .long 4f - 1f
> > > > > .long 5
> > > > > 0:
> > > > > .asciz "GNU"
> > > > > 1:
> > > > > .p2align 3
> > > > > .long 0xc002
> > > > > .long 3f - 2f
> > > > > 2:
> > > > > .long 1
> > > > > 3:
> > > > > .p2align 3
> > > > > 4:
> > > > > .popsection
> > > > > .pushsection ".note.gnu.property", "a"
> > > > > .p2align 3
> > > > > .long 1f - 0f
> > > > > .long 4f - 1f
> > > > > .long 5
> > > > > 0:
> > > > > .asciz "GNU"
> > > > > 1:
> > > > > .p2align 3
> > > > > .long 0xc002
> > > > > .long 3f - 2f
> > > > > 2:
> > > > > .long 2
> > > > > 3:
> > > > > .p2align 3
> > > > > 4:
> > > > > .popsection
> > > > > [hjl@gnu-cfl-1 tmp]$ as -o p.o p.s -mx86-used-note=no
> > > > > [hjl@gnu-cfl-1 tmp]$ readelf -n p.o
> > > > > 
> > > > > Displaying notes found in: .note.gnu.property
> > > > >   OwnerData size Description
> > > > >   GNU  0x0010 NT_GNU_PROPERTY_TYPE_0
> > > > >   Properties: x86 feature: IBT
> > > > >   GNU  0x0010 NT_GNU_PROPERTY_TYPE_0
> > > > >   Properties: x86 feature: SHSTK
> > > > > [hjl@gnu-cfl-1 tmp]$ ld -r p.o
> > > > > [hjl@gnu-cfl-1 tmp]$ readelf -n a.out
> > > > > 
> > > > > Displaying notes found in: .note.gnu.property
> > > > >   OwnerData size Description
> > > > >   GNU  0x0010 NT_GNU_PROPERTY_TYPE_0
> > > > >   

Re: V3 [PATCH] x86: Add X86_ENDBR and CET marker to config.m4.in

2020-03-09 Thread Simo Sorce
On Mon, 2020-03-09 at 14:31 -0700, H.J. Lu wrote:
> On Mon, Mar 9, 2020 at 2:15 PM Simo Sorce  wrote:
> > On Mon, 2020-03-09 at 12:46 -0700, H.J. Lu wrote:
> > > On Mon, Mar 9, 2020 at 12:22 PM Simo Sorce  wrote:
> > > > On Mon, 2020-03-09 at 15:19 -0400, Simo Sorce wrote:
> > > > > On Mon, 2020-03-09 at 11:56 -0700, H.J. Lu wrote:
> > > > > > On Mon, Mar 9, 2020 at 11:19 AM Simo Sorce  wrote:
> > > > > > > On Mon, 2020-03-09 at 19:03 +0100, Niels Möller wrote:
> > > > > > > > Simo Sorce  writes:
> > > > > > > > 
> > > > > > > > > The patchset i solder than I did remember, April 2019
> > > > > > > > > But I recall running at least one version of it on our CET 
> > > > > > > > > emulator @
> > > > > > > > > Red Hat.
> > > > > > > > 
> > > > > > > > Sorry I forgot to followup on that. It seems only the first 
> > > > > > > > easy cleanup
> > > > > > > > patch, "Add missing EPILOGUEs in assembly files", was applied 
> > > > > > > > back then.
> > > > > > > > 
> > > > > > > > Do you remember why you used GNU_CET_SECTION() explicitly in 
> > > > > > > > .asm files,
> > > > > > > > rather than using an m4 divert?
> > > > > > > 
> > > > > > > Not really I do not recall anymore, but I think there was a 
> > > > > > > reason, as
> > > > > > > I recall you made that comment back then and it "didn't work out" 
> > > > > > > when
> > > > > > > I tried is the memory I have of it.
> > > > > > > Might have to do with differences in how it lays out the code 
> > > > > > > when done
> > > > > > > via m4 divert, but not 100% sure.
> > > > > > > 
> > > > > > 
> > > > > > m4 divert  requires much less changes.   Here is the updated patch 
> > > > > > with
> > > > > > ASM_X86_ENDBR, ASM_X86_MARK_CET_ALIGN and ASM_X86_MARK_CET.
> > > > > > 
> > > > > > 
> > > > > 
> > > > > Two comments on your patch.
> > > > > 
> > > > > 1. It is an error to align based on architecture. All GNU Notes MUST 
> > > > > be
> > > > > aligned 8 bytes. Since 2018 GNU Libc ignores misaligned notes.
> > > > 
> > > > Ah nevermind this point, misunderstanding with my libc expert, the 4
> > > > bytes alignment is ok on 32 bit code.
> > > > 
> > > > > 2. It is better to use .pushsection .popsection pairs around the note
> > > > > instead of .section because of the side effects of using .section
> > > 
> > > Done.
> > > 
> > > > > The m4 divert looks smaller impact, feel free to lift the Gnu Note
> > > > > section in my patch #3 and place it into your patch if you want. My
> > > > > code also made it more explicit what all the sections values actually
> > > > > mean which will help in long term maintenance if someone else need to
> > > > > change anything (like for example changing to enable only ShadowStack
> > > > > vs IBT).
> > > > > 
> > > 
> > > Since CET support requires all objects are marked for CET,  CET marker on
> > > assembly sources is controlled by compiler options, not by configure 
> > > option.
> > > Also linker can merge multiple .note.gnu.property sections in a single
> > > input file:
> > > 
> > > [hjl@gnu-cfl-1 tmp]$ cat p.s
> > > .pushsection ".note.gnu.property", "a"
> > > .p2align 3
> > > .long 1f - 0f
> > > .long 4f - 1f
> > > .long 5
> > > 0:
> > > .asciz "GNU"
> > > 1:
> > > .p2align 3
> > > .long 0xc002
> > > .long 3f - 2f
> > > 2:
> > > .long 1
> > > 3:
> > > .p2align 3
> > > 4:
> > > .popsection
> > > .pushsection ".note.gnu.property", "a"
> > > .p2align 3
> > > .long 1f - 0f
> > > .long 4f - 1f
> > > .long 5
> > > 0:
> > > .asciz "GNU"
> > > 1:
> > > .p2align 3
> > > .long 0xc002
> > > .long 3f - 2f
> > > 2:
> > > .long 2
> > > 3:
> > > .p2align 3
> > > 4:
> > > .popsection
> > > [hjl@gnu-cfl-1 tmp]$ as -o p.o p.s -mx86-used-note=no
> > > [hjl@gnu-cfl-1 tmp]$ readelf -n p.o
> > > 
> > > Displaying notes found in: .note.gnu.property
> > >   OwnerData size Description
> > >   GNU  0x0010 NT_GNU_PROPERTY_TYPE_0
> > >   Properties: x86 feature: IBT
> > >   GNU  0x0010 NT_GNU_PROPERTY_TYPE_0
> > >   Properties: x86 feature: SHSTK
> > > [hjl@gnu-cfl-1 tmp]$ ld -r p.o
> > > [hjl@gnu-cfl-1 tmp]$ readelf -n a.out
> > > 
> > > Displaying notes found in: .note.gnu.property
> > >   OwnerData size Description
> > >   GNU  0x0010 NT_GNU_PROPERTY_TYPE_0
> > >   Properties: x86 feature: IBT, SHSTK
> > > [hjl@gnu-cfl-1 tmp]$
> > > 
> > > New properties can be added without changing CET marker.
> > > 
> > > Here is the updated patch.
> > 
> > This patch looks good to me.
> > Unfortunately I never received the original email creating the thred,
> > did you send other patches too ?
> 
> This is the only patch needed to enable CET.
> 
> > Or is the prologue stuff sufficient to pass test suite in CET emulator?
> > 
> 
> It is sufficient to pass all tests on real CET processors with
> 
> $ CC="gcc -Wl,-z,cet-report=error -fcf-protection" CXX="g++
> -Wl,-z,cet-report=error -fcf-protection"
> /home/hjl/work/git/gitlab/nettle/configure


Re: V3 [PATCH] x86: Add X86_ENDBR and CET marker to config.m4.in

2020-03-09 Thread Simo Sorce
On Mon, 2020-03-09 at 12:46 -0700, H.J. Lu wrote:
> On Mon, Mar 9, 2020 at 12:22 PM Simo Sorce  wrote:
> > On Mon, 2020-03-09 at 15:19 -0400, Simo Sorce wrote:
> > > On Mon, 2020-03-09 at 11:56 -0700, H.J. Lu wrote:
> > > > On Mon, Mar 9, 2020 at 11:19 AM Simo Sorce  wrote:
> > > > > On Mon, 2020-03-09 at 19:03 +0100, Niels Möller wrote:
> > > > > > Simo Sorce  writes:
> > > > > > 
> > > > > > > The patchset i solder than I did remember, April 2019
> > > > > > > But I recall running at least one version of it on our CET 
> > > > > > > emulator @
> > > > > > > Red Hat.
> > > > > > 
> > > > > > Sorry I forgot to followup on that. It seems only the first easy 
> > > > > > cleanup
> > > > > > patch, "Add missing EPILOGUEs in assembly files", was applied back 
> > > > > > then.
> > > > > > 
> > > > > > Do you remember why you used GNU_CET_SECTION() explicitly in .asm 
> > > > > > files,
> > > > > > rather than using an m4 divert?
> > > > > 
> > > > > Not really I do not recall anymore, but I think there was a reason, as
> > > > > I recall you made that comment back then and it "didn't work out" when
> > > > > I tried is the memory I have of it.
> > > > > Might have to do with differences in how it lays out the code when 
> > > > > done
> > > > > via m4 divert, but not 100% sure.
> > > > > 
> > > > 
> > > > m4 divert  requires much less changes.   Here is the updated patch with
> > > > ASM_X86_ENDBR, ASM_X86_MARK_CET_ALIGN and ASM_X86_MARK_CET.
> > > > 
> > > > 
> > > 
> > > Two comments on your patch.
> > > 
> > > 1. It is an error to align based on architecture. All GNU Notes MUST be
> > > aligned 8 bytes. Since 2018 GNU Libc ignores misaligned notes.
> > 
> > Ah nevermind this point, misunderstanding with my libc expert, the 4
> > bytes alignment is ok on 32 bit code.
> > 
> > > 2. It is better to use .pushsection .popsection pairs around the note
> > > instead of .section because of the side effects of using .section
> 
> Done.
> 
> > > The m4 divert looks smaller impact, feel free to lift the Gnu Note
> > > section in my patch #3 and place it into your patch if you want. My
> > > code also made it more explicit what all the sections values actually
> > > mean which will help in long term maintenance if someone else need to
> > > change anything (like for example changing to enable only ShadowStack
> > > vs IBT).
> > > 
> 
> Since CET support requires all objects are marked for CET,  CET marker on
> assembly sources is controlled by compiler options, not by configure option.
> Also linker can merge multiple .note.gnu.property sections in a single
> input file:
> 
> [hjl@gnu-cfl-1 tmp]$ cat p.s
> .pushsection ".note.gnu.property", "a"
> .p2align 3
> .long 1f - 0f
> .long 4f - 1f
> .long 5
> 0:
> .asciz "GNU"
> 1:
> .p2align 3
> .long 0xc002
> .long 3f - 2f
> 2:
> .long 1
> 3:
> .p2align 3
> 4:
> .popsection
> .pushsection ".note.gnu.property", "a"
> .p2align 3
> .long 1f - 0f
> .long 4f - 1f
> .long 5
> 0:
> .asciz "GNU"
> 1:
> .p2align 3
> .long 0xc002
> .long 3f - 2f
> 2:
> .long 2
> 3:
> .p2align 3
> 4:
> .popsection
> [hjl@gnu-cfl-1 tmp]$ as -o p.o p.s -mx86-used-note=no
> [hjl@gnu-cfl-1 tmp]$ readelf -n p.o
> 
> Displaying notes found in: .note.gnu.property
>   OwnerData size Description
>   GNU  0x0010 NT_GNU_PROPERTY_TYPE_0
>   Properties: x86 feature: IBT
>   GNU  0x0010 NT_GNU_PROPERTY_TYPE_0
>   Properties: x86 feature: SHSTK
> [hjl@gnu-cfl-1 tmp]$ ld -r p.o
> [hjl@gnu-cfl-1 tmp]$ readelf -n a.out
> 
> Displaying notes found in: .note.gnu.property
>   OwnerData size Description
>   GNU  0x0010 NT_GNU_PROPERTY_TYPE_0
>   Properties: x86 feature: IBT, SHSTK
> [hjl@gnu-cfl-1 tmp]$
> 
> New properties can be added without changing CET marker.
> 
> Here is the updated patch.

This patch looks good to me.
Unfortunately I never received the original email creating the thred,
did you send other patches too ?
Or is the prologue stuff sufficient to pass test suite in CET emulator?

-- 
Simo Sorce
RHEL Crypto Team
Red Hat, Inc




___
nettle-bugs mailing list
nettle-bugs@lists.lysator.liu.se
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs


Re: [PATCH] x86: Add X86_ENDBR and CET marker to config.m4.in

2020-03-09 Thread Simo Sorce
On Mon, 2020-03-09 at 15:19 -0400, Simo Sorce wrote:
> On Mon, 2020-03-09 at 11:56 -0700, H.J. Lu wrote:
> > On Mon, Mar 9, 2020 at 11:19 AM Simo Sorce  wrote:
> > > On Mon, 2020-03-09 at 19:03 +0100, Niels Möller wrote:
> > > > Simo Sorce  writes:
> > > > 
> > > > > The patchset i solder than I did remember, April 2019
> > > > > But I recall running at least one version of it on our CET emulator @
> > > > > Red Hat.
> > > > 
> > > > Sorry I forgot to followup on that. It seems only the first easy cleanup
> > > > patch, "Add missing EPILOGUEs in assembly files", was applied back then.
> > > > 
> > > > Do you remember why you used GNU_CET_SECTION() explicitly in .asm files,
> > > > rather than using an m4 divert?
> > > 
> > > Not really I do not recall anymore, but I think there was a reason, as
> > > I recall you made that comment back then and it "didn't work out" when
> > > I tried is the memory I have of it.
> > > Might have to do with differences in how it lays out the code when done
> > > via m4 divert, but not 100% sure.
> > > 
> > 
> > m4 divert  requires much less changes.   Here is the updated patch with
> > ASM_X86_ENDBR, ASM_X86_MARK_CET_ALIGN and ASM_X86_MARK_CET.
> > 
> > 
> 
> Two comments on your patch.
> 
> 1. It is an error to align based on architecture. All GNU Notes MUST be
> aligned 8 bytes. Since 2018 GNU Libc ignores misaligned notes.

Ah nevermind this point, misunderstanding with my libc expert, the 4
bytes alignment is ok on 32 bit code.

> 2. It is better to use .pushsection .popsection pairs around the note
> instead of .section because of the side effects of using .section

> The m4 divert looks smaller impact, feel free to lift the Gnu Note
> section in my patch #3 and place it into your patch if you want. My
> code also made it more explicit what all the sections values actually
> mean which will help in long term maintenance if someone else need to
> change anything (like for example changing to enable only ShadowStack
> vs IBT).
> 
> Simo.
> 

-- 
Simo Sorce
RHEL Crypto Team
Red Hat, Inc




___
nettle-bugs mailing list
nettle-bugs@lists.lysator.liu.se
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs


Re: [PATCH] x86: Add X86_ENDBR and CET marker to config.m4.in

2020-03-09 Thread Simo Sorce
On Mon, 2020-03-09 at 11:56 -0700, H.J. Lu wrote:
> On Mon, Mar 9, 2020 at 11:19 AM Simo Sorce  wrote:
> > On Mon, 2020-03-09 at 19:03 +0100, Niels Möller wrote:
> > > Simo Sorce  writes:
> > > 
> > > > The patchset i solder than I did remember, April 2019
> > > > But I recall running at least one version of it on our CET emulator @
> > > > Red Hat.
> > > 
> > > Sorry I forgot to followup on that. It seems only the first easy cleanup
> > > patch, "Add missing EPILOGUEs in assembly files", was applied back then.
> > > 
> > > Do you remember why you used GNU_CET_SECTION() explicitly in .asm files,
> > > rather than using an m4 divert?
> > 
> > Not really I do not recall anymore, but I think there was a reason, as
> > I recall you made that comment back then and it "didn't work out" when
> > I tried is the memory I have of it.
> > Might have to do with differences in how it lays out the code when done
> > via m4 divert, but not 100% sure.
> > 
> 
> m4 divert  requires much less changes.   Here is the updated patch with
> ASM_X86_ENDBR, ASM_X86_MARK_CET_ALIGN and ASM_X86_MARK_CET.
> 
> 

Two comments on your patch.

1. It is an error to align based on architecture. All GNU Notes MUST be
aligned 8 bytes. Since 2018 GNU Libc ignores misaligned notes.

2. It is better to use .pushsection .popsection pairs around the note
instead of .section because of the side effects of using .section

The m4 divert looks smaller impact, feel free to lift the Gnu Note
section in my patch #3 and place it into your patch if you want. My
code also made it more explicit what all the sections values actually
mean which will help in long term maintenance if someone else need to
change anything (like for example changing to enable only ShadowStack
vs IBT).

Simo.

-- 
Simo Sorce
RHEL Crypto Team
Red Hat, Inc




___
nettle-bugs mailing list
nettle-bugs@lists.lysator.liu.se
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs


Re: [PATCH v2 1/3] chacha: add function to set initial block counter

2020-03-09 Thread Niels Möller
Daiki Ueno  writes:

> From: Daiki Ueno 
>
> The ChaCha20 based header protection algorithm in QUIC requires a way
> to set the initial value of counter:
> https://quicwg.org/base-drafts/draft-ietf-quic-tls.html#name-chacha20-based-header-prote
>
> This will add a new function chacha_set_counter, which takes an
> 8-octet initial value of the block counter.

I've merged all three patches to master-updates. Two nits below:

> +void
> +chacha_crypt32(struct chacha_ctx *ctx,
> +size_t length,
> +uint8_t *c,
> +const uint8_t *m)
> +{
> +  if (!length)
> +return;
> +
> +  for (;;)
> +{
> +  uint32_t x[_CHACHA_STATE_LENGTH];
> +
> +  _chacha_core (x, ctx->state, CHACHA_ROUNDS);
> +
> +  ++ctx->state[12];
> +
> +  /* stopping at 2^70 length per nonce is user's responsibility */

Should be 2^38, not 2^70, right?

> +Nettle's implementation of ChaCha-Poly1305 follows @cite{RFC 8439},
> +where the ChaCha cipher is initialized with a 12-byte nonce and a 4-byte
> +block counter. This allows up to 256 gigabytes of data to be encrypted
> +using the same key.

Should be "same key and nonce"; the counter size limits the size of a
message, but the nonce allows for many messages using the same key.

I'll fix these.

It would be nice with a test case where the first 32 bits of the counter
wrap around, which is the only case where chacha_crypt and
chacha_crypt32 behave differently. Is that something you can look into?

Thanks!
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
nettle-bugs mailing list
nettle-bugs@lists.lysator.liu.se
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs


Re: [PATCH] x86: Add X86_ENDBR and CET marker to config.m4.in

2020-03-09 Thread Simo Sorce
On Mon, 2020-03-09 at 19:03 +0100, Niels Möller wrote:
> Simo Sorce  writes:
> 
> > The patchset i solder than I did remember, April 2019
> > But I recall running at least one version of it on our CET emulator @
> > Red Hat.
> 
> Sorry I forgot to followup on that. It seems only the first easy cleanup
> patch, "Add missing EPILOGUEs in assembly files", was applied back then.
> 
> Do you remember why you used GNU_CET_SECTION() explicitly in .asm files,
> rather than using an m4 divert?

Not really I do not recall anymore, but I think there was a reason, as
I recall you made that comment back then and it "didn't work out" when
I tried is the memory I have of it.
Might have to do with differences in how it lays out the code when done
via m4 divert, but not 100% sure.

Simo.

-- 
Simo Sorce
RHEL Crypto Team
Red Hat, Inc




___
nettle-bugs mailing list
nettle-bugs@lists.lysator.liu.se
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs


Re: [PATCH] x86: Add X86_ENDBR and CET marker to config.m4.in

2020-03-09 Thread Niels Möller
Simo Sorce  writes:

> The patchset i solder than I did remember, April 2019
> But I recall running at least one version of it on our CET emulator @
> Red Hat.

Sorry I forgot to followup on that. It seems only the first easy cleanup
patch, "Add missing EPILOGUEs in assembly files", was applied back then.

Do you remember why you used GNU_CET_SECTION() explicitly in .asm files,
rather than using an m4 divert?

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
___
nettle-bugs mailing list
nettle-bugs@lists.lysator.liu.se
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs


Re: [PATCH] x86: Add X86_ENDBR and CET marker to config.m4.in

2020-03-09 Thread Simo Sorce
On Mon, 2020-03-09 at 08:33 -0700, H.J. Lu wrote:
> On Mon, Mar 9, 2020 at 5:36 AM Simo Sorce  wrote:
> > On Sat, 2020-03-07 at 17:49 +0100, Niels Möller wrote:
> > > "H.J. Lu"  writes:
> > > 
> > > > Intel Control-flow Enforcement Technology (CET):
> > > > 
> > > > https://software.intel.com/en-us/articles/intel-sdm
> > > > 
> > > > contains shadow stack (SHSTK) and indirect branch tracking (IBT).  When
> > > > CET is enabled, ELF object files must be marked with .note.gnu.property
> > > > section.  Also when IBT is enabled, all indirect branch targets must
> > > > start with ENDBR instruction.
> > > > 
> > > > This patch adds X86_ENDBR and the CET marker to config.m4.in when CET
> > > > is enabled.  It updates PROLOGUE with X86_ENDBR.
> > > 
> > > I'd like to have a look at what gcc produces. How is it enabled with
> > > gcc? In the docs, I find
> > > 
> > >   -mshstk
> > > 
> > > The -mshstk option enables shadow stack built-in functions from x86
> > > Control-flow Enforcement Technology (CET).
> > > 
> > > but when I try compiling a trivial function,
> > > 
> > >   $ cat foo-cet.c
> > >   int foo(void) {return 0;}
> > >   $ gcc -save-temps -c -mshstk foo-cet.c
> > > 
> > > I get no endbr instruction and no note in the foo-cet.s. I'm using
> > > gcc-8.3. I do get an
> > > 
> > >   .section .note.GNU-stack,"",@progbits
> > > 
> > > corresponding to Nettle's ASM_MARK_NOEXEC_STACK
> > > 
> > > > --- a/config.m4.in
> > > > +++ b/config.m4.in
> > > > @@ -8,6 +8,10 @@ define(, <@ASM_ALIGN_LOG@>)dnl
> > > >  define(, <@W64_ABI@>)dnl
> > > >  define(, <@ASM_RODATA@>)dnl
> > > >  define(, <@ASM_WORDS_BIGENDIAN@>)dnl
> > > > +define(,<@X86_ENDBR@>)dnl
> > > > +divert(1)
> > > > +@X86_GNU_PROPERTY@
> > > > +divert
> > > >  divert(1)
> > > >  @ASM_MARK_NOEXEC_STACK@
> > > >  divert
> > > 
> > > You can put the two properties in the same m4 divert. Also, please
> > > rename the autoconf substitutions with ASM_ prefix, and something more
> > > descriptive than X64_GNU_PROPERTY. E.g., ASM_X86_ENDBR and
> > > ASM_X86_MARK_CET.
> > > 
> > > > diff --git a/configure.ac b/configure.ac
> > > > index ba3ab7c6..e9ed630c 100644
> > > > --- a/configure.ac
> > > > +++ b/configure.ac
> > > > @@ -803,6 +803,82 @@ EOF
> > > >ASM_ALIGN_LOG="$nettle_cv_asm_align_log"
> > > >  fi
> > > > 
> > > > +dnl  Define
> > > > +dnl  1. X86_ENDBR for endbr32/endbr64.
> > > > +dnl  2. X86_GNU_PROPERTY to add a .note.gnu.property section to mark
> > > > +dnl  Intel CET support if needed.
> > > > +dnl.section ".note.gnu.property", "a"
> > > > +dnl.p2align POINTER-ALIGN
> > > > +dnl.long 1f - 0f
> > > > +dnl.long 4f - 1f
> > > > +dnl.long 5
> > > > +dnl 0:
> > > > +dnl.asciz "GNU"
> > > > +dnl 1:
> > > > +dnl.p2align POINTER-ALIGN
> > > > +dnl.long 0xc002
> > > > +dnl.long 3f - 2f
> > > > +dnl 2:
> > > > +dnl.long 3
> > > > +dnl 3:
> > > > +dnl.p2align POINTER-ALIGN
> > > > +dnl 4:
> > > 
> > > No need to repeat the definition in full in this comment. And as I think
> > > I've said before, I'm a bit surprised that it needs to be this verbose.
> > > 
> > > > +AC_CACHE_CHECK([if Intel CET is enabled],
> > > > +  [nettle_cv_asm_x86_intel_cet],
> > > > +  [AC_TRY_COMPILE([
> > > > +#ifndef __CET__
> > > > +#error Intel CET is not enabled
> > > > +#endif
> > > > +  ], [],
> > > > +  [nettle_cv_asm_x86_intel_cet=yes],
> > > > +  [nettle_cv_asm_x86_intel_cet=no])])
> > > > +if test "$nettle_cv_asm_x86_intel_cet" = yes; then
> > > > +  case $ABI in
> > > > +  32|standard)
> > > > +X86_ENDBR=endbr32
> > > > +p2align=2
> > > > +;;
> > > > +  64)
> > > > +X86_ENDBR=endbr64
> > > > +p2align=3
> > > > +;;
> > > > +  x32)
> > > > +X86_ENDBR=endbr64
> > > > +p2align=2
> > > > +;;
> > > > +  esac
> > > > +  AC_CACHE_CHECK([if .note.gnu.property section is needed],
> > > > +[nettle_cv_asm_x86_gnu_property],
> > > > +[AC_TRY_COMPILE([
> > > > +#if !defined __ELF__ || !defined __CET__
> > > > +#error GNU property is not needed
> > > > +#endif
> > > > +], [],
> > > > +[nettle_cv_asm_x86_gnu_property=yes],
> > > > +[nettle_cv_asm_x86_gnu_property=no])])
> > > > +else
> > > > +  nettle_cv_asm_x86_gnu_property=no
> > > > +fi
> > > > +if test "$nettle_cv_asm_x86_gnu_property" = yes; then
> > > > +  X86_GNU_PROPERTY="
> > > > +   .section \".note.gnu.property\", \"a\"
> > > > +   .p2align $p2align
> > > > +   .long 1f - 0f
> > > > +   .long 4f - 1f
> > > > +   .long 5
> > > > +0:
> > > > +   .asciz \"GNU\"
> > > > +1:
> > > > +   .p2align $p2align
> > > > +   .long 0xc002
> > > > +   .long 3f - 2f
> > > > +2:
> > > > +   .long 3
> > > > +3:
> > > > +   .p2align $p2align
> > > > +4:"
> > > > +fi
> > > 
> > > Maybe a bit easier to read if you use single quotes for
> > > X86_GNU_PROPERTY='...', don't escape the inner double quotes. That
> > > leaves the expansion of $p2align, maybe it's better to define a separate

Re: [PATCH] x86: Add X86_ENDBR and CET marker to config.m4.in

2020-03-09 Thread Simo Sorce
On Sat, 2020-03-07 at 17:49 +0100, Niels Möller wrote:
> "H.J. Lu"  writes:
> 
> > Intel Control-flow Enforcement Technology (CET):
> > 
> > https://software.intel.com/en-us/articles/intel-sdm
> > 
> > contains shadow stack (SHSTK) and indirect branch tracking (IBT).  When
> > CET is enabled, ELF object files must be marked with .note.gnu.property
> > section.  Also when IBT is enabled, all indirect branch targets must
> > start with ENDBR instruction.
> > 
> > This patch adds X86_ENDBR and the CET marker to config.m4.in when CET
> > is enabled.  It updates PROLOGUE with X86_ENDBR.
> 
> I'd like to have a look at what gcc produces. How is it enabled with
> gcc? In the docs, I find
> 
>   -mshstk
> 
> The -mshstk option enables shadow stack built-in functions from x86
> Control-flow Enforcement Technology (CET).
> 
> but when I try compiling a trivial function,
> 
>   $ cat foo-cet.c 
>   int foo(void) {return 0;}
>   $ gcc -save-temps -c -mshstk foo-cet.c 
> 
> I get no endbr instruction and no note in the foo-cet.s. I'm using
> gcc-8.3. I do get an
> 
>   .section .note.GNU-stack,"",@progbits
> 
> corresponding to Nettle's ASM_MARK_NOEXEC_STACK
> 
> > --- a/config.m4.in
> > +++ b/config.m4.in
> > @@ -8,6 +8,10 @@ define(, <@ASM_ALIGN_LOG@>)dnl
> >  define(, <@W64_ABI@>)dnl
> >  define(, <@ASM_RODATA@>)dnl
> >  define(, <@ASM_WORDS_BIGENDIAN@>)dnl
> > +define(,<@X86_ENDBR@>)dnl
> > +divert(1)
> > +@X86_GNU_PROPERTY@
> > +divert
> >  divert(1)
> >  @ASM_MARK_NOEXEC_STACK@
> >  divert
> 
> You can put the two properties in the same m4 divert. Also, please
> rename the autoconf substitutions with ASM_ prefix, and something more
> descriptive than X64_GNU_PROPERTY. E.g., ASM_X86_ENDBR and
> ASM_X86_MARK_CET.
> 
> > diff --git a/configure.ac b/configure.ac
> > index ba3ab7c6..e9ed630c 100644
> > --- a/configure.ac
> > +++ b/configure.ac
> > @@ -803,6 +803,82 @@ EOF
> >ASM_ALIGN_LOG="$nettle_cv_asm_align_log"
> >  fi
> >  
> > +dnl  Define
> > +dnl  1. X86_ENDBR for endbr32/endbr64.
> > +dnl  2. X86_GNU_PROPERTY to add a .note.gnu.property section to mark
> > +dnl  Intel CET support if needed.
> > +dnl.section ".note.gnu.property", "a"
> > +dnl.p2align POINTER-ALIGN
> > +dnl.long 1f - 0f
> > +dnl.long 4f - 1f
> > +dnl.long 5
> > +dnl 0:
> > +dnl.asciz "GNU"
> > +dnl 1:
> > +dnl.p2align POINTER-ALIGN
> > +dnl.long 0xc002
> > +dnl.long 3f - 2f
> > +dnl 2:
> > +dnl.long 3
> > +dnl 3:
> > +dnl.p2align POINTER-ALIGN
> > +dnl 4:
> 
> No need to repeat the definition in full in this comment. And as I think
> I've said before, I'm a bit surprised that it needs to be this verbose.
> 
> > +AC_CACHE_CHECK([if Intel CET is enabled],
> > +  [nettle_cv_asm_x86_intel_cet],
> > +  [AC_TRY_COMPILE([
> > +#ifndef __CET__
> > +#error Intel CET is not enabled
> > +#endif
> > +  ], [],
> > +  [nettle_cv_asm_x86_intel_cet=yes],
> > +  [nettle_cv_asm_x86_intel_cet=no])])
> > +if test "$nettle_cv_asm_x86_intel_cet" = yes; then
> > +  case $ABI in
> > +  32|standard)
> > +X86_ENDBR=endbr32
> > +p2align=2
> > +;;
> > +  64)
> > +X86_ENDBR=endbr64
> > +p2align=3
> > +;;
> > +  x32)
> > +X86_ENDBR=endbr64
> > +p2align=2
> > +;;
> > +  esac
> > +  AC_CACHE_CHECK([if .note.gnu.property section is needed],
> > +[nettle_cv_asm_x86_gnu_property],
> > +[AC_TRY_COMPILE([
> > +#if !defined __ELF__ || !defined __CET__
> > +#error GNU property is not needed
> > +#endif
> > +], [],
> > +[nettle_cv_asm_x86_gnu_property=yes],
> > +[nettle_cv_asm_x86_gnu_property=no])])
> > +else
> > +  nettle_cv_asm_x86_gnu_property=no
> > +fi
> > +if test "$nettle_cv_asm_x86_gnu_property" = yes; then
> > +  X86_GNU_PROPERTY="
> > +   .section \".note.gnu.property\", \"a\"
> > +   .p2align $p2align
> > +   .long 1f - 0f
> > +   .long 4f - 1f
> > +   .long 5
> > +0:
> > +   .asciz \"GNU\"
> > +1:
> > +   .p2align $p2align
> > +   .long 0xc002
> > +   .long 3f - 2f
> > +2:
> > +   .long 3
> > +3:
> > +   .p2align $p2align
> > +4:"
> > +fi
> 
> Maybe a bit easier to read if you use single quotes for
> X86_GNU_PROPERTY='...', don't escape the inner double quotes. That
> leaves the expansion of $p2align, maybe it's better to define a separate
> substituted variable for pointer alignment? (If there's no easier way to
> enforce pointer-alignment).

Niels,
I sent patches longa few months ago on the list to enable CET, they
already went through review, any reason why we are looking at a
different set and restarting review from scratch now?

(sorry for not catching earlier, we seem to be having some delivery
issues and sometimes mailing list post are not reaching me, please keep
me in direct CC for now, hopefully that will help :-/ )

Simo.

-- 
Simo Sorce
RHEL Crypto Team
Red Hat, Inc




___
nettle-bugs mailing list
nettle-bugs@lists.lysator.liu.se

[PATCH v2 1/3] chacha: add function to set initial block counter

2020-03-09 Thread Daiki Ueno
From: Daiki Ueno 

The ChaCha20 based header protection algorithm in QUIC requires a way
to set the initial value of counter:
https://quicwg.org/base-drafts/draft-ietf-quic-tls.html#name-chacha20-based-header-prote

This will add a new function chacha_set_counter, which takes an
8-octet initial value of the block counter.

Signed-off-by: Daiki Ueno 
---
 chacha-set-nonce.c  |  7 +++
 chacha.h|  5 +
 nettle.texinfo  | 12 
 testsuite/chacha-test.c | 37 +++--
 4 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/chacha-set-nonce.c b/chacha-set-nonce.c
index 607f176b..2c34e498 100644
--- a/chacha-set-nonce.c
+++ b/chacha-set-nonce.c
@@ -68,3 +68,10 @@ chacha_set_nonce96(struct chacha_ctx *ctx, const uint8_t 
*nonce)
   ctx->state[14] = LE_READ_UINT32(nonce + 4);
   ctx->state[15] = LE_READ_UINT32(nonce + 8);
 }
+
+void
+chacha_set_counter(struct chacha_ctx *ctx, const uint8_t *counter)
+{
+  ctx->state[12] = LE_READ_UINT32(counter + 0);
+  ctx->state[13] = LE_READ_UINT32(counter + 4);
+}
diff --git a/chacha.h b/chacha.h
index 429a55b6..440fe968 100644
--- a/chacha.h
+++ b/chacha.h
@@ -46,6 +46,7 @@ extern "C" {
 #define chacha_set_key nettle_chacha_set_key
 #define chacha_set_nonce nettle_chacha_set_nonce
 #define chacha_set_nonce96 nettle_chacha_set_nonce96
+#define chacha_set_counter nettle_chacha_set_counter
 #define chacha_crypt nettle_chacha_crypt
 
 /* Currently, only 256-bit keys are supported. */
@@ -53,6 +54,7 @@ extern "C" {
 #define CHACHA_BLOCK_SIZE 64
 #define CHACHA_NONCE_SIZE 8
 #define CHACHA_NONCE96_SIZE 12
+#define CHACHA_COUNTER_SIZE 8
 
 #define _CHACHA_STATE_LENGTH 16
 
@@ -81,6 +83,9 @@ chacha_set_nonce(struct chacha_ctx *ctx, const uint8_t 
*nonce);
 void
 chacha_set_nonce96(struct chacha_ctx *ctx, const uint8_t *nonce);
 
+void
+chacha_set_counter(struct chacha_ctx *ctx, const uint8_t *counter);
+
 void
 chacha_crypt(struct chacha_ctx *ctx, size_t length, 
  uint8_t *dst, const uint8_t *src);
diff --git a/nettle.texinfo b/nettle.texinfo
index 19eb6d34..0b339f51 100644
--- a/nettle.texinfo
+++ b/nettle.texinfo
@@ -1669,6 +1669,10 @@ ChaCha block size, 64.
 Size of the nonce, 8.
 @end defvr
 
+@defvr Constant CHACHA_COUNTER_SIZE
+Size of the counter, 8.
+@end defvr
+
 @deftypefun void chacha_set_key (struct chacha_ctx *@var{ctx}, const uint8_t 
*@var{key})
 Initialize the cipher. The same function is used for both encryption and
 decryption. Before using the cipher,
@@ -1681,6 +1685,14 @@ octets. This function also initializes the block 
counter, setting it to
 zero.
 @end deftypefun
 
+@deftypefun void chacha_set_counter (struct chacha_ctx *@var{ctx}, const 
uint8_t *@var{counter})
+Sets the block counter. It is always of size @code{CHACHA_COUNTER_SIZE},
+8 octets. This is rarely needed since @code{chacha_set_nonce}
+initializes the block counter to zero. When it is still necessary, this
+function must be called after @code{chacha_set_nonce}.
+
+@end deftypefun
+
 @deftypefun void chacha_crypt (struct chacha_ctx *@var{ctx}, size_t 
@var{length}, uint8_t *@var{dst}, const uint8_t *@var{src})
 Encrypts or decrypts the data of a message, using ChaCha. When a
 message is encrypted using a sequence of calls to @code{chacha_crypt},
diff --git a/testsuite/chacha-test.c b/testsuite/chacha-test.c
index d6489e9c..6875d4bb 100644
--- a/testsuite/chacha-test.c
+++ b/testsuite/chacha-test.c
@@ -38,8 +38,9 @@
 #include "chacha-internal.h"
 
 static void
-test_chacha(const struct tstring *key, const struct tstring *nonce,
-   const struct tstring *expected, unsigned rounds)
+_test_chacha(const struct tstring *key, const struct tstring *nonce,
+const struct tstring *expected, unsigned rounds,
+const struct tstring *counter)
 {
   struct chacha_ctx ctx;
 
@@ -69,6 +70,9 @@ test_chacha(const struct tstring *key, const struct tstring 
*nonce,
  else
die ("Bad nonce size %u.\n", (unsigned) nonce->length);
 
+ if (counter)
+   chacha_set_counter(, counter->data);
+
  chacha_crypt (, length, data, data);
 
  ASSERT (data[-1] == 17);
@@ -98,6 +102,8 @@ test_chacha(const struct tstring *key, const struct tstring 
*nonce,
   ASSERT (nonce->length == CHACHA_NONCE_SIZE);
 
   chacha_set_nonce(, nonce->data);
+  if (counter)
+   chacha_set_counter(, counter->data);
   _chacha_core (out, ctx.state, rounds);
 
   if (!MEMEQ(CHACHA_BLOCK_SIZE, out, expected->data))
@@ -117,6 +123,21 @@ test_chacha(const struct tstring *key, const struct 
tstring *nonce,
 }
 }
 
+static void
+test_chacha(const struct tstring *key, const struct tstring *nonce,
+   const struct tstring *expected, unsigned rounds)
+{
+  _test_chacha(key, nonce, expected, rounds, NULL);
+}
+
+static void
+test_chacha_with_counter(const struct tstring *key, const struct tstring 
*nonce,
+const struct tstring 

[PATCH v2 3/3] doc: match ChaCha-Poly1305 documentation to the implementation

2020-03-09 Thread Daiki Ueno
From: Daiki Ueno 

While the documentation said the nonce size is 8 octets, the
implementation actually assumed 12 octets following RFC 7539.

Signed-off-by: Daiki Ueno 
---
 nettle.texinfo | 19 +++
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/nettle.texinfo b/nettle.texinfo
index fe44f6af..418f46d8 100644
--- a/nettle.texinfo
+++ b/nettle.texinfo
@@ -3323,17 +3323,12 @@ except that @var{cipher} and @var{f} are replaced with 
a context structure.
 ChaCha-Poly1305 is a combination of the ChaCha stream cipher and the
 poly1305 message authentication code (@pxref{Poly1305}). It originates
 from the NaCl cryptographic library by D. J. Bernstein et al, which
-defines a similar construction but with Salsa20 instead of ChaCha. 
-
-Nettle's implementation ChaCha-Poly1305 should be considered
-@strong{experimental}. At the time of this writing, there is no
-authoritative specification for ChaCha-Poly1305, and a couple of
-different incompatible variants. Nettle implements it using the original
-definition of ChaCha, with 64 bits (8 octets) each for the nonce and the
-block counter. Some protocols prefer to use nonces of 12 bytes, and it's
-a small change to ChaCha to use the upper 32 bits of the block counter
-as a nonce, instead limiting message size to @math{2^32} blocks or 256
-GBytes, but that variant is currently not supported.
+defines a similar construction but with Salsa20 instead of ChaCha.
+
+Nettle's implementation of ChaCha-Poly1305 follows @cite{RFC 8439},
+where the ChaCha cipher is initialized with a 12-byte nonce and a 4-byte
+block counter. This allows up to 256 gigabytes of data to be encrypted
+using the same key.
 
 For ChaCha-Poly1305, the ChaCha cipher is initialized with a key, of 256
 bits, and a per-message nonce. The first block of the key stream
@@ -3362,7 +3357,7 @@ ChaCha-Poly1305 key size, 32.
 @end defvr
 
 @defvr Constant CHACHA_POLY1305_NONCE_SIZE
-Same as the ChaCha nonce size, 16.
+ChaCha-Poly1305 nonce size, 12.
 @end defvr
 
 @defvr Constant CHACHA_POLY1305_DIGEST_SIZE
-- 
2.24.1

___
nettle-bugs mailing list
nettle-bugs@lists.lysator.liu.se
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs


[PATCH v2 2/3] chacha: add variant that treats counter value as 32-bit

2020-03-09 Thread Daiki Ueno
From: Daiki Ueno 

The ChaCha-Poly1305 implementation previously used the chacha_crypt
function that assumes the block counter is 64-bit long, while RFC 8439
defines that the counter is 32-bit long.  Although this should be fine
as long as up to 256 gigabytes of data is encrypted with the same key,
it would be nice to use a separate functions (chacha_set_counter32 and
chacha_crypt32) that assume the counter is 32-bit long.

Signed-off-by: Daiki Ueno 
---
 chacha-crypt.c  | 32 
 chacha-poly1305.c   |  4 ++--
 chacha-set-nonce.c  |  6 ++
 chacha.h| 10 ++
 nettle.texinfo  | 31 +++
 testsuite/chacha-test.c | 34 ++
 6 files changed, 111 insertions(+), 6 deletions(-)

diff --git a/chacha-crypt.c b/chacha-crypt.c
index 63d799ce..0bb44ed9 100644
--- a/chacha-crypt.c
+++ b/chacha-crypt.c
@@ -85,3 +85,35 @@ chacha_crypt(struct chacha_ctx *ctx,
   m += CHACHA_BLOCK_SIZE;
   }
 }
+
+void
+chacha_crypt32(struct chacha_ctx *ctx,
+  size_t length,
+  uint8_t *c,
+  const uint8_t *m)
+{
+  if (!length)
+return;
+
+  for (;;)
+{
+  uint32_t x[_CHACHA_STATE_LENGTH];
+
+  _chacha_core (x, ctx->state, CHACHA_ROUNDS);
+
+  ++ctx->state[12];
+
+  /* stopping at 2^70 length per nonce is user's responsibility */
+
+  if (length <= CHACHA_BLOCK_SIZE)
+   {
+ memxor3 (c, m, x, length);
+ return;
+   }
+  memxor3 (c, m, x, CHACHA_BLOCK_SIZE);
+
+  length -= CHACHA_BLOCK_SIZE;
+  c += CHACHA_BLOCK_SIZE;
+  m += CHACHA_BLOCK_SIZE;
+  }
+}
diff --git a/chacha-poly1305.c b/chacha-poly1305.c
index 974a5022..a15fef0c 100644
--- a/chacha-poly1305.c
+++ b/chacha-poly1305.c
@@ -130,7 +130,7 @@ chacha_poly1305_encrypt (struct chacha_poly1305_ctx *ctx,
   assert (ctx->data_size % CHACHA_POLY1305_BLOCK_SIZE == 0);
   poly1305_pad (ctx);
 
-  chacha_crypt (>chacha, length, dst, src);
+  chacha_crypt32 (>chacha, length, dst, src);
   poly1305_update (ctx, length, dst);
   ctx->data_size += length;
 }
@@ -146,7 +146,7 @@ chacha_poly1305_decrypt (struct chacha_poly1305_ctx *ctx,
   poly1305_pad (ctx);
 
   poly1305_update (ctx, length, src);
-  chacha_crypt (>chacha, length, dst, src);
+  chacha_crypt32 (>chacha, length, dst, src);
   ctx->data_size += length;
 }
 
diff --git a/chacha-set-nonce.c b/chacha-set-nonce.c
index 2c34e498..1547aea1 100644
--- a/chacha-set-nonce.c
+++ b/chacha-set-nonce.c
@@ -75,3 +75,9 @@ chacha_set_counter(struct chacha_ctx *ctx, const uint8_t 
*counter)
   ctx->state[12] = LE_READ_UINT32(counter + 0);
   ctx->state[13] = LE_READ_UINT32(counter + 4);
 }
+
+void
+chacha_set_counter32(struct chacha_ctx *ctx, const uint8_t *counter)
+{
+  ctx->state[12] = LE_READ_UINT32(counter + 0);
+}
diff --git a/chacha.h b/chacha.h
index 440fe968..fe28b835 100644
--- a/chacha.h
+++ b/chacha.h
@@ -47,7 +47,9 @@ extern "C" {
 #define chacha_set_nonce nettle_chacha_set_nonce
 #define chacha_set_nonce96 nettle_chacha_set_nonce96
 #define chacha_set_counter nettle_chacha_set_counter
+#define chacha_set_counter32 nettle_chacha_set_counter32
 #define chacha_crypt nettle_chacha_crypt
+#define chacha_crypt32 nettle_chacha_crypt32
 
 /* Currently, only 256-bit keys are supported. */
 #define CHACHA_KEY_SIZE 32
@@ -55,6 +57,7 @@ extern "C" {
 #define CHACHA_NONCE_SIZE 8
 #define CHACHA_NONCE96_SIZE 12
 #define CHACHA_COUNTER_SIZE 8
+#define CHACHA_COUNTER32_SIZE 4
 
 #define _CHACHA_STATE_LENGTH 16
 
@@ -86,10 +89,17 @@ chacha_set_nonce96(struct chacha_ctx *ctx, const uint8_t 
*nonce);
 void
 chacha_set_counter(struct chacha_ctx *ctx, const uint8_t *counter);
 
+void
+chacha_set_counter32(struct chacha_ctx *ctx, const uint8_t *counter);
+
 void
 chacha_crypt(struct chacha_ctx *ctx, size_t length, 
  uint8_t *dst, const uint8_t *src);
 
+void
+chacha_crypt32(struct chacha_ctx *ctx, size_t length,
+  uint8_t *dst, const uint8_t *src);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/nettle.texinfo b/nettle.texinfo
index 0b339f51..fe44f6af 100644
--- a/nettle.texinfo
+++ b/nettle.texinfo
@@ -1700,6 +1700,37 @@ all but the last call @emph{must} use a length that is a 
multiple of
 @code{CHACHA_BLOCK_SIZE}.
 @end deftypefun
 
+@subsubsection 32-bit counter variant
+
+While the original paper uses 64-bit counter value, the variant defined
+in @cite{RFC 8439} uses 32-bit counter value. This variant is
+particularly useful for @pxref{ChaCha-Poly1305} AEAD construction, which
+supports 12-octet nonces.
+
+@defvr Constant CHACHA_NONCE96_SIZE
+Size of the nonce, 12.
+@end defvr
+
+@defvr Constant CHACHA_COUNTER32_SIZE
+Size of the counter, 4.
+@end defvr
+
+@deftypefun void chacha_set_nonce96 (struct chacha_ctx *@var{ctx}, const 
uint8_t *@var{nonce})
+Sets the nonce. This is similar to the above @code{chacha_set_nonce},
+but the input is always of size