While following a number of tangents in the code (I was figuring out how to edit lib/Kconfig; don't ask), I came across a table of 256 64-bit words, all of which had the high half set to zero.
Since the code depends on both pclmulq and crc32, SSE 4.1 is obviously present, so it could use pmovzxdq and save 1K of kernel data. The following patch obviously lacks the kludges for old binutils, but should convey the general idea. Jan: Is support for SLE10's pre-2.18 binutils still required? Your PEXTRD fix was only a year ago, so I expect, but I wanted to ask. Two other minor additional changes: 1. The current code unnecessarily puts the table in the read-write .data section. Moved to .text. 2. I'm also not sure why it's necessary to force such large alignment on K_table. Comments on reducing it? Signed-off-by: George Spelvin <li...@horizon.com> diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S index dbc4339b..9f885ee4 100644 --- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S +++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S @@ -216,15 +216,11 @@ LABEL crc_ %i ## 4) Combine three results: ################################################################ - lea (K_table-16)(%rip), bufp # first entry is for idx 1 + lea (K_table-8)(%rip), bufp # first entry is for idx 1 shlq $3, %rax # rax *= 8 - subq %rax, tmp # tmp -= rax*8 - shlq $1, %rax - subq %rax, tmp # tmp -= rax*16 - # (total tmp -= rax*24) - addq %rax, bufp - - movdqa (bufp), %xmm0 # 2 consts: K1:K2 + pmovzxdq (bufp,%rax), %xmm0 # 2 consts: K1:K2 + leal (%eax,%eax,2), %eax # rax *= 3 (total *24) + subq %rax, tmp # tmp -= rax*24 movq crc_init, %xmm1 # CRC for block 1 PCLMULQDQ 0x00,%xmm0,%xmm1 # Multiply by K2 @@ -331,136 +327,135 @@ ENDPROC(crc_pcl) ################################################################ ## PCLMULQDQ tables - ## Table is 128 entries x 2 quad words each + ## Table is 128 entries x 2 words (8 bytes) each ################################################################ -.data -.align 64 +.align 8 K_table: - .quad 0x14cd00bd6,0x105ec76f0 + .long 0x14cd00bd6,0x105ec76f0 - .quad 0x0ba4fc28e,0x14cd00bd6 + .long 0x0ba4fc28e,0x14cd00bd6 - .quad 0x1d82c63da,0x0f20c0dfe + .long 0x1d82c63da,0x0f20c0dfe - .quad 0x09e4addf8,0x0ba4fc28e + .long 0x09e4addf8,0x0ba4fc28e - .quad 0x039d3b296,0x1384aa63a + .long 0x039d3b296,0x1384aa63a - .quad 0x102f9b8a2,0x1d82c63da + .long 0x102f9b8a2,0x1d82c63da - .quad 0x14237f5e6,0x01c291d04 + .long 0x14237f5e6,0x01c291d04 - .quad 0x00d3b6092,0x09e4addf8 + .long 0x00d3b6092,0x09e4addf8 (Remaining boring bits of this hunk elided.) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/