[Bug gas/31964] Add directive for more efficient encoding of binary data

2024-07-11 Thread cvs-commit at gcc dot gnu.org
https://sourceware.org/bugzilla/show_bug.cgi?id=31964

--- Comment #12 from Sourceware Commits  ---
The master branch has been updated by Alan Modra :

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=d686a2b68810b4b1f98930cebcf3b2ee256b1ce2

commit d686a2b68810b4b1f98930cebcf3b2ee256b1ce2
Author: Alan Modra 
Date:   Fri Jul 12 09:50:46 2024 +0930

Re: base64: Add support for targets with byte size > octet size.

Three extra octets are now expected with the latest change to base64.s.
They happened to be covered by patterns allowing for zero padding at
the end of the section, but we don't want to allow fewer octets than
expected.

PR 31964
* testsuite/gas/all/base64.d: Adjust.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


[Bug gas/31964] Add directive for more efficient encoding of binary data

2024-07-11 Thread cvs-commit at gcc dot gnu.org
https://sourceware.org/bugzilla/show_bug.cgi?id=31964

--- Comment #11 from Sourceware Commits  ---
The master branch has been updated by Nick Clifton :

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=a79094915578872a0360c78a54accff994b883b1

commit a79094915578872a0360c78a54accff994b883b1
Author: Nick Clifton 
Date:   Thu Jul 11 12:51:16 2024 +0100

base64: Add support for targets with byte size > octet size.

PR 31964

-- 
You are receiving this mail because:
You are on the CC list for the bug.


[Bug gas/31964] Add directive for more efficient encoding of binary data

2024-07-10 Thread cvs-commit at gcc dot gnu.org
https://sourceware.org/bugzilla/show_bug.cgi?id=31964

--- Comment #10 from Sourceware Commits  ---
The master branch has been updated by Alan Modra :

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=4cf957e7ac44097baa06e6caca5ad444cff78155

commit 4cf957e7ac44097baa06e6caca5ad444cff78155
Author: Alan Modra 
Date:   Thu Jul 11 11:08:50 2024 +0930

Re: Add support for a .base64 pseudo-op to gas

Fixes a failure on rx-elf where the standard data section isn't .data.
run_dump_test has machinery to translate .data in both options and
expected results for objdump, but not for readelf -x.

PR 31964
* testsuite/gas/all/base64.d: Dump .data with objdump.  Run on
all targets.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


[Bug gas/31964] Add directive for more efficient encoding of binary data

2024-07-10 Thread nickc at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=31964

Nick Clifton  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Nick Clifton  ---
Feature added.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


[Bug gas/31964] Add directive for more efficient encoding of binary data

2024-07-10 Thread cvs-commit at gcc dot gnu.org
https://sourceware.org/bugzilla/show_bug.cgi?id=31964

--- Comment #8 from Sourceware Commits  ---
The master branch has been updated by Nick Clifton :

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=479edf0a6a61159486f14d5e62403f8769cc591d

commit 479edf0a6a61159486f14d5e62403f8769cc591d
Author: Nick Clifton 
Date:   Wed Jul 10 15:01:39 2024 +0100

Add support for a .base64 pseudo-op to gas

  PR 31964

-- 
You are receiving this mail because:
You are on the CC list for the bug.


[Bug gas/31964] Add directive for more efficient encoding of binary data

2024-07-10 Thread jakub at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=31964

--- Comment #7 from Jakub Jelinek  ---
So, I've tried your patch on my short #embed testcase:
unsigned char a[] = {
#embed "cc1plus"
};
with the #embed patchset for GCC, where cc1plus is 273372376 bytes long binary.
Assembly for this from the gcc is 1328371852 bytes long, just
.file   "embed-11.c"
.text
.globl  a
.data
.align 32
.type   a, @object
.size   a, 273372376
a:
.byte   127
.string "ELF\002\001\001\003"
.string ""
.string ""
...
.string
"(\035\214\034\347_u\244\rz|~\002\253h\267\271\203v\244\266\372\001\353\363\026\346\365\305\211\005\220\372\215h\267\211{\022\257\277'\0256\215G\2013c.~\244\206\360\2
26|_\226\223\034\177j\232u\300,\003\3273kh\267q\221\302\326\3153\3772\202,\003\327\346\207\3662giJ3\202,\003\327\305\271\234@%v~\2446-\034\257\310\207\302\326=\256h\267\016\237h\267Q
\201\023\257\016\313\302\326q\032\\*\205(u\244\237\023t\244\344Vt\244\247\335\243k\007\256\302\326,th\267}\221h\267\317O\034\257\377\373v\244\227\202a\221$\236\3772\263\326X\221\215M
z\244\216\227\034\257F\213\302\326G\316\302\326\033\277\302\326\177\220h\267\023\263\302\326X\236v\244\034Zt\244\003>\177[\0135\022\257\226ph\267|\377\3033Ox\022\257\214\307\340`\356
\235\3772M>\245\013\321*\003\327=\377\3033"
...
.string ""
.string ""
.byte   0
.ident  "GCC: (GNU) 15.0.0 20240703 (experimental)"
.section.note.GNU-stack,"",@progbits
Now, if I hand edit this to replace the first .byte up to the last one
including .string etc. directives in between with
cat cc1plus | base64 | sed 's/^/\t.base64\t"/;s/$/"/'
the new assembly is 422048853 bytes.
time .../gas/as-new -o embed-11.o embed-11.s 

real0m10.481s
user0m10.113s
sys 0m0.356s

time .../gas/as-new -o embed-11_.o embed-11_.s 

real0m2.519s
user0m2.282s
sys 0m0.233s
md5sum embed-11.o embed-11_.o
049aaf9fdb9cf6f84fd54984ab032ac0  embed-11.o
049aaf9fdb9cf6f84fd54984ab032ac0  embed-11_.o

So, this looks good to me.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


[Bug gas/31964] Add directive for more efficient encoding of binary data

2024-07-10 Thread nickc at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=31964

--- Comment #6 from Nick Clifton  ---
(In reply to Jakub Jelinek from comment #5)

> 1) @command{uuencode} program's @code{-m} option
>I think base64 program from coreutils is more common than uuencode from
> sharutils,
>so either mention just that, or both.  For no line wrapping base64 has -w
> 0 option.

That is a fair point.  I will change the example to use the base64 program as
you suggested.

> 2) I don't know how FRAG_APPEND_1_CHAR is expensive compared to say
> appending more

It is actually pretty fast unless the specific backend involved needs to do
something funky.

> Just running coreutils base64 on 261M file
> took around 1s and base64 -d of that too.

I tried a similar test using /usr/bin/lto-dump (29Mb) and it took the assembler
less than a second to convert the base64 encoded version of the file into an
object file containing the binary.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


[Bug gas/31964] Add directive for more efficient encoding of binary data

2024-07-09 Thread jakub at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=31964

--- Comment #5 from Jakub Jelinek  ---
Comment on attachment 15612
  --> https://sourceware.org/bugzilla/attachment.cgi?id=15612
Proposed patch

Thanks, will try tomorrow.

Just some nits:
1) @command{uuencode} program's @code{-m} option
   I think base64 program from coreutils is more common than uuencode from
sharutils,
   so either mention just that, or both.  For no line wrapping base64 has -w 0
option.
2) I don't know how FRAG_APPEND_1_CHAR is expensive compared to say appending
more
   characters at a time; if appending more at a time would be cheaper, with
base64
   one can cheaply check for the length of the addition (at least number of
non-["=]
   characters divided by 4 times 3); but if it is inexpensive, just ignore
Guess the most important thing will be how fast will be the parsing of it (and
encoding on the gcc side).  Just running coreutils base64 on 261M file took
around 1s and base64 -d of that too.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


[Bug gas/31964] Add directive for more efficient encoding of binary data

2024-07-09 Thread nickc at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=31964

--- Comment #4 from Nick Clifton  ---
Created attachment 15612
  --> https://sourceware.org/bugzilla/attachment.cgi?id=15612=edit
Proposed patch

Hi Jakub,

  Would you like to try out this patch ?

  With it applied you can use the .base64 directive as you outlined in your
description.  The patch allows for multiple comma separated strings to be
specified for a single .base64 pseudo-op because this is in keeping with how
other assembler pseudo-ops behave.

Cheers
  Nick

-- 
You are receiving this mail because:
You are on the CC list for the bug.


[Bug gas/31964] Add directive for more efficient encoding of binary data

2024-07-09 Thread jakub at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=31964

--- Comment #3 from Jakub Jelinek  ---
(In reply to Nick Clifton from comment #2)
> Hi Jakub,
> 
>   Does libiberty (or some other library) have a base64 decoding function ?

I don't think so.

>   If not, I guess I will have to steal^H^H^H^H borrow some code from some 
>   other project.

I simply wrote my own, see
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655156.html
(the base64_dec_fn helper function, base64_dec array and most of
finish_base64_embed).  Uses C++ for the base64_dec array initialization, of
course it could be initialized on demand at runtime, just wanted to make it
more efficient.
Now, I don't really remember if gas does any kind of character set translation
or not, or whether say 'A' in .ascii/.string etc. routines is expected to be
'A' in gas source and is what is being written to the sections.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


[Bug gas/31964] Add directive for more efficient encoding of binary data

2024-07-09 Thread nickc at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=31964

Nick Clifton  changed:

   What|Removed |Added

   Assignee|unassigned at sourceware dot org   |nickc at redhat dot com
 Status|NEW |ASSIGNED

--- Comment #2 from Nick Clifton  ---
Hi Jakub,

  Does libiberty (or some other library) have a base64 decoding function ?

  If not, I guess I will have to steal^H^H^H^H borrow some code from some 
  other project.

Cheers
  Nick

-- 
You are receiving this mail because:
You are on the CC list for the bug.


[Bug gas/31964] Add directive for more efficient encoding of binary data

2024-07-08 Thread jakub at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=31964

Jakub Jelinek  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org,
   ||nickc at redhat dot com,
   ||rguenth at gcc dot gnu.org

-- 
You are receiving this mail because:
You are on the CC list for the bug.


[Bug gas/31964] Add directive for more efficient encoding of binary data

2024-07-08 Thread jakub at redhat dot com
https://sourceware.org/bugzilla/show_bug.cgi?id=31964

--- Comment #1 from Jakub Jelinek  ---
base64 encoding is 4 characters per 3 bytes.
Guess the directive shouldn't be supported for targets which don't have 8-bit
bytes (if there are any).

-- 
You are receiving this mail because:
You are on the CC list for the bug.