Thank you so much for all the information and the help.
Seriously, this is much more than I was hoping to get, even the
suggestion for generating commented assembly code (which is, I assume,
the method that Compiler Explorer uses to relate the high-level Haskell
code with the assembly output of the compiler, which is really nice).
And four your RISC-V NGC, which I found easy to understand.
I guess this is the kind of professionals that Haskell attracts, which
is a big part of why I love it.
I will send here an executive summary of my findings, including
statistics about a couple of programs that I try. I don't know if I'll
be able to do a very statistical significant analysis, because I'll
still have a lot of things to do (to extend QEMU, maybe also gem5, and
implement it in a Clash microprocessor design, probably VELDT), but
maybe in the future I can automate more of it and running a more
comprehensive analysis. FYI, I have found that the RISC-V specs mention
Haskell among other languages in a still empty J extension section,
which will be aimed at helping dynamically translated languages as well
as garbage-collected, but I guess RISC-V people is still more focused on
other things and it will take some time to start work on that extension.
I find also very interesting your suggestion for far-jumping, but I'm
afraid that will be very unpopular among hardware designers because it
messes with their highly appreciated and scarce L1 cache. But funnily
enough, I had the impresion before starting this project that some kind
of simple mechanism for complex jumping would be a good idea. I will
keep this in mind when looking for patterns in the assembly code.
Once more, thank you so much for your work and the help, and I hope I
can deliver soon information that you all could find interesting.
Have a very nice weekend!
Cheers,
Dani.
On 17/4/25 9:19, Sven Tennie wrote:
Hey Daniel 👋
That's really an interesting topic, because we never analyzed the
emitted RISC-V assembly with statistical measures.
So, if I may ask for a favour: If you spot anything that could be
better expressed with the current ISA, please open a ticket and label
it as RISC-V: https://gitlab.haskell.org/ghc/ghc/-/issues
(We haven't decided which RISC-V profile to require. I.e. requiring
the very latest extensions would frustrate people with older
hardware... However, it's in anycase good to have possible
improvements documented in tickets.)
I'm wondering if you really have to go through QEMU. Or, if feeding
assembly code to a parser and then doing the math on that wouldn't be
sufficient? (Of course, tracing the execution is more accurate.
However, it's much more complicated as well.)
To account Assembly instructions to Cmm statements you may use the GHC
parameters -ddump-cmm and -dppr-debug (and to stream this into files
instead of stdout -ddump-to-file.) This will add comments for most Cmm
statements into the dumped assembly code.
At first, I thought that sign-extension / truncation might be a good
candidate. However, it turned out that this is already covered by the
RISC-V B-extension. Which led to this new ticket:
https://gitlab.haskell.org/ghc/ghc/-/issues/25966
Skimming over the NCG code and watching out for longer or repeating
instruction lists might be a good strategy to make educated guesses.
From a developer's perspective, I found the immediate sizes (usually
12bit) rather limiting. E.g. the Note [RISCV64 far jumps]
(https://gitlab.haskell.org/ghc/ghc/-/blob/395e0ad17c0d309637f079a05dbdc23e0d4188f6/compiler/GHC/CmmToAsm/RV64/CodeGen.hs?page=2#L1996)
tells a story how we had to work around this limit for addresses in
conditional jumps.
So, you could raise the question if - analog to compressed expressions
- it wouldn't make sense to have extended expressions that cover two
words. Such that the first word is the instruction and the second it's
immediate(s). (Hardware designers would probably hate that, because it
means a bigger change to the instruction decoding unit. However, I got
asked as a software developer ;) )
Other than that, I've unfortunately got no great ideas.
Please feel free to keep us in the loop (especially regarding the
results of your analyses.) And, if you've got any questions regarding
the RISC-V NCG, please feel free to reach out either here or directly
to me. There's also a #GHC "room" on Matrix where you can quickly drop
smaller scoped questions.
I hope that was of any help. Best regards,
Sven
Am Mi., 16. Apr. 2025 um 10:34Â Uhr schrieb Matthew Pickering
<matthewtpicker...@gmail.com>:
Hi Daniel. I think Sven Tennie and the LoongArch contributors are
the experts in NCG for these kinds of instruction sets. I have
cced them.
Cheers,
Matt
On Tue, Apr 15, 2025 at 5:40 PM Daniel Trujillo Viedma
<danihacker.vie...@gmail.com> wrote:
Hello, ghc-devs! My name is Daniel Trujillo, I'm a Haskell
enthusiast
from Spain and I'm trying to make my Master's thesis about
accelerating
Haskell programs with a custom ISA extension.
Right now, my focus is in executing software written in
Haskell within
QEMU in order to get traces that tells me, basically, how many
times
each block (not exactly basic blocks, but sort of) of assembly
code has
been executed, with the hope of finding some patterns of RISCV
instructions that I could implement together into 1 instruction.
As you can see, my method is a bit crude, and I was wondering
if the
people involved with any of the different internal
representations (STG,
Cmm...) and/or native code generators (particularly RISCV)
could provide
me hints about assembly instructions that would have made the
work
easier, by removing the need of "massaging" the Cmm code to
make CodeGen
easier, or the need of particular optimizations, or in
general, dirty
tricks because of lacking of proper support of the standard
RISCV ISA.
And of course, I would also appreciate very much other hints
from people
involved in general performance (as oppossed to, for example,
libraries
for SIMD and parallel execution, or Haskell wrappers to
lower-level code
for performance reasons).
P.D. I'm sorry if I broke any netiquette rule, but I'm very
new to the
email list, and haven't received yet any email from it.
Looking forward to hear from you!
Cheers,
Dani.
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs