Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
Daniel P. Berrangé writes: > On Mon, May 27, 2024 at 04:14:15PM +0800, Zhao Liu wrote: >> Hi maintainers and list, >> >> This RFC series attempts to re-implement simpletrace.py with Rust, which >> is the 1st task of Paolo's GSoC 2024 proposal. >> >> There are two motivations for this work: >> 1. This is an open chance to discuss how to integrate Rust into QEMU. > > I don't think this proposal really triggers that discussion to any > great extent, because 'simpletrace.py' is not a part of the QEMU > codebase in any meaningul way. It is a standalone python script > that just happens to live in the qemu.git repository. > > The difficult questions around Rust integration will arise when we > want to have Rust used to /replace/ some existing non-optional > functionality. IOW, if Rust were to become a mandatory dep of QEMU > that could not be avoided. We hope to post some PoC device models in Rust soon with exactly that debate in mind. > In fact I kinda wonder whether this Rust simpletrace code could > simply be its own git repo under gitlab.com/qemu-project/, > rather than put into the monolithic QEMU repo ? That just makes > it a "normal" Rust project and no questions around integration > with QEMU's traditional build system would arise. Do we export the necessary bits for external projects to use? I don't think we've had the equivalent of a qemu-devel package yet and doing so start implying externally visible versioning and deprecation policies for QEMU APIs and data formats. TCG plugins have a header based API but currently everyone builds against their local checkout and we are fairly liberal about bumping the API versions. > > > With regards, > Daniel -- Alex Bennée Virtualisation Tech Lead @ Linaro
Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
On Mon, May 27, 2024 at 04:14:15PM +0800, Zhao Liu wrote: > Hi maintainers and list, > > This RFC series attempts to re-implement simpletrace.py with Rust, which > is the 1st task of Paolo's GSoC 2024 proposal. > > There are two motivations for this work: > 1. This is an open chance to discuss how to integrate Rust into QEMU. I don't think this proposal really triggers that discussion to any great extent, because 'simpletrace.py' is not a part of the QEMU codebase in any meaningul way. It is a standalone python script that just happens to live in the qemu.git repository. The difficult questions around Rust integration will arise when we want to have Rust used to /replace/ some existing non-optional functionality. IOW, if Rust were to become a mandatory dep of QEMU that could not be avoided. In fact I kinda wonder whether this Rust simpletrace code could simply be its own git repo under gitlab.com/qemu-project/, rather than put into the monolithic QEMU repo ? That just makes it a "normal" Rust project and no questions around integration with QEMU's traditional build system would arise. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
On Wed, May 29, 2024 at 10:10:00PM +0800, Zhao Liu wrote: > Hi Stefan and Mads, > > On Wed, May 29, 2024 at 11:33:42AM +0200, Mads Ynddal wrote: > > Date: Wed, 29 May 2024 11:33:42 +0200 > > From: Mads Ynddal > > Subject: Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust > > X-Mailer: Apple Mail (2.3774.600.62) > > > > > > >> Maybe later, Rust-simpletrace and python-simpletrace can differ, e.g. > > >> the former goes for performance and the latter for scalability. > > > > > > Rewriting an existing, maintained component without buy-in from the > > > maintainers is risky. Mads is the maintainer of simpletrace.py and I am > > > the overall tracing maintainer. While the performance improvement is > > > nice, I'm a skeptical about the need for this and wonder whether thought > > > was put into how simpletrace should evolve. > > > > > > There are disadvantages to maintaining multiple implementations: > > > - File format changes need to be coordinated across implementations to > > > prevent compatibility issues. In other words, changing the > > > trace-events format becomes harder and discourages future work. > > > - Multiple implementations makes life harder for users because they need > > > to decide between implementations and understand the trade-offs. > > > - There is more maintenance overall. > > > > > > I think we should have a single simpletrace implementation to avoid > > > these issues. The Python implementation is more convenient for > > > user-written trace analysis scripts. The Rust implementation has better > > > performance (although I'm not aware of efforts to improve the Python > > > implementation's performance, so who knows). > > > > > > I'm ambivalent about why a reimplementation is necessary. What I would > > > like to see first is the TCG binary tracing functionality. Find the > > > limits of the Python simpletrace implementation and then it will be > > > clear whether a Rust reimplementation makes sense. > > > > > > If Mads agrees, I am happy with a Rust reimplementation, but please > > > demonstrate why a reimplementation is necessary first. > > > > > > Stefan > > > > I didn't want to shoot down the idea, since it seemed like somebody had a > > plan > > with GSoC. But I actually agree, that I'm not quite convinced. > > > > I think I'd need to see some data that showed the Python version is > > inadequate. > > I know Python is not the fastest, but is it so prohibitively slow, that we > > cannot make the TCG analysis? I'm not saying it can't be true, but it'd be > > nice > > to see it demonstrated before making decisions. > > > > Because, as you point out, there's a lot of downsides to having two > > versions. So > > the benefits have to clearly outweigh the additional work. > > > > I have a lot of other questions, but let's maybe start with the core idea > > first. > > > > — > > Mads Ynddal > > > > I really appreciate your patience and explanations, and your feedback > and review has helped me a lot! > > Yes, simple repetition creates much maintenance burden (though I'm happy > to help with), and the argument for current performance isn't convincing > enough. > > Getting back to the project itself, as you have said, the core is still > further support for TCG-related traces, and I'll continue to work on it, > and then look back based on such work to see what issues there are with > traces or what improvements can be made. Thanks for doing that and sorry for holding up the work you have already done! Stefan signature.asc Description: PGP signature
Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
Hi Stefan and Mads, On Wed, May 29, 2024 at 11:33:42AM +0200, Mads Ynddal wrote: > Date: Wed, 29 May 2024 11:33:42 +0200 > From: Mads Ynddal > Subject: Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust > X-Mailer: Apple Mail (2.3774.600.62) > > > >> Maybe later, Rust-simpletrace and python-simpletrace can differ, e.g. > >> the former goes for performance and the latter for scalability. > > > > Rewriting an existing, maintained component without buy-in from the > > maintainers is risky. Mads is the maintainer of simpletrace.py and I am > > the overall tracing maintainer. While the performance improvement is > > nice, I'm a skeptical about the need for this and wonder whether thought > > was put into how simpletrace should evolve. > > > > There are disadvantages to maintaining multiple implementations: > > - File format changes need to be coordinated across implementations to > > prevent compatibility issues. In other words, changing the > > trace-events format becomes harder and discourages future work. > > - Multiple implementations makes life harder for users because they need > > to decide between implementations and understand the trade-offs. > > - There is more maintenance overall. > > > > I think we should have a single simpletrace implementation to avoid > > these issues. The Python implementation is more convenient for > > user-written trace analysis scripts. The Rust implementation has better > > performance (although I'm not aware of efforts to improve the Python > > implementation's performance, so who knows). > > > > I'm ambivalent about why a reimplementation is necessary. What I would > > like to see first is the TCG binary tracing functionality. Find the > > limits of the Python simpletrace implementation and then it will be > > clear whether a Rust reimplementation makes sense. > > > > If Mads agrees, I am happy with a Rust reimplementation, but please > > demonstrate why a reimplementation is necessary first. > > > > Stefan > > I didn't want to shoot down the idea, since it seemed like somebody had a plan > with GSoC. But I actually agree, that I'm not quite convinced. > > I think I'd need to see some data that showed the Python version is > inadequate. > I know Python is not the fastest, but is it so prohibitively slow, that we > cannot make the TCG analysis? I'm not saying it can't be true, but it'd be > nice > to see it demonstrated before making decisions. > > Because, as you point out, there's a lot of downsides to having two versions. > So > the benefits have to clearly outweigh the additional work. > > I have a lot of other questions, but let's maybe start with the core idea > first. > > — > Mads Ynddal > I really appreciate your patience and explanations, and your feedback and review has helped me a lot! Yes, simple repetition creates much maintenance burden (though I'm happy to help with), and the argument for current performance isn't convincing enough. Getting back to the project itself, as you have said, the core is still further support for TCG-related traces, and I'll continue to work on it, and then look back based on such work to see what issues there are with traces or what improvements can be made. Thanks again! Regards, Zhao
Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
>> Maybe later, Rust-simpletrace and python-simpletrace can differ, e.g. >> the former goes for performance and the latter for scalability. > > Rewriting an existing, maintained component without buy-in from the > maintainers is risky. Mads is the maintainer of simpletrace.py and I am > the overall tracing maintainer. While the performance improvement is > nice, I'm a skeptical about the need for this and wonder whether thought > was put into how simpletrace should evolve. > > There are disadvantages to maintaining multiple implementations: > - File format changes need to be coordinated across implementations to > prevent compatibility issues. In other words, changing the > trace-events format becomes harder and discourages future work. > - Multiple implementations makes life harder for users because they need > to decide between implementations and understand the trade-offs. > - There is more maintenance overall. > > I think we should have a single simpletrace implementation to avoid > these issues. The Python implementation is more convenient for > user-written trace analysis scripts. The Rust implementation has better > performance (although I'm not aware of efforts to improve the Python > implementation's performance, so who knows). > > I'm ambivalent about why a reimplementation is necessary. What I would > like to see first is the TCG binary tracing functionality. Find the > limits of the Python simpletrace implementation and then it will be > clear whether a Rust reimplementation makes sense. > > If Mads agrees, I am happy with a Rust reimplementation, but please > demonstrate why a reimplementation is necessary first. > > Stefan I didn't want to shoot down the idea, since it seemed like somebody had a plan with GSoC. But I actually agree, that I'm not quite convinced. I think I'd need to see some data that showed the Python version is inadequate. I know Python is not the fastest, but is it so prohibitively slow, that we cannot make the TCG analysis? I'm not saying it can't be true, but it'd be nice to see it demonstrated before making decisions. Because, as you point out, there's a lot of downsides to having two versions. So the benefits have to clearly outweigh the additional work. I have a lot of other questions, but let's maybe start with the core idea first. — Mads Ynddal
Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
On Tue, May 28, 2024 at 02:48:42PM +0800, Zhao Liu wrote: > Hi Stefan, > > On Mon, May 27, 2024 at 03:59:44PM -0400, Stefan Hajnoczi wrote: > > Date: Mon, 27 May 2024 15:59:44 -0400 > > From: Stefan Hajnoczi > > Subject: Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust > > > > On Mon, May 27, 2024 at 04:14:15PM +0800, Zhao Liu wrote: > > > Hi maintainers and list, > > > > > > This RFC series attempts to re-implement simpletrace.py with Rust, which > > > is the 1st task of Paolo's GSoC 2024 proposal. > > > > > > There are two motivations for this work: > > > 1. This is an open chance to discuss how to integrate Rust into QEMU. > > > 2. Rust delivers faster parsing. > > > > > > > > > Introduction > > > > > > > > > Code framework > > > -- > > > > > > I choose "cargo" to organize the code, because the current > > > implementation depends on external crates (Rust's library), such as > > > "backtrace" for getting frameinfo, "clap" for parsing the cli, "rex" for > > > regular matching, and so on. (Meson's support for external crates is > > > still incomplete. [2]) > > > > > > The simpletrace-rust created in this series is not yet integrated into > > > the QEMU compilation chain, so it can only be compiled independently, e.g. > > > under ./scripts/simpletrace/, compile it be: > > > > > > cargo build --release > > > > Please make sure it's built by .gitlab-ci.d/ so that the continuous > > integration system prevents bitrot. You can add a job that runs the > > cargo build. > > Thanks! I'll do this. > > > > > > > The code tree for the entire simpletrace-rust is as follows: > > > > > > $ script/simpletrace-rust . > > > . > > > ├── Cargo.toml > > > └── src > > > └── main.rs // The simpletrace logic (similar to simpletrace.py). > > > └── trace.rs // The Argument and Event abstraction (refer to > > > // tracetool/__init__.py). > > > > > > My question about meson v.s. cargo, I put it at the end of the cover > > > letter (the section "Opens on Rust Support"). > > > > > > The following two sections are lessons I've learned from this Rust > > > practice. > > > > > > > > > Performance > > > --- > > > > > > I did the performance comparison using the rust-simpletrace prototype with > > > the python one: > > > > > > * On the i7-10700 CPU @ 2.90GHz machine, parsing and outputting a 35M > > > trace binary file for 10 times on each item: > > > > > > AVE (ms) Rust v.s. Python > > > Rust (stdout) 12687.16114.46% > > > Python (stdout) 14521.85 > > > > > > Rust (file) 1422.44264.99% > > > Python (file) 3769.37 > > > > > > - The "stdout" lines represent output to the screen. > > > - The "file" lines represent output to a file (via "> file"). > > > > > > This Rust version contains some optimizations (including print, regular > > > matching, etc.), but there should be plenty of room for optimization. > > > > > > The current performance bottleneck is the reading binary trace file, > > > since I am parsing headers and event payloads one after the other, so > > > that the IO read overhead accounts for 33%, which can be further > > > optimized in the future. > > > > Performance will become more important when large amounts of TCG data is > > captured, as described in the project idea: > > https://wiki.qemu.org/Internships/ProjectIdeas/TCGBinaryTracing > > > > While I can't think of a time in the past where simpletrace.py's > > performance bothered me, improving performance is still welcome. Just > > don't spend too much time on performance (and making the code more > > complex) unless there is a real need. > > Yes, I agree that it shouldn't be over-optimized. > > The logic in the current Rust version is pretty much a carbon copy of > the Python version, without additional complex logic introduced, but the > improvements in x2.6 were obtained by optimizing IO: > > * reading the event configuration file, where I called the buffered > interface, > * and the output formatted trace log, w
Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
Hi Stefan, On Mon, May 27, 2024 at 03:59:44PM -0400, Stefan Hajnoczi wrote: > Date: Mon, 27 May 2024 15:59:44 -0400 > From: Stefan Hajnoczi > Subject: Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust > > On Mon, May 27, 2024 at 04:14:15PM +0800, Zhao Liu wrote: > > Hi maintainers and list, > > > > This RFC series attempts to re-implement simpletrace.py with Rust, which > > is the 1st task of Paolo's GSoC 2024 proposal. > > > > There are two motivations for this work: > > 1. This is an open chance to discuss how to integrate Rust into QEMU. > > 2. Rust delivers faster parsing. > > > > > > Introduction > > > > > > Code framework > > -- > > > > I choose "cargo" to organize the code, because the current > > implementation depends on external crates (Rust's library), such as > > "backtrace" for getting frameinfo, "clap" for parsing the cli, "rex" for > > regular matching, and so on. (Meson's support for external crates is > > still incomplete. [2]) > > > > The simpletrace-rust created in this series is not yet integrated into > > the QEMU compilation chain, so it can only be compiled independently, e.g. > > under ./scripts/simpletrace/, compile it be: > > > > cargo build --release > > Please make sure it's built by .gitlab-ci.d/ so that the continuous > integration system prevents bitrot. You can add a job that runs the > cargo build. Thanks! I'll do this. > > > > The code tree for the entire simpletrace-rust is as follows: > > > > $ script/simpletrace-rust . > > . > > ├── Cargo.toml > > └── src > > └── main.rs // The simpletrace logic (similar to simpletrace.py). > > └── trace.rs // The Argument and Event abstraction (refer to > > // tracetool/__init__.py). > > > > My question about meson v.s. cargo, I put it at the end of the cover > > letter (the section "Opens on Rust Support"). > > > > The following two sections are lessons I've learned from this Rust > > practice. > > > > > > Performance > > --- > > > > I did the performance comparison using the rust-simpletrace prototype with > > the python one: > > > > * On the i7-10700 CPU @ 2.90GHz machine, parsing and outputting a 35M > > trace binary file for 10 times on each item: > > > > AVE (ms) Rust v.s. Python > > Rust (stdout) 12687.16114.46% > > Python (stdout) 14521.85 > > > > Rust (file) 1422.44264.99% > > Python (file) 3769.37 > > > > - The "stdout" lines represent output to the screen. > > - The "file" lines represent output to a file (via "> file"). > > > > This Rust version contains some optimizations (including print, regular > > matching, etc.), but there should be plenty of room for optimization. > > > > The current performance bottleneck is the reading binary trace file, > > since I am parsing headers and event payloads one after the other, so > > that the IO read overhead accounts for 33%, which can be further > > optimized in the future. > > Performance will become more important when large amounts of TCG data is > captured, as described in the project idea: > https://wiki.qemu.org/Internships/ProjectIdeas/TCGBinaryTracing > > While I can't think of a time in the past where simpletrace.py's > performance bothered me, improving performance is still welcome. Just > don't spend too much time on performance (and making the code more > complex) unless there is a real need. Yes, I agree that it shouldn't be over-optimized. The logic in the current Rust version is pretty much a carbon copy of the Python version, without additional complex logic introduced, but the improvements in x2.6 were obtained by optimizing IO: * reading the event configuration file, where I called the buffered interface, * and the output formatted trace log, which I output all via std_out.lock() followed by write_all(). So, just the simple tweak of the interfaces brings much benefits. :-) > > Security > > > > > > This is an example. > > > > Rust is very strict about type-checking, and it found timestamp reversal > > issue in simpletrace-rust [3] (sorry, haven't gotten around to digging > > deeper with more time)...in this RFC, I workingaround it by allowing > > negative values. And the python version, just silently covered this > > issue up. > >
Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
Hi Mads, On Mon, May 27, 2024 at 12:49:06PM +0200, Mads Ynddal wrote: > Date: Mon, 27 May 2024 12:49:06 +0200 > From: Mads Ynddal > Subject: Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust > X-Mailer: Apple Mail (2.3774.600.62) > > Hi, > > Interesting work. I don't have any particular comments for the code, but I > wanted to address a few of the points here. > > > 2. Rust delivers faster parsing. > > For me, the point of simpletrace.py is not to be the fastest at parsing, but > rather to open the door for using Python libraries like numpy, matplotlib, > etc. > for analysis. > > There might be room for improvement in the Python version, especially in > minimizing memory usage, when parsing large traces. Thanks for pointing this out, the Python version is also very extensible and easy to develop. Perhaps ease of scalability vs. performance could be the difference that the two versions focus on? > > Security > > > > > > This is an example. > > > > Rust is very strict about type-checking, and it found timestamp reversal > > issue in simpletrace-rust [3] (sorry, haven't gotten around to digging > > deeper with more time)...in this RFC, I workingaround it by allowing > > negative values. And the python version, just silently covered this > > issue up. > > I'm not particularly worried about the security of the Python version. We're > not > doing anything obviously exploitable. I agree with this, this tool is mainly for parsing. I think one of the starting points for providing a Rust version was also to explore whether this could be an opportunity to integrate Rust into QEMU. Thanks, Zhao
Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
On Mon, May 27, 2024 at 04:14:15PM +0800, Zhao Liu wrote: > Hi maintainers and list, > > This RFC series attempts to re-implement simpletrace.py with Rust, which > is the 1st task of Paolo's GSoC 2024 proposal. > > There are two motivations for this work: > 1. This is an open chance to discuss how to integrate Rust into QEMU. > 2. Rust delivers faster parsing. > > > Introduction > > > Code framework > -- > > I choose "cargo" to organize the code, because the current > implementation depends on external crates (Rust's library), such as > "backtrace" for getting frameinfo, "clap" for parsing the cli, "rex" for > regular matching, and so on. (Meson's support for external crates is > still incomplete. [2]) > > The simpletrace-rust created in this series is not yet integrated into > the QEMU compilation chain, so it can only be compiled independently, e.g. > under ./scripts/simpletrace/, compile it be: > > cargo build --release Please make sure it's built by .gitlab-ci.d/ so that the continuous integration system prevents bitrot. You can add a job that runs the cargo build. > > The code tree for the entire simpletrace-rust is as follows: > > $ script/simpletrace-rust . > . > ├── Cargo.toml > └── src > └── main.rs // The simpletrace logic (similar to simpletrace.py). > └── trace.rs // The Argument and Event abstraction (refer to > // tracetool/__init__.py). > > My question about meson v.s. cargo, I put it at the end of the cover > letter (the section "Opens on Rust Support"). > > The following two sections are lessons I've learned from this Rust > practice. > > > Performance > --- > > I did the performance comparison using the rust-simpletrace prototype with > the python one: > > * On the i7-10700 CPU @ 2.90GHz machine, parsing and outputting a 35M > trace binary file for 10 times on each item: > > AVE (ms) Rust v.s. Python > Rust (stdout) 12687.16114.46% > Python (stdout) 14521.85 > > Rust (file) 1422.44264.99% > Python (file) 3769.37 > > - The "stdout" lines represent output to the screen. > - The "file" lines represent output to a file (via "> file"). > > This Rust version contains some optimizations (including print, regular > matching, etc.), but there should be plenty of room for optimization. > > The current performance bottleneck is the reading binary trace file, > since I am parsing headers and event payloads one after the other, so > that the IO read overhead accounts for 33%, which can be further > optimized in the future. Performance will become more important when large amounts of TCG data is captured, as described in the project idea: https://wiki.qemu.org/Internships/ProjectIdeas/TCGBinaryTracing While I can't think of a time in the past where simpletrace.py's performance bothered me, improving performance is still welcome. Just don't spend too much time on performance (and making the code more complex) unless there is a real need. > Security > > > This is an example. > > Rust is very strict about type-checking, and it found timestamp reversal > issue in simpletrace-rust [3] (sorry, haven't gotten around to digging > deeper with more time)...in this RFC, I workingaround it by allowing > negative values. And the python version, just silently covered this > issue up. > > Opens on Rust Support > = > > Meson v.s. Cargo > > > The first question is whether all Rust code (including under scripts) > must be integrated into meson? > > If so, because of [2] then I have to discard the external crates and > build some more Rust wheels of my own to replace the previous external > crates. > > For the main part of the QEMU code, I think the answer must be Yes, but > for the tools in the scripts directory, would it be possible to allow > the use of cargo to build small tools/program for flexibility and > migrate to meson later (as meson's support for rust becomes more > mature)? I have not seen a satisfying way to natively build Rust code using meson. I remember reading about a tool that converts Cargo.toml files to meson wrap files or something similar. That still doesn't feel great because upstream works with Cargo and duplicating build information in meson is a drag. Calling cargo from meson is not ideal either, but it works and avoids duplicating build information. This is the approach I would use for now unless someone can point to an example of native Rust support in meson that is clean. Here is how libblkio calls cargo from meson: https://gitlab.com/libblkio/libblkio/-/blob/main/src/meson.build https://gitlab.com/libblkio/libblkio/-/blob/main/src/cargo-build.sh > > > External crates > --- > > This is an additional question that naturally follows from the above > question, do we have requirements for Rust's external crate? Is only std > allowed? There is no
Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
Hi, Interesting work. I don't have any particular comments for the code, but I wanted to address a few of the points here. > 2. Rust delivers faster parsing. For me, the point of simpletrace.py is not to be the fastest at parsing, but rather to open the door for using Python libraries like numpy, matplotlib, etc. for analysis. There might be room for improvement in the Python version, especially in minimizing memory usage, when parsing large traces. > Security > > > This is an example. > > Rust is very strict about type-checking, and it found timestamp reversal > issue in simpletrace-rust [3] (sorry, haven't gotten around to digging > deeper with more time)...in this RFC, I workingaround it by allowing > negative values. And the python version, just silently covered this > issue up. I'm not particularly worried about the security of the Python version. We're not doing anything obviously exploitable. — Mads Ynddal
Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
Cc'ing a few more Rust integration reviewers :) On 27/5/24 10:14, Zhao Liu wrote: Hi maintainers and list, This RFC series attempts to re-implement simpletrace.py with Rust, which is the 1st task of Paolo's GSoC 2024 proposal. There are two motivations for this work: 1. This is an open chance to discuss how to integrate Rust into QEMU. 2. Rust delivers faster parsing. Introduction Code framework -- I choose "cargo" to organize the code, because the current implementation depends on external crates (Rust's library), such as "backtrace" for getting frameinfo, "clap" for parsing the cli, "rex" for regular matching, and so on. (Meson's support for external crates is still incomplete. [2]) The simpletrace-rust created in this series is not yet integrated into the QEMU compilation chain, so it can only be compiled independently, e.g. under ./scripts/simpletrace/, compile it be: cargo build --release The code tree for the entire simpletrace-rust is as follows: $ script/simpletrace-rust . . ├── Cargo.toml └── src └── main.rs // The simpletrace logic (similar to simpletrace.py). └── trace.rs // The Argument and Event abstraction (refer to // tracetool/__init__.py). My question about meson v.s. cargo, I put it at the end of the cover letter (the section "Opens on Rust Support"). The following two sections are lessons I've learned from this Rust practice. Performance --- I did the performance comparison using the rust-simpletrace prototype with the python one: * On the i7-10700 CPU @ 2.90GHz machine, parsing and outputting a 35M trace binary file for 10 times on each item: AVE (ms) Rust v.s. Python Rust (stdout) 12687.16114.46% Python (stdout) 14521.85 Rust (file) 1422.44264.99% Python (file) 3769.37 - The "stdout" lines represent output to the screen. - The "file" lines represent output to a file (via "> file"). This Rust version contains some optimizations (including print, regular matching, etc.), but there should be plenty of room for optimization. The current performance bottleneck is the reading binary trace file, since I am parsing headers and event payloads one after the other, so that the IO read overhead accounts for 33%, which can be further optimized in the future. Security This is an example. Rust is very strict about type-checking, and it found timestamp reversal issue in simpletrace-rust [3] (sorry, haven't gotten around to digging deeper with more time)...in this RFC, I workingaround it by allowing negative values. And the python version, just silently covered this issue up. Opens on Rust Support = Meson v.s. Cargo The first question is whether all Rust code (including under scripts) must be integrated into meson? If so, because of [2] then I have to discard the external crates and build some more Rust wheels of my own to replace the previous external crates. For the main part of the QEMU code, I think the answer must be Yes, but for the tools in the scripts directory, would it be possible to allow the use of cargo to build small tools/program for flexibility and migrate to meson later (as meson's support for rust becomes more mature)? External crates --- This is an additional question that naturally follows from the above question, do we have requirements for Rust's external crate? Is only std allowed? Welcome your feedback! [1]: https://wiki.qemu.org/Google_Summer_of_Code_2024 [2]: https://github.com/mesonbuild/meson/issues/2173 [3]: https://lore.kernel.org/qemu-devel/20240509134712.ga515...@fedora.redhat.com/ Thanks and Best Regards, Zhao --- Zhao Liu (6): scripts/simpletrace-rust: Add the basic cargo framework scripts/simpletrace-rust: Support Event & Arguments in trace module scripts/simpletrace-rust: Add helpers to parse trace file scripts/simpletrace-rust: Parse and check trace recode file scripts/simpletrace-rust: Format simple trace output docs/tracing: Add simpletrace-rust section docs/devel/tracing.rst | 35 ++ scripts/simpletrace-rust/.gitignore| 1 + scripts/simpletrace-rust/.rustfmt.toml | 9 + scripts/simpletrace-rust/Cargo.lock| 370 +++ scripts/simpletrace-rust/Cargo.toml| 17 + scripts/simpletrace-rust/src/main.rs | 633 + scripts/simpletrace-rust/src/trace.rs | 339 + 7 files changed, 1404 insertions(+) create mode 100644 scripts/simpletrace-rust/.gitignore create mode 100644 scripts/simpletrace-rust/.rustfmt.toml create mode 100644 scripts/simpletrace-rust/Cargo.lock create mode 100644 scripts/simpletrace-rust/Cargo.toml create mode 100644 scripts/simpletrace-rust/src/main.rs create mode 100644 scripts/simpletrace-rust/src/trace.rs