Hi Stefan,

On Mon, May 27, 2024 at 03:59:44PM -0400, Stefan Hajnoczi wrote:
> Date: Mon, 27 May 2024 15:59:44 -0400
> From: Stefan Hajnoczi <stefa...@redhat.com>
> Subject: Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
> 
> On Mon, May 27, 2024 at 04:14:15PM +0800, Zhao Liu wrote:
> > Hi maintainers and list,
> > 
> > This RFC series attempts to re-implement simpletrace.py with Rust, which
> > is the 1st task of Paolo's GSoC 2024 proposal.
> > 
> > There are two motivations for this work:
> > 1. This is an open chance to discuss how to integrate Rust into QEMU.
> > 2. Rust delivers faster parsing.
> > 
> > 
> > Introduction
> > ============
> > 
> > Code framework
> > --------------
> > 
> > I choose "cargo" to organize the code, because the current
> > implementation depends on external crates (Rust's library), such as
> > "backtrace" for getting frameinfo, "clap" for parsing the cli, "rex" for
> > regular matching, and so on. (Meson's support for external crates is
> > still incomplete. [2])
> > 
> > The simpletrace-rust created in this series is not yet integrated into
> > the QEMU compilation chain, so it can only be compiled independently, e.g.
> > under ./scripts/simpletrace/, compile it be:
> > 
> >     cargo build --release
> 
> Please make sure it's built by .gitlab-ci.d/ so that the continuous
> integration system prevents bitrot. You can add a job that runs the
> cargo build.

Thanks! I'll do this.

> > 
> > The code tree for the entire simpletrace-rust is as follows:
> > 
> > $ script/simpletrace-rust .
> > .
> > ├── Cargo.toml
> > └── src
> >     └── main.rs   // The simpletrace logic (similar to simpletrace.py).
> >     └── trace.rs  // The Argument and Event abstraction (refer to
> >                   // tracetool/__init__.py).
> > 
> > My question about meson v.s. cargo, I put it at the end of the cover
> > letter (the section "Opens on Rust Support").
> > 
> > The following two sections are lessons I've learned from this Rust
> > practice.
> > 
> > 
> > Performance
> > -----------
> > 
> > I did the performance comparison using the rust-simpletrace prototype with
> > the python one:
> > 
> > * On the i7-10700 CPU @ 2.90GHz machine, parsing and outputting a 35M
> > trace binary file for 10 times on each item:
> > 
> >                       AVE (ms)       Rust v.s. Python
> > Rust   (stdout)       12687.16            114.46%
> > Python (stdout)       14521.85
> > 
> > Rust   (file)          1422.44            264.99%
> > Python (file)          3769.37
> > 
> > - The "stdout" lines represent output to the screen.
> > - The "file" lines represent output to a file (via "> file").
> > 
> > This Rust version contains some optimizations (including print, regular
> > matching, etc.), but there should be plenty of room for optimization.
> > 
> > The current performance bottleneck is the reading binary trace file,
> > since I am parsing headers and event payloads one after the other, so
> > that the IO read overhead accounts for 33%, which can be further
> > optimized in the future.
> 
> Performance will become more important when large amounts of TCG data is
> captured, as described in the project idea:
> https://wiki.qemu.org/Internships/ProjectIdeas/TCGBinaryTracing
> 
> While I can't think of a time in the past where simpletrace.py's
> performance bothered me, improving performance is still welcome. Just
> don't spend too much time on performance (and making the code more
> complex) unless there is a real need.

Yes, I agree that it shouldn't be over-optimized.

The logic in the current Rust version is pretty much a carbon copy of
the Python version, without additional complex logic introduced, but the
improvements in x2.6 were obtained by optimizing IO:

* reading the event configuration file, where I called the buffered
  interface,
* and the output formatted trace log, which I output all via std_out.lock()
  followed by write_all().

So, just the simple tweak of the interfaces brings much benefits. :-)

> > Security
> > --------
> > 
> > This is an example.
> > 
> > Rust is very strict about type-checking, and it found timestamp reversal
> > issue in simpletrace-rust [3] (sorry, haven't gotten around to digging
> > deeper with more time)...in this RFC, I workingaround it by allowing
> > negative values. And the python version, just silently covered this
> > issue up.
> >
> > Opens on Rust Support
> > =====================
> > 
> > Meson v.s. Cargo
> > ----------------
> > 
> > The first question is whether all Rust code (including under scripts)
> > must be integrated into meson?
> > 
> > If so, because of [2] then I have to discard the external crates and
> > build some more Rust wheels of my own to replace the previous external
> > crates.
> > 
> > For the main part of the QEMU code, I think the answer must be Yes, but
> > for the tools in the scripts directory, would it be possible to allow
> > the use of cargo to build small tools/program for flexibility and
> > migrate to meson later (as meson's support for rust becomes more
> > mature)?
> 
> I have not seen a satisfying way to natively build Rust code using
> meson. I remember reading about a tool that converts Cargo.toml files to
> meson wrap files or something similar. That still doesn't feel great
> because upstream works with Cargo and duplicating build information in
> meson is a drag.
> 
> Calling cargo from meson is not ideal either, but it works and avoids
> duplicating build information. This is the approach I would use for now
> unless someone can point to an example of native Rust support in meson
> that is clean.
> 
> Here is how libblkio calls cargo from meson:
> https://gitlab.com/libblkio/libblkio/-/blob/main/src/meson.build
> https://gitlab.com/libblkio/libblkio/-/blob/main/src/cargo-build.sh

Many thanks! This is a good example and I'll try to build similarly to
it.

> > 
> > 
> > External crates
> > ---------------
> > 
> > This is an additional question that naturally follows from the above
> > question, do we have requirements for Rust's external crate? Is only std
> > allowed?
> 
> There is no policy. My suggestion:
> 
> If you need a third-party crate then it's okay to use it, but try to
> minimize dependencies. Avoid adding dependening for niceties that are
> not strictly needed. Third-party crates are a burden for package
> maintainers and anyone building from source. They increase the risk that
> the code will fail to build. They can also be a security risk.

Thanks for the suggestion, that's clear to me, I'll try to control the
third party dependencies.

> > 
> > Welcome your feedback!
> 
> It would be easier to give feedback if you implement some examples of
> TCG binary tracing before rewriting simpletrace.py. It's unclear to me
> why simpletrace needs to be rewritten at this point. If you are
> extending the simpletrace file format in ways that are not suitable for
> Python or can demonstrate that Python performance is a problem, then
> focussing on a Rust simpletrace implementation is more convincing.
> 
> Could you use simpletrace.py to develop TCG binary tracing first?

Yes, I can. :-)

Rewriting in Rust does sound duplicative, but I wonder if this could be
viewed as an open opportunity to add Rust support for QEMU, looking at
the scripts directory to start exploring Rust support/build would be
a relatively easy place to start.

I think the exploration of Rust's build of the simpletrace tool under
scripts parts can help with subsequent work on supporting Rust in other
QEMU core parts.

>From this point, may I ask your opinion?

Maybe later, Rust-simpletrace and python-simpletrace can differ, e.g.
the former goes for performance and the latter for scalability.

Thanks,
Zhao



Reply via email to