-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Tue, Jul 21, 2020 at 06:31:55PM +0000, WillyPillow wrote:
> On Monday, July 20, 2020 4:52 PM, Wojtek Porczyk 
> <w...@invisiblethingslab.com> wrote:
> > On Sun, Jul 19, 2020 at 06:20:06PM +0000, WillyPillow wrote:
> 
> > > On another note, I'm wonder which fields are needed in the output of the 
> > > "info"
> > > operation. Comparing my WIP code to DNF, I currently do not have the
> > > architecture [2], URL, licence, and description fields. This is due to
> > > `qubes.TemplateSearch` not currently returning those fields.
> > > For the packages in the official repos, those fields do not contain much
> > > information (in particular, the description field contains the same 
> > > information
> > > as the summary), though I'm not sure if they might be useful in the 
> > > future or
> > > for unofficial templates.
> > 
> 
> > I don't know, could you design that so that those can be added in the future
> > if need be? Those need to be understood and properly validated, because some
> > software might act upon that information. For example: Debian provides
> > a directory with licence texts, which allows for
> > /usr/share/common-licenses/$license, which smells path traversal.
> > Fedora's RPM guidelines is even worse, they support operators like "()",
> > "and", "or":
> > https://docs.fedoraproject.org/en-US/packaging-guidelines/LicensingGuidelines/
> 
> To include the information, all that is needed is to include it in the
> `queryformat` string of the qrexec call and read it in `qvm-template` (Aside
> from dealing with the newline issue [4] -- mentioned below.)
> 
> As for validation, on our end, as we (AFAICT) only need to display the entry 
> as
> is and don't need to parse it, we should be fine.
> 
> For other programs parsing the information, IMO it should be up to the program
> in question to sanitize its inputs.

That would be in ideal world, unfortunately many developers can't be bothered
to do it properly. That's why Qubes exists in the first place.

> This is because the fields contain, by
> design, arbitrary text, and my understanding of the linked article is that 
> it's
> merely a guideline for Fedora repos instead of RPMs in general.

Can't we make our own guidelines? Like explicit whitelist of known licences
and maybe "other"? Linux templates are mostly under GPL, unikernels can have
other licences, but there aren't many unikernels. Unless I'm missing
something, known-good values would be easy to enumerate and check.

Also, some checking probably applies also to name and description: they
shouldn't have fancy characters, because UI toolkits might support things
like pango markup or whatever.

> > > One tricky thing is that the description may contain newlines, while `dnf 
> > > repoquery` does not escape them at all [3]. This may mean that another 
> > > method
> > > of querying the repo needs to be considered if the description is 
> > > included. (Or
> > > use unconventional characters/strings as separators. In particular, I 
> > > can't
> > > pass NULL characters in the arguments to DNF for obvious reasons.)
> > 
> 
> > Yes, another qrexec call is OK, provided this won't be too slow, i.e., to
> > display a list of 15 templates available you won't refresh the repo cache
> > 15 times...
> 
> (Speaking of refreshing the repo cache, a `--refresh` option that forces this
> may need to be added, either as an option to the existing qrexec calls or as
> another call.)
> 
> Creating another qrexec call is an interesting idea, but I'm unsure if it's
> really feasible performance-wise. In particular, running `dnf info` (which 
> does
> not refresh the cache by default) on any package in `qubes-template-itl` takes
> almost a second on my machine.
> 
> What I meant by "another method" is actually an alternative to `dnf 
> repoquery`.
> The issue is that (AFAIK) DNF provides no methods other than `repoquery` to
> obtain a machine-readable form of the information (short of using the API from
> Python). However, it has the issue when dealing with newlines.
> 
> IMO, the easiest way of doing this in terms of code changes is to modify
> `repoquery` so that it allows expanding `\0` as null characters. Currently, it
> already does similar replacements with `\n` and `\t`, and adding `\0` should
> only be a one-line change.

> However, I imagine that it would not be ideal to maintain a patched version of
> a package in VMs. Even if the patch gets merged upstream, it'll probably still
> take a few months for it to land in the next Fedora release.
> 
> Other solutions I can think of right now are:
> 
> - Maintain a separate modified copy of `repoquery.py`. From a first glance, it
>   seems to be fairly self-contained, and the only internal thing it calls,
>   AFAICT, is for i18n, which we don't care about in this context.
> - Write a simplified version of `repoquery` using the DNF API. Probably not 
> too
>   hard but feels like reinventing the wheel.
>
> > > [3]: Speaking of which, I'm also unsure what would happen if newlines 
> > > appear
> > > in, say, the summary field. Maybe I can conduct some experiments about 
> > > this...
> > 
> 
> > Certainly.
> 
> Unfortunately, newlines in the summary field *does* break things. In
> particular, besides malformed repo XML files, a malformed package with 
> newlines
> in the summary can also introduce erroneous newlines in the output of
> `repoquery`.

> [4]: Technically, there may also be issues with the colon separator we
>      currently use, though it's in essence the same issue.

I wouldn't worry much about either newlines or additional colons generated on
the untrusted side of qrexec call. I'd worry about proper checking of the
untrusted input. Yes, I can make .rpm package with colon and newline in some
metadata fields. So what? The template manager on the trusted side should just
reject any malformed input from the untrusted side.

Now if it would be easy, you could also make some provisions to also reject
such malformed input inside the downloading vm, but it's not as critical as in
dom0. That's why I certainly wouldn't patch dnf, because effort to gains ratio
doesn't look promising, and I'd only fork the the repoquery if it would look
easy to maintain. That ':'.join(['%{something}']) formatstring looks good
enough for PoC.


> Also, initial implementations of `qvm-template {info,search}` along with some
> other changes have been added to [the PR to
> core-admin-client](https://github.com/QubesOS/qubes-core-admin-client/pull/145).

Acked, thanks.



- -- 
pozdrawiam / best regards
Wojtek Porczyk
Invisible Things Lab
 
 I do not fear computers,
 I fear lack of them.
    -- Isaac Asimov
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEaO0VFfpr0tEF6hYkv2vZMhA6I1EFAl8YGfwACgkQv2vZMhA6
I1HWQQ//UtvsL3ACQClT/0eeTFA9qqrYsmgnRF/yOtIPZOB3FbNMklrwK70arfbE
RGK1Iq8697mBtrBH9yishkTiVKN7FeetW8jkLEaA1hLgLYTc+TuL8EgrXoDT0CoZ
cEDcFYTs4zX7rvmIOM0ZGq5JOSOQcRWf6ZNuRr0+GnpYNbYN83s7Fi40freogizk
HFzVRsx4BsC9nKLyE3nAAIcTUt0L/GhZ3lgRhUWTZdYRCrYYSFuF5dzAl3Aacovz
3pVjfgdLU3e0JAbFUTI0MRENtd+XIVHm5MBpqlM1zp1fMZNtDeRGSppqOPWxKU6H
JbrO9AXeiMdzKjdZlB6QZ81gNVBSgsQU2AtYLkvHcx8BxH4axzeQZMiUc3SyXLn2
AsMo0l0IBlj+mjF/FVxBJ3YDQfDS7r5T3ile1NEOvMATjn2C5AlkGAgUM+kERE5U
uobNqb8bPkNkAIL+U7uAox+CXqDMvQ4+InJk7wRKKbj4dPVPSA8KCCirxhF0AtpH
ub8ZWihh6QhtBU8G8By8fMAz4VvEV7oi0rZOYaIGrNoOGutP3KiPB66M7OKA2QG6
zAMdQ2aKn/ODxFncqj4WA2djGEB7ewIYCL4m57CUp1O7tXBukKxtH+GWSckdkx8p
8J5bQfuRGr6WwLkvdQUTYlMDVO8bk7nQ/zZ1OGuwMDGQj9/izgM=
=GlEM
-----END PGP SIGNATURE-----

-- 
You received this message because you are subscribed to the Google Groups 
"qubes-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to qubes-devel+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/qubes-devel/20200722105036.GD2122%40invisiblethingslab.com.

Reply via email to