BuildStream autocompletion and manifest generation

Richard Maw Tue, 21 Jan 2025 11:43:08 -0800

I have been working with a client engineer who is evaluating BuildStream
as their tool to integrate various binaries.


They are using import elements and some in-tree custom plugins
(including a fun source plugin that turns a binary's metadata into a
buildstream project tree that is junctioned to provide dependency
information of its components)
and the main artifact they are producing from this is a manifest usable
as a SBOM and listing all the various components that need to be flashed
onto test rig hardware.

I think this is going well, but there have been a couple of feature
requests that I'll expand further on later:

1. Is there a way to provide autocompletions and validation when modifying
   .bst files?
2. Is there a way to generate a manifest of all of elements and sources
   reachable by all combinations of project.conf options?

We have some ideas and can put the effort in to implement them,
but these are features that ought to live upstream
and we'd like to know what options have already been considered
so we don't end up going down any dead-ends
and if we do find a solution that it may find a home upstream.

# 1. Validating and providing autocompletions for .bst files

The cilent engineer has been successfully using a vscode plugin that
uses json schema to validate and provide autocompletions for yaml on
other projects, and would like to be able to provide tooling support to
help other developers in their organisation who are unfamiliar with
projects built out of a tree of yaml documents, or who might have
trouble running buildstream to validate their work pre-push.

I'm aware of conversations long ago about implementing a BuildStream
language server, but I'm not aware of any recent developments
and this JSON Schema approach might be an intermediate step
that can also be useful in CI pipelines to validate changes before
starting a build.

The general idea is that plugins can define a schema that can be
combined into a larger schema that can be used to validate element
files.

This would initially be using a script in the repository
since all of the interesting plugins will be defined in the repositoy
too so integrating with BuildStream's plugin loader machinery is not
immediately necessary, but this would need to be integrated into
BuildStream to prevent the model getting out of sync.

The client has had success in using pydantic to declare the data
structures and use them both to validate loaded data at runtime
and generate the json schema to validate and autocomplete while
editing.

My current thinking is that plugins would define their state as a
pydantic model. The plugins use:

```
state = PluginState.model_validate(node.strip_node_info())
```

to validate their data, though it's unfortunate that this loses the
provenance reporting in error messages and I've not had enough time to
dig deep enough into pydantic to determine if I can extend the
validation to include these reports.

The script for generating the schema would look something like this:

```
import plugins.foo
import plugins.bar

sources = []

class LocalSource(BaseModel):
    kind: Literal["local"]
    path: str

sources.append(LocalSource)
sources.append(plugins.foo.FooSource)
sources.append(plugins.bar.BarSource)

Source = Union[*sources]

class Element(BaseModel):
    ...
    sources: List[Source] = Field(discriminator="kind")

print(json.dumps(Element.model_json_schema()))
```

I'm not sure how includes could be handled in this since their schema
depends on where they are imported from and could be included from
multiple contexts. This sounds like something only possible to support
with a full language server.

The fact that conditionals can be used pretty much anywhere makes the
shcema somewhat complicated, but should be possible to implement using
https://docs.pydantic.dev/latest/concepts/models/#generic-models

Which brings me onto the second question.

# 2. Generating a manifest for all combinations

After the success of collecting all the integration information into
BuildStream and being able to build a machine-parsable manifest
that could be used to visualise the dependencies
it was asked if it were possible to generate a "generic" form of this
that includes every possible source and binary artifact's dependencies
with annotations for which option combinations select the nodes.

My gut feeling is that this would be very complicated,
far from "just" splitting the parser into a loading stage that gets
everything and a resolving stage that prunes the tree based on options.

I suspect it's possible for options and conditionals to affect which
plugins are loaded, which would mean that every possible plugin has to
be loaded and indexed by the set of options that enable it,
each conditional branch is annotated by the set of options that enabled
it so it's possible to look up which plugin is used in that context
and all of the possible junction sources are fetched.

I think if we encode BuildStream's data format in enough pydantic models
we could probably implement the subset of this that is necessary
when the set of plugins is fixed and no options cross junctions,
but this is probably going to also involve importing buildstream
internal libraries or running the command with specific options to fetch
stuff, so it sounds like a challenge either way.

I don't really have questions about this that aren't "should I abandon
hope immediately and seek an alternative" or "is this an A/B problem
that I'm missing an obvious B solution".

-- 
Richard Maw (he/him)
Codethink Ltd., 3rd Floor, Dale House, 35 Dale Street, MANCHESTER, M1 2HF.
https://www.codethink.co.uk/privacy.html

BuildStream autocompletion and manifest generation

Reply via email to