In the coming weeks we will need to settle on a strategy for the Pulp 3
advanced copy APIs for the RPM plugin. This is one of, if not the most
complicated plugins, so there are a lot of factors to consider. We'd like
to invite the community to participate in the discussion and get an idea of
what patterns and workflows you would like to have, and help elaborate on
the pros and cons of each approach, and possibly suggest new approaches we
haven't thought of.
Here are the basic use cases that we have come up with, and some points of
interest/concern that we noted during the RPM meeting today:
Use cases?
-
As a user, I can copy all content from one repository to another
repository
-
As a user, I can copy content matching certain "search criteria" from
one repository to another repository
-
Search criteria with copying more multiple content types is a
difficult problem -- what content types do the criteria correspond to?
- Possible solutions:
- Allow criteria to be specified as a JSON data structure so that
it can be kept organized by type
- Only allow copying one content type at a time -- but couple this
with a feature to allow "incomplete" repository versions to
be built up
over the course of many different operations
- This was suggested as a possible necessary use case, but one
that we need more feedback on
-
As a user, when copying content that directly references other content,
the referenced content is *always* copied (if present)
-
e.g. Modules referencing Modules, Modules referencing RPMs, Erratum
referencing RPMs, and {{other types}}
- Can't think of a reason to allow bypassing this -- the manual UX if
you wanted to bypass this would be excruciatingly painful and require
looking at the details of the individual content units
- Modules declare debug packages as artifacts, debug packages are not
always present in the same repo, so we can only copy referenced units "if
present"
-
As a user, I can optionally choose to copy all indirect dependencies of
content that is being copied (recursive copy)
-
Should this be the default?
-
Arguments for both 'yes' and 'no'
-
Let’s see what perf looks like in real-world scenarios
-
Some content types create a new content unit in the destination repo
instead of just copying
-
e.g. yum_repo_metadata_file
-
Multi-repo copy needed for modularity
- Multiple source repositories used for depsolving, search criteria only
applies to the primary source repo, multiple target repositories
correspond
with matching source repos
What should the API look like?
-
If we want to support all of the use cases with API endpoint and one
task, then we might need to use ggainey’s proposal for a complex filter
provided by a JSON blob (proposal #3 on the issue)
- {
'package': ["name=firefox, arch=x86_64, version>=70",
"name=chrome, version==72.0.1"],
'modulemd': ["name=ripgrep, stream=master"]
}
-
The Python plugin does something like this, but the criteria matching
library is provided for us by the Python packaging utils. We
would have to
implement this ourselves for RPM
-
Brian proposed a modification of this idea where the search criteria
can be saved and re-used between tasks rather than provided each
and every
time
-
Ina’s concern: How we would process these queries?
-
We will perform search on each content type separately and return
a result only if both of the queries would give back a result?
-
It can happen that repo will have package foo but not modulemd
ripgrep
- Do we fail if any are missing, or succeed regardless of matching
content?
-
If we had a feature where the user can progressively build up a complete
repository version by modifying an incomplete repository version, then one
single very complex search criteria layout is unnecessary. You could run
several copy tasks, one for each content type you want to copy, with
criteria corresponding only to that type.
-
BUT, with recursive copy, that might lead to a lot of overhead since
it has to be set up for each individual task
-
BUT, there are ways to mitigate that overhead, albeit it would be
very challenging
Hopefully it is clear from this summary that the topic is complicated and
that it could be accomplished in several very different ways. We would love
your feedback!
_______________________________________________
Pulp-dev mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/pulp-dev