Re: [Pulp-dev] RPM plugin Copy API discussion

Ina Panova Fri, 10 Jan 2020 04:36:50 -0800

Bump on the thread in the case this got lost before the holidays.

Any feedback is better then no feedback, this will help us driving the
feature implementation.


Thank you!
--------
Regards,

Ina Panova
Senior Software Engineer| Pulp| Red Hat Inc.

"Do not go where the path may lead,
 go instead where there is no path and leave a trail."


On Thu, Dec 12, 2019 at 8:52 PM Daniel Alley <[email protected]> wrote:

> In the coming weeks we will need to settle on a strategy for the Pulp 3
> advanced copy APIs for the RPM plugin. This is one of, if not the most
> complicated plugins, so there are a lot of factors to consider.  We'd like
> to invite the community to participate in the discussion and get an idea of
> what patterns and workflows you would like to have, and help elaborate on
> the pros and cons of each approach, and possibly suggest new approaches we
> haven't thought of.
>
> Here are the basic use cases that we have come up with, and some points of
> interest/concern that we noted during the RPM meeting today:
>
> Use cases?
>
>    -
>
>    As a user, I can copy all content from one repository to another
>    repository
>    -
>
>    As a user, I can copy content matching certain "search criteria" from
>    one repository to another repository
>    -
>
>       Search criteria with copying more multiple content types is a
>       difficult problem -- what content types do the criteria correspond to?
>       - Possible solutions:
>          - Allow criteria to be specified as a JSON data structure so
>          that it can be kept organized by type
>          - Only allow copying one content type at a time -- but couple
>          this with a feature to allow "incomplete" repository versions to be 
> built
>          up over the course of many different operations
>             - This was suggested as a possible necessary use case, but
>             one that we need more feedback on
>             -
>
>    As a user, when copying content that directly references other
>    content, the referenced content is *always* copied (if present)
>    -
>
>       e.g. Modules referencing Modules, Modules referencing RPMs, Erratum
>       referencing RPMs, and {{other types}}
>       - Can't think of a reason to allow bypassing this -- the manual UX
>       if you wanted to bypass this would be excruciatingly painful and require
>       looking at the details of the individual content units
>       - Modules declare debug packages as artifacts, debug packages are
>       not always present in the same repo, so we can only copy referenced 
> units
>       "if present"
>
>
>    -
>
>    As a user, I can optionally choose to copy all indirect dependencies
>    of content that is being copied (recursive copy)
>    -
>
>       Should this be the default?
>       -
>
>       Arguments for both 'yes' and 'no'
>       -
>
>       Let’s see what perf looks like in real-world scenarios
>       -
>
>    Some content types create a new content unit in the destination repo
>    instead of just copying
>    -
>
>       e.g. yum_repo_metadata_file
>       -
>
>    Multi-repo copy needed for modularity
>    - Multiple source repositories used for depsolving, search criteria
>       only applies to the primary source repo, multiple target repositories
>       correspond with matching source repos
>
>
> What should the API look like?
>
>    -
>
>    If we want to support all of the use cases with API endpoint and one
>    task, then we might need to use ggainey’s proposal for a complex filter
>    provided by a JSON blob (proposal #3 on the issue)
>    - {
>           'package': ["name=firefox, arch=x86_64, version>=70",
>       "name=chrome, version==72.0.1"],
>           'modulemd': ["name=ripgrep, stream=master"]
>       }
>       -
>
>       The Python plugin does something like this, but the criteria
>       matching library is provided for us by the Python packaging utils. We 
> would
>       have to implement this ourselves for RPM
>       -
>
>       Brian proposed a modification of this idea where the search
>       criteria can be saved and re-used between tasks rather than provided 
> each
>       and every time
>       -
>
>       Ina’s concern: How we would process these queries?
>       -
>
>          We will perform search on each content type separately and
>          return a result only if both of the queries would give back a result?
>          -
>
>          It can happen that repo will have package foo but not modulemd
>          ripgrep
>          - Do we fail if any are missing, or succeed regardless of
>          matching content?
>       -
>
>    If we had a feature where the user can progressively build up a
>    complete repository version by modifying an incomplete repository version,
>    then one single very complex search criteria layout is unnecessary. You
>    could run several copy tasks, one for each content type you want to copy,
>    with criteria corresponding only to that type.
>    -
>
>       BUT, with recursive copy, that might lead to a lot of overhead
>       since it has to be set up for each individual task
>       -
>
>          BUT, there are ways to mitigate that overhead, albeit it would
>          be very challenging
>
>
> Hopefully it is clear from this summary that the topic is complicated and
> that it could be accomplished in several very different ways. We would love
> your feedback!
>
> _______________________________________________
> Pulp-dev mailing list
> [email protected]
> https://www.redhat.com/mailman/listinfo/pulp-dev
>

_______________________________________________
Pulp-dev mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/pulp-dev

Re: [Pulp-dev] RPM plugin Copy API discussion

Reply via email to