We realized in our discussion that the original proposal described in my email will not work, because "relative_path" ultimately describes the path of the published *artifacts* (not content), and for content types with multiple artifacts, storing this information in a field on RepositoryContent would not be possible.
On Mon, Apr 27, 2020 at 6:08 PM Daniel Alley <dal...@redhat.com> wrote: > There is a video call scheduled to discuss this issue tomorrow (Tuesday > April 28th) at 13:30 UTC (please convert to your local time). > https://meet.google.com/scy-csbx-qiu > > On Sat, Apr 25, 2020 at 7:02 AM David Davis <davidda...@redhat.com> wrote: > >> I had a chance to think about this some more yesterday and wanted to >> email out my thoughts. I also think that this change sounds scary and will >> have a big impact on plugin writers so I thought of a couple alternatives: >> >> First, we could add a relative_path field to RepositoryContent instead of >> moving it there. This would be an optional field. It would be up to plugins >> to manage this field and they would still need to populate the >> relative_path field on ContentArtifact. But plugins could use this optional >> field to store relative paths per repository and then use this field when >> generating publications. >> >> The second alternative is one that is already laid out in the original >> email but to call it out again: it would be to not solve this in pulpcore. >> RPM would create its own object that would map content in a repository to >> relative_paths. >> >> David >> >> >> On Tue, Apr 21, 2020 at 9:22 AM Quirin Pamp <p...@atix.de> wrote: >> >>> Hi, >>> >>> >>> I am not currently very well versed in the classes involved, but moving >>> relative_path around sounds slightly scary with the potential to break >>> things. >>> >>> >>> As such, I would be interested to be kept in the loop as this moves >>> forward. (Mailing list once there is some movement is entirely sufficient >>> 😉) >>> >>> >>> Thanks, >>> >>> Quirin Pamp >>> ------------------------------ >>> *From:* pulp-dev-boun...@redhat.com <pulp-dev-boun...@redhat.com> on >>> behalf of Ina Panova <ipan...@redhat.com> >>> *Sent:* 21 April 2020 14:07:13 >>> *To:* Daniel Alley <dal...@redhat.com> >>> *Cc:* Pulp-dev <pulp-dev@redhat.com> >>> *Subject:* Re: [Pulp-dev] the "relative path" problem >>> >>> Daniel, >>> >>> how about setting up a meeting and brainstorm the alternatives, >>> pros/cons there? >>> >>> >>> -------- >>> Regards, >>> >>> Ina Panova >>> Senior Software Engineer| Pulp| Red Hat Inc. >>> >>> "Do not go where the path may lead, >>> go instead where there is no path and leave a trail." >>> >>> >>> On Fri, Apr 17, 2020 at 5:57 PM Daniel Alley <dal...@redhat.com> wrote: >>> >>> Bump, this item needs to move forwards soon. Does anyone have any >>> thoughts? >>> >>> On Wed, Apr 1, 2020 at 9:40 AM Pavel Picka <ppi...@redhat.com> wrote: >>> >>> Hi, >>> I'd like to add one more question to this topic. Do you think it is a >>> blocker for PRs [0] & [1] as by testing [2] this features I haven't run >>> into real world example where two really same name packages appears. >>> I think this is a 'must have' feature but until we solve/decide it we >>> can have two features working may with warning in docs for users that can >>> happen in some 'special' repositories. >>> >>> To follow topic directly I like proposed move to 'RepositoryContent' and >>> add it to its uniqueness constraint (if I understand well). >>> >>> [0] https://github.com/pulp/pulp_rpm/pull/1657 >>> [1] https://github.com/pulp/pulp_rpm/pull/1642 >>> [2] tested with centos 7, 8, opensuse and SLE repositories >>> >>> On Wed, Apr 1, 2020 at 3:22 PM Daniel Alley <dal...@redhat.com> wrote: >>> >>> We'd like to start a discussion on the "relative path problem" >>> identified recently. >>> Problem: >>> >>> Currently, a relative_path is tied to content in Pulp. This means that >>> if a content unit exists in two places within a repository or across >>> repositories, it has to be stored as two separate content units. This >>> creates redundant data and potential confusion for users. >>> >>> As a specific example, we need to support mirroring content in pulp_rpm >>> <https://pulp.plan.io/issues/6353>. Currently, for each location at >>> which a single package is stored, we’ll need to create a content unit. We >>> could end up with several records representing a single package. Users may >>> be confused about why they see multiple records for a package and they may >>> have trouble for example deciding which content unit to copy. >>> Proposed Solution: >>> >>> Move “relative_path” from its current location on ContentArtifact, to >>> RepositoryContent. This will require a sizable data migration. It is >>> possibly the case that in rare cases, repository versions may change >>> slightly due to deduplication. >>> >>> A repository-version-wide uniqueness constraint will be present on >>> “relative_path”, independently of any other repository uniquness >>> constraints (repo_key_fields) defined by the plugin writer. >>> >>> Modify the Stages API so that the relative_path can be processed in the >>> correct location – instead of “DeclarativeArtifact” it will likely need to >>> go on “DeclarativeContent” >>> >>> Remove “location_href” from the RPM Package content model – it was never >>> a true part of the RPM (file) metadata, it is derived from the repository >>> metadata. So storing it as a part of the Content unit doesn’t entirely make >>> sense. >>> Alternatives >>> >>> In most cases, a content unit will have a single relative path for a >>> content unit. Creating a general solution to solve a one-off problem is >>> usually not a good idea. As an alternative, we could look at another >>> solution for mirroring content. One example might be to create a new object >>> (e.g. RpmRepoMirrorContentMapping) that maps content to specific paths >>> within a repo or repo version. >>> Questions >>> >>> - How do we handle this in pulp_file? How are content units >>> identified in pulp_file without relative_path? >>> - Checksum? >>> - How was this problem handled in Pulp 2? >>> >>> >>> Please weigh in if you have any input on potential problems with the >>> proposal, potential alternate solutions, or other insights or questions! >>> _______________________________________________ >>> Pulp-dev mailing list >>> Pulp-dev@redhat.com >>> https://www.redhat.com/mailman/listinfo/pulp-dev >>> >>> >>> >>> -- >>> Pavel Picka >>> Red Hat >>> >>> _______________________________________________ >>> Pulp-dev mailing list >>> Pulp-dev@redhat.com >>> https://www.redhat.com/mailman/listinfo/pulp-dev >>> >>> _______________________________________________ >>> Pulp-dev mailing list >>> Pulp-dev@redhat.com >>> https://www.redhat.com/mailman/listinfo/pulp-dev >>> >>
_______________________________________________ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev