Bump, this item needs to move forwards soon. Does anyone have any thoughts?
On Wed, Apr 1, 2020 at 9:40 AM Pavel Picka <ppi...@redhat.com> wrote: > Hi, > I'd like to add one more question to this topic. Do you think it is a > blocker for PRs [0] & [1] as by testing [2] this features I haven't run > into real world example where two really same name packages appears. > I think this is a 'must have' feature but until we solve/decide it we can > have two features working may with warning in docs for users that can > happen in some 'special' repositories. > > To follow topic directly I like proposed move to 'RepositoryContent' and > add it to its uniqueness constraint (if I understand well). > > [0] https://github.com/pulp/pulp_rpm/pull/1657 > [1] https://github.com/pulp/pulp_rpm/pull/1642 > [2] tested with centos 7, 8, opensuse and SLE repositories > > On Wed, Apr 1, 2020 at 3:22 PM Daniel Alley <dal...@redhat.com> wrote: > >> We'd like to start a discussion on the "relative path problem" identified >> recently. >> Problem: >> >> Currently, a relative_path is tied to content in Pulp. This means that if >> a content unit exists in two places within a repository or across >> repositories, it has to be stored as two separate content units. This >> creates redundant data and potential confusion for users. >> >> As a specific example, we need to support mirroring content in pulp_rpm >> <https://pulp.plan.io/issues/6353>. Currently, for each location at >> which a single package is stored, we’ll need to create a content unit. We >> could end up with several records representing a single package. Users may >> be confused about why they see multiple records for a package and they may >> have trouble for example deciding which content unit to copy. >> Proposed Solution: >> >> Move “relative_path” from its current location on ContentArtifact, to >> RepositoryContent. This will require a sizable data migration. It is >> possibly the case that in rare cases, repository versions may change >> slightly due to deduplication. >> >> A repository-version-wide uniqueness constraint will be present on >> “relative_path”, independently of any other repository uniquness >> constraints (repo_key_fields) defined by the plugin writer. >> >> Modify the Stages API so that the relative_path can be processed in the >> correct location – instead of “DeclarativeArtifact” it will likely need to >> go on “DeclarativeContent” >> >> Remove “location_href” from the RPM Package content model – it was never >> a true part of the RPM (file) metadata, it is derived from the repository >> metadata. So storing it as a part of the Content unit doesn’t entirely make >> sense. >> Alternatives >> >> In most cases, a content unit will have a single relative path for a >> content unit. Creating a general solution to solve a one-off problem is >> usually not a good idea. As an alternative, we could look at another >> solution for mirroring content. One example might be to create a new object >> (e.g. RpmRepoMirrorContentMapping) that maps content to specific paths >> within a repo or repo version. >> Questions >> >> - How do we handle this in pulp_file? How are content units >> identified in pulp_file without relative_path? >> - Checksum? >> - How was this problem handled in Pulp 2? >> >> >> Please weigh in if you have any input on potential problems with the >> proposal, potential alternate solutions, or other insights or questions! >> _______________________________________________ >> Pulp-dev mailing list >> Pulp-dev@redhat.com >> https://www.redhat.com/mailman/listinfo/pulp-dev >> > > > -- > Pavel Picka > Red Hat >
_______________________________________________ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev