Pavel, I meant to say, that pulp3 does not have such limitation as pulp2 had ( saving rpms on the filesystem with same nevra). The error is raised in pulp3 [0] when a repo version is created, because of the repo key[1], we cannot have 2 rpms with save NEVRA.
We can enable that, if we decide to, by adding location_href to the repo_key, *but* this needs to be evaluated, it can have side effects and we should involve our stakeholders to weigh in. [0] https://github.com/pulp/pulpcore/blob/master/pulpcore/app/models/repository.py#L570 [1] https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/models/package.py#L188 -------- Regards, Ina Panova Senior Software Engineer| Pulp| Red Hat Inc. "Do not go where the path may lead, go instead where there is no path and leave a trail." On Wed, Mar 18, 2020 at 2:24 PM Pavel Picka <ppi...@redhat.com> wrote: > True in opensuse repository there are two possibilities 'src' and 'nosrc' > (this one should be legacy without source code), both are recognized by > createrepo_c as arch 'src'. > > To point the pulp2 code I mentioned I found here [0] (base rpm package > what I understood). > > The rise of error in pulp3 happening here [1] in pulpcore when adding > packages to repository version. > So as Ina mentioned it doesn't have to be an issue with packages itself > than the logic in sync. > > [0] > https://github.com/pulp/pulp_rpm/blob/2-master/plugins/pulp_rpm/plugins/db/models.py#L779 > [1] > https://github.com/pulp/pulpcore/blob/master/pulpcore/app/models/repository.py#L570 > > On Wed, Mar 18, 2020 at 1:55 PM Ina Panova <ipan...@redhat.com> wrote: > >> Tanya and Pavel, >> in this issue it is explained why we cannot keep 2 packages with same >> NEVRA but different checksums within a repo >> https://pulp.plan.io/issues/494 >> >> Pulp2 had a limitation where it was not able to save on the filesystem 2 >> rpms with same filename, it lead to the primary.xml that could have pointed >> to the rpm that did not actually get saved. >> I believe in Pulp3 we could allow having rpm with same NEVRA if they have >> different location_href within a repo. >> >> -------- >> Regards, >> >> Ina Panova >> Senior Software Engineer| Pulp| Red Hat Inc. >> >> "Do not go where the path may lead, >> go instead where there is no path and leave a trail." >> >> >> On Wed, Mar 18, 2020 at 10:47 AM Tatiana Tereshchenko < >> ttere...@redhat.com> wrote: >> >>> Hi Pavel, >>> >>> On Tue, Mar 17, 2020 at 7:31 PM Pavel Picka <ppi...@redhat.com> wrote: >>> >>>> Hello, would like to ask you how to proceed with issue with duplicate >>>> (but not really) packages. >>>> >>>> I am syncing suse repository (opensuse42 and SLE12) and get and >>>> duplicate error. But when checking the packages [0](from primary.xml) glibc >>>> and glibc they got same nevra but different checksum (and a few more as >>>> size..) so doesn't look like real duplicates. >>>> >>> Those are weird, the have the same nevra but see the location_href, one >>> is src and the other one is nosrc! :/ : >>> <location href="nosrc/glibc-2.19-20.3.nosrc.rpm"/> >>> <location href="src/glibc-2.19-20.3.src.rpm"/> >>> >>> It looks like something OpenSUSE specific. I'm not sure if it's a valid >>> way to create a repo with such metadata, we need to figure it out at some >>> point. >>> >>> >>>> I've checked Pulp2 and there is used nevra+sum for repository >>>> uniqueness. In pulp3 we use only nevra. >>>> >>> Why do you think that in pulp 2 we use NEVRA + checksum? have you tested >>> it? please point to the code. >>> I believe in Pulp 2 as well as in Pulp 3 we allow to have packages with >>> different checksums in Pulp storage. >>> I don't think we allow having the same packages with different checksums >>> in the same repo. >>> FWIW, in pulp 2 the most recently added package is chosen to stay in a >>> repo, no packages with duplicate NEVRA left after sync, see >>> https://github.com/pulp/pulp_rpm/blob/2-master/plugins/pulp_rpm/plugins/importers/yum/purge.py#L285-L333 >>> >>> >>>> >>>> My suggestion is to extend repo_key_fields for rpm package as is in >>>> pulp2 with pkgId (checksum). As I don't think they are really duplicates >>>> and other software can rely on specific version of package. >>>> >>> >>> Unfortunately, I don't remember the main reason to remove duplicates >>> based on nevra. Was it because some tooling will complain, or was it just >>> to avoid duplicates at resync time? Does anyone know? >>> We should not change it unless we know for sure that it's needed + we >>> would need to have an agreement from all our stakeholders for that change. >>> >>> For now, I think we can move on and ensure that no duplicates are in a >>> repo version. To my understanding, the behaviour will be the same as in >>> pulp 2. >>> Feel free to share where you get duplicate error to see if it's a bug or >>> not. I wonder why duplicates are not removed automatically. Maybe because >>> the first version contains duplicates due to this bug >>> https://pulp.plan.io/issues/6217 ? >>> >>> Tanya >>> >>> >>>> >>>> What do you think? >>>> >>>> >>>> [0] >>>> >>>>> <package type="rpm"> >>>>> <name>glibc</name> >>>>> <arch>src</arch> >>>>> <version epoch="0" ver="2.19" rel="20.3"/> >>>>> <checksum type="sha256" >>>>> pkgid="YES">00d36c0f741b0c01a77ce318a2bbcfa59cb4dd0b24ce61f57c6205e4fa1bb310</checksum> >>>>> <summary>Standard Shared Libraries (from the GNU C Library)</summary> >>>>> <description>The GNU C Library provides the most important standard >>>>> libraries used >>>>> by nearly all programs: the standard C library, the standard math >>>>> library, and the POSIX thread library. A system is not functional >>>>> without these libraries.</description> >>>>> <packager>https://www.suse.com/</packager> >>>>> <url>http://www.gnu.org/software/libc/libc.html</url> >>>>> <time file="1426696882" build="1425645307"/> >>>>> <size package="591662" installed="13047428" archive="974464"/> >>>>> <location href="nosrc/glibc-2.19-20.3.nosrc.rpm"/> >>>>> <format> >>>>> <rpm:license>LGPL-2.1+ and SUSE-LGPL-2.1+-with-GCC-exception and >>>>> GPL-2.0+</rpm:license> >>>>> <rpm:vendor>SUSE LLC <https://www.suse.com/></rpm:vendor> >>>>> <rpm:group>System/Libraries</rpm:group> >>>>> <rpm:buildhost>sheep16</rpm:buildhost> >>>>> <rpm:sourcerpm/> >>>>> <rpm:header-range start="872" end="144403"/> >>>>> <rpm:requires> >>>>> <rpm:entry name="pwdutils"/> >>>>> <rpm:entry name="xz"/> >>>>> <rpm:entry name="fdupes"/> >>>>> <rpm:entry name="systemd-rpm-macros"/> >>>>> <rpm:entry name="libselinux-devel"/> >>>>> <rpm:entry name="makeinfo"/> >>>>> </rpm:requires> >>>>> </format> >>>>> </package> >>>>> >>>>> <package type="rpm"> >>>>> <name>glibc</name> >>>>> <arch>src</arch> >>>>> <version epoch="0" ver="2.19" rel="20.3"/> >>>>> <checksum type="sha256" >>>>> pkgid="YES">353e1dc85eab8d434be83160eca4fcee11a72eec345385df125ca0835abd6068</checksum> >>>>> <summary>Standard Shared Libraries (from the GNU C Library)</summary> >>>>> <description>The GNU C Library provides the most important standard >>>>> libraries used >>>>> by nearly all programs: the standard C library, the standard math >>>>> library, and the POSIX thread library. A system is not functional >>>>> without these libraries.</description> >>>>> <packager>https://www.suse.com/</packager> >>>>> <url>http://www.gnu.org/software/libc/libc.html</url> >>>>> <time file="1426696883" build="1423750734"/> >>>>> <size package="12678975" installed="13047285" archive="13057760"/> >>>>> <location href="src/glibc-2.19-20.3.src.rpm"/> >>>>> <format> >>>>> <rpm:license>LGPL-2.1+ and SUSE-LGPL-2.1+-with-GCC-exception and >>>>> GPL-2.0+</rpm:license> >>>>> <rpm:vendor>SUSE LLC <https://www.suse.com/></rpm:vendor> >>>>> <rpm:group>System/Libraries</rpm:group> >>>>> <rpm:buildhost>sheep02</rpm:buildhost> >>>>> <rpm:sourcerpm/> >>>>> <rpm:header-range start="872" end="144334"/> >>>>> <rpm:requires> >>>>> <rpm:entry name="pwdutils"/> >>>>> <rpm:entry name="xz"/> >>>>> <rpm:entry name="fdupes"/> >>>>> <rpm:entry name="systemd-rpm-macros"/> >>>>> <rpm:entry name="libselinux-devel"/> >>>>> <rpm:entry name="makeinfo"/> >>>>> </rpm:requires> >>>>> </format> >>>>> </package> >>>> >>>> >>>> -- >>>> Pavel Picka >>>> Red Hat >>>> _______________________________________________ >>>> Pulp-dev mailing list >>>> Pulp-dev@redhat.com >>>> https://www.redhat.com/mailman/listinfo/pulp-dev >>>> >>> _______________________________________________ >>> Pulp-dev mailing list >>> Pulp-dev@redhat.com >>> https://www.redhat.com/mailman/listinfo/pulp-dev >>> >> > > -- > Pavel Picka > Red Hat > _______________________________________________ > Pulp-dev mailing list > Pulp-dev@redhat.com > https://www.redhat.com/mailman/listinfo/pulp-dev >
_______________________________________________ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev