Is this just about where to store the files on disk? -- bk
On 3/20/20 7:24 AM, David Davis wrote: > I think using pkgid is problematic though. Consider the case where you have > two > packages with the same location_href but different pkgIds. Since the pulp_rpm > code > uses location_href (which also gets stored as relative_path) as the filename, > which > one will get published when a repo version is published? > > PS - Don't tell me that two different packages will never have the same > location_href. If it's one thing I've learned working on RPM, things that > will never > happen sometimes do happen. > > David > > > On Fri, Mar 20, 2020 at 4:46 AM Pavel Picka <ppi...@redhat.com > <mailto:ppi...@redhat.com>> wrote: > > I think we should keep nevra as unique constraint, but as I mentioned > before > (above in this thread) your idea is similar to mine as my suggestion was > NEVRA + > checksum (pkgId). > With pkgId I've already tested it and working good. > > On Fri, Mar 20, 2020 at 5:43 AM Daniel Alley <dal...@redhat.com > <mailto:dal...@redhat.com>> wrote: > > I discussed this a little bit on the #rpm.org <http://rpm.org> > channel. Here > is the gist of that discussion > > * The metadata is "crazy, but technically valid" > * "the entire SUSE ecosystem tends to do this a lot, anything using > OBS, > including nvidia and dell and friends" > * "also, SUSE packages can have the same NEVRA with being completely > different packages because of how their build system makes > packages" > > I'm not sure what the best means to fix it would be. Perhaps the > uniqueness > constraint should be on the location_href, instead of on the NEVRA? > Or on > NEVRA + location_href? > > On Wed, Mar 18, 2020 at 9:47 AM Ina Panova <ipan...@redhat.com > <mailto:ipan...@redhat.com>> wrote: > > Pavel, > I meant to say, that pulp3 does not have such limitation as pulp2 > had ( > saving rpms on the filesystem with same nevra). > The error is raised in pulp3 [0] when a repo version is created, > because > of the repo key[1], we cannot have 2 rpms with save NEVRA. > > We can enable that, if we decide to, by adding location_href to > the > repo_key, *but* this needs to be evaluated, it can have side > effects and > we should involve our stakeholders to weigh in. > > [0] > > https://github.com/pulp/pulpcore/blob/master/pulpcore/app/models/repository.py#L570 > [1] > > https://github.com/pulp/pulp_rpm/blob/master/pulp_rpm/app/models/package.py#L188 > > -------- > Regards, > > Ina Panova > Senior Software Engineer| Pulp| Red Hat Inc. > > "Do not go where the path may lead, > go instead where there is no path and leave a trail." > > > On Wed, Mar 18, 2020 at 2:24 PM Pavel Picka <ppi...@redhat.com > <mailto:ppi...@redhat.com>> wrote: > > True in opensuse repository there are two possibilities 'src' > and > 'nosrc' (this one should be legacy without source code), both > are > recognized by createrepo_c as arch 'src'. > > To point the pulp2 code I mentioned I found here [0] (base rpm > package what I understood). > > The rise of error in pulp3 happening here [1] in pulpcore > when adding > packages to repository version. > So as Ina mentioned it doesn't have to be an issue with > packages > itself than the logic in sync. > > [0] > https://github.com/pulp/pulp_rpm/blob/2-master/plugins/pulp_rpm/plugins/db/models.py#L779 > [1] > https://github.com/pulp/pulpcore/blob/master/pulpcore/app/models/repository.py#L570 > > On Wed, Mar 18, 2020 at 1:55 PM Ina Panova <ipan...@redhat.com > <mailto:ipan...@redhat.com>> wrote: > > Tanya and Pavel, > in this issue it is explained why we cannot keep 2 > packages with > same NEVRA but different checksums within a repo > https://pulp.plan.io/issues/494 > > Pulp2 had a limitation where it was not able to save on > the > filesystem 2 rpms with same filename, it lead to the > primary.xml > that could have pointed to the rpm that did not actually > get saved. > I believe in Pulp3 we could allow having rpm with same > NEVRA if > they have different location_href within a repo. > > -------- > Regards, > > Ina Panova > Senior Software Engineer| Pulp| Red Hat Inc. > > "Do not go where the path may lead, > go instead where there is no path and leave a trail." > > > On Wed, Mar 18, 2020 at 10:47 AM Tatiana Tereshchenko > <ttere...@redhat.com <mailto:ttere...@redhat.com>> wrote: > > Hi Pavel, > > On Tue, Mar 17, 2020 at 7:31 PM Pavel Picka > <ppi...@redhat.com <mailto:ppi...@redhat.com>> wrote: > > Hello, would like to ask you how to proceed with > issue > with duplicate (but not really) packages. > > I am syncing suse repository (opensuse42 and > SLE12) and > get and duplicate error. But when checking the > packages > [0](from primary.xml) glibc and glibc they got > same nevra > but different checksum (and a few more as size..) > so > doesn't look like real duplicates. > > Those are weird, the have the same nevra but see the > location_href, one is src and the other one is nosrc! > :/ : > <location href="nosrc/glibc-2.19-20.3.nosrc.rpm"/> > <location href="src/glibc-2.19-20.3.src.rpm"/> > > It looks like something OpenSUSE specific. I'm not > sure if > it's a valid way to create a repo with such metadata, > we need > to figure it out at some point. > > > I've checked Pulp2 and there is used nevra+sum for > repository uniqueness. In pulp3 we use only nevra. > > Why do you think that in pulp 2 we use NEVRA + > checksum? have > you tested it? please point to the code. > I believe in Pulp 2 as well as in Pulp 3 we allow to > have > packages with different checksums in Pulp storage. > I don't think we allow having the same packages with > different checksums in the same repo. > FWIW, in pulp 2 the most recently added package is > chosen to > stay in a repo, no packages with duplicate NEVRA left > after > sync, > see > https://github.com/pulp/pulp_rpm/blob/2-master/plugins/pulp_rpm/plugins/importers/yum/purge.py#L285-L333 > > > > My suggestion is to extend repo_key_fields for rpm > package as is in pulp2 with pkgId (checksum). As > I don't > think they are really duplicates and other > software can > rely on specific version of package. > > > Unfortunately, I don't remember the main reason to > remove > duplicates based on nevra. Was it because some > tooling will > complain, or was it just to avoid duplicates at > resync time? > Does anyone know? > We should not change it unless we know for sure that > it's > needed + we would need to have an agreement from all > our > stakeholders for that change. > > For now, I think we can move on and ensure that no > duplicates > are in a repo version. To my understanding, the > behaviour > will be the same as in pulp 2. > Feel free to share where you get duplicate error to > see if > it's a bug or not. I wonder why duplicates are not > removed > automatically. Maybe because the first version > contains > duplicates due to this bug > https://pulp.plan.io/issues/6217 ? > > Tanya > > > > What do you think? > > > [0] > > <package type="rpm"> > <name>glibc</name> > <arch>src</arch> > <version epoch="0" ver="2.19" rel="20.3"/> > <checksum type="sha256" > > pkgid="YES">00d36c0f741b0c01a77ce318a2bbcfa59cb4dd0b24ce61f57c6205e4fa1bb310</checksum> > <summary>Standard Shared Libraries (from > the GNU C > Library)</summary> > <description>The GNU C Library provides the > most > important standard libraries used > by nearly all programs: the standard C > library, the > standard math > library, and the POSIX thread library. A > system is > not functional > without these libraries.</description> > <packager>https://www.suse.com/</packager> > > <url>http://www.gnu.org/software/libc/libc.html</url> > <time file="1426696882" build="1425645307"/> > <size package="591662" installed="13047428" > archive="974464"/> > <location > href="nosrc/glibc-2.19-20.3.nosrc.rpm"/> > <format> > <rpm:license>LGPL-2.1+ and > SUSE-LGPL-2.1+-with-GCC-exception and > GPL-2.0+</rpm:license> > <rpm:vendor>SUSE LLC > <https://www.suse.com/></rpm:vendor> > <rpm:group>System/Libraries</rpm:group> > <rpm:buildhost>sheep16</rpm:buildhost> > <rpm:sourcerpm/> > <rpm:header-range start="872" > end="144403"/> > <rpm:requires> > <rpm:entry name="pwdutils"/> > <rpm:entry name="xz"/> > <rpm:entry name="fdupes"/> > <rpm:entry name="systemd-rpm-macros"/> > <rpm:entry name="libselinux-devel"/> > <rpm:entry name="makeinfo"/> > </rpm:requires> > </format> > </package> > > <package type="rpm"> > <name>glibc</name> > <arch>src</arch> > <version epoch="0" ver="2.19" rel="20.3"/> > <checksum type="sha256" > > pkgid="YES">353e1dc85eab8d434be83160eca4fcee11a72eec345385df125ca0835abd6068</checksum> > <summary>Standard Shared Libraries (from > the GNU C > Library)</summary> > <description>The GNU C Library provides the > most > important standard libraries used > by nearly all programs: the standard C > library, the > standard math > library, and the POSIX thread library. A > system is > not functional > without these libraries.</description> > <packager>https://www.suse.com/</packager> > > <url>http://www.gnu.org/software/libc/libc.html</url> > <time file="1426696883" build="1423750734"/> > <size package="12678975" > installed="13047285" > archive="13057760"/> > <location href="src/glibc-2.19-20.3.src.rpm"/> > <format> > <rpm:license>LGPL-2.1+ and > SUSE-LGPL-2.1+-with-GCC-exception and > GPL-2.0+</rpm:license> > <rpm:vendor>SUSE LLC > <https://www.suse.com/></rpm:vendor> > <rpm:group>System/Libraries</rpm:group> > <rpm:buildhost>sheep02</rpm:buildhost> > <rpm:sourcerpm/> > <rpm:header-range start="872" > end="144334"/> > <rpm:requires> > <rpm:entry name="pwdutils"/> > <rpm:entry name="xz"/> > <rpm:entry name="fdupes"/> > <rpm:entry name="systemd-rpm-macros"/> > <rpm:entry name="libselinux-devel"/> > <rpm:entry name="makeinfo"/> > </rpm:requires> > </format> > </package> > > > -- > Pavel Picka > Red Hat > _______________________________________________ > Pulp-dev mailing list > Pulp-dev@redhat.com <mailto:Pulp-dev@redhat.com> > https://www.redhat.com/mailman/listinfo/pulp-dev > > _______________________________________________ > Pulp-dev mailing list > Pulp-dev@redhat.com <mailto:Pulp-dev@redhat.com> > https://www.redhat.com/mailman/listinfo/pulp-dev > > > > -- > Pavel Picka > Red Hat > _______________________________________________ > Pulp-dev mailing list > Pulp-dev@redhat.com <mailto:Pulp-dev@redhat.com> > https://www.redhat.com/mailman/listinfo/pulp-dev > > _______________________________________________ > Pulp-dev mailing list > Pulp-dev@redhat.com <mailto:Pulp-dev@redhat.com> > https://www.redhat.com/mailman/listinfo/pulp-dev > > > > -- > Pavel Picka > Red Hat > _______________________________________________ > Pulp-dev mailing list > Pulp-dev@redhat.com <mailto:Pulp-dev@redhat.com> > https://www.redhat.com/mailman/listinfo/pulp-dev > > > _______________________________________________ > Pulp-dev mailing list > Pulp-dev@redhat.com > https://www.redhat.com/mailman/listinfo/pulp-dev >
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev