On 04/24/2017 06:31 AM, Mihai Ibanescu wrote: > Jeff, > > A few comments to your strawman: > > * What is an artifact? If it is a database model, then why not call it a unit > if that's what it's called > everywhere else in the code?
In pulp3, content units and associated files are separate. Each content unit has 1/many artifacts. An RPM content unit has exactly 1 artifact whereas docker has 3 artifacts. > * How would you deal with metadata-only units that don't have a file > representation, but show up in some kind > of metadata (e.g. package groups / errata). associate() doesn't seem to give > me that. The straw-man splits published items into 2 categories: 1. Files associated with content which are artifacts. Handled by associate(). - RPMs - ISOs - Docker images - etc ... 2. Files generated by the publisher. Handled by add(). - Metadata (like everything in repodata/) - Metadata only units like errata and package groups. For category #2 files the publisher uses the information stored in the content unit to generate a file in the staging directory. Then uses add() to include it in the publication. > * For that matter, how would you deal with files that are not representations > of units, but new artifacts? > (e.g. repomd.xml and the like). It feels like it can be possible by extending > my commit() with writing the > metadata and then calling the parent class' commit() (which does the atomic > publish), but I think that's not > pretty. See above ^^. > > > On Fri, Apr 21, 2017 at 5:09 PM, Jeff Ortel <jor...@redhat.com > <mailto:jor...@redhat.com>> wrote: > > I like this Brian and want to take it one step further. I think there is > value in abstracting how a > publication is composed. Files like metadata need to be composed by the > publisher (as needed) in the > working_dir then "added" to the publication. Artifacts could be > "associated" to the publication and the > platform determines how this happens (symlinks/in the DB). > > Assuming the Publisher is instantiated with a 'working_dir' attribute. > > --------------------------------------- > > Something like this to kick around: > > > class Publication: > """ > The Publication provided by the plugin API. > > Examples: > > A crude example with lots of hand waving. > > In Publisher.publish() > > >>> > >>> publication = Publication(self.working_dir) > >>> > >>> # Artifacts > >>> for artifact in []: # artifacts > >>> path = ' <determine relative path>' > >>> publication.associate(artifact, path) > >>> > >>> # Metadata created in self.staging_dir <here>. > >>> > >>> publication.add('repodata/primary.xml') > >>> publication.add('repodata/others.xml') > >>> publication.add('repodata/repomd.xml') > >>> > >>> # - OR - > >>> > >>> publication.add('repodata/') > >>> > >>> publication.commit() > """ > > def __init__(self, staging_dir): > """ > Args: > staging_dir: Absolute path to where publication is staged. > """ > self.staging_dir = staging_dir > > def associate(self, artifact, path): > """ > Associate an artifact to the publication. > This could result in creating a symlink in the staging directory > or (later) creating a record in the db. > > Args: > artifact: A content artifact > path: Relative path within the staging directory AND > eventually > within the published URL. > """ > > def add(self, path): > """ > Add a file within the staging directory to the publication by > relative path. > > Args: > path: Relative path within the staging directory AND > eventually within > the published URL. When *path* is a directory, all > files > within the directory are added. > """ > > def commit(self): > """ > When committed, the publication is atomically published. > """ > # atomic magic > > > > > > On 04/19/2017 10:16 AM, Brian Bouterse wrote: > > I was thinking about the design here and I wanted to share some > thoughts. > > > > For the MVP, I think a publisher implemented by a plugin developer > would write all files into the working > > directory and the platform will "atomically publish" that data into the > location configured by the repository. > > The "atomic publish" aspect would copy/stage the files in a permanent > location but would use a single symlink > > to the top level folder to go live with the data. This would make > atomic publication the default behavior. > > This runs after the publish() implemented by the plugin developer > returns, after it has written all of its > > data to the working dir. > > > > Note that ^ allows for the plugin writer to write the actual contents > of files in the working directory > > instead of symlinks, causing Pulp to duplicate all content on disk with > every publish. That would be a > > incredibly inefficient way to write a plugin but it's something the > platform would not prevent in any explicit > > way. I'm not sure if this is something we should improve on or not. > > > > At a later point, we could add in the incremental publish maybe as a > method on a Publisher called > > incremental_publish() which would only be called if the previous > publish only had units added. > > > > > > > > On Mon, Apr 17, 2017 at 4:22 PM, Brian Bouterse <bbout...@redhat.com > <mailto:bbout...@redhat.com> <mailto:bbout...@redhat.com > <mailto:bbout...@redhat.com>>> wrote: > > > > For plugin writers who are writing a publisher for Pulp3, what do > they need to handle during publishing > > versus platform? To make a comparison against sync, the "Download > API" and "Changesets" [0] allows the > > plugin writer to tell platform about a remote piece of content. > Then platform handles creating the unit, > > fetching it, and saving it. Will there be a similar API to support > publishing to ease the burden of a > > plugin writer? Also will this allow platform to have a structured > knowledge of a publication with Pulp3? > > > > I wanted to try to characterize the problem statement as two > separate questions: > > > > 1) How will units be recorded to allow platform to know which units > comprise a specific publish? > > 2) What are plugin writer's needs at publish time, and what > repetitive tasks could be moved to platform? > > > > As a quick recalling of how Pulp2 works. Each publisher would write > files into the working directory and > > then they would get moved into their permanent home. Also there is > the incrementalPublisher base machinery > > which allowed for an additive publication which would use the > previous publish and was "faster". Finally > > in Pulp2, the only record of a publication are the symlinks on the > filesystem. > > > > I have some of my own ideas on these things, but I'll start the > conversation. > > > > [0]: https://github.com/pulp/pulp/pull/2876 > <https://github.com/pulp/pulp/pull/2876> > <https://github.com/pulp/pulp/pull/2876 > <https://github.com/pulp/pulp/pull/2876>> > > > > -Brian > > > > > > > > > > _______________________________________________ > > Pulp-dev mailing list > > Pulp-dev@redhat.com <mailto:Pulp-dev@redhat.com> > > https://www.redhat.com/mailman/listinfo/pulp-dev > <https://www.redhat.com/mailman/listinfo/pulp-dev> > > > > > _______________________________________________ > Pulp-dev mailing list > Pulp-dev@redhat.com <mailto:Pulp-dev@redhat.com> > https://www.redhat.com/mailman/listinfo/pulp-dev > <https://www.redhat.com/mailman/listinfo/pulp-dev> > >
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev