Thanks for the explanation! On Mon, May 28, 2018 at 4:17 PM, Daniel Alley <dal...@redhat.com> wrote: >> Is that because of the rollback actually creates version #3 that's >> "newer" but lacks the rolled-back commits? >> So there are some "merge" conflict if folks, that cloned #2, want to >> pull from version #3 but their branch contains a commit the origin >> lacks now? >> Or rather that the published bits of the version #2 doesn't exist >> anymore at all? > > > The first one. It would be like if someone force-pushed to the git > repository, removing the last couple of commits of history. It's basically > the same problem. > >> Does it mean a publication directory git tree is built anew every time >> a rollback happens? > > > What it would have to do is take the existing git tree and apply new commits > on top to return the contents of the repository to the state you want to > roll it back to. > >> So Pulp history and the original project history are meant to be >> different? >> Can there be ever conflicts? > > > It's not that they're meant to be different, but I think it is an > unavoidable problem if you want to do rollbacks in Pulp. > > The source git repository for the project, whether it's on github or the > admin's machine, is separate from Pulp's copy. The second you add a commit > to one and not the other (by doing rollback w/ linear git history from the > client's perspective), the histories will diverge. It's unavoidable, that's > just how git works. You can keep the content of the files in the repo > identical but the history will never be equivalent again.
...impairing the usability of Pulp as the "master" repository > > Basically, it is mutually exclusive to have: > > * Pulp not be the "master" git repository e.g. the admin is syncing / > uploading it from somewhere else > * maintain linear git history > * be able to do rollbacks in Pulp > * keep identical git history between Pulp and the git repository being > synced / uploaded into Pulp > > One of them has to give. +1 I believe any content type/plug-in with its own idea of content versioning will have the same conflict. Wrapping/translating from content-specific versioning scheme to Pulp versioning scheme sounds like a headache even if Pulp supports a non-linear history one day. Let's forget about it and give the plug-in the ability to opt-out from the core versioning scheme instead? Cheers, milan > > > On Mon, May 28, 2018 at 8:01 AM, Milan Kovacik <mkova...@redhat.com> wrote: >> >> On Sat, May 26, 2018 at 2:23 AM, Daniel Alley <dal...@redhat.com> wrote: >> > @Brian >> > >> > I agree with a lot of those points, but I would say that we're not just >> > competing against hodgepodge collections of "scripts", but also against >> > writing small microservice-y Flask apps that only implement the API for >> > one >> > content type. >> > >> > Also, rollback is not something Pulp would necessarily be able to offer >> > with >> > respect to history-sensitive content and metadata, like git >> > repositories, or >> > the Cargo example I provided. It's still something the plugin writer >> > would >> > have to implement themselves in this case. >> > >> > @Jeff >> > >> >> perhaps a new component of a Publication like PublishedDirectory that >> >> references an OSTree/Git repository created in /var/lib/pulp/published. >> > >> > >> > I like the idea generally, but I don't think it would be able to be a >> > component of a Publication. I think it would need to be an alternative >> > to a >> > Publication which fulfills a similar function. >> > >> > The fundamental problem is this scenario: >> > >> > You upload a git repository with a git repository plugin >> > You publish and distribute version 1 of the git repository >> > You publish and distribute version 2 of the git repository >> > A client downloads the git repository >> > You notice a problem and decide to roll back to version 1. A >> > publication of >> > version 1 already exists, which you distribute. >> > Clients have a broken git history. New clients can download the old >> > version >> > but anyone who has already downloaded version 2 will not be able to roll >> > back to version 1 by pulling from Pulp >> >> Just trying to understand the situation: >> Is that because of the rollback actually creates version #3 that's >> "newer" but lacks the rolled-back commits? >> So there are some "merge" conflict if folks, that cloned #2, want to >> pull from version #3 but their branch contains a commit the origin >> lacks now? >> Or rather that the published bits of the version #2 doesn't exist >> anymore at all? >> >> > >> > We need to prevent step 5 from happening. >> > >> > There are a couple of possible solutions to this problem: >> > >> > As a Pulp admin, you ignore Pulp's rollback functionality. Instead of >> > using >> > Pulp to roll back, you manually revert the commits using git, and upload >> > a >> > new version of the repository to Pulp as "version 3". You then >> > distribute >> > version 3 instead of version 1. You understand that if you were to >> > publish >> > and old version using Pulp, it would misbehave for clients that tried to >> > pull / update instead of cloning. >> >> In my opinion folks needing Pulp to track a git(-like) repo are >> probably interested in more workflows than just the clone. >> >> > >> > As a Pulp admin / plugin writer / user, you know that the client for the >> > content type will never try to pull or update, only clone. Therefore it >> > is >> > not a problem for you and can be ignored. >> >> The cloning might be equivalent of just snapshotting the tree at a >> particular commit and just publishing a plain tar.gz w/o the git >> structures. >> Limiting but clean? >> >> > >> > As a Plugin writer, whenever you publish a new version of the git >> > repository, you delete or invalidate every publication for previous >> > versions >> > for the distribution base path. If a Pulp admin wants to roll back, >> > they >> > need to create a new Publication. The Plugin knows to apply revert >> > commits >> > on top of the repository to keep history linear. >> > >> > But really we've just pushed the problem forwards. What happens when >> > you >> > want to upload future versions? Now history of the git repository in >> > Pulp >> > is different from the Pulp admin's git repo history >> > This is only acceptable for content types where the history is >> > immaterial to >> > the content itself. Probably viable for Cargo, but probably not a Git >> > content type. >> > >> >> Does it mean a publication directory git tree is built anew every time >> a rollback happens? >> So Pulp history and the original project history are meant to be >> different? >> Can there be ever conflicts? >> >> >> > As a Plugin writer, you ignore publications entirely. You don't make it >> > possible to do the wrong thing. You have something along the lines of a >> > "PluginManagedDirectory" which core does not try to mess with. If you >> > want >> > to implement rollback functionality, you do it through your own API >> > where >> > the side effects are more easily controlled and reasoned about. >> >> +1 seems like the cleanest way to me >> >> > >> > I have doubts about whether Option 3 is viable - it seems like making it >> > work reliably would be difficult. >> >> I'd say option #1 and #3 are the same, #3 adding the complexity of >> automating the rollback in Pulp, >> option #2 and #4 are the same in the sense of Pulp staying away from >> the incompatible workflow a content type has while providing a limited >> functionality subset to the consumer. In addition, #4 allows for Pulp >> service/host to provide both the Pulp-specific, limited functionality >> as well as the incompatible, content-type specific workflows from a >> "single" point. This might be a benefit to some folks. >> >> >> Option #5: somehow make core Pulp (content versioning) compatible with >> the Git model ;) >> >> -- >> milan >> >> > >> > On Fri, May 25, 2018 at 5:05 PM, Brian Bouterse <bbout...@redhat.com> >> > wrote: >> >> >> >> I think Pulp does have enough value proposition over a script-based >> >> alternative to make it worthwile for all of those types of plugins. >> >> Here are >> >> a few points I think about: >> >> >> >> * scalability. A common story users tell is that scripts work well up >> >> until a point. Doing it for an entire organization, or when content >> >> comes >> >> from many places, or with more than a few people involved in >> >> maintaining the >> >> content, it becomes unmaintainable. >> >> >> >> * Stacks of content. Often a group of content goes together, but each >> >> piece of content is updated separately. For instance with Ansible >> >> roles, you >> >> may use many of them together to deploy something, but each role may >> >> receive >> >> changes separately. I think of all this content together as a "stack". >> >> Keeping everything up to date can be challenging. Managing that change >> >> with >> >> scripts can be hard and fragile. Also the ability to rollback quickly >> >> an >> >> confidently is something Pulp can offer. >> >> >> >> * Organizing content is easier. Having an API that you can use to >> >> organize >> >> content is easier than doing lots and lots of git yourself or with >> >> scripts. >> >> >> >> * Tasking. Long running tasks (and a lot of them) can be unweildy, and >> >> Pulp makes that very organized and run very well. >> >> >> >> * Static and vulnerability analysis. We're seeing interest in using >> >> analysis projects like Clair (https://github.com/arminc/clair-scanner) >> >> to >> >> scan content in Pulp. By bringing all the content into one place, and >> >> that >> >> place having a tasking system that plugin writers can control how their >> >> content can be analyzed continuously. >> >> >> >> Also +1 to jortel's idea. I think that's a great idea and exactly what >> >> we >> >> need. >> >> >> >> >> >> On Thu, May 24, 2018 at 1:33 PM, Jeff Ortel <jor...@redhat.com> wrote: >> >>> >> >>> >> >>> >> >>> On 05/17/2018 07:46 AM, Daniel Alley wrote: >> >>> >> >>> Some content types are not going to be compatible with the normal >> >>> sync/publish/distribute Pulp workflows, and will need to be live >> >>> API-only. >> >>> To what degree should Pulp accomodate these use cases? >> >>> >> >>> Example: >> >>> >> >>> Pulp makes the assumptions that >> >>> >> >>> A) the metadata for a repository can be generated in its entirety by >> >>> the >> >>> known set of content in a RepositoryVersion, and >> >>> >> >>> B) the client wouldn't care if you point it at an older version of the >> >>> same repository. >> >>> >> >>> Cargo, the package manager for the Rust programming language, expects >> >>> the >> >>> registry url to be a git repository. When a user does a "cargo >> >>> update", >> >>> cargo essentially does a "git pull" to update a local copy of the >> >>> registry. >> >>> >> >>> Both of those assumptions are false in this case. You cannot generate >> >>> the >> >>> git history just from the set of content, and you cannot "roll back" >> >>> the >> >>> state of the repository without either breaking it for clients, or >> >>> adding >> >>> new commits on top. >> >>> >> >>> A theoretical Pulp plugin that worked with Cargo would need to ignore >> >>> almost all of the existing Pulp primitives and very little (if any) of >> >>> the >> >>> normal Pulp workflow could be used. >> >>> >> >>> Should Pulp attempt to cater to plugins like these? What could Pulp >> >>> do >> >>> to provide a benefit for such plugins over writing something from >> >>> scratch >> >>> from the ground up? To what extent would such plugins be able to >> >>> integrate >> >>> with the rest of Pulp, if at all? >> >>> >> >>> >> >>> I think OSTree and Ansible plugins will be in the same boat as Cargo. >> >>> In >> >>> the case of OSTree, libostree does the heavy lifting for sync and >> >>> publishing >> >>> and I suspect the same is true for Git based repositories. We should >> >>> consider way to best support distributing (serving) content in core >> >>> for >> >>> these content types. I suspect this will mainly entail something in >> >>> the >> >>> content app and perhaps a new component of a Publication like >> >>> PublishedDirectory that references an OSTree/Git repository created in >> >>> /var/lib/pulp/published. This may benefit Maven as well. >> >>> >> >>> >> >>> >> >>> We don't have to commit to anything pre-GA but it is a good thing to >> >>> keep >> >>> in mind. I'm sure there are other content types out there (not just >> >>> Cargo) >> >>> which would face similar problems. pulp_git was inquired about a few >> >>> months >> >>> ago, it seems like it would share a few of them. >> >>> >> >>> >> >>> _______________________________________________ >> >>> Pulp-dev mailing list >> >>> Pulp-dev@redhat.com >> >>> https://www.redhat.com/mailman/listinfo/pulp-dev >> >>> >> >>> >> >>> >> >>> _______________________________________________ >> >>> Pulp-dev mailing list >> >>> Pulp-dev@redhat.com >> >>> https://www.redhat.com/mailman/listinfo/pulp-dev >> >>> >> >> >> >> >> >> _______________________________________________ >> >> Pulp-dev mailing list >> >> Pulp-dev@redhat.com >> >> https://www.redhat.com/mailman/listinfo/pulp-dev >> >> >> > >> > >> > _______________________________________________ >> > Pulp-dev mailing list >> > Pulp-dev@redhat.com >> > https://www.redhat.com/mailman/listinfo/pulp-dev >> > > > _______________________________________________ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev