I was just reviewing the Task search API in Pulp 3 we designed a few months ago. Two of the requirements [0] were "As a user of the task search API I want to search for all tasks that operated on repo zoo” and "As a user of the task search API I want to search all publish tasks performed by a particular publisher." My question: will users be allowed to search for tasks using a publication href or a publication using a task href as they would with a repo/publisher/etc? If so, that could be the link between task and publication[1].
[0] https://pulp.plan.io/issues/2482 [1] https://pulp.plan.io/issues/2890 David On Tue, Oct 24, 2017 at 2:11 PM, Brian Bouterse <bbout...@redhat.com> wrote: > Thanks everyone for all the discussion! I'll try to recap the problem and > some of the solutions I've heard. I'll also share some of my perspective on > them too. > > What problem are we solving? > When a user calls "publish" (the action API endpoint) they get a 202 w/ a > link to the task. That task will produce a publication. How can the user > find the publication that was produced by the task? How can the user be > sure the publication is fully complete? > > > What are our options? > 1) Start linking to created objects from task status. I believe its been > clearly stated about why we can't do this. If it's not clear, or if there > are other things we should consider, let's talk about it. Acknowledging or > establishing agreement on this is crucial because a change like this would > bring back a lot of the user pain from pulp2. I believe the HAL suggestion > falls into this area. > > 2) Have the user find the publication via query that sorts on time and > filters only for a specific publisher. This could be fragile because with a > multi-user system and no hard references between publications and tasks, > answering the question "which is the publication for me" is hard because > another user could have submitted a publish too. While not totally perfect, > this could work. > > 3) Have the user create a publication directly like any other REST > resource, and help the user understand the state of that resource over > time. I believe the proposal at the start of this thread is recommending > this solution. I'm also +1 on this solution. > > > As an aside, I don't think considering versioned repos as a possible > solution is helping us with this problem. The scope of the current problem > is relatively small and the scope of planning for versioned repos is large. > > > On Tue, Oct 24, 2017 at 9:43 AM, Jeff Ortel <jor...@redhat.com> wrote: > >> >> >> On 10/23/2017 06:14 PM, Dennis Kliban wrote: >> > On Mon, Oct 23, 2017 at 3:20 PM, Michael Hrivnak <mhriv...@redhat.com >> <mailto:mhriv...@redhat.com>> wrote: >> > >> > >> > >> > On Mon, Oct 23, 2017 at 12:30 PM, Dennis Kliban <dkli...@redhat.com >> <mailto:dkli...@redhat.com>> wrote: >> > >> > On Mon, Oct 23, 2017 at 10:56 AM, Jeff Ortel <jor...@redhat.com >> <mailto:jor...@redhat.com>> wrote: >> > >> > This is interesting. >> > >> > Some thoughts: >> > >> > If adopted, I propose the publication task create the >> publication and pass to the publisher which >> > would >> > require a change in the plugin API - >> Publisher.publish(publication). If the publisher fails, I >> > think the >> > publication should be deleted. >> > >> > >> > The ViewSet would create the publication, dispatch a publish >> task with the publication id as an >> > argument, update the publication with the task id, return a >> serialized Publication to the API user. >> > The user is responsible for deleting any publication that is >> not created successfully. >> > >> > >> > For me, your wording illustrates the problem well. Why should a >> user have to delete a resource that was >> > never created? >> > >> > This sounds like we'd be introducing a partially-created state for >> publications. There would be some kind >> > of placeholder representation that could be referenced as a >> location where a real publication *might or >> > might not* eventually appear. And this representation would live >> side-by-side in a "publications/" >> > endpoint with representations of actual publications? How would a >> user know which are which? It seems like >> > this just shifts the async problem onto the publication model. >> > >> > I go back to this: When creation of a resource is requested, the >> response should either be 201 if the >> > resource was created, or 202 if creation is deferred. We should not >> attempt partial creation. >> > >> > >> > >> > It's easy to lose sight of this, so maybe it's worth also observing >> that a resource is not just a DB >> > record or some JSON. The existence of a resource representation >> requires that the resource itself exists >> > in every way that is necessary for it to make sense. We should be >> careful not to misrepresent the >> > existence of a publication. >> > >> > >> > The description of issue 3033[0] does not clearly establish what a >> serialized version of a Publication looks >> > like. In our current design, I imagine that it will contain three >> fields: _href, created, and publisher. >> > @jortel, do you have the same vision? >> >> Yes. >> >> > >> > If we start associating tasks with Publications, then the serialized >> publication would have 4 fields: _href, >> > created, publisher, task. The API would then allow filtering based on >> the status of the associated task. e.g. >> > publications/?task__status=successful to retrieve all publications >> that are successfully created. >> > >> > We could also add validation on the Distribution that will check >> whether the publication being associated with >> > the Distribution has a task associated with it, and if so that it >> successfully completed. >> >> I don't think we should store broken publications in the DB. >> >> > >> > A POST to /publications/ could return a 202 and a serialized version of >> the publication. This lets the user >> > know that the task of creating a publication was accepted. Any GET >> requests to /publications/<publication_id> >> > would return 202 until the publication task has completed. Once the >> publication task is complete a GET request >> > to /publications/<publication_id> would return 200 if the task finished >> successfully or 410 (gone) if it did >> > not complete successfully. >> >> My main objection to storing the task_id on the publication is that >> task_id is only meaningful to the user for >> a very short period. Just long enough to make subsequent API calls but >> nothing further unless the user writes >> it down with a note giving it meaning. But imagine a user listing >> publications later, trying to select one to >> associate with a distribution. Or to be delted. The task ID would be >> meaningless. The natural key >> Publication.name was an attempt to give the user something meaningful for >> all use cases. After further >> consideration, I'm not convinced that adding "name" is the best solution >> either. >> >> I wonder if versioned repositories isn't the real answer. If the >> repository was versioned then publications >> would be naturally versioned as well. The serialized publication could >> include the repository "version" >> number. This would be meaningful to the user for all use cases. >> >> > >> > >> > [0] https://pulp.plan.io/issues/3033 <https://pulp.plan.io/issues/3033> >> > >> > >> > -- >> > >> > Michael Hrivnak >> > >> > Principal Software Engineer, RHCE >> > >> > Red Hat >> > >> > >> >> >> _______________________________________________ >> Pulp-dev mailing list >> Pulp-dev@redhat.com >> https://www.redhat.com/mailman/listinfo/pulp-dev >> >> > > _______________________________________________ > Pulp-dev mailing list > Pulp-dev@redhat.com > https://www.redhat.com/mailman/listinfo/pulp-dev > >
_______________________________________________ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev