On 04/17/2017 09:50 AM, Michael Hrivnak wrote: > That MVP scoping sounds reasonable. > > I'm not sure we decided 100% on whether those operations need to be tasks, > but I think everyone seems to be > leaning in the direction that update does not. But it partly will come down > to a plugin API question. If > updates can happen any time, then the plugin should load its config exactly > once and not re-read it during the > task. The plugin API should encourage that workflow, if not enforce it.
The Importer instance is fetched from the DB and its attributes are the /config/. So, would only be fetched once from the DB for any task. > > Delete could be a little more challenging, because of the potential desire > for an importer or publisher to > want to update attributes on it, like the last time a sync succeeded. Maybe > we can manage to avoid the need > for a plugin to write to its config, which also seems like a very reasonable > goal, but that again could be > part of the plugin API discussion. Although thinking about relationships that > are likely to exist in the > database, having an importer or publisher disappear while a corresponding > task is running sounds like it could > cause a lot of potential problems. What kind of problems? > > On Fri, Apr 14, 2017 at 11:23 AM, Brian Bouterse <[email protected] > <mailto:[email protected]>> wrote: > > Did we ever decide on if delete and modify operations on an importer > would be synchronous API operations > or tasking system tasks? > > From what I'm reading, there isn't a MVP need (and maybe ever) to verify > the feed is set at the sync API > call and in the task. We definitely need to do that business logic, i.e. > verify the feed != None, in the > task for the MVP. If we don't need to do it in two places now, then I > think doing it procedurally in the > sync task here [0] would be good enough. That means we would not add it > to the model as a method. I think > this business logic in the sync task would be: > > ``` > if importer.feed is None: > raise ValueError("The 'feed' attribute on the importer cannot be none > when syncing") # <--- could > should probably be a descendent of PulpException since it's a > predictable/common error > ``` > > [0]: > https://github.com/pulp/pulp/pull/2991/files#diff-b9ddc3416cf3909687419960d59316bbR30 > > <https://github.com/pulp/pulp/pull/2991/files#diff-b9ddc3416cf3909687419960d59316bbR30> > > -Brian > > > > On Wed, Apr 12, 2017 at 10:11 AM, Jeff Ortel <[email protected] > <mailto:[email protected]>> wrote: > > > > On 04/11/2017 04:08 PM, Michael Hrivnak wrote: > > I like thinking about this as business logic. Data may be valid, > but it may not be usable in a particular context. > > > > To help figure out where such logic should live, it may help to > think about where the check is most important. > > I've described that time as "at the time of use" earlier in this > discussion (maybe just on IRC). With sync as > > an example, a workflow will load an importer's config from the > database, check it for problems, and then > > immediately use the values it just inspected. This is the critical > moment where it must gracefully handle > > unusable data. This check ensures correct behavior and avoids an > unhandled exception or crash. > > +1 > > > > > We can and should also check for problems at earlier opportunities, > such as at the time a user tries to queue > > a sync. This improves the user experience, but it is not required > for correct and safe operation. > > The data can change between the time when the task is queued and when > it is executed. This adds > complexity > and the value added is not deterministic. -1 for a check here. > > > > > Given that, I think it makes sense to put the check close to the > data. A method on the model seems reasonable. > > In terms of polluting the model with business logic, it isn't that > different from defining a custom query set > > on the model, which django encourages. > > Agreed, but only for data integrity. Each field should be validated. > And, the record (cross field) > should be > validated. We don't want to store things in the DB that are invalid. > Seems like the django ORM already > provides for this, right? But, the model should not have methods to > validate itself for a particular > operation (usage). For example, ensure_syncable() that validates the > feed URL would not be > appropriate. This > should be checked at run-time by the object doing the sync. Perhaps > a double dispatch model in > plugin.Importer could do: > > def synchronize(): > # do validation and preparation. > return self._synchronize() > > def _synchronize(): > # overridden by plugins. > raise NotImplementedError() > > > > > > As a slight tangent, some applications take this sort of checking > even further. An admirable approach in REST > > API design, which may not be a good idea for us at this time but is > interesting to note, is to make a behavior > > such as "sync" only available via a link accessed via a known name > in an object's representation. That's a > > mouthful, so here's an example: > > > > { > > "id": "foo", > > "feed": "http://cdn.redhat.com/stuff/", > > "_links": > > { > > "self": "http://here.com/importers/foo", > > "sync": "http://here.com/importers/foo/sync > <http://here.com/importers/foo/sync>" > > } > > } > > > > Consider that the link for starting a sync is not part of the > published API, except that it must be obtained > > from this representation. There are two advantages here. > > > > The main advantage I'm pointing out is that when the server creates > this representation of an Importer, it > > would only include the "sync" link if the current state of the > object would allow for a sync. If there were no > > feed, there would be no sync link, and thus the client would be > unable to even try starting one. So this is a > > third opportunity to check whether the object's state is suitable > for a sync. It even allows the client to > > show or hide a "sync" button without having to re-implement the > business logic that's already present on the > > server side. Neat, huh? > > Interesting. > > However, the duration of how long the 'sync' link is valid is not > deterministic. 1 ms after the link is > included in the returned document, a DB commit could set it to NULL. > The link could be invalid by the > time > the user even receives the REST reply. > > > > > Another advantage to this kind of approach is a smaller API surface > area. We could theoretically change the > > sync URL schema at any time. We could even move it to a new, > separate service. We'd still need to document how > > to use it, but it's actual location can change. In practice I don't > think this aspect is all that valuable > > unless you are 100% bought in to this design. But it's fun to think > about. > > > > And to re-state, this kind of thing may not be worth our time to > actually do right now, and I'm not proposing > > it. I don't know to what extent DRF would make this easy. But I > wanted to bring it up for interest's sake as > > yet another place in the workflow, even closer to the end user than > the other two we've discussed, where > > applications have an opportunity to utilize context checking of > data. > > > > On Tue, Apr 11, 2017 at 3:49 PM, Austin Macdonald > <[email protected] <mailto:[email protected]> <mailto:[email protected] > <mailto:[email protected]>>> wrote: > > > > Where should business logic live? As an example, I want to > consider the sync task [0] and the need to > > ensure that an importer is syncable. For now, let's say that an > importer is syncable if it has a feed_url. > > > > Since the call to sync an importer comes from the API, the > earliest we can check that the configuration is > > syncable is in a not-yet-written SyncView. Checking syncability > here is ideal because it allows us to > > immediately return a meaningful http response instead of > happily returning information about a Task that > > we already know is doomed to fail. > > > > Performing the check in the API layer is not enough. We have > discussed edge cases that lead to an > > importer's feed_url being removed while the sync task is in the > WAITING state. To make an assertion to the > > plugin that the feed_url exists, we have to check syncability > again when the Task moves into an active state. > > > > My thinking is that we should put this business logic on the > model. > > > > Admittedly, it is not a clean fit with the separation of > concerns philosophy but we have already violated > > the concept by putting the sync method on the model. If sync is > on the model, it seems like > > ensure_syncable should be too. > > > > If we write the platform API layer to use this kind of business > logic, then the plugins can add double > > checking business logic without modifying the API and Task > layers. > > > > > > [0]: https://pulp.plan.io/issues/2399 > <https://pulp.plan.io/issues/2399> > <https://pulp.plan.io/issues/2399 <https://pulp.plan.io/issues/2399>> > > > > On Fri, Apr 7, 2017 at 2:14 PM, Sean Myers > <[email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>>> wrote: > > > > On 04/07/2017 12:08 PM, Brian Bouterse wrote: > > > == questions == > > > * Where should ^ terms be documented? > > > > I'm not really sure, but recommend the wiki as a good > starting point for > > putting information that we should probably "officially" > document *somewhere*, > > but at the moment we aren't quite sure where. > > > > > https://pulp.plan.io/projects/pulp/wiki/Pulp_3_Developer_Notes > <https://pulp.plan.io/projects/pulp/wiki/Pulp_3_Developer_Notes> > > > <https://pulp.plan.io/projects/pulp/wiki/Pulp_3_Developer_Notes > <https://pulp.plan.io/projects/pulp/wiki/Pulp_3_Developer_Notes>> > > > > > * Take the case of a sync which has overrides provided? > This isn't in the > > > MVP, but in the future it could be. In that case, does > the serializer > > > associated with the importer validate the database data > with the overrides > > > "added" on top of it? > > > > My question here is "validate against what?". It makes > sense to validate > > against database data, but as long as the overrides aren't > themselves stored > > in the database, what does this really stop? > > > > For example, what prevents two simultaneous syncs of repos > using overrides > > that would trigger a constraint violation if they were > saved, but don't do > > this because we don't save the overrides? > > > > > * For plugin writers writing a serializer for a > subclassed Importer, do > > > they also need to express validations for fields defined > on the base > > > Importer? > > > > It depends on the validation. If it's just validating an > individual field, > > no. If it's validation of a combination of values in > multiple fields, and > > one of those fields in this case was defined in a subclass, > the subclass > > will need to add the proper validation support. > > > > > * The database still rejects data that doesn't adhere to > the data layer > > > definition right? That occurs even without the DRF > serializer correct? > > > > Again, this depends. For example, attempting to store an > int in a charfield > > will work, because Django will coerce that int to string on > save. Attempting > > to store a string in an IntegerField will fail, though, > because Django is > > not able to coerce str to int prior to saving. Generally, > though, your > > understanding is correct. Anything that the database can't > handle will be > > rejected. > > > > > * In cases where data is created in the backend, do we > need to validate > > > that as a general practice? If we do, do we call the DRF > serializer > > > regularly in the backend or just let the database reject > "bad" data at the > > > db level? > > > > As a general practice, I don't think so. Specifically, > though, when we're > > passing data around, like when a bit of platform code is > taking incoming > > plugin data and passing it into some standard workflow that > platform provides > > (like running sync on an importer, say) I think it's going > to be a good idea > > and an all-around gentlemenly thing to do to validate that > data in some way > > that appropriate to the process/workflow being invoked. > > > > I'm concerned about finding the balance between making > things user-friendly > > for plugin writers and having our checking code that > provides that user- > > friendly-ness itself be difficult to maintain and end up > being pulp-developer- > > unfriendly. > > > > > > _______________________________________________ > > Pulp-dev mailing list > > [email protected] <mailto:[email protected]> > <mailto:[email protected] > <mailto:[email protected]>> > > https://www.redhat.com/mailman/listinfo/pulp-dev > <https://www.redhat.com/mailman/listinfo/pulp-dev> > <https://www.redhat.com/mailman/listinfo/pulp-dev > <https://www.redhat.com/mailman/listinfo/pulp-dev>> > > > > > > > > _______________________________________________ > > Pulp-dev mailing list > > [email protected] <mailto:[email protected]> > <mailto:[email protected] > <mailto:[email protected]>> > > https://www.redhat.com/mailman/listinfo/pulp-dev > <https://www.redhat.com/mailman/listinfo/pulp-dev> > <https://www.redhat.com/mailman/listinfo/pulp-dev > <https://www.redhat.com/mailman/listinfo/pulp-dev>> > > > > > > > > > > -- > > > > Michael Hrivnak > > > > Principal Software Engineer, RHCE > > > > Red Hat > > > > > > > > _______________________________________________ > > Pulp-dev mailing list > > [email protected] <mailto:[email protected]> > > https://www.redhat.com/mailman/listinfo/pulp-dev > <https://www.redhat.com/mailman/listinfo/pulp-dev> > > > > > _______________________________________________ > Pulp-dev mailing list > [email protected] <mailto:[email protected]> > https://www.redhat.com/mailman/listinfo/pulp-dev > <https://www.redhat.com/mailman/listinfo/pulp-dev> > > > > _______________________________________________ > Pulp-dev mailing list > [email protected] <mailto:[email protected]> > https://www.redhat.com/mailman/listinfo/pulp-dev > <https://www.redhat.com/mailman/listinfo/pulp-dev> > > > > > -- > > Michael Hrivnak > > Principal Software Engineer, RHCE > > Red Hat >
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Pulp-dev mailing list [email protected] https://www.redhat.com/mailman/listinfo/pulp-dev
