Re: [Pulp-dev] bulk_create for Artifact, Content, ContentArtifact, and RemoteArtifact?
I just tried an implementation of DeclarativeVersion that uses bulk_create for all content units, content artifacts, and remote artifacts. The content units are incompatible with bulk_save(). When trying to save a batch of content units with bulk_save it raises: ValueError: Can't bulk create a multi-table inherited model On Thu, Jun 21, 2018 at 4:19 PM, Brian Bouterse wrote: > I'm only considering these changes for the plugin writer API to help > resolve the performance issues. > > On Thu, Jun 21, 2018 at 4:11 PM, Austin Macdonald > wrote: > >> For models, bulk_create seems good to me. Endpoints to kick off tasks >> like sync that use bulk_create seems fine. >> >> Are you also proposing we have bulk_create for non-task REST API calls? >> Should a user be able to POST a list of dictionaries that becomes a set of >> Content? I'm open to it, but it seems like it could get ugly. >> >> On Thu, Jun 21, 2018 at 3:54 PM, Brian Bouterse >> wrote: >> >>> I've run cprofile on some of the sync code for Pulp3 and I've noticed >>> that we may have some problems with bulk_create on some of the object types. >>> >>> Here is a small analysis I did: https://pulp.plan.io/issues/3770#note-2 >>> >>> As an aside, we don't have a bulk add option for >>> RepositoryVersion.add_content, which ensures each round trip to the db will >>> be for one unit. When you're processing 70K units, that's a lot of trips. I >>> don't think we have to add this right now, but to resolve an issue like >>> 3770 we may need to. >>> >>> I do think we should make our models compatible with bulk_create now >>> either way. >>> >>> What do you think? >>> >>> -Brian >>> >>> ___ >>> Pulp-dev mailing list >>> Pulp-dev@redhat.com >>> https://www.redhat.com/mailman/listinfo/pulp-dev >>> >>> >> > ___ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev
Re: [Pulp-dev] bulk_create for Artifact, Content, ContentArtifact, and RemoteArtifact?
On Thu, Jun 21, 2018 at 4:19 PM, Brian Bouterse wrote: > I'm only considering these changes for the plugin writer API to help > resolve the performance issues. > Cool. +1 ___ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev
Re: [Pulp-dev] bulk_create for Artifact, Content, ContentArtifact, and RemoteArtifact?
I'm only considering these changes for the plugin writer API to help resolve the performance issues. On Thu, Jun 21, 2018 at 4:11 PM, Austin Macdonald wrote: > For models, bulk_create seems good to me. Endpoints to kick off tasks like > sync that use bulk_create seems fine. > > Are you also proposing we have bulk_create for non-task REST API calls? > Should a user be able to POST a list of dictionaries that becomes a set of > Content? I'm open to it, but it seems like it could get ugly. > > On Thu, Jun 21, 2018 at 3:54 PM, Brian Bouterse > wrote: > >> I've run cprofile on some of the sync code for Pulp3 and I've noticed >> that we may have some problems with bulk_create on some of the object types. >> >> Here is a small analysis I did: https://pulp.plan.io/issues/3770#note-2 >> >> As an aside, we don't have a bulk add option for >> RepositoryVersion.add_content, which ensures each round trip to the db will >> be for one unit. When you're processing 70K units, that's a lot of trips. I >> don't think we have to add this right now, but to resolve an issue like >> 3770 we may need to. >> >> I do think we should make our models compatible with bulk_create now >> either way. >> >> What do you think? >> >> -Brian >> >> ___ >> Pulp-dev mailing list >> Pulp-dev@redhat.com >> https://www.redhat.com/mailman/listinfo/pulp-dev >> >> > ___ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev
Re: [Pulp-dev] bulk_create for Artifact, Content, ContentArtifact, and RemoteArtifact?
+1 David On Thu, Jun 21, 2018 at 3:55 PM Brian Bouterse wrote: > I've run cprofile on some of the sync code for Pulp3 and I've noticed that > we may have some problems with bulk_create on some of the object types. > > Here is a small analysis I did: https://pulp.plan.io/issues/3770#note-2 > > As an aside, we don't have a bulk add option for > RepositoryVersion.add_content, which ensures each round trip to the db will > be for one unit. When you're processing 70K units, that's a lot of trips. I > don't think we have to add this right now, but to resolve an issue like > 3770 we may need to. > > I do think we should make our models compatible with bulk_create now > either way. > > What do you think? > > -Brian > ___ > Pulp-dev mailing list > Pulp-dev@redhat.com > https://www.redhat.com/mailman/listinfo/pulp-dev > ___ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev
[Pulp-dev] bulk_create for Artifact, Content, ContentArtifact, and RemoteArtifact?
I've run cprofile on some of the sync code for Pulp3 and I've noticed that we may have some problems with bulk_create on some of the object types. Here is a small analysis I did: https://pulp.plan.io/issues/3770#note-2 As an aside, we don't have a bulk add option for RepositoryVersion.add_content, which ensures each round trip to the db will be for one unit. When you're processing 70K units, that's a lot of trips. I don't think we have to add this right now, but to resolve an issue like 3770 we may need to. I do think we should make our models compatible with bulk_create now either way. What do you think? -Brian ___ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev
[Pulp-dev] 2.17.0 Release request
A 2.17.0 is being planned with some features and recent fixes. Here [0] is a release schedule page which outlines some tentative dates, starting with a dev freeze on July 30 2018. If this schedule needs to be adjusted, please reply with alternate dates. [0] https://pulp.plan.io/projects/pulp/wiki/2170_Release_Schedule Regards, Ina Panova Software Engineer| Pulp| Red Hat Inc. "Do not go where the path may lead, go instead where there is no path and leave a trail." ___ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev
Re: [Pulp-dev] 'id' versus 'pulp_id' on Content
Base URLs should never change. That's an expectation that all web application clients everywhere should be able to rely on. "Cool URIs don't change." If anything, storing IDs is the worse practice, because that implies that the client is going to use pre-existing knowledge to locally build URLs, instead of asking Pulp for the URLs it needs. ___ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev
Re: [Pulp-dev] 'id' versus 'pulp_id' on Content
Another way of thinking of it would be: "don't store store this unless you absolutely know that the base of the URL will never, ever change". Storing IDs is fine, storing hrefs may potentially not be, because it can change out from underneath you. I think it's actually a similar concept. On Thu, Jun 21, 2018 at 9:58 AM, Jeremy Audet wrote: > > I'm -1 on going the underscore idea, partly because of the aforementioned >> confusion issue, but also partly because but I've noticed that in our API, >> the "underscore" basically has a semantic meeting of "href, [which is] >> generated on the fly, not stored in the db". >> >> Specifically: >> >>- '_href' >>- '_added_href' >>- '_removed_href' >>- '_content_href' >> >> So I think if we use a prefix, we should avoid using one that already has >> a semantic meaning (I don't know whether we actually planned for that to be >> the case, but I think it's a useful pattern / distinction and I don't think >> we should mess with it). >> > > Outside perspective: My experience with Python, JavaScript, Ruby, C++, and > so on has led me to believe that the leading underscore means "only touch > if you know what you're doing." However, the _href attribute is something > that I, as an end user, have to use all the time. Thus, the lesson I've > taken away from this naming convention is "pulp abuses naming conventions." > I certainly didn't think that the leading underscore means "generated on > the fly" or "some kind of href." Others might think similarly. > ___ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev
Re: [Pulp-dev] 'id' versus 'pulp_id' on Content
> I'm -1 on going the underscore idea, partly because of the aforementioned > confusion issue, but also partly because but I've noticed that in our API, > the "underscore" basically has a semantic meeting of "href, [which is] > generated on the fly, not stored in the db". > > Specifically: > >- '_href' >- '_added_href' >- '_removed_href' >- '_content_href' > > So I think if we use a prefix, we should avoid using one that already has > a semantic meaning (I don't know whether we actually planned for that to be > the case, but I think it's a useful pattern / distinction and I don't think > we should mess with it). > Outside perspective: My experience with Python, JavaScript, Ruby, C++, and so on has led me to believe that the leading underscore means "only touch if you know what you're doing." However, the _href attribute is something that I, as an end user, have to use all the time. Thus, the lesson I've taken away from this naming convention is "pulp abuses naming conventions." I certainly didn't think that the leading underscore means "generated on the fly" or "some kind of href." Others might think similarly. ___ Pulp-dev mailing list Pulp-dev@redhat.com https://www.redhat.com/mailman/listinfo/pulp-dev