Re: [Pulp-dev] bulk_create for Artifact, Content, ContentArtifact, and RemoteArtifact?

2018-06-21 Thread Brian Bouterse
I just tried an implementation of DeclarativeVersion that uses bulk_create
for all content units, content artifacts, and remote artifacts.

The content units are incompatible with bulk_save(). When trying to save a
batch of content units with bulk_save it raises:  ValueError: Can't bulk
create a multi-table inherited model

On Thu, Jun 21, 2018 at 4:19 PM, Brian Bouterse  wrote:

> I'm only considering these changes for the plugin writer API to help
> resolve the performance issues.
>
> On Thu, Jun 21, 2018 at 4:11 PM, Austin Macdonald 
> wrote:
>
>> For models, bulk_create seems good to me. Endpoints to kick off tasks
>> like sync that use bulk_create seems fine.
>>
>> Are you also proposing we have bulk_create for non-task REST API calls?
>> Should a user be able to POST a list of dictionaries that becomes a set of
>> Content? I'm open to it, but it seems like it could get ugly.
>>
>> On Thu, Jun 21, 2018 at 3:54 PM, Brian Bouterse 
>> wrote:
>>
>>> I've run cprofile on some of the sync code for Pulp3 and I've noticed
>>> that we may have some problems with bulk_create on some of the object types.
>>>
>>> Here is a small analysis I did: https://pulp.plan.io/issues/3770#note-2
>>>
>>> As an aside, we don't have a bulk add option for
>>> RepositoryVersion.add_content, which ensures each round trip to the db will
>>> be for one unit. When you're processing 70K units, that's a lot of trips. I
>>> don't think we have to add this right now, but to resolve an issue like
>>> 3770 we may need to.
>>>
>>> I do think we should make our models compatible with bulk_create now
>>> either way.
>>>
>>> What do you think?
>>>
>>> -Brian
>>>
>>> ___
>>> Pulp-dev mailing list
>>> Pulp-dev@redhat.com
>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>
>>>
>>
>
___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev


Re: [Pulp-dev] bulk_create for Artifact, Content, ContentArtifact, and RemoteArtifact?

2018-06-21 Thread Austin Macdonald
On Thu, Jun 21, 2018 at 4:19 PM, Brian Bouterse  wrote:

> I'm only considering these changes for the plugin writer API to help
> resolve the performance issues.
>

Cool. +1
___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev


Re: [Pulp-dev] bulk_create for Artifact, Content, ContentArtifact, and RemoteArtifact?

2018-06-21 Thread Brian Bouterse
I'm only considering these changes for the plugin writer API to help
resolve the performance issues.

On Thu, Jun 21, 2018 at 4:11 PM, Austin Macdonald 
wrote:

> For models, bulk_create seems good to me. Endpoints to kick off tasks like
> sync that use bulk_create seems fine.
>
> Are you also proposing we have bulk_create for non-task REST API calls?
> Should a user be able to POST a list of dictionaries that becomes a set of
> Content? I'm open to it, but it seems like it could get ugly.
>
> On Thu, Jun 21, 2018 at 3:54 PM, Brian Bouterse 
> wrote:
>
>> I've run cprofile on some of the sync code for Pulp3 and I've noticed
>> that we may have some problems with bulk_create on some of the object types.
>>
>> Here is a small analysis I did: https://pulp.plan.io/issues/3770#note-2
>>
>> As an aside, we don't have a bulk add option for
>> RepositoryVersion.add_content, which ensures each round trip to the db will
>> be for one unit. When you're processing 70K units, that's a lot of trips. I
>> don't think we have to add this right now, but to resolve an issue like
>> 3770 we may need to.
>>
>> I do think we should make our models compatible with bulk_create now
>> either way.
>>
>> What do you think?
>>
>> -Brian
>>
>> ___
>> Pulp-dev mailing list
>> Pulp-dev@redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>
>>
>
___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev


Re: [Pulp-dev] bulk_create for Artifact, Content, ContentArtifact, and RemoteArtifact?

2018-06-21 Thread David Davis
+1

David


On Thu, Jun 21, 2018 at 3:55 PM Brian Bouterse  wrote:

> I've run cprofile on some of the sync code for Pulp3 and I've noticed that
> we may have some problems with bulk_create on some of the object types.
>
> Here is a small analysis I did: https://pulp.plan.io/issues/3770#note-2
>
> As an aside, we don't have a bulk add option for
> RepositoryVersion.add_content, which ensures each round trip to the db will
> be for one unit. When you're processing 70K units, that's a lot of trips. I
> don't think we have to add this right now, but to resolve an issue like
> 3770 we may need to.
>
> I do think we should make our models compatible with bulk_create now
> either way.
>
> What do you think?
>
> -Brian
> ___
> Pulp-dev mailing list
> Pulp-dev@redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev


[Pulp-dev] bulk_create for Artifact, Content, ContentArtifact, and RemoteArtifact?

2018-06-21 Thread Brian Bouterse
I've run cprofile on some of the sync code for Pulp3 and I've noticed that
we may have some problems with bulk_create on some of the object types.

Here is a small analysis I did: https://pulp.plan.io/issues/3770#note-2

As an aside, we don't have a bulk add option for
RepositoryVersion.add_content, which ensures each round trip to the db will
be for one unit. When you're processing 70K units, that's a lot of trips. I
don't think we have to add this right now, but to resolve an issue like
3770 we may need to.

I do think we should make our models compatible with bulk_create now either
way.

What do you think?

-Brian
___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev


[Pulp-dev] 2.17.0 Release request

2018-06-21 Thread Ina Panova
A 2.17.0 is being planned with some features and recent fixes.

Here [0] is a release schedule page which outlines some tentative dates,
starting with a dev freeze on July 30 2018.

If this schedule needs to be adjusted, please reply with alternate dates.

[0] https://pulp.plan.io/projects/pulp/wiki/2170_Release_Schedule



Regards,

Ina Panova
Software Engineer| Pulp| Red Hat Inc.

"Do not go where the path may lead,
 go instead where there is no path and leave a trail."
___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev


Re: [Pulp-dev] 'id' versus 'pulp_id' on Content

2018-06-21 Thread Jeremy Audet
Base URLs should never change. That's an expectation that all web
application clients everywhere should be able to rely on. "Cool URIs don't
change." If anything, storing IDs is the worse practice, because that
implies that the client is going to use pre-existing knowledge to locally
build URLs, instead of asking Pulp for the URLs it needs.
___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev


Re: [Pulp-dev] 'id' versus 'pulp_id' on Content

2018-06-21 Thread Daniel Alley
Another way of thinking of it would be: "don't store store this unless you
absolutely know that the base of the URL will never, ever change".  Storing
IDs is fine, storing hrefs may potentially not be, because it can change
out from underneath you.  I think it's actually a similar concept.

On Thu, Jun 21, 2018 at 9:58 AM, Jeremy Audet  wrote:

>
> I'm -1 on going the underscore idea, partly because of the aforementioned
>> confusion issue, but also partly because but I've noticed that in our API,
>> the "underscore" basically has a semantic meeting of "href, [which is]
>> generated on the fly, not stored in the db".
>>
>> Specifically:
>>
>>- '_href'
>>- '_added_href'
>>- '_removed_href'
>>- '_content_href'
>>
>> So I think if we use a prefix, we should avoid using one that already has
>> a semantic meaning (I don't know whether we actually planned for that to be
>> the case, but I think it's a useful pattern / distinction and I don't think
>> we should mess with it).
>>
>
> Outside perspective: My experience with Python, JavaScript, Ruby, C++, and
> so on has led me to believe that the leading underscore means "only touch
> if you know what you're doing." However, the _href attribute is something
> that I, as an end user, have to use all the time. Thus, the lesson I've
> taken away from this naming convention is "pulp abuses naming conventions."
> I certainly didn't think that the leading underscore means "generated on
> the fly" or "some kind of href." Others might think similarly.
>
___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev


Re: [Pulp-dev] 'id' versus 'pulp_id' on Content

2018-06-21 Thread Jeremy Audet
> I'm -1 on going the underscore idea, partly because of the aforementioned
> confusion issue, but also partly because but I've noticed that in our API,
> the "underscore" basically has a semantic meeting of "href, [which is]
> generated on the fly, not stored in the db".
>
> Specifically:
>
>- '_href'
>- '_added_href'
>- '_removed_href'
>- '_content_href'
>
> So I think if we use a prefix, we should avoid using one that already has
> a semantic meaning (I don't know whether we actually planned for that to be
> the case, but I think it's a useful pattern / distinction and I don't think
> we should mess with it).
>

Outside perspective: My experience with Python, JavaScript, Ruby, C++, and
so on has led me to believe that the leading underscore means "only touch
if you know what you're doing." However, the _href attribute is something
that I, as an end user, have to use all the time. Thus, the lesson I've
taken away from this naming convention is "pulp abuses naming conventions."
I certainly didn't think that the leading underscore means "generated on
the fly" or "some kind of href." Others might think similarly.
___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev