Re: [Pulp-dev] repository versions update

2017-12-19 Thread Robin Chan
Thanks for sending this concise summary to update where we are on this.

I think the format is great and very helpful!

On Mon, Dec 18, 2017 at 5:09 PM, David Davis  wrote:
> tl;dr - @dkliban, @bmbouter, and I met and we propose adopting the second
> proposal because it has better performance and is more line with how we
> think users will use repository versions (i.e. in a linear fashion rather
> than a tree/branching model). We've also updated the user stories to remove
> the base_version features and we're hoping to get @mhrivnak's PR merged this
> week.
>
> # Background
>
> I ran through some performance tests on the first proposal which involved
> storing a direct relationship between repository versions and content. The
> results[0] show that for a smalli/medium-size system with 100M associations
> between repository versions and content, it would take about a minute to
> create a new repo version with 10,000 units in the database. 100M
> associations also required a table size of at least 7GB and an index size of
> 15GB.
>
> I don't think this is a dealbreaker in and of itself. It's possible we could
> do some optimizations if we really want to adopt the first proposal (e.g.
> use int keys instead of UUIDs, table partitioning, etc). I think it's worth
> asking though what we want to optimize for which brings me to the next
> point.
>
> # Linear vs Branching
>
> A main consideration for us was how users would use Pulp 3. The strength of
> the second proposal (in which additions/removals are stored) is when a few
> units are added/removed to the latest repo version. This case captures how a
> majority of users will create new versions in Pulp. This is basically a
> linear sort of model in which new versions are always based off the previous
> version.
>
> The first proposal better supports creating versions from a base_version
> which may or may not be a latest version. This is a branching sort of model
> (like git) that offers more flexibility to our users but we feel like a
> majority of the time, users would not be doing this when creating a new
> version. And optimizing for a less frequently used use case is imprudent.
>
> Therefore, we think it makes sense to adopt the second proposal and store
> only additions/removals of content from a repository version. Also, we think
> that the base_version feature (allowing users to make changes to an older
> repo version) should not be a part of the MVP and maybe we can consider it
> for 3.1+.
>
> # Next Steps
>
> We've updated the user stories in the MVP document to remove the terminology
> around base_version[1]. We're going to break them up into separate user
> stories under our Repo Version tracker[2] and add a few of the basic ones
> around CRD repo versions to the sprint.
>
> Also, we're going to work on accepting @mhrivnak's repo version PR[3]. I
> think it's mostly ready, and just needs some re-review and ACKs.
>
> # Feedback
>
> If you have any thoughts, please respond. We're hoping to get the ball
> rolling on repo versions ASAP. Thank you all for your help!
>
> [0] https://github.com/daviddavis/pulp_repo_version_test#results
> [1]
> https://pulp.plan.io/projects/pulp/wiki/Pulp_3_Minimum_Viable_Product/diff?utf8=%E2%9C%93=136_from=135=View+differences
> [2] https://pulp.plan.io/issues/3209
> [3] https://github.com/pulp/pulp/pull/3228
>
>
> David
>
> On Sun, Dec 17, 2017 at 3:30 PM, Michael Hrivnak 
> wrote:
>>
>> I decided to rebase the PR onto latest 3.0-dev just so it doesn't get too
>> stale, particularly since the un-nesting work had a substantial impact. I
>> also updated the gist containing tests. Feel free to have a look.
>>
>> I also addressed all the feedback on the PR. I did not implement any new
>> behavior, such as adding a boolean value to the version model, since it
>> seems like discussions may not be complete about what to name it and how it
>> should be used. That seems easy enough to implement as an additional change.
>>
>> On Mon, Dec 4, 2017 at 10:11 AM, Dennis Kliban  wrote:
>>>
>>> I am looking forward to discussing the use cases. I hope we can get
>>> versioned repositories into 3.0. Thanks everyone for the discussion so far.
>>>
>>> -Dennis
>>>
>>> On Fri, Dec 1, 2017 at 5:16 PM, Brian Bouterse 
>>> wrote:

 Thank you all for such great discussion!

 To recap some discussion we had today. We are going to look at the
 versioned repos use cases at an upcoming MVP call in the near future
 (probably 12/8). Look for the pulp-list announcement. If you have use cases
 you want to share, you can add them in red in the Versioned Repos section 
 of
 the MVP here:
 https://pulp.plan.io/projects/pulp/wiki/Pulp_3_Minimum_Viable_Product/#Versioned-Repositories

 Once the use cases are known, we can look at the PR and see if it
 fulfills them. From the discussion today, the general consensus is that 

Re: [Pulp-dev] repository versions update

2017-12-18 Thread David Davis
tl;dr - @dkliban, @bmbouter, and I met and we propose adopting the second
proposal because it has better performance and is more line with how we
think users will use repository versions (i.e. in a linear fashion rather
than a tree/branching model). We've also updated the user stories to remove
the base_version features and we're hoping to get @mhrivnak's PR merged
this week.

# Background

I ran through some performance tests on the first proposal which involved
storing a direct relationship between repository versions and content. The
results[0] show that for a smalli/medium-size system with 100M associations
between repository versions and content, it would take about a minute to
create a new repo version with 10,000 units in the database. 100M
associations also required a table size of at least 7GB and an index size
of 15GB.

I don't think this is a dealbreaker in and of itself. It's possible we
could do some optimizations if we really want to adopt the first proposal
(e.g. use int keys instead of UUIDs, table partitioning, etc). I think it's
worth asking though what we want to optimize for which brings me to the
next point.

# Linear vs Branching

A main consideration for us was how users would use Pulp 3. The strength of
the second proposal (in which additions/removals are stored) is when a few
units are added/removed to the latest repo version. This case captures how
a majority of users will create new versions in Pulp. This is basically a
linear sort of model in which new versions are always based off the
previous version.

The first proposal better supports creating versions from a base_version
which may or may not be a latest version. This is a branching sort of model
(like git) that offers more flexibility to our users but we feel like a
majority of the time, users would not be doing this when creating a new
version. And optimizing for a less frequently used use case is imprudent.

Therefore, we think it makes sense to adopt the second proposal and store
only additions/removals of content from a repository version. Also, we
think that the base_version feature (allowing users to make changes to an
older repo version) should not be a part of the MVP and maybe we can
consider it for 3.1+.

# Next Steps

We've updated the user stories in the MVP document to remove the
terminology around base_version[1]. We're going to break them up into
separate user stories under our Repo Version tracker[2] and add a few of
the basic ones around CRD repo versions to the sprint.

Also, we're going to work on accepting @mhrivnak's repo version PR[3]. I
think it's mostly ready, and just needs some re-review and ACKs.

# Feedback

If you have any thoughts, please respond. We're hoping to get the ball
rolling on repo versions ASAP. Thank you all for your help!

[0] https://github.com/daviddavis/pulp_repo_version_test#results
[1]
https://pulp.plan.io/projects/pulp/wiki/Pulp_3_Minimum_Viable_Product/diff?utf8=%E2%9C%93=136_from=135=View+differences
[2] https://pulp.plan.io/issues/3209
[3] https://github.com/pulp/pulp/pull/3228


David

On Sun, Dec 17, 2017 at 3:30 PM, Michael Hrivnak 
wrote:

> I decided to rebase the PR onto latest 3.0-dev just so it doesn't get too
> stale, particularly since the un-nesting work had a substantial impact. I
> also updated the gist containing tests. Feel free to have a look.
>
> I also addressed all the feedback on the PR. I did not implement any new
> behavior, such as adding a boolean value to the version model, since it
> seems like discussions may not be complete about what to name it and how it
> should be used. That seems easy enough to implement as an additional change.
>
> On Mon, Dec 4, 2017 at 10:11 AM, Dennis Kliban  wrote:
>
>> I am looking forward to discussing the use cases. I hope we can get
>> versioned repositories into 3.0. Thanks everyone for the discussion so far.
>>
>> -Dennis
>>
>> On Fri, Dec 1, 2017 at 5:16 PM, Brian Bouterse 
>> wrote:
>>
>>> Thank you all for such great discussion!
>>>
>>> To recap some discussion we had today. We are going to look at the
>>> versioned repos use cases at an upcoming MVP call in the near future
>>> (probably 12/8). Look for the pulp-list announcement. If you have use cases
>>> you want to share, you can add them in red in the Versioned Repos section
>>> of the MVP here:  https://pulp.plan.io/projects/
>>> pulp/wiki/Pulp_3_Minimum_Viable_Product/#Versioned-Repositories
>>>
>>> Once the use cases are known, we can look at the PR and see if it
>>> fulfills them. From the discussion today, the general consensus is that gap
>>> will be relatively small, which makes including it in Pulp3 feasible.
>>>
>>> @misa providing those types of features may be possible. Imagine an
>>> optional attribute on a repo version named 'frozen' that defaults to True.
>>> While the latest repo_version for a repo has frozen=False, any action that
>>> would normally create a new repo version 

Re: [Pulp-dev] repository versions update

2017-12-04 Thread Dennis Kliban
I am looking forward to discussing the use cases. I hope we can get
versioned repositories into 3.0. Thanks everyone for the discussion so far.

-Dennis

On Fri, Dec 1, 2017 at 5:16 PM, Brian Bouterse  wrote:

> Thank you all for such great discussion!
>
> To recap some discussion we had today. We are going to look at the
> versioned repos use cases at an upcoming MVP call in the near future
> (probably 12/8). Look for the pulp-list announcement. If you have use cases
> you want to share, you can add them in red in the Versioned Repos section
> of the MVP here:  https://pulp.plan.io/projects/
> pulp/wiki/Pulp_3_Minimum_Viable_Product/#Versioned-Repositories
>
> Once the use cases are known, we can look at the PR and see if it fulfills
> them. From the discussion today, the general consensus is that gap will be
> relatively small, which makes including it in Pulp3 feasible.
>
> @misa providing those types of features may be possible. Imagine an
> optional attribute on a repo version named 'frozen' that defaults to True.
> While the latest repo_version for a repo has frozen=False, any action that
> would normally create a new repo version (copy, add/remove, delete, etc)
> would act on the existing repo version and *not* create a new one. Then the
> user can update the frozen attribute of the repo version when they want,
> which commits the transaction as a repo version. I don't think this would
> be too hard to implement.
>
>
> On Thu, Nov 30, 2017 at 3:20 PM, Michael Hrivnak 
> wrote:
>
>>
>>
>> On Thu, Nov 30, 2017 at 11:43 AM, Mihai Ibanescu <
>> mihai.ibane...@gmail.com> wrote:
>>
>>> I am late to the thread, so I apologize if I repeat things that have
>>> been discussed already.
>>>
>>> Is it a meaningful use case to publish an older version of the repo?
>>> Once published, do you keep track of which version got published, and how
>>> do you decide which version to push next? This seems like a complication to
>>> me.
>>>
>>>
>> A publication will have a reference to the version that it was created
>> from. To illustrate how that would get used: Your CTO calls early on a
>> Saturday morning and says "I read in the news about a major security flaw
>> in cowsay, and I know our applications depend heavily on it. What version
>> do we have deployed right now???!!!" You can concretely determine which
>> publications are being currently "distributed" to your infrastructure, and
>> from there see their exact content sets by virtue of the repo version.
>>
>> Then there is the promotion workflow, which in Pulp 2 requires a lot of
>> copying and re-publishing. With repo versions, you'll have a sequence of
>> versions of course. Let's say there's 1, 2 and 3. Version 1 is deployed
>> now, version 2 is undergoing testing, and version 3 got created last night
>> by the weekly sync job you setup. You would have two different distributors
>> that make these publications available to clients: one for production, and
>> one for testing. "Promotion" becomes just the act of updating the reference
>> on a distribution to a different publication. When testing on version 2 is
>> done, assuming it passes, you can update the production distribution to
>> make it use version 2.
>>
>> There are a few use cases for publishing an old version.
>>
>> One is: I want to publish the same exact content set two different ways,
>> with two different publishers. If the contents change between publishes, I
>> want a guarantee that it won't cause the second publish to use different
>> content than the first.
>>
>> Second: I like the state of the content in a repo as it is right now. I
>> want to publish that exact content set. If any changes happen to the
>> content in that repo between now and when my publish task gets run by a
>> worker, I don't want those changes to affect the publish I'm requesting
>> right now.
>>
>> Third: I want the ability to roll back from a bad content set to a
>> known-good one. How many publications must I keep around to have confidence
>> that if I need to roll back some distance, that publication will still be
>> available? It's valuable to know I can re-publish an older version any time
>> I need it.
>>
>> Fourth: In some cases you may decide after-the-fact that you need to
>> publish the same content set a different way. Maybe you went to kickstart
>> from a yum repo and then remembered that (this is a true story) one version
>> of your installer is too old to know about sha256 checksums, so you have to
>> go re-publish the same content set with different settings for how the
>> metadata gets generated.
>>
>> Otherwise, just as reproducible builds of software is a very valuable
>> trait, reproducible publishes of repositories are valuable for similar
>> reasons.
>>
>>
>>
>>> As a user / content developer, it seems more useful to me to always
>>> publish the latest (i.e. don't have an optional version for publishing),
>>> but have the ability to copy from a specific 

Re: [Pulp-dev] repository versions update

2017-11-30 Thread Michael Hrivnak
On Thu, Nov 30, 2017 at 11:43 AM, Mihai Ibanescu 
wrote:

> I am late to the thread, so I apologize if I repeat things that have been
> discussed already.
>
> Is it a meaningful use case to publish an older version of the repo? Once
> published, do you keep track of which version got published, and how do you
> decide which version to push next? This seems like a complication to me.
>
>
A publication will have a reference to the version that it was created
from. To illustrate how that would get used: Your CTO calls early on a
Saturday morning and says "I read in the news about a major security flaw
in cowsay, and I know our applications depend heavily on it. What version
do we have deployed right now???!!!" You can concretely determine which
publications are being currently "distributed" to your infrastructure, and
from there see their exact content sets by virtue of the repo version.

Then there is the promotion workflow, which in Pulp 2 requires a lot of
copying and re-publishing. With repo versions, you'll have a sequence of
versions of course. Let's say there's 1, 2 and 3. Version 1 is deployed
now, version 2 is undergoing testing, and version 3 got created last night
by the weekly sync job you setup. You would have two different distributors
that make these publications available to clients: one for production, and
one for testing. "Promotion" becomes just the act of updating the reference
on a distribution to a different publication. When testing on version 2 is
done, assuming it passes, you can update the production distribution to
make it use version 2.

There are a few use cases for publishing an old version.

One is: I want to publish the same exact content set two different ways,
with two different publishers. If the contents change between publishes, I
want a guarantee that it won't cause the second publish to use different
content than the first.

Second: I like the state of the content in a repo as it is right now. I
want to publish that exact content set. If any changes happen to the
content in that repo between now and when my publish task gets run by a
worker, I don't want those changes to affect the publish I'm requesting
right now.

Third: I want the ability to roll back from a bad content set to a
known-good one. How many publications must I keep around to have confidence
that if I need to roll back some distance, that publication will still be
available? It's valuable to know I can re-publish an older version any time
I need it.

Fourth: In some cases you may decide after-the-fact that you need to
publish the same content set a different way. Maybe you went to kickstart
from a yum repo and then remembered that (this is a true story) one version
of your installer is too old to know about sha256 checksums, so you have to
go re-publish the same content set with different settings for how the
metadata gets generated.

Otherwise, just as reproducible builds of software is a very valuable
trait, reproducible publishes of repositories are valuable for similar
reasons.



> As a user / content developer, it seems more useful to me to always
> publish the latest (i.e. don't have an optional version for publishing),
> but have the ability to copy from a specific version of a repo into another
> repo (or the same repo, effectively reverting the content of latest).
>
> So I would shift the discussion away from the REST API (for now), and more
> into the expected behavior for manipulating content within pulp. The
> operations I am aware of are: syncing units, importing units,
> copying/deleting units, and I am seeking clarification on how versioning
> will work for each.
>
> Syncing is probably the easiest, because it can handle all the changes
> internally and create a new version at the end.
>
> For importing, if you don't want to create unnecessary intermediate
> versions that are meaningless, I would want the ability to upload more than
> one unit and associate it to the repo, and then create a version. In other
> words, a transactional multi-upload.
>

Indeed. We want to have a behavior in Pulp 3 anyway that lets you
arbitrarily add and remove multiple content units in one operation. That's
one of the more notable missing features from Pulp 2. As Brian has pointed
out, one option is to let a user directly POST to a "versions" endpoint and
express what content they want to add/remove. Even without repo versions,
we'd still want an API that lets you bulk add/remove.


> For copying, as suggested above, I want to optionally specify the version.
>
> Deleting by itself is not hard, it does what it needs to do and then
> creates a version.
>
> The more complicated use case would be: what if I wanted to change the
> contents of repoA:
> * add 3 packages from repo1 version 1
> * add 4 packages from repo2 (latest)
> * delete 5 packages
>
> and at the end have a single version change for repoA.
>
> Or, for the same repoA:
> * delete all units of type "rpm" and name "glibc"
> * 

Re: [Pulp-dev] repository versions update

2017-11-30 Thread Mihai Ibanescu
I am late to the thread, so I apologize if I repeat things that have been
discussed already.

Is it a meaningful use case to publish an older version of the repo? Once
published, do you keep track of which version got published, and how do you
decide which version to push next? This seems like a complication to me.

As a user / content developer, it seems more useful to me to always publish
the latest (i.e. don't have an optional version for publishing), but have
the ability to copy from a specific version of a repo into another repo (or
the same repo, effectively reverting the content of latest).

So I would shift the discussion away from the REST API (for now), and more
into the expected behavior for manipulating content within pulp. The
operations I am aware of are: syncing units, importing units,
copying/deleting units, and I am seeking clarification on how versioning
will work for each.

Syncing is probably the easiest, because it can handle all the changes
internally and create a new version at the end.

For importing, if you don't want to create unnecessary intermediate
versions that are meaningless, I would want the ability to upload more than
one unit and associate it to the repo, and then create a version. In other
words, a transactional multi-upload.

For copying, as suggested above, I want to optionally specify the version.

Deleting by itself is not hard, it does what it needs to do and then
creates a version.

The more complicated use case would be: what if I wanted to change the
contents of repoA:
* add 3 packages from repo1 version 1
* add 4 packages from repo2 (latest)
* delete 5 packages

and at the end have a single version change for repoA.

Or, for the same repoA:
* delete all units of type "rpm" and name "glibc"
* copy unit type "rpm" and name "glibc" from two versions ago


If you wanted this use case, then you need a new resource type, somewhat
similar to a Task, let's call it Transaction. It is tied to the repository
it operates on (repoA in the example above), and locks it from further
changes until the transaction is committed or aborted. It could be
implemented internally as a repository. You start with the current contents
of repoA, and you perform whatever operations you need to do (including
changing repo metadata). When you "commit" the Transaction, it becomes
*the* new version of the repository and unlocks repoA.

Whether a Version is a full copy of the repo or a delta is an
implementation detail. I would argue for full copy, otherwise you run into
the inefficiencies of cvs which had to apply patches in reverse order just
to get to a version in the past. I would find it more useful to have a repo
diff resource (diff version 1 with version 3, or repo1 version 1 with repo2
latest).

Unfortunately, it is a rather large paradigm shift, and not one that you
can push in a 3.0 -> 3.1 transition. Parts of it will need to land in 3.0
proper, determining what can be left out is an exercise to the reader who
managed to keep up with my long emails.

Hey, a man can dream.

Mihai


On Thu, Nov 30, 2017 at 11:00 AM, David Davis  wrote:

> Brian,
>
> The issue is not adding new paths in the API to support versioned
> repositories. The issue is having to remove or change paths which we can’t
> do if we want to adhere to semantic versioning. See the example from my
> email about /repositorycontents/. Also, see @mhrivnak’s writeup here:
>
> https://pulp.plan.io/projects/pulp/wiki/Repository_Versions#REST-API
>
> I think we can probably make tweaks to our API to support the adding of
> versioned repos in 3.1+ but we’re not set up right now to do that. It also
> might be worth considering how much time it’ll take to make our APIs
> compatible with versioned repos in 3.1+ versus just adopting @mhrivnak’s
> versioned repo changes.
>
>
> David
>
> On Thu, Nov 30, 2017 at 10:32 AM, Brian Bouterse 
> wrote:
>
>> I agree if we couldn't add versioned repos later, then we should add it
>> now. I believe the current API is already well setup to add it later.
>> Specifically, here is a non-nested example of the versioned repository
>> resource to be added in 3.1+
>>
>> /api/v3/repo_version/
>>
>> The user would POST to that to cause a new repo version to be created.
>> What are the contents of that repo version? Well it depends on the POST
>> data. That repo version could be the result of a sync being performed, or
>> perhaps it could also include mass associate/unassociate operations right
>> there. In other words, I see a clear path to adding such a resource, and I
>> see endless flexibility in terms of the users intention (via POST data) of
>> what should be contained in that repo version.
>>
>> Can some feedback be given on this design? Why won't this work in 3.1+?
>>
>> To give my own answer to the question. The main impact I see when adding
>> this in 3.1+ is on plugin writers not users. Specifically the plugin writer
>> API would change because a plugin 

Re: [Pulp-dev] repository versions update

2017-11-30 Thread Brian Bouterse
I agree if we couldn't add versioned repos later, then we should add it
now. I believe the current API is already well setup to add it later.
Specifically, here is a non-nested example of the versioned repository
resource to be added in 3.1+

/api/v3/repo_version/

The user would POST to that to cause a new repo version to be created. What
are the contents of that repo version? Well it depends on the POST data.
That repo version could be the result of a sync being performed, or perhaps
it could also include mass associate/unassociate operations right there. In
other words, I see a clear path to adding such a resource, and I see
endless flexibility in terms of the users intention (via POST data) of what
should be contained in that repo version.

Can some feedback be given on this design? Why won't this work in 3.1+?

To give my own answer to the question. The main impact I see when adding
this in 3.1+ is on plugin writers not users. Specifically the plugin writer
API would change because a plugin writer would no longer be associating
content with a repo as it's primary activity. Instead their code would be
to associate content with a repo version. We can roll out that change in a
clear, coordinated way with the plugin community. It's also appropriate
because the plugin API is still < 1.0. We would bump the plugin api from
0.1 to 0.2 and plugins can easily declare their compatibility as
pulpcore-plugin < 0.2. Relative to the effort of creating a plugin, porting
plugin code for a change like this I think would be low effort.


On Wed, Nov 29, 2017 at 10:19 AM, Jeff Ortel  wrote:

>
>
> On 11/29/2017 07:22 AM, David Davis wrote:
> > I think we could design an API in 3.0 that would support versioned repos
> in 3.1+. However, our current API
> > does not. For example, the /repositorycontents/ endpoint doesn't make
> sense with versioned repos as no one
> > would want to add/remove content units one-by-one when doing so would
> generate a new repo version each time.
> > Imagine that we end up with an endpoint in 3.0 that’s not compatible
> with versioned repos. What would we do? I
> > think this is a strong argument for adding versioned repos now.
>
> agreed.
>
> >
> > Of course the main drawback is that it might delay the beta. But I
> wonder by how much. It might be good to
> > groom the versioned repo user stories so that (a) we can see how much
> value they provide to end users and (b)
> > how closely they align with the work @mhrivnak has done.
>
> agreed.
>
> >
> >
> > David
> >
> > On Tue, Nov 28, 2017 at 4:00 PM, Brian Bouterse  > wrote:
> >
> > In reading back over the last email thread in May, it ended with us
> looking at URL options to ensure we
> > could release 3.0 and add in repo versions in 3.1+. We definitely
> want repo versions in the 3.y line, so
> > we wanted to make sure that was possible. If it wasn't, then we may
> have to add it into 3.0.
> >
> > That question is a lot easier now given how firm the API is. I think
> we can add in versioned repos in
> > 3.1+, in a natural way. Just like a user creates a Publication which
> triggers a publish, a user would
> > create a RepoVersion which would trigger a sync to produce that new
> RepoVersion. The repo versions work
> > needs to continue, but first I hope we prioritize getting to Beta 1
> for core. There are a lot of use cases
> > in black on the MVP which are not implemented or written in Redmine.
> I believe closing that gap would be a
> > better use of time given that we can add this later.
> >
> > What do others think?
> >
> >
> > On Tue, Nov 28, 2017 at 2:24 PM, Dennis Kliban  > wrote:
> >
> > I have a hard objection to including versioned repositories in
> 3.0. We agreed to make sure that our
> > current design would not prevent us from adding versioned
> repositories in the future. We did NOT agree
> > to including versioned repositories in 3.0 release. This is a
> big code change that did not go through
> > our regular planning process. I greatly appreciate your effort
> in driving this feature forward, but we
> > should take a step back and go through our regular process. I am
> also concerned that adding such a big
> > change at this time will delay the beta.
> >
> > -Dennis
> >
> >
> > On Tue, Nov 28, 2017 at 10:10 AM, Michael Hrivnak <
> mhriv...@redhat.com >
> > wrote:
> >
> > Following up on previous discussions, I did an analysis of
> how repository versioning would impact
> > Pulp 3's current REST API and plugin API. A lot has changed
> since we last discussed the topic (in
> > May 2017), such as how we handle publications, and how the
> REST API is laid out. You can read the
> > analysis here:
> >
> > 

Re: [Pulp-dev] repository versions update

2017-11-29 Thread Jeff Ortel


On 11/29/2017 07:22 AM, David Davis wrote:
> I think we could design an API in 3.0 that would support versioned repos in 
> 3.1+. However, our current API
> does not. For example, the /repositorycontents/ endpoint doesn't make sense 
> with versioned repos as no one
> would want to add/remove content units one-by-one when doing so would 
> generate a new repo version each time.
> Imagine that we end up with an endpoint in 3.0 that’s not compatible with 
> versioned repos. What would we do? I
> think this is a strong argument for adding versioned repos now.

agreed.

> 
> Of course the main drawback is that it might delay the beta. But I wonder by 
> how much. It might be good to
> groom the versioned repo user stories so that (a) we can see how much value 
> they provide to end users and (b)
> how closely they align with the work @mhrivnak has done.

agreed.

> 
> 
> David
> 
> On Tue, Nov 28, 2017 at 4:00 PM, Brian Bouterse  > wrote:
> 
> In reading back over the last email thread in May, it ended with us 
> looking at URL options to ensure we
> could release 3.0 and add in repo versions in 3.1+. We definitely want 
> repo versions in the 3.y line, so
> we wanted to make sure that was possible. If it wasn't, then we may have 
> to add it into 3.0.
> 
> That question is a lot easier now given how firm the API is. I think we 
> can add in versioned repos in
> 3.1+, in a natural way. Just like a user creates a Publication which 
> triggers a publish, a user would
> create a RepoVersion which would trigger a sync to produce that new 
> RepoVersion. The repo versions work
> needs to continue, but first I hope we prioritize getting to Beta 1 for 
> core. There are a lot of use cases
> in black on the MVP which are not implemented or written in Redmine. I 
> believe closing that gap would be a
> better use of time given that we can add this later.
> 
> What do others think?
> 
> 
> On Tue, Nov 28, 2017 at 2:24 PM, Dennis Kliban  > wrote:
> 
> I have a hard objection to including versioned repositories in 3.0. 
> We agreed to make sure that our
> current design would not prevent us from adding versioned 
> repositories in the future. We did NOT agree
> to including versioned repositories in 3.0 release. This is a big 
> code change that did not go through
> our regular planning process. I greatly appreciate your effort in 
> driving this feature forward, but we
> should take a step back and go through our regular process. I am also 
> concerned that adding such a big
> change at this time will delay the beta.
> 
> -Dennis
> 
> 
> On Tue, Nov 28, 2017 at 10:10 AM, Michael Hrivnak 
> >
> wrote:
> 
> Following up on previous discussions, I did an analysis of how 
> repository versioning would impact
> Pulp 3's current REST API and plugin API. A lot has changed since 
> we last discussed the topic (in
> May 2017), such as how we handle publications, and how the REST 
> API is laid out. You can read the
> analysis here:
> 
> https://pulp.plan.io/projects/pulp/wiki/Repository_Versions
> 
> 
> We previously discussed and vetted the mechanics at great length. 
> While there was broad agreement
> on the value to Pulp 3, there was uncertainty about the details 
> of how it would impact REST
> clients and plugin writers, and also uncertainty about how long 
> it would take to fully implement.
> 
> In the course of my recent analysis, two things became clear. 1) 
> both current APIs are not
> compatible and would have to change. Details are on the wiki page 
> above. 2) the PoC from earlier
> this year indeed covers the hard parts, leaving mostly DRF 
> details to sort out.
> 
> 
> I don't agree with your assessment that the current REST API is not 
> compatible with adding repository
> versions. A repository version is it's own resource that can be added
>  
> 
> 
> I started rebasing the PoC onto current 3.0-dev, and within an 
> hour I had it working with the
> updated REST endpoints. With that having been so easy, I threw 
> caution to the wind, and within a
> few hours I had a fully functional branch that covered all the 
> key use cases.
> 
> - sync creates a new version
> - versions and their content sets are visible through the REST API
> - each version shows what content was added and removed
> - versions can be deleted, which queues a task that squashes 
> changes as previously discussed
> - the ChangeSet and pulp_file were updated 

Re: [Pulp-dev] repository versions update

2017-11-29 Thread David Davis
I think we could design an API in 3.0 that would support versioned repos in
3.1+. However, our current API does not. For example, the
/repositorycontents/ endpoint doesn't make sense with versioned repos as no
one would want to add/remove content units one-by-one when doing so would
generate a new repo version each time. Imagine that we end up with an
endpoint in 3.0 that’s not compatible with versioned repos. What would we
do? I think this is a strong argument for adding versioned repos now.

Of course the main drawback is that it might delay the beta. But I wonder
by how much. It might be good to groom the versioned repo user stories so
that (a) we can see how much value they provide to end users and (b) how
closely they align with the work @mhrivnak has done.


David

On Tue, Nov 28, 2017 at 4:00 PM, Brian Bouterse  wrote:

> In reading back over the last email thread in May, it ended with us
> looking at URL options to ensure we could release 3.0 and add in repo
> versions in 3.1+. We definitely want repo versions in the 3.y line, so we
> wanted to make sure that was possible. If it wasn't, then we may have to
> add it into 3.0.
>
> That question is a lot easier now given how firm the API is. I think we
> can add in versioned repos in 3.1+, in a natural way. Just like a user
> creates a Publication which triggers a publish, a user would create a
> RepoVersion which would trigger a sync to produce that new RepoVersion. The
> repo versions work needs to continue, but first I hope we prioritize
> getting to Beta 1 for core. There are a lot of use cases in black on the
> MVP which are not implemented or written in Redmine. I believe closing that
> gap would be a better use of time given that we can add this later.
>
> What do others think?
>
>
> On Tue, Nov 28, 2017 at 2:24 PM, Dennis Kliban  wrote:
>
>> I have a hard objection to including versioned repositories in 3.0. We
>> agreed to make sure that our current design would not prevent us from
>> adding versioned repositories in the future. We did NOT agree to including
>> versioned repositories in 3.0 release. This is a big code change that did
>> not go through our regular planning process. I greatly appreciate your
>> effort in driving this feature forward, but we should take a step back and
>> go through our regular process. I am also concerned that adding such a big
>> change at this time will delay the beta.
>>
>> -Dennis
>>
>>
>> On Tue, Nov 28, 2017 at 10:10 AM, Michael Hrivnak 
>> wrote:
>>
>>> Following up on previous discussions, I did an analysis of how
>>> repository versioning would impact Pulp 3's current REST API and plugin
>>> API. A lot has changed since we last discussed the topic (in May 2017),
>>> such as how we handle publications, and how the REST API is laid out. You
>>> can read the analysis here:
>>>
>>> https://pulp.plan.io/projects/pulp/wiki/Repository_Versions
>>>
>>> We previously discussed and vetted the mechanics at great length. While
>>> there was broad agreement on the value to Pulp 3, there was uncertainty
>>> about the details of how it would impact REST clients and plugin writers,
>>> and also uncertainty about how long it would take to fully implement.
>>>
>>> In the course of my recent analysis, two things became clear. 1) both
>>> current APIs are not compatible and would have to change. Details are on
>>> the wiki page above. 2) the PoC from earlier this year indeed covers the
>>> hard parts, leaving mostly DRF details to sort out.
>>>
>>
>> I don't agree with your assessment that the current REST API is not
>> compatible with adding repository versions. A repository version is it's
>> own resource that can be added
>>
>>
>>>
>>> I started rebasing the PoC onto current 3.0-dev, and within an hour I
>>> had it working with the updated REST endpoints. With that having been so
>>> easy, I threw caution to the wind, and within a few hours I had a fully
>>> functional branch that covered all the key use cases.
>>>
>>> - sync creates a new version
>>> - versions and their content sets are visible through the REST API
>>> - each version shows what content was added and removed
>>> - versions can be deleted, which queues a task that squashes changes as
>>> previously discussed
>>> - the ChangeSet and pulp_file were updated to work with versions
>>> - publish defaults to using the latest version
>>>
>>> I also created a set of tests to help prove that it behaves correctly:
>>>
>>> https://gist.github.com/mhrivnak/69af54063dff7465212914094dff34c2
>>>
>>> I have just about 12 hours of recent work into it, and the code is
>>> PR-ready. It's just missing doc updates and release notes. It's been
>>> difficult to keep discussion moving toward a full plan due to the
>>> uncertainties mentioned above, so hopefully this can alleviate those
>>> concerns and give everyone something concrete to look at.
>>>
>>> https://github.com/pulp/pulp/pull/3228
>>> 

Re: [Pulp-dev] repository versions update

2017-11-28 Thread Brian Bouterse
In reading back over the last email thread in May, it ended with us looking
at URL options to ensure we could release 3.0 and add in repo versions in
3.1+. We definitely want repo versions in the 3.y line, so we wanted to
make sure that was possible. If it wasn't, then we may have to add it into
3.0.

That question is a lot easier now given how firm the API is. I think we can
add in versioned repos in 3.1+, in a natural way. Just like a user creates
a Publication which triggers a publish, a user would create a RepoVersion
which would trigger a sync to produce that new RepoVersion. The repo
versions work needs to continue, but first I hope we prioritize getting to
Beta 1 for core. There are a lot of use cases in black on the MVP which are
not implemented or written in Redmine. I believe closing that gap would be
a better use of time given that we can add this later.

What do others think?


On Tue, Nov 28, 2017 at 2:24 PM, Dennis Kliban  wrote:

> I have a hard objection to including versioned repositories in 3.0. We
> agreed to make sure that our current design would not prevent us from
> adding versioned repositories in the future. We did NOT agree to including
> versioned repositories in 3.0 release. This is a big code change that did
> not go through our regular planning process. I greatly appreciate your
> effort in driving this feature forward, but we should take a step back and
> go through our regular process. I am also concerned that adding such a big
> change at this time will delay the beta.
>
> -Dennis
>
>
> On Tue, Nov 28, 2017 at 10:10 AM, Michael Hrivnak 
> wrote:
>
>> Following up on previous discussions, I did an analysis of how repository
>> versioning would impact Pulp 3's current REST API and plugin API. A lot has
>> changed since we last discussed the topic (in May 2017), such as how we
>> handle publications, and how the REST API is laid out. You can read the
>> analysis here:
>>
>> https://pulp.plan.io/projects/pulp/wiki/Repository_Versions
>>
>> We previously discussed and vetted the mechanics at great length. While
>> there was broad agreement on the value to Pulp 3, there was uncertainty
>> about the details of how it would impact REST clients and plugin writers,
>> and also uncertainty about how long it would take to fully implement.
>>
>> In the course of my recent analysis, two things became clear. 1) both
>> current APIs are not compatible and would have to change. Details are on
>> the wiki page above. 2) the PoC from earlier this year indeed covers the
>> hard parts, leaving mostly DRF details to sort out.
>>
>
> I don't agree with your assessment that the current REST API is not
> compatible with adding repository versions. A repository version is it's
> own resource that can be added
>
>
>>
>> I started rebasing the PoC onto current 3.0-dev, and within an hour I had
>> it working with the updated REST endpoints. With that having been so easy,
>> I threw caution to the wind, and within a few hours I had a fully
>> functional branch that covered all the key use cases.
>>
>> - sync creates a new version
>> - versions and their content sets are visible through the REST API
>> - each version shows what content was added and removed
>> - versions can be deleted, which queues a task that squashes changes as
>> previously discussed
>> - the ChangeSet and pulp_file were updated to work with versions
>> - publish defaults to using the latest version
>>
>> I also created a set of tests to help prove that it behaves correctly:
>>
>> https://gist.github.com/mhrivnak/69af54063dff7465212914094dff34c2
>>
>> I have just about 12 hours of recent work into it, and the code is
>> PR-ready. It's just missing doc updates and release notes. It's been
>> difficult to keep discussion moving toward a full plan due to the
>> uncertainties mentioned above, so hopefully this can alleviate those
>> concerns and give everyone something concrete to look at.
>>
>> https://github.com/pulp/pulp/pull/3228
>> https://github.com/pulp/pulp_file/pull/20
>>
>> Two notable items are missing. One is that there is no way to arbitrarily
>> add and remove content from a repo now, since this removes the
>> "repositorycontent" endpoint. But we need to solve that with a more formal
>> and bulk add/remove API anyway. I also found that the "repositorycontent"
>> endpoint was not using tasks, and thus there was no repo locking, so it
>> needed additional work anyway. Based on this overall effort, I think it
>> will be very easy to add if we just agree on what the endpoints should look
>> like.
>>
>> The other is that publish does not in this PR accept a reference to a
>> version. It always uses the latest. That would also be a very easy
>> enhancement to make.
>>
>> I am happy to support getting this merged as I transition to being a more
>> passive community member, assuming there are no objections. I am also of
>> course happy to help support this into the future, as I believe 

Re: [Pulp-dev] repository versions update

2017-11-28 Thread Dennis Kliban
I have a hard objection to including versioned repositories in 3.0. We
agreed to make sure that our current design would not prevent us from
adding versioned repositories in the future. We did NOT agree to including
versioned repositories in 3.0 release. This is a big code change that did
not go through our regular planning process. I greatly appreciate your
effort in driving this feature forward, but we should take a step back and
go through our regular process. I am also concerned that adding such a big
change at this time will delay the beta.

-Dennis


On Tue, Nov 28, 2017 at 10:10 AM, Michael Hrivnak 
wrote:

> Following up on previous discussions, I did an analysis of how repository
> versioning would impact Pulp 3's current REST API and plugin API. A lot has
> changed since we last discussed the topic (in May 2017), such as how we
> handle publications, and how the REST API is laid out. You can read the
> analysis here:
>
> https://pulp.plan.io/projects/pulp/wiki/Repository_Versions
>
> We previously discussed and vetted the mechanics at great length. While
> there was broad agreement on the value to Pulp 3, there was uncertainty
> about the details of how it would impact REST clients and plugin writers,
> and also uncertainty about how long it would take to fully implement.
>
> In the course of my recent analysis, two things became clear. 1) both
> current APIs are not compatible and would have to change. Details are on
> the wiki page above. 2) the PoC from earlier this year indeed covers the
> hard parts, leaving mostly DRF details to sort out.
>

I don't agree with your assessment that the current REST API is not
compatible with adding repository versions. A repository version is it's
own resource that can be added


>
> I started rebasing the PoC onto current 3.0-dev, and within an hour I had
> it working with the updated REST endpoints. With that having been so easy,
> I threw caution to the wind, and within a few hours I had a fully
> functional branch that covered all the key use cases.
>
> - sync creates a new version
> - versions and their content sets are visible through the REST API
> - each version shows what content was added and removed
> - versions can be deleted, which queues a task that squashes changes as
> previously discussed
> - the ChangeSet and pulp_file were updated to work with versions
> - publish defaults to using the latest version
>
> I also created a set of tests to help prove that it behaves correctly:
>
> https://gist.github.com/mhrivnak/69af54063dff7465212914094dff34c2
>
> I have just about 12 hours of recent work into it, and the code is
> PR-ready. It's just missing doc updates and release notes. It's been
> difficult to keep discussion moving toward a full plan due to the
> uncertainties mentioned above, so hopefully this can alleviate those
> concerns and give everyone something concrete to look at.
>
> https://github.com/pulp/pulp/pull/3228
> https://github.com/pulp/pulp_file/pull/20
>
> Two notable items are missing. One is that there is no way to arbitrarily
> add and remove content from a repo now, since this removes the
> "repositorycontent" endpoint. But we need to solve that with a more formal
> and bulk add/remove API anyway. I also found that the "repositorycontent"
> endpoint was not using tasks, and thus there was no repo locking, so it
> needed additional work anyway. Based on this overall effort, I think it
> will be very easy to add if we just agree on what the endpoints should look
> like.
>
> The other is that publish does not in this PR accept a reference to a
> version. It always uses the latest. That would also be a very easy
> enhancement to make.
>
> I am happy to support getting this merged as I transition to being a more
> passive community member, assuming there are no objections. I am also of
> course happy to help support this into the future, as I believe strongly in
> its value and importance (see previous thread).
>
> Please provide feedback and questions. If a live meeting this week would
> help expedite evaluation of this effort, I'm happy to schedule that. And
> assuming there are no hard objections, I'm happy to proceed with
> documentation updates.
>
> Thanks!
>
> --
>
> Michael Hrivnak
> Principal Software Engineer, RHCE
>
> Red Hat
>
> ___
> Pulp-dev mailing list
> Pulp-dev@redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
>
___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev


Re: [Pulp-dev] repository versions update

2017-11-28 Thread David Davis
Nice! +1 to a live meeting.


David

On Tue, Nov 28, 2017 at 10:10 AM, Michael Hrivnak 
wrote:

> Following up on previous discussions, I did an analysis of how repository
> versioning would impact Pulp 3's current REST API and plugin API. A lot has
> changed since we last discussed the topic (in May 2017), such as how we
> handle publications, and how the REST API is laid out. You can read the
> analysis here:
>
> https://pulp.plan.io/projects/pulp/wiki/Repository_Versions
>
> We previously discussed and vetted the mechanics at great length. While
> there was broad agreement on the value to Pulp 3, there was uncertainty
> about the details of how it would impact REST clients and plugin writers,
> and also uncertainty about how long it would take to fully implement.
>
> In the course of my recent analysis, two things became clear. 1) both
> current APIs are not compatible and would have to change. Details are on
> the wiki page above. 2) the PoC from earlier this year indeed covers the
> hard parts, leaving mostly DRF details to sort out.
>
> I started rebasing the PoC onto current 3.0-dev, and within an hour I had
> it working with the updated REST endpoints. With that having been so easy,
> I threw caution to the wind, and within a few hours I had a fully
> functional branch that covered all the key use cases.
>
> - sync creates a new version
> - versions and their content sets are visible through the REST API
> - each version shows what content was added and removed
> - versions can be deleted, which queues a task that squashes changes as
> previously discussed
> - the ChangeSet and pulp_file were updated to work with versions
> - publish defaults to using the latest version
>
> I also created a set of tests to help prove that it behaves correctly:
>
> https://gist.github.com/mhrivnak/69af54063dff7465212914094dff34c2
>
> I have just about 12 hours of recent work into it, and the code is
> PR-ready. It's just missing doc updates and release notes. It's been
> difficult to keep discussion moving toward a full plan due to the
> uncertainties mentioned above, so hopefully this can alleviate those
> concerns and give everyone something concrete to look at.
>
> https://github.com/pulp/pulp/pull/3228
> https://github.com/pulp/pulp_file/pull/20
>
> Two notable items are missing. One is that there is no way to arbitrarily
> add and remove content from a repo now, since this removes the
> "repositorycontent" endpoint. But we need to solve that with a more formal
> and bulk add/remove API anyway. I also found that the "repositorycontent"
> endpoint was not using tasks, and thus there was no repo locking, so it
> needed additional work anyway. Based on this overall effort, I think it
> will be very easy to add if we just agree on what the endpoints should look
> like.
>
> The other is that publish does not in this PR accept a reference to a
> version. It always uses the latest. That would also be a very easy
> enhancement to make.
>
> I am happy to support getting this merged as I transition to being a more
> passive community member, assuming there are no objections. I am also of
> course happy to help support this into the future, as I believe strongly in
> its value and importance (see previous thread).
>
> Please provide feedback and questions. If a live meeting this week would
> help expedite evaluation of this effort, I'm happy to schedule that. And
> assuming there are no hard objections, I'm happy to proceed with
> documentation updates.
>
> Thanks!
>
> --
>
> Michael Hrivnak
>
> Principal Software Engineer, RHCE
>
> Red Hat
>
> ___
> Pulp-dev mailing list
> Pulp-dev@redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
>
___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev


[Pulp-dev] repository versions update

2017-11-28 Thread Michael Hrivnak
Following up on previous discussions, I did an analysis of how repository
versioning would impact Pulp 3's current REST API and plugin API. A lot has
changed since we last discussed the topic (in May 2017), such as how we
handle publications, and how the REST API is laid out. You can read the
analysis here:

https://pulp.plan.io/projects/pulp/wiki/Repository_Versions

We previously discussed and vetted the mechanics at great length. While
there was broad agreement on the value to Pulp 3, there was uncertainty
about the details of how it would impact REST clients and plugin writers,
and also uncertainty about how long it would take to fully implement.

In the course of my recent analysis, two things became clear. 1) both
current APIs are not compatible and would have to change. Details are on
the wiki page above. 2) the PoC from earlier this year indeed covers the
hard parts, leaving mostly DRF details to sort out.

I started rebasing the PoC onto current 3.0-dev, and within an hour I had
it working with the updated REST endpoints. With that having been so easy,
I threw caution to the wind, and within a few hours I had a fully
functional branch that covered all the key use cases.

- sync creates a new version
- versions and their content sets are visible through the REST API
- each version shows what content was added and removed
- versions can be deleted, which queues a task that squashes changes as
previously discussed
- the ChangeSet and pulp_file were updated to work with versions
- publish defaults to using the latest version

I also created a set of tests to help prove that it behaves correctly:

https://gist.github.com/mhrivnak/69af54063dff7465212914094dff34c2

I have just about 12 hours of recent work into it, and the code is
PR-ready. It's just missing doc updates and release notes. It's been
difficult to keep discussion moving toward a full plan due to the
uncertainties mentioned above, so hopefully this can alleviate those
concerns and give everyone something concrete to look at.

https://github.com/pulp/pulp/pull/3228
https://github.com/pulp/pulp_file/pull/20

Two notable items are missing. One is that there is no way to arbitrarily
add and remove content from a repo now, since this removes the
"repositorycontent" endpoint. But we need to solve that with a more formal
and bulk add/remove API anyway. I also found that the "repositorycontent"
endpoint was not using tasks, and thus there was no repo locking, so it
needed additional work anyway. Based on this overall effort, I think it
will be very easy to add if we just agree on what the endpoints should look
like.

The other is that publish does not in this PR accept a reference to a
version. It always uses the latest. That would also be a very easy
enhancement to make.

I am happy to support getting this merged as I transition to being a more
passive community member, assuming there are no objections. I am also of
course happy to help support this into the future, as I believe strongly in
its value and importance (see previous thread).

Please provide feedback and questions. If a live meeting this week would
help expedite evaluation of this effort, I'm happy to schedule that. And
assuming there are no hard objections, I'm happy to proceed with
documentation updates.

Thanks!

-- 

Michael Hrivnak

Principal Software Engineer, RHCE

Red Hat
___
Pulp-dev mailing list
Pulp-dev@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-dev